tensorrt
PulseAugur coverage of tensorrt — every cluster mentioning tensorrt across labs, papers, and developer communities, ranked by signal.
1 day(s) with sentiment data
-
Microsoft engineer compares TensorRT, vLLM, Triton, ONNX for GPU inference
This article compares four key GPU inference frameworks: NVIDIA's TensorRT, vLLM, Triton, and ONNX Runtime. It delves into their architectures, performance characteristics, and suitability for different large language m…
-
New satellite system uses AI for real-time wildfire detection under strict constraints
Researchers have developed a real-time wildfire detection system for use on satellites, designed to operate under strict on-board constraints. The system utilizes a lightweight dense representation learning approach, sp…
-
New DEEP-GAP study compares NVIDIA T4 and L4 GPU inference performance
A new research paper introduces DEEP-GAP, a methodology for evaluating GPU inference performance. The study systematically compares the NVIDIA T4 and L4 GPUs using various deep learning models and precision modes. Resul…
-
AI models advance plant disease detection with new datasets and efficient distillation
Researchers have developed new methods for plant leaf disease classification to aid in early detection and treatment. One approach involves training a new base model using the DenseNet201 architecture on a custom datase…
-
Object detection models show mixed robustness to quantization and input degradations
A new study investigates how post-training quantization (PTQ) affects the robustness of YOLO object detection models when faced with real-world input degradations like noise and blur. Researchers evaluated various preci…
-
NVIDIA boosts Unreal Engine AI speed 5x; Nadella redefines AI success metrics
NVIDIA has introduced TensorRT for RTX, a technology designed to accelerate Neural Network Engine (NNE) inference within Unreal Engine by up to five times. This advancement aims to significantly reduce latency for real-…
-
Optimizing Transformer Inference: Techniques for Faster, Cheaper Large Models
Large transformer models present significant inference challenges due to their substantial memory footprint and computation costs, which scale quadratically with input length. Researchers and practitioners are exploring…