tensorrt
PulseAugur coverage of tensorrt — every cluster mentioning tensorrt across labs, papers, and developer communities, ranked by signal.
4 day(s) with sentiment data
-
Transformer model enhances security for autonomous vehicle platoons
Researchers have developed AIMformer, a transformer-based framework designed for real-time detection of misbehavior in vehicular platoons. This system utilizes multi-head self-attention to analyze temporal dynamics with…
-
New RAMS system adapts YOLOv8 tiers for edge AI perception
Researchers have developed RAMS, a novel runtime controller designed for embedded edge perception systems. RAMS dynamically switches between different tiers of YOLOv8 models based on real-time device resource monitoring…
-
NVIDIA launches Halos OS for certified robotaxi safety
NVIDIA has introduced the Halos Operating System (OS) to enhance safety in autonomous vehicles, particularly for robotaxis. This new OS, built on the NVIDIA DRIVE Hyperion platform, provides a certified foundation for A…
-
LogNEO framework uses GPT-Neo for real-time log anomaly detection
Researchers have developed LogNEO, a new framework for detecting anomalies in system logs using EleutherAI's GPT-Neo model. This system employs a novel reinforcement learning approach with a position-aware reward scheme…
-
Together AI builds world's fastest speech-to-text stack
Together AI has developed a highly efficient speech-to-text system, significantly outperforming existing models in speed. Their approach addresses the unique challenges of audio data processing, which is substantially l…
-
New framework tackles industrial Edge AI deployment challenges
This paper introduces a new systems framework designed to improve the deployment of Edge AI applications on industrial embedded platforms. It argues that treating AI deployment as a systems problem, rather than just a m…
-
DEMON engine enables real-time diffusion control as musical instrument
Researchers have developed DEMON, a real-time diffusion engine that allows users to control the denoising process like a musical instrument. This system enables live performance adjustments to various parameters, achiev…
-
AI video inference sped up 3x by optimizing pipeline, not model
Researchers have developed a method to significantly accelerate video inference for computer vision models without altering the model itself. By optimizing the pipeline of frame reading, model inference, and result visu…
-
New framework optimizes LLM inference energy use on multi-GPU systems
Researchers have developed EnergyLens, a framework designed to optimize the energy consumption of large language models (LLMs) during inference on multi-GPU systems. This tool addresses the challenge of predicting and r…
-
New satellite system uses AI for real-time wildfire detection under strict constraints
Researchers have developed a real-time wildfire detection system for use on satellites, designed to operate under strict on-board constraints. The system utilizes a lightweight dense representation learning approach, sp…
-
New DEEP-GAP study compares NVIDIA T4 and L4 GPU inference performance
A new research paper introduces DEEP-GAP, a methodology for evaluating GPU inference performance. The study systematically compares the NVIDIA T4 and L4 GPUs using various deep learning models and precision modes. Resul…
-
AI models advance plant disease detection with new datasets and efficient distillation
Researchers have developed new methods for plant leaf disease classification to aid in early detection and treatment. One approach involves training a new base model using the DenseNet201 architecture on a custom datase…
-
Object detection models show mixed robustness to quantization and input degradations
A new study investigates how post-training quantization (PTQ) affects the robustness of YOLO object detection models when faced with real-world input degradations like noise and blur. Researchers evaluated various preci…
-
NVIDIA boosts Unreal Engine AI speed 5x; Nadella redefines AI success metrics
NVIDIA has introduced TensorRT for RTX, a technology designed to accelerate Neural Network Engine (NNE) inference within Unreal Engine by up to five times. This advancement aims to significantly reduce latency for real-…
-
Optimizing Transformer Inference: Techniques for Faster, Cheaper Large Models
Large transformer models present significant inference challenges due to their substantial memory footprint and computation costs, which scale quadratically with input length. Researchers and practitioners are exploring…