Researchers combine DPUs and GPUs for faster neural network inference

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 2 sources

Researchers have developed a novel method for accelerating neural network inference by splitting Convolutional Neural Network (CNN) computations between Deep Learning Processing Units (DPUs) and Graphics Processing Units (GPUs). This 'Split CNN Inference' approach processes initial layers on a DPU near the data source and subsequent layers on a GPU, significantly reducing latency. A Graph Neural Network (GNN) model was also introduced to accurately predict optimal layer partitioning for various CNN architectures, achieving 96.27% accuracy. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT Potential for reduced latency in edge AI applications by optimizing hardware utilization for CNN inference.

RANK_REASON Academic paper proposing a new method for accelerating neural network inference.

Read on arXiv cs.CV →

paper
infra

COVERAGE [2]

arXiv cs.CV TIER_1 · Ali Emre Oztas, Mahir Demir, James Garside, Mikel Luj'an · 2026-05-04 04:00

DPU or GPU for Accelerating Neural Networks Inference -- Why not both? Split CNN Inference

arXiv:2605.00174v1 Announce Type: cross Abstract: Video and image streaming on edge devices requires low latency. To address this, Neural Networks (NNs) are widely used, and prior work mainly focuses on accelerating them with single hardware units such as Graphics Processing Unit…
arXiv cs.CV TIER_1 · Mikel Luj'an · 2026-04-30 19:49

DPU or GPU for Accelerating Neural Networks Inference -- Why not both? Split CNN Inference

Video and image streaming on edge devices requires low latency. To address this, Neural Networks (NNs) are widely used, and prior work mainly focuses on accelerating them with single hardware units such as Graphics Processing Units (GPUs), Field Programmable Gate Arrays (FPGAs), …

COVERAGE [2]

DPU or GPU for Accelerating Neural Networks Inference -- Why not both? Split CNN Inference

DPU or GPU for Accelerating Neural Networks Inference -- Why not both? Split CNN Inference

RELATED ENTITIES

RELATED TOPICS