English(EN) HGQ-LUT: Fast LUT-Aware Training and Efficient Architectures for DNN Inference

新的HGQ-LUT和da4ml方法加速DNN训练和FPGA部署

作者 PulseAugur 编辑部 · [3 个来源] · 2026-04-24 07:13

研究人员开发了HGQ-LUT，一种用于训练基于查找表（LUT）的神经网络的新方法，该方法显著加快了训练过程，在现代GPU上速度提升超过100倍。该方法引入了专门的层和细粒度量化，以自动探索精度-资源权衡，无需手动调整。HGQ-LUT已集成到开源工具链中，能够为像CERN大型强子对撞机这样的应用实际部署这些高效的DNN。 AI

影响加速FPGA上的DNN训练，为要求苛刻的应用实现更高效的实时推理。

排序理由这是一篇详细介绍FPGA上DNN新训练方法的学术论文。

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。我们如何撰写摘要 →

报道来源 [3]

arXiv cs.LG TIER_1 English(EN) · Chang Sun, Zhiqiang Que, Bakhtiar Zadeh, Qibin Liu, Kevin H. Alvarez, Wayne Luk, Maria Spiropulu · 2026-04-27 04:00

HGQ-LUT: Fast LUT-Aware Training and Efficient Architectures for DNN Inference

arXiv:2604.22293v1 Announce Type: cross Abstract: Lookup-table (LUT) based neural networks can deliver ultra-low latency and excellent hardware efficiency on FPGAs by mapping arithmetic operations directly onto the logic primitives. However, state-of-the-art LUT-aware training (L…
arXiv cs.LG TIER_1 English(EN) · Chang Sun, Zhiqiang Que, Vladimir Loncar, Wayne Luk, Maria Spiropulu · 2026-04-27 04:00

da4ml: Distributed Arithmetic for Real-time Neural Networks on FPGAs

arXiv:2507.04535v2 Announce Type: replace-cross Abstract: Neural networks with a latency requirement on the order of microseconds, like the ones used at the CERN Large Hadron Collider, are typically deployed on FPGAs fully unrolled and pipelined. A bottleneck for the deployment o…
arXiv cs.LG TIER_1 English(EN) · Maria Spiropulu · 2026-04-24 07:13

HGQ-LUT: Fast LUT-Aware Training and Efficient Architectures for DNN Inference

Lookup-table (LUT) based neural networks can deliver ultra-low latency and excellent hardware efficiency on FPGAs by mapping arithmetic operations directly onto the logic primitives. However, state-of-the-art LUT-aware training (LAT) approaches remain difficult to use in practice…

报道来源 [3]

HGQ-LUT: Fast LUT-Aware Training and Efficient Architectures for DNN Inference

da4ml: Distributed Arithmetic for Real-time Neural Networks on FPGAs

HGQ-LUT: Fast LUT-Aware Training and Efficient Architectures for DNN Inference

相关实体

相关话题