English(EN) Bitwise Systolic Array Architecture for Runtime-Reconfigurable Multi-precision Quantized Multiplication on Hardware Accelerators

新型位串行阵列加速FPGA上的QNN推理速度

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-24 04:00

研究人员开发了一种新颖的位串行阵列架构，专为运行时可重构的多精度量化神经网络（QNN）设计。该架构解决了现有硬件乘法器无法为混合精度QNN模型动态调整精度的局限性。该设计在Ultra96 FPGA上实现并进行了测试，在混合精度模型推理方面展示了1.3185倍至3.5671倍的显著加速。它还具有缩短的关键路径延迟，能够实现高达250MHz的更高时钟频率。 AI

影响该架构可以实现资源有限的边缘设备上更高效、更快速的复杂AI模型推理。

排序理由该集群包含一篇详细介绍AI推理新硬件架构的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.AI TIER_1 English(EN) · Yuhao Liu, Salim Ullah, Akash Kumar · 2026-06-24 04:00

Bitwise Systolic Array Architecture for Runtime-Reconfigurable Multi-precision Quantized Multiplication on Hardware Accelerators

arXiv:2602.23334v2 Announce Type: replace-cross Abstract: Neural network accelerators have been widely applied to edge devices for complex tasks like object tracking, image recognition, etc. Previous works have explored the quantization technologies in related lightweight acceler…

报道来源 [1]

Bitwise Systolic Array Architecture for Runtime-Reconfigurable Multi-precision Quantized Multiplication on Hardware Accelerators

相关话题