PulseAugur
EN
LIVE 22:03:24

New bitwise systolic array boosts QNN inference speed on FPGAs

Researchers have developed a novel bitwise systolic array architecture designed for runtime-reconfigurable multi-precision quantized neural networks (QNNs). This architecture addresses the limitations of existing hardware multipliers that cannot dynamically adjust precision for mixed-precision QNN models. Implemented and tested on an Ultra96 FPGA, the design demonstrates significant speedups ranging from 1.3185x to 3.5671x for mixed-precision model inference. It also features a reduced critical path delay, enabling higher clock frequencies of up to 250MHz. AI

IMPACT This architecture could enable more efficient and faster inference of complex AI models on edge devices with limited resources.

RANK_REASON The cluster contains an academic paper detailing a new hardware architecture for AI inference. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

New bitwise systolic array boosts QNN inference speed on FPGAs

COVERAGE [1]

  1. arXiv cs.AI TIER_1 English(EN) · Yuhao Liu, Salim Ullah, Akash Kumar ·

    Bitwise Systolic Array Architecture for Runtime-Reconfigurable Multi-precision Quantized Multiplication on Hardware Accelerators

    arXiv:2602.23334v2 Announce Type: replace-cross Abstract: Neural network accelerators have been widely applied to edge devices for complex tasks like object tracking, image recognition, etc. Previous works have explored the quantization technologies in related lightweight acceler…