Mitigating scalability challenges in LUT-based neural networks via pruning optimisations
Researchers have developed a novel LUT-based approximate matrix multiplication unit (LUT-MU) designed to improve the scalability and energy efficiency of neural networks. This new architecture integrates a pruning strategy into the MADDNESS algorithm, effectively managing resource expansion for larger problem sizes and higher precision demands. Deploying this LUT-MU in various neural network architectures, including those used for MNIST, CIFAR-10, and ImageNet datasets, has shown significant improvements in throughput and energy efficiency compared to traditional CUDA-based and quantized implementations, with only a minor impact on accuracy. AI