IMEC: A Memory-Efficient Convolution Algorithm For Quantised Neural Network Accelerators
Citation:
Eashan Wadhwa, Shashwat Khandelwal, Shreejith Shanker, IMEC: A Memory-Efficient Convolution Algorithm For Quantised Neural Network Accelerators, 33rd IEEE International Conference on Application-specific Systems, Architectures and Processors, Gothenburg, Sweden, July 2022, IEEE, 2022Abstract:
Quantised convolution neural networks (QCNNs) on FPGAs have shown tremendous potential for deploying deep learning on resource constrained devices closer to the data source or in embedded applications. An essential building block of (Q)CNNs are the convolutional layers. FPGA implementations use modified versions of convolution kernels to reduce the resource
overheads using variations of the sliding kernel algorithm. While these alleviate resource consumption to a certain degree, they still incur considerable (distributed) memory resources, requiring the use of larger FPGA devices with sufficient on-chip memory elements to implement deep QCNNs. In this paper, we present the Inverse Memory Efficient Convolution (IMEC) algorithm, a novel strategy to lower the memory consumption of convolutional layers in QCNNs. IMEC lowers the footprint of intermediate matrix buffers incurred within the convolutional layers and the multiply-
accumulate (MAC) operators required at each layer through a series of data organisation and computational optimisations. We evaluate IMEC by integrating it into the BNN-PYNQ framework
that can compile high-level QCNN representations to the FPGA bitstream. Our results show that IMEC can optimise memory footprint and the overall resource overhead of the convolutional
layers by ∼33% and ∼20% (LUT and FF count) respectively, across multiple quantisation levels (1-bit to 8-bit), while maintaining identical inference accuracy as the state-of-the-art QCNN
implementations.
Author's Homepage:
http://people.tcd.ie/shankers
Author: Shanker, Shreejith
Other Titles:
33rd IEEE International Conference on Application-specific Systems, Architectures and ProcessorsPublisher:
IEEEType of material:
Conference PaperAvailability:
Full text availableKeywords:
Research Subject Categories::TECHNOLOGY, Convolution Neural Networks, Field Programmable Gate Arrays, Inference AlgorithmsSubject (TCD):
Making Ireland , Smart & Sustainable Planet , Telecommunications , ARTIFICIAL NEURAL NETWORKS , Quantised Neural Networks , Reconfigurable ComputingSource URI:
https://xilinx.github.io/finn/Metadata
Show full item recordLicences: