Title: ACane: An Efficient FPGA-based Embedded Vision Platform with
Accumulation-as-Convolution Packing for Autonomous Mobile Robots
Venue: ASP-DAC 2024
Abstract: Convolutional Neural Networks (CNNs) have been extensively deployed on autonomous mobile robots in recent years, and embedded platforms based on field-programmable gate arrays (FPGAs) that involve digital signal processors (DSPs) effectively utilize low-precision quantization with DSP-packing methods to implement large CNN models. However, DSP-packing has a limitation in improving computation performance due to zero bits that prevent bit contamination of output operands. In this paper, we propose ACane, a compact FPGA-based vision platform for autonomous mobile robots, based on a novel DSP-packing technique called accumulation-as-convolution packing, which effectively packs low-bit values to a single DSP, with boosting convolution operations. It also applies optimized data
mapping and dataflow to improve computation parallelism of the DSP-packing. ACane successfully achieves the highest DSP efficiency (1.465 GOPS/DSP) and energy efficiency (361.8 GOPS/W), which are 1.98-8.32× and 4.03-25.5× higher compared to the state-of-the-art FPGA-based vision works, respectively.