Deep Neural Network Interlayer Feature Map Compression Based on Least-Squares Fitting

Deep convolutional neural networks (CNNs) have brought a significant amount of interlayer data during computation, resulting in a large data-exchange delay and power consumption. This paper proposes a Least-Squares Fitting Compression (LSFC) method to compress the interlayer data to resolve the above problem. In LSFC, the feature maps are firstly divided into block groups; then, two base blocks are selected for each block group. Finally, the LSFC core is applied to get the fitting parameters, and the fitting parameters are selectively stored in the on-chip memory according to the mean-squared error (MSE) results. The proposed compression method is hardware-implemented and integrated into an AI accelerator to support the on-the-fly compression process with a slight hardware overhead and latency. Experiments show that the LSFC can reduce the required on-chip storage space by 21.9% ~ 33.6% during CNN computation without loss of network prediction.

Slides

Deep Neural Network Interlayer Feature Map Compression Based on Least-Squares Fitting (application/pdf)

Download