Details
![Chenjia Xie Headshot](https://confcats-catavault.s3.amazonaws.com/CATAVault/ieeecass/master/files/styles/cc_user_photo/s3/user-pictures/21591.jpg?h=2c4e73f8&itok=H4mwtFOs)
- Affiliation
-
AffiliationNanjing University
- Country
Deep convolutional neural networks (CNNs) have brought a significant amount of interlayer data during computation, resulting in a large data-exchange delay and power consumption. This paper proposes a Least-Squares Fitting Compression (LSFC) method to compress the interlayer data to resolve the above problem. In LSFC, the feature maps are firstly divided into block groups; then, two base blocks are selected for each block group. Finally, the LSFC core is applied to get the fitting parameters, and the fitting parameters are selectively stored in the on-chip memory according to the mean-squared error (MSE) results. The proposed compression method is hardware-implemented and integrated into an AI accelerator to support the on-the-fly compression process with a slight hardware overhead and latency. Experiments show that the LSFC can reduce the required on-chip storage space by 21.9% ~ 33.6% during CNN computation without loss of network prediction.