Details
![Vasilis Sakellariou Headshot](https://confcats-catavault.s3.amazonaws.com/CATAVault/ieeecass/master/files/styles/cc_user_photo/s3/user-pictures/20391_0.jpg?h=827069f2&itok=9iRsJ8DS)
- Affiliation
-
AffiliationKhalifa University
- Country
In this paper, a method to reduce the number of multiplications in convolutional layers by exploiting the properties of the Residue Number System (RNS) is proposed. RNS decomposes the elementary computations into a number of small bit-width, independent channels, which can be processed in parallel. Naturally, due to the small dynamic range of each RNS channel, the number of common factors inside the weight kernels during a convolution is increased. By identifying these common factors and by rearranging the order of computations to perform first the additions of the input feature-map terms that correspond to the same factors, the number of multiplications can be reduced up to 97%, for state-of-the-art CNN models. The remaining multiplications are also simplified, as they are implemented through shift-add operations or fixed-operand multipliers. ASIC implementations of the proposed Processing Element (PE) architecture show a speedup of up to 2.67x and 1.64x compared to the binary and conventional RNS counterparts, respectively. Compared to a conventional RNS PE implementation, the proposed method also leads to a 20% reduction in area and 16% reduction in power consumption.