Details
Presenter(s)
![Yu-Che Yen Headshot](https://confcats-catavault.s3.amazonaws.com/CATAVault/ieeecass/master/files/styles/cc_user_photo/s3/user-pictures/15633.jpg?h=83b6248d&itok=fuuosIdy)
Display Name
Yu-Che Yen
- Affiliation
-
AffiliationNational Sun Yat-sen University
- Country
Abstract
We present a quantization algorithm to find the per-layer bit-width in deep neural networks (DNN) models and then propose multi-precision designs for two fundamental DNN arithmetic operations: multiplication and non-linear activation function computation. The multi-precision multiplier design with truncated partial product bits supports four precision modes (4-bit, 8-bit, 12-bit, and 16-bit) with shared hardware resource so that the power consumption in low-precision modes can be reduced by turning off unnecessary hardware circuits. For the evaluation of non-linear activation functions, we compare three different approaches in various precision requirements and observe that the best design method depends on the bit-accuracy.