Details
Presenter(s)
![Chen-Chien Kao Headshot](https://confcats-catavault.s3.amazonaws.com/CATAVault/ieeecass/master/files/styles/cc_user_photo/s3/user-pictures/92061.jpg?h=99c62b3d&itok=t9n7gExq)
Display Name
Chen-Chien Kao
- Affiliation
-
AffiliationNational Taiwan University
- Country
-
CountryTaiwan
Abstract
This work presents an accelerator that implements randomized CPD in large-scale tensors for neural network compression. A mixing method that combines the Walsh-Hadamard transform and discrete cosine transform is proposed to replace the fast Fourier transform with faster convergence. It reduces the computations for transformation by 83%. 75% of computations for solving the required least squares problem are also reduced. The proposed accelerator is flexible to support tensor decomposition with a size of up to 512×512×9×9. Compared to the previous work, this work support larger tensors and achieves a 112× lower latency given the same condition.