Skip to main content
Video s3
    Details
    Presenter(s)
    Chen-Chien Kao Headshot
    Display Name
    Chen-Chien Kao
    Affiliation
    Affiliation
    National Taiwan University
    Country
    Country
    Taiwan
    Author(s)
    Display Name
    Chen-Chien Kao
    Affiliation
    Affiliation
    National Taiwan University
    Display Name
    Yi-Yen Hsieh
    Affiliation
    Affiliation
    National Taiwan University
    Display Name
    Chao-Hung Chen
    Affiliation
    Affiliation
    Industrial Technology Research Institute
    Display Name
    Chia-Hsiang Yang
    Affiliation
    Affiliation
    National Taiwan University
    Abstract

    This work presents an accelerator that implements randomized CPD in large-scale tensors for neural network compression. A mixing method that combines the Walsh-Hadamard transform and discrete cosine transform is proposed to replace the fast Fourier transform with faster convergence. It reduces the computations for transformation by 83%. 75% of computations for solving the required least squares problem are also reduced. The proposed accelerator is flexible to support tensor decomposition with a size of up to 512×512×9×9. Compared to the previous work, this work support larger tensors and achieves a 112× lower latency given the same condition.

    Slides
    • [SHORT] Hardware Acceleration in Large-Scale Tensor Decomposition for Neural Network Compression (application/pdf)