Skip to main content
Video s3
    Details
    Presenter(s)
    Jian Huang Headshot
    Display Name
    Jian Huang
    Affiliation
    Affiliation
    Nanjing University
    Country
    Author(s)
    Display Name
    Jian Huang
    Affiliation
    Affiliation
    Nanjing University
    Display Name
    Jinming Lu
    Affiliation
    Affiliation
    Nanjing University
    Display Name
    Zhongfeng Wang
    Affiliation
    Affiliation
    Nanjing University, China
    Abstract

    In this paper, an architecture is proposed for energy-efficient DNN training. It leverages triple sparsity in training which eliminates more unnecessary computation compared with prior works. Meanwhile, a 2-level grained mask-matching scheme is introduced for efficient sparsity detection. Coarse-grained mask match units are reused among PEs to save power and area while fine-grained mask match units are inside each PE to maintain throughput. As a result, our architecture achieves 42.1 TOPS and 174.0 TOPS/W in terms of throughput and energy efficiency, respectively. The energy efficiency of our design is 2.12× higher than the state-of-the-art training processor.

    Slides
    • An Efficient Hardware Architecture for DNN Training by Exploiting Triple Sparsity (application/pdf)