HPPU: An Energy-Efficient Sparse DNN Training Processor with Hybrid Weight Pruning

Abstract

This paper proposes an algorithm and hardware co-design approach for energy-efficient DNN training. By presenting the hybrid pruning algorithm, it reaches a 2.01× higher pruning ratio, significantly reducing the training computation requirement. Instead of introducing a drastic irregular sparsity distribution, the proposed pruning is hardware friendly and can facilitate processors to achieve high energy efficiency. Meanwhile, we design HPPU with a 2-level ADS and linecompression dataflow to fully utilize the hybrid sparsity. HPPU achieves 1.43× higher PE utilization due to the predictive sparsity detection. The line-compress dataflow simplifies the sparsity processing to reduce power consumption and increase the training speed since it releases the critical path. As a result, our HPPU reaches 126.04TFLOPS/W peak energy-efficiency, 1.67× higher than [7] and 1.53× higher than [8].