CompoundEye: A 0.24-4.17 TOPS Scalable Multi-Node DNN Processor for Image Recognition

Details

Author(s)

Xiaobai Chen

Affiliation: Affiliation

Nanjing University of Posts and Telecommunications

View profile

Qiurun Hu

Affiliation: Affiliation

Nanjing University of Posts and Telecommunications

View profile

Fu Xiao

Affiliation: Affiliation

Nanjing University of Posts and Telecommunications

View profile

Jieming Yin

Affiliation: Affiliation

Nanjing University of Posts and Telecommunications

View profile

Abstract

This paper proposes a scalable DNN processor that can be flexibly reconfigured to maximize inference efficiency on a wide range of DNN models. The processor consists of 18 computing Nodes with various precision modes support. To improve the computation throughput, we propose a sub-image parallelization strategy, where the original input image is divided into multiple sub images and computed on multiple Nodes in parallel. In addition, cross-layer pipeline is implemented to improve the resource utilization. The proposed processor is implemented in 28-nm CMOS technology and achieves a peak performance of 4.17 TOPS and energy efficiency of 2.08 Tops/W.