A Quality-Oriented Reconfigurable Convolution Engine Using Cross-Shaped Sparse Kernels for Highly-Parallel CNN Acceleration

Abstract

Computational imaging CNNs are computation-intensive and need complexity reduction for high-throughput applications. This paper presents cross-shaped sparse kernels for quality-oriented and low-overhead complexity reduction. They improve image quality by 0.03-0.31 dB on classic denoising and super-resolution networks compared to depth reduction. Moreover, we design a highly-parallel reconfigurable convolution engine to support the proposed complexity-saving method. With TSMC 40nm technology, it uses 9.85M logic gates for delivering 8.2 TOPS computing performance, and only needs 8.4% logic overheads and 14.9% additional power for reconfigurability. In the case study on ERNets, it achieves 10.1-14.8x higher area efficiency for pixel throughput compared to SparTen.