SQNR-Based Layer-Wise Mixed-Precision Schemes with Computational Complexity Consideration

In this paper, we present two simple methods for fast analysis in mixed-precision determination and for computational complexity reduction in inference, respectively. With the proposed SQNR-based analysis, we can significantly reduce the time required in mixed-precision scheme determination compared to conventional mAP-based scheme with negligible loss of accuracy. Also, by combining the hyperparameters of the target neural networks in the process of mixed-precision determination, we can reduce the computational complexity in inference. We applied these two proposed methods to the SSDlite network with MobileNet-v2 and YOLOv2 network and evaluate the quantized networks with the proposed mixed-precision schemes on the Pascal-VOC dataset.

Slides

SQNR-Based Layer-Wise Mixed-Precision Schemes with Computational Complexity Consideration (application/pdf)

Download