Skip to main content
Video s3
    Details
    Presenter(s)
    Zhiyong Li Headshot
    Display Name
    Zhiyong Li
    Affiliation
    Affiliation
    KAIST
    Country
    Abstract

    A Hybrid floating-point (FP) and fixed-point (FXP) deep learning processor with an outlier-aware channel splitting algorithm is proposed for image-to-image applications on mobile devices. In this work, the proposed algorithm reduces 16-bit FP data to 8-bit FXP data, and only few outliers (< 10%) are computed in 16-bit FP while maintaining the image reconstruction quality. Therefore, it reduces EMA by 45.5%. Moreover, the hierarchical processor accelerates these dense 8-bit FXP data and sparse 16-bit FP data, and the functional L2 memory aggregates the convolution output of them by forming the pipeline, which reduces 98% of latency. The proposed system is simulated in 28nm COMS technology, and it occupies 4.16mm2. The hierarchical processor successfully demonstrates the × 4 scale Full-HD super-resolution generation achieving 76 frames-per-second (fps) with 133.3 mW power-consumption at 0.9 V supply and 3.6 TOPS/W of energy-efficiency which is × 3.27 higher than the previous 16-bit FXP processor.

    Slides
    • A 3.6 TOPS/W Hybrid FP-FXP Deep Learning Processor with Outlier Compensation for Image-to-Image Application (application/pdf)