Skip to main content
Video s3
    Details
    Presenter(s)
    Dongyue Li Headshot
    Display Name
    Dongyue Li
    Affiliation
    Affiliation
    Shanghai Jiao Tong University
    Country
    Abstract

    In this paper, we present our ReRAM-Sharing, a software-hardware co-design scheme, to explore fined-grained weight sharing compression for ReRAM-based accelerators. Due to the limits of ADC bandwidth and ADC numbers, DNN computation on ReRAM crossbars is conducted in a smaller granularity, denoted as Operation Unit (OU). Motivated by this, we propose ReRAM-Sharing algorithm that applies weight-sharing on OU-level to exploit fine-grained sparsity. Our proposed ReRAM-Sharing reduces the redundancy of DNNs while maintaining the representation capability. Moreover, as the ReRAM-Sharing algorithm is orthogonal with the traditional pruning techniques, we can integrate them to shrink NN model size further. We then propose the ReRAM-Sharing architecture, which introduces the index table and adders to the traditional ReRAM-based accelerator, to support the ReRAM-Sharing algorithm.

    Slides
    • ReRAM-Sharing: Fine-Grained Weight Sharing for ReRAM-Based Deep Neural Network Accelerator (application/pdf)