Skip to main content
Video s3
    Details
    Presenter(s)
    Mohammed F Tolba Headshot
    Display Name
    Mohammed F Tolba
    Affiliation
    Affiliation
    Khalifa University
    Country
    Author(s)
    Display Name
    Mohammed F Tolba
    Affiliation
    Affiliation
    Khalifa University
    Display Name
    Hani Saleh
    Affiliation
    Affiliation
    Khalifa University
    Display Name
    Mahmoud Alqutayri
    Affiliation
    Affiliation
    Khalifa University
    Display Name
    Baker Mohammad
    Affiliation
    Affiliation
    Khalifa University
    Abstract

    Large deep neural network (DNN) models are computation and memory intensive, which limits their deployment especially on edge devices with limited resources. This paper introduces Scaling-Weight-based Convolution (SWC) technique to reduce the DNN model size and the cost of arithmetic operations. This is achieved by, using a small set of high-precision weights (maximum absolute weight ”MAW”) and a large set of low-precision weights (Scaling weights ”SWs”). This results in decreasing the model size with minimum loss in accuracy compared to simply reducing precision. Moreover, a scaling and quantized network-acceleration processor (SQNAP) is proposed based on the SWC method to achieve high-speed and low-power with reduced memory accesses. The proposed SWC eliminate > 90% of the multiplications in the network. Full analysis for MNIST, Fashion MNIST, Cifar 10 and Cifar 100 datasets is presented for image recognition, where different DNN models are used including LeNet, ResNet, AlexNet and VGG 16.

    Slides
    • Reduce Computing Complexity of Deep Neural Networks Through Weight Scaling (application/pdf)