Skip to main content
Video s3
    Details
    Presenter(s)
    Anaam Ansari Headshot
    Display Name
    Anaam Ansari
    Affiliation
    Affiliation
    Santa Clara University
    Country
    Author(s)
    Display Name
    Anaam Ansari
    Affiliation
    Affiliation
    Santa Clara University
    Display Name
    Allen Shelton
    Affiliation
    Affiliation
    Santa Clara University
    Display Name
    Tokunbo Ogunfunmi
    Affiliation
    Affiliation
    University of Santa Clara
    Affiliation
    Affiliation
    Santa Clara University
    Abstract

    Hardware acceleration of Deep Neural Networks is very critical to many edge applications. The acceleration solutions available today are typically for GPU, CPU, FPGA and ASIC platforms. The Single Partial Product 2-D Convolution known as SPP2D is a hardware architecture for fast 2-D convolution which can be used for implementing a convolutional neural network (CNN). The SPP2D prevents the re-fetching of input weights for the calculation of partial weights and it computes the output of any input size and kernel with low latency and high throughput compared to some other popular techniques. In this paper, we utilize SPP2D for a full hardware implementation of the VGG-16 CNN and also for the compressed network that is pruned and quantized which requires less on-chip memory thus reducing the most power consuming task of moving data from off-chip to on-chip.

    Slides
    • A Fast Compressed Hardware Architecture for Deep Neural Networks (application/pdf)