Skip to main content
    Details
    Author(s)
    Display Name
    Ali Al-shaarawy
    Affiliation
    Affiliation
    University of Toronto
    Display Name
    Roman Genov
    Affiliation
    Affiliation
    University of Toronto
    Affiliation
    Affiliation
    York University
    Abstract

    Von Neumann architecture-based deep neural network architectures are fundamentally bottlenecked by the need to transfer data from memory to compute units. Memristor crossbar-based accelerators overcome this by leveraging Kirchoff’s law to perform matrix-vector multiplication (MVM) in-memory. They still, however, are relatively inefficient in their device programming schemes, requiring individual devices to be written sequentially. Parallel writing schemes have emerged recently, which program entire crossbars simultaneously through the outer product of bit-line and word-line voltages and pulse widths respectively. We propose a scheme that leverages singular value decomposition and low-rank approximation to generate all word-line and bit-line vectors needed to program a convolutional neural network (CNN) onto a memristor crossbar-based accelerator. Our scheme reduces programming latency by 50% from sequential programming schemes, while maintaining high test accuracy on state of the art image classification models.