Skip to main content
Video s3
    Details
    Author(s)
    Display Name
    Martin Hardieck
    Affiliation
    Affiliation
    Universität Kassel
    Display Name
    Tobias Habermann
    Affiliation
    Affiliation
    Fulda University of Applied Sciences
    Display Name
    Fabian Wagner
    Affiliation
    Affiliation
    Universität Kassel
    Display Name
    Michael Mecik
    Affiliation
    Affiliation
    Fulda University of Applied Sciences
    Display Name
    Martin Kumm
    Affiliation
    Affiliation
    University of Applied Sciences Fulda
    Display Name
    Peter Zipf
    Affiliation
    Affiliation
    Universität Kassel
    Abstract

    We present a training tool flow for deep neural networks (DNN) optimized for a hardware-efficient FPGA- implementation based on reconfigurable constant-coefficient multipliers (RCCMs). RCCMs replace the costly generic multipliers by shift-and-add operations. In previous work, it was shown that RCCMs offer a better alternative for saving FPGA area than utilizing low-precision arithmetic. This work proposes an improved tool flow that enables layer-wise weight quantization, a larger search space by additional RCCM coefficient sets and an optimized retraining. This leads to an improved accuracy compared to the previous method. In addition, hardware requirements are lower as only 1 to 3 adders per multiplication are used. This reduces the overall complexity and the required memory bandwidth simultaneously. We evaluate our tool flow using multiple networks (ResNets) on the ImageNet data set.