Skip to main content
Video s3
    Details
    Presenter(s)
    Alessandro Capotondi Headshot
    Affiliation
    Affiliation
    Università degli Studi di Modena e Reggio Emilia
    Country
    Abstract

    Low-precision integer arithmetic is a necessary ingredient for enabling Deep Learning inference on tiny and resource-constrained IoT edge devices. This work presents CMix-NN, a flexible open-source1 mixed low-precision (independent tensors quantization of weight and activations at 8, 4, 2 bits) inference library for low bitwidth Quantized Networks. Thanks to CMix-NN, we deploy on an STM32H7 microcontroller a set of Mobilenet family networks with the largest input resolutions (224) and highest accuracies (up to 68% Top1) when compressed with a mixed low precision technique, witnessing up to +8% accuracy improvement concerning any other previously published solutions for MCU devices.

    Slides