Skip to main content
    Details
    Presenter(s)
    Marian Verhlest Headshot
    Display Name
    Marian Verhlest
    Affiliation
    Affiliation
    KU Leuven
    Country
    Abstract

    TinyML strives for powerful machine inference in resource-scarce distributed devices. To allow intelligent applications at ultra-low energy and low latency, one needs 1.) compact compute and memory structures; 2.) which are used at very high utilization. This has resulted in a wide variety of accelerator designs proposed in the SotA. However, it becomes increasingly clear that every intelligent edge device will need to be equipped with a diverse set of many heterogeneous co-processors, which allow to run every workload at the most compatible (combination of) accelerators. Moreover, by using multiple cores in parallel, streaming data between the cores, the required amount of on-chip memory and IO bandwidth can be reduced, leading to area, energy and latency savings. This talk will introduce the benefits and challenges of such heterogeneous ML systems, supported through practical examples for efficient deep inference.