Skip to main content
Video s3
    Details
    Author(s)
    Affiliation
    Affiliation
    Washington University in St. Louis
    Display Name
    Gert Cauwenberghs
    Affiliation
    Affiliation
    University of California, San Diego
    Abstract

    At the fundamental level, an energy imbalance exists between training and inference in machine learning (ML) systems. While inference involves recall using a fixed or learned set of parameters that can be energy-optimized using compression and sparsification techniques, training involves searching over the entire set of parameters and hence requires repeated memorization, caching, pruning, and annealing. In this paper, we introduce three performance walls that determine the training energy efficiency, namely, the memory-wall, the update-wall, and the consolidation-wall. While the emerging compute-in-memory ML architectures can address the memory-wall bottleneck (or energy-dissipated due to repeated memory access) the approach is agnostic to energy-dissipated due to the number and precision required for the training updates (the update-wall) and information transfer between short-term and long-term memories (the consolidation-wall). To overcome these performance walls, we propose a learning-in-memory (LIM) paradigm that prescribes ML system memories with metaplasticity and whose thermodynamical properties match the physics and energetics of learning.