Compute-in-RRAM with Limited On-Chip Resources

Abstract

Compute-in-memory (CIM) is a new computing paradigm that addresses the memory-wall problem in the deep learning accelerator. Resistive Random Access Memory (RRAM) is an emerging non-volatile memory that is suitable as on-chip embedded memory to store the weights of the deep neural network (DNN) models. In this paper, first we will review general design considerations of RRAM-CIM prototype chip integrated with CMOS peripheral circuitry such as weight mapping scheme and analog-to-digital conversion requirement. Second, we will discuss the challenges in CIM chip design when the chip area is constrained to hold all the weights of the large-scale DNN models. Finally, we will present a design methodology to enable the runtime reconfiguration of DNN models on a custom CIM chip instance with fixed hardware resources.