Details
- Affiliation
-
AffiliationUniversity of Notre Dame
- Country
The increasingly central role of speech based human computer interaction necessitates on-device, low-latency, low-power, high-accuracy key word spotting (KWS). State-of-the-art accuracies on speech-related tasks have been achieved by long short-term memory (LSTM) neural network (NN) models. Such models are typically computationally intensive because of their heavy use of Matrix vector multiplication (MVM) operations. Compute-in-Memory (CIM) architectures, while well suited to MVM operations, have not seen widespread adoption for LSTMs. In this paper we show how resistive random access memory (ReRAM) based CIM architectures might be adapted for KWS using LSTMs. We find that a hybrid system composed of CIM cores and digital cores achieves 90% test accuracy on the google speech data set at the cost of 25 uJ/decision. Operating on 5-bit inputs, producing 6-bit outputs, and performing all digital computations at 8-bit accuracy, our proposed system results in a 3.7x improvement to computational efficiency compared to equivalent digital systems that deliver the same accuracy.