A 110nW Always-on Keyword Spotting Chip Using Spiking CNN in 40nm CMOS

This paper presents an ultra-low power keyword spotting (KWS) chip for Artificial Intelligence of Thing (AIoT) device’s always-on ambient sensing function. The core KWS engine is based on a spiking convolutional neural network (SCNN) model for its attractive features of sparse activation and addition only operations inside the spiking neurons. The proposed SCNN model improves the existing frame-wise incremental computation structure by adding a spike processing unit (SPU) to reduce the computation cycles. The power and latency of the whole system is reduced by 16.5% and 43.2% respectively. Extensive network quantization reduced the weight bit-length to 4-bit and only 1-bit activation is required. The chip also supports power gating by an energy- based voice activity detection (VAD) module to further reduce power consumption in random and sparse event (RSE) scenarios. Full chip simulation results show that the chip consumes only 110nw with 2.15% False alarm rate and 3.00% False reject rate in a 10% voice event stream test. It achieves the state-of-art recognition accuracy of 99% and 96% for one and two keyword detection tasks.