Accelerating Deep Neural Network Computation on a Low Power Reconfigurable Architecture

Poster

Presenter(s)

Affiliation: Affiliation

Arizona State University
Country

View profile

Abstract

Recent work on neural network architectures has focused on bridging the gap between performance/efficiency and programmability. We consider implementations of three popular neural networks, ResNet, AlexNet and ASGD weight-dropped Recurrent Neural Network (AWD RNN) on a low power programmable architecture. Transformer consists of light-weight cores interconnected by caches and crossbars that support run-time reconfiguration between shared and private cache mode operations. We present efficient implementations of key neural network kernels and evaluate the performance of each kernel when operating in different cache modes. The best-performing cache modes are then used in the implementation of the end-to-end network. Simulation results show superior performance with ResNet, AlexNet and AWD RNN achieving 196.0 GOPS/W, 150.5 GOPS/W and 120.7 GOPS/W, respectively, in the 14 nm technology node.