Improving Energy Efficiency of Convolutional Neural Networks on Multi-Core Architectures Through Run-Time Reconfiguration

Convolutional neural networks (CNN) are built with convolution layers of various types (2D, point-wise, depth-wise) and these layers are typically the most computationally-intense parts. The differences in the convolution kernel sizes and memory requirements can be best supported by reconfigurable architectures that can easily adapt to their characteristics. In this work, we exploit run-time reconfiguration to enhance the performance of convolution kernels on a low-power reconfigurable architecture, Transmuter. The architecture consists of light-weight cores interconnected by caches and crossbars that support run-time reconfiguration between different cache modes -- shared or private, different dataflow modes -- systolic or parallel and different computation mapping schemes. To achieve run-time reconfiguration, we propose a decision-tree-based engine that selects the optimal Transmuter configuration at a low cost. The proposed method is evaluated on commonly-used CNN models such as ResNet18, VGG11, AlexNet and MobileNetV3. Simulation results show that run-time reconfiguration helps improve the energy efficiency of Transmuter in the range of 3.1x-13.7x across all networks.

Slides

Improving Energy Efficiency of Convolutional Neural Networks on Multi-Core Architectures Through Run-Time Reconfiguration (application/pdf)

Download