Details
![Kyeongjong Lim Headshot](https://confcats-catavault.s3.amazonaws.com/CATAVault/ieeecass/master/files/styles/cc_user_photo/s3/user-pictures/13511_0.jpg?h=a61f5ce9&itok=WWf_nZqV)
- Affiliation
-
AffiliationSeoul National University
- Country
Convolutional neural network (CNN) based object detectors, e.g. you only look once (YOLO), achieve remarkable performance, but come with high computing complexity and large memory bandwidth. Therefore, it is challenging to design an accelerator for such object detectors on edge devices which have a limited power budget and on-chip memory footprint. In this paper, we propose an energy and memory efficient CNN accelerator for YOLO. First, we propose a novel dataflow which reduces the number of filter switching by 99.56% on average. Second, we propose a layer-wise on-chip memory reuse scheme, in which multi-bank on-chip buffers are efficiently utilized for both feature maps (FMs) and filters without access external memory for FMs. The proposed design is implemented on Xilinx ZC706 FPGA, consumes a power of 5.05W but achieves a throughput of 370.5 GOPS for Tiny-YOLOv2 while using only 640 DSPs and 322.5 BRAMs. Our design achieves an energy efficiency of 73.39 GOPS/W which outperforms the previous works.