Video Not Available
Details
Abstract
With regard to memory access overload and limited computation parallelism feasibility in edge devices, this paper proposes an Embedded-FPGA-based hardware accelerator tailored for Symmetric Sparse Matrix-Vector Multiplication (SSpMV), named eSSpMV. We first propose an optimized data format, Symmetric Compressed Sparse Row (SCSR), together with a fully-pipelined computation unit being compatible with the optimized format. Experimental results show that eSSpMV outperforms the state-of-the-art FPGA implementation for 2.9× speedup, with resource reduction of 39.3% and 32.3% for LUT and DSP, respectively. As for edge CPU and GPU implementations, eSSpMV achieves 9.3× speedup over CPU while acquiring 13.1× better energy-efficiency than GPU.