Cache Compression with Golomb-Rice Code and Quantization for Convolutional Neural Networks

Details

Poster

Presenter(s)

Affiliation: Affiliation

Seoul National University
Country

View profile

Abstract

In this paper, a new cache compression of a floating-point number is proposed for CNN. The exponent is compressed using Golomb-Rice code for an efficient hardware implementation. The compression syntax is carefully designed so that the size of compressed data is not very far from the entropy, which is the theoretical limit, by distinguishing two different types of data used in a CNN. Since the mantissa of a CNN data can be hardly compressed by entropy coding, it is simply quantized for data reduction which may not degrade the CNN performance significantly thanks to the error robustness of a CNN.

Slides

Cache Compression with Golomb-Rice Code and Quantization for Convolutional Neural Networks (application/pdf)

Download