UCViT: Hardware-Friendly Vision Transformer via Unified Compression

We develop a unified compressed version of Vision Transformer (UCViT), whose main focus is on compressing the computation-intensive ViT model by a unified compression method, which converts the dense matrix multiplication to hardware-friendly operations with low bit-width dominated by shift and addition calculations.Meanwhile, to compensate for the accuracy degradation, we additionally introduce a small matrix with relative high-precision associated with the mechanism of multi-head attention in Transformer-based model.

Slides

UCViT: Hardware-Friendly Vision Transformer via Unified Compression (application/pdf)

Download