Skip to main content
Video s3
    Details
    Author(s)
    Display Name
    Alexandre Mercat
    Affiliation
    Affiliation
    Tampere University
    Display Name
    Ari Lemmetti
    Affiliation
    Affiliation
    Tampere University
    Display Name
    Joose Sainio
    Affiliation
    Affiliation
    Tampere University
    Display Name
    Jarno Vanne
    Affiliation
    Affiliation
    Tampere University
    Abstract

    High Efficiency Video Coding (HEVC) sets the stage for economic video transmission and storage, but its inherent computational complexity calls for powerful implementations. This paper addresses the principal performance bottleneck of HEVC codecs by introducing AVX2-vectorized algorithms for HEVC interpolation filters. The proposed speed-up techniques include 1) a data permutation scheme for the horizontal interpolation stage; 2) a sliding window strategy for the vertical interpolation stage; 3) optimal usage of horizontal and vertical interpolation during fractional motion estimation; and 4) a lane-based approach to double the vector lengths from 128-bit legacy vector extensions to full 256 bits of AVX2. Our AVX2-optimized interpolation filters were benchmarked as part of the practical Kvazaar open-source HEVC encoder. On an Intel 8-core Xeon processor, they were shown to be 9.7 and 8.5 times as fast as scalar interpolation with the Kvazaar ultrafast and veryslow presets, respectively. In both cases, changing over from scalar to vectorized interpolation more than doubles the encoder speed, which stresses the importance of interpolation optimizations in modern encoders.