Wavelet-Based Learned Scalable Video Coding

Scalability is an important requirement for video coding when coded videos stream over dynamic-bandwidth networks. The state-of-the-art scalable video coding schemes adopt layer-based methods upon H.265, represented by the SHVC standard. Compared to layer-based schemes, wavelet-based schemes were suspected less efficient for a long while. We try to improve the compression efficiency of wavelet-based scalable video coding by leveraging the recent progresses of deep learning. First, we propose an entropy coding method, using trained convolutional neural networks (CNNs) for probability estimation, to compress the wavelet-transformed subbands. Second, we design a CNN-based method for inverse temporal wavelet transform. We integrate the two proposed methods into a traditional wavelet-based scalable video coding scheme, named Interframe-EZBC. The two methods together achieve more than 20% bits savings. Then, our scheme outperforms the SHVC reference software by 9.09%, 6.55%, and 8.66% BD-rate reductions in YUV respectively.

Slides

Wavelet-Based Learned Scalable Video Coding (application/pdf)

Download