Skip to main content
Video s3
    Details
    Author(s)
    Display Name
    Ardavan Elahi
    Affiliation
    Affiliation
    University of Tehran
    Display Name
    Ali Falahati
    Affiliation
    Affiliation
    Avanco
    Display Name
    Farhad Pakdaman
    Affiliation
    Affiliation
    Tampere University
    Display Name
    Mehdi Modarressi
    Affiliation
    Affiliation
    University of Tehran
    Display Name
    Moncef Gabbouj
    Affiliation
    Affiliation
    Tampere University
    Abstract

    Since for a considerable portion of the captured video the target is a machine learning task, rather than a human audience, transmission of videos in such applications requires efficient video compression tailored for machine vision. However, existing compression solutions are optimized for human vision. This paper presents a methodology to optimize an existing video compression standard, HEVC, for a machine vision task, Object Detection (OD). To this end, (1) a dataset of compressed videos, including several compression-ratios and their corresponding OD performance is collected to enable modeling, (2) A trade-off point (knee-point) between bitrate and OD performance is defined, that finds the point after which no major improvements will be achieved, (3) an extensive set of features were extracted and studied to model this point, via a practical machine learning method. The resulting solution can predict the knee-point with MAE=1.28, resulting in a ΔRecall of only 0.012 and bitrate reduction of 86.56%, compared to OD with very high-quality video.