Skip to main content
Video s3
    Details
    Presenter(s)
    Yang Liu Headshot
    Display Name
    Yang Liu
    Affiliation
    Affiliation
    Fudan University
    Country
    Country
    China
    Author(s)
    Display Name
    Yang Liu
    Affiliation
    Affiliation
    Fudan University
    Display Name
    Jing Liu
    Affiliation
    Affiliation
    Fudan University
    Display Name
    Jieyu Lin
    Affiliation
    Affiliation
    University of Toronto
    Display Name
    Mengyang Zhao
    Affiliation
    Affiliation
    Fudan University
    Display Name
    Liang Song
    Affiliation
    Affiliation
    Fudan University
    Abstract

    The key to video anomaly detection is understanding the appearance and motion differences between normal and abnormal events. However, previous works either considered the characteristics of appearance or motion in isolation or treated them without distinction, making the model fail to exploit the unique characteristics of both. In this paper, we propose an appearance-motion united auto-encoder framework to learn the prototypical spatial and temporal patterns of normal events jointly. This method includes a spatial auto-encoder to learn appearance normality, a temporal auto-encoder to learn motion normality, and a channel attention-based spatial-temporal decoder to fuse the spatial-temporal features. The experimental results on standard benchmarks demonstrate the effectiveness of the united normality learning, and our method outperforms the state-of-the-art methods with the AUC of 97.4% and 73.6% on the UCSD Ped2 and ShanghaiTech datasets.

    Slides
    • Appearance-Motion United Auto-Encoder Framework for Video Anomaly Detection (application/pdf)