V3Trans-Crowd: a Video-Based Visual Transformer for Crowd Management Monitoring

Details

Author(s)

Yuqi Zuo

Affiliation: Affiliation

King Abdullah University of Science and Technology

View profile

Aymen Hamrouni

Affiliation: Affiliation

King Abdullah University of Science and Technology

View profile

Hakim Ghazzai

Affiliation: Affiliation

King Abdullah University of Science and Technology

View profile

Yehia Massoud

Affiliation: Affiliation

King Abdullah University of Science and Technology

View profile

Abstract

Crowd behavior monitoring and situation assessment continue to be a very challenging problem. There are two main difficulties for such tasks. First, the complexity brought by the interaction and fusion from individual to group that needs to be assessed and analyzed. Second, the classification of these actions which might be useful in identifying danger and avoiding any undesired consequences. In this paper, we propose a transformer-based crowd management monitoring framework called V3Trans-Crowd that captures information from video data and extracts meaningful output to categorize the behavior of the crowd. We provide an improved hierarchical transformer for multi-modal tasks. Inspired by 3D visual transformer, our proposed 3D visual model, V3Trans-Crowd, has been shown to achieve great performances in terms of accuracy compared to state-of-the-art methods, all tested on the standard Crowd-11 dataset.

Video Not Available

V3Trans-Crowd: a Video-Based Visual Transformer for Crowd Management Monitoring