Skip to main content
    Details
    Author(s)
    Display Name
    Yimao Xiong
    Affiliation
    Affiliation
    Hunan University of Science and Technology
    Display Name
    Xiangling Ding
    Affiliation
    Affiliation
    Hunan University of Science and Technology
    Display Name
    Qing Gu
    Affiliation
    Affiliation
    Hunan University of Science and Technology
    Abstract

    Deep video inpainting can be exploited to remove the specific target objects. When these inpainted videos are spread on social medias, it is easy to cause negative public perspectives. Therefore, it is necessary to locate the inpainted regions subjected to deep video inpainting. This paper addresses this issue on the basis of the enhanced inpainting traces. Concretely, continuous RGB frames and error-level analysis frames (ELA) are firstly fed into the encoder in parallel to extract more trace features of the inpainted regions, and multi-modal features are generated at different scales through channel feature-level fusion. Then, a cascade of eight ConvGRUs is embedded in the decoder to capture the temporal abnormity between video frames. In particular, an eight-direction local attention module in the last level of the encoder is introduced, which pays attention to the neighborhood information of pixels through eight directions and captures the inconsistency between pixels in the inpainted regions. As a result, our proposed method performs favorably with more tampered details compared with the state-of-the-art methods on the constructed datasets.