Skip to main content
    Details
    Author(s)
    Display Name
    Runhao Li
    Affiliation
    Affiliation
    Nanyang Technological University
    Display Name
    Zhenyu Weng
    Affiliation
    Affiliation
    Nanyang Technological University
    Display Name
    Huiping Zhuang
    Affiliation
    Affiliation
    South China University of Technology
    Display Name
    Yongming Chen
    Affiliation
    Affiliation
    Nanyang Technological University
    Display Name
    Zhiping Lin
    Affiliation
    Affiliation
    Nanyang Technological University
    Abstract

    Cross-modal retrieval methods are developed to retrieve relevant data across different modalities. Usually, supervised cross-modal retrieval methods can achieve higher accuracy than unsupervised methods because they can utilize the semantic information provided by clean labels. However, training data with noisy labels will lead to the performance degradation of supervised cross-modal retrieval methods. In this work, we present a novel framework called Neighborhood Learning for Cross-Modal Retrieval (NLCMR) that is robust against noisy labels by exploiting the information contained in the neighborhood. Our NLCMR contains two main components: Clustering with Neighborhood Alignment and Neighborhood Contrastive Learning. The first component focuses on reducing the impact of noisy labels and improving clustering robustness, and the second component learns from noisy data by exploring pairwise and neighborhood information. Extensive experiments are conducted on three multi-modal datasets to demonstrate the effectiveness of NLCMR.