Download PDFOpen PDF in browserUnsupervised Joint-Semantics Autoencoder Hashing for Multimedia RetrievalEasyChair Preprint 1113713 pages•Date: October 23, 2023AbstractCross-modal hashing has emerged as a prominent approach for large-scale multimedia information retrieval, offering advantages in computational speed and storage efficiency over traditional methods. However, unsupervised cross-modal hashing methods still face challenges in the lack of practical semantic labeling guidance and handling of cross-modal heterogeneity. In this paper, we propose a new unsupervised cross-modal hashing method called Unsupervised Joint-Semantics Autoencoder Hashing(UJSAH) for multimedia retrieval. First, we introduce a joint-semantics similarity matrix that effectively preserves the semantic information in multimodal data. This matrix integrates the original neighborhood structure information of the data, allowing it to capture the associations between different modalities better. This ensures that the similarity matrix can accurately mine the underlying relationships within the data. Second, we design a dual prediction network-based autoencoder, which implements the interconversion of semantic information from different modalities and ensures that the generated binary hash codes maintain the semantic information of different modalities. Experimental results on several classical datasets show a significant improvement in the performance of UJSAH in multimodal retrieval tasks relative to existing methods. The experimental code is published at https://github.com/YunfeiChenMY/UJSAH. Keyphrases: Cross-modal hashing, Dual Prediction, Joint-Semantics, multimedia retrieval
|