2 research outputs found
Semi-supervised Multimodal Hashing
Retrieving nearest neighbors across correlated data in multiple modalities,
such as image-text pairs on Facebook and video-tag pairs on YouTube, has become
a challenging task due to the huge amount of data. Multimodal hashing methods
that embed data into binary codes can boost the retrieving speed and reduce
storage requirement. As unsupervised multimodal hashing methods are usually
inferior to supervised ones, while the supervised ones requires too much
manually labeled data, the proposed method in this paper utilizes a part of
labels to design a semi-supervised multimodal hashing method. It first computes
the transformation matrices for data matrices and label matrix. Then, with
these transformation matrices, fuzzy logic is introduced to estimate a label
matrix for unlabeled data. Finally, it uses the estimated label matrix to learn
hashing functions for data in each modality to generate a unified binary code
matrix. Experiments show that the proposed semi-supervised method with 50%
labels can get a medium performance among the compared supervised ones and
achieve an approximate performance to the best supervised method with 90%
labels. With only 10% labels, the proposed method can still compete with the
worst compared supervised one
Label Prediction Framework for Semi-Supervised Cross-Modal Retrieval
Cross-modal data matching refers to retrieval of data from one modality, when
given a query from another modality. In general, supervised algorithms achieve
better retrieval performance compared to their unsupervised counterpart, as
they can learn better representative features by leveraging the available label
information. However, this comes at the cost of requiring huge amount of
labeled examples, which may not always be available. In this work, we propose a
novel framework in a semi-supervised setting, which can predict the labels of
the unlabeled data using complementary information from different modalities.
The proposed framework can be used as an add-on with any baseline crossmodal
algorithm to give significant performance improvement, even in case of limited
labeled data. Finally, we analyze the challenging scenario where the unlabeled
examples can even come from classes not in the training data and evaluate the
performance of our algorithm under such setting. Extensive evaluation using
several baseline algorithms across three different datasets shows the
effectiveness of our label prediction framework.Comment: 12 pages, 3 tables, 2 figures, 1 algorithm flowchar