10,463 research outputs found
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
Joint Intermodal and Intramodal Label Transfers for Extremely Rare or Unseen Classes
In this paper, we present a label transfer model from texts to images for
image classification tasks. The problem of image classification is often much
more challenging than text classification. On one hand, labeled text data is
more widely available than the labeled images for classification tasks. On the
other hand, text data tends to have natural semantic interpretability, and they
are often more directly related to class labels. On the contrary, the image
features are not directly related to concepts inherent in class labels. One of
our goals in this paper is to develop a model for revealing the functional
relationships between text and image features as to directly transfer
intermodal and intramodal labels to annotate the images. This is implemented by
learning a transfer function as a bridge to propagate the labels between two
multimodal spaces. However, the intermodal label transfers could be undermined
by blindly transferring the labels of noisy texts to annotate images. To
mitigate this problem, we present an intramodal label transfer process, which
complements the intermodal label transfer by transferring the image labels
instead when relevant text is absent from the source corpus. In addition, we
generalize the inter-modal label transfer to zero-shot learning scenario where
there are only text examples available to label unseen classes of images
without any positive image examples. We evaluate our algorithm on an image
classification task and show the effectiveness with respect to the other
compared algorithms.Comment: The paper has been accepted by IEEE Transactions on Pattern Analysis
and Machine Intelligence. It will apear in a future issu
Simple to Complex Cross-modal Learning to Rank
The heterogeneity-gap between different modalities brings a significant
challenge to multimedia information retrieval. Some studies formalize the
cross-modal retrieval tasks as a ranking problem and learn a shared multi-modal
embedding space to measure the cross-modality similarity. However, previous
methods often establish the shared embedding space based on linear mapping
functions which might not be sophisticated enough to reveal more complicated
inter-modal correspondences. Additionally, current studies assume that the
rankings are of equal importance, and thus all rankings are used
simultaneously, or a small number of rankings are selected randomly to train
the embedding space at each iteration. Such strategies, however, always suffer
from outliers as well as reduced generalization capability due to their lack of
insightful understanding of procedure of human cognition. In this paper, we
involve the self-paced learning theory with diversity into the cross-modal
learning to rank and learn an optimal multi-modal embedding space based on
non-linear mapping functions. This strategy enhances the model's robustness to
outliers and achieves better generalization via training the model gradually
from easy rankings by diverse queries to more complex ones. An efficient
alternative algorithm is exploited to solve the proposed challenging problem
with fast convergence in practice. Extensive experimental results on several
benchmark datasets indicate that the proposed method achieves significant
improvements over the state-of-the-arts in this literature.Comment: 14 pages; Accepted by Computer Vision and Image Understandin
- …