539 research outputs found
Simple to Complex Cross-modal Learning to Rank
The heterogeneity-gap between different modalities brings a significant
challenge to multimedia information retrieval. Some studies formalize the
cross-modal retrieval tasks as a ranking problem and learn a shared multi-modal
embedding space to measure the cross-modality similarity. However, previous
methods often establish the shared embedding space based on linear mapping
functions which might not be sophisticated enough to reveal more complicated
inter-modal correspondences. Additionally, current studies assume that the
rankings are of equal importance, and thus all rankings are used
simultaneously, or a small number of rankings are selected randomly to train
the embedding space at each iteration. Such strategies, however, always suffer
from outliers as well as reduced generalization capability due to their lack of
insightful understanding of procedure of human cognition. In this paper, we
involve the self-paced learning theory with diversity into the cross-modal
learning to rank and learn an optimal multi-modal embedding space based on
non-linear mapping functions. This strategy enhances the model's robustness to
outliers and achieves better generalization via training the model gradually
from easy rankings by diverse queries to more complex ones. An efficient
alternative algorithm is exploited to solve the proposed challenging problem
with fast convergence in practice. Extensive experimental results on several
benchmark datasets indicate that the proposed method achieves significant
improvements over the state-of-the-arts in this literature.Comment: 14 pages; Accepted by Computer Vision and Image Understandin
voxel2vec: A Natural Language Processing Approach to Learning Distributed Representations for Scientific Data
Relationships in scientific data, such as the numerical and spatial
distribution relations of features in univariate data, the scalar-value
combinations' relations in multivariate data, and the association of volumes in
time-varying and ensemble data, are intricate and complex. This paper presents
voxel2vec, a novel unsupervised representation learning model, which is used to
learn distributed representations of scalar values/scalar-value combinations in
a low-dimensional vector space. Its basic assumption is that if two scalar
values/scalar-value combinations have similar contexts, they usually have high
similarity in terms of features. By representing scalar values/scalar-value
combinations as symbols, voxel2vec learns the similarity between them in the
context of spatial distribution and then allows us to explore the overall
association between volumes by transfer prediction. We demonstrate the
usefulness and effectiveness of voxel2vec by comparing it with the isosurface
similarity map of univariate data and applying the learned distributed
representations to feature classification for multivariate data and to
association analysis for time-varying and ensemble data.Comment: Accepted by IEEE Transaction on Visualization and Computer Graphics
(TVCG
- …