9,712 research outputs found
Deep Multi-view Learning to Rank
We study the problem of learning to rank from multiple information sources.
Though multi-view learning and learning to rank have been studied extensively
leading to a wide range of applications, multi-view learning to rank as a
synergy of both topics has received little attention. The aim of the paper is
to propose a composite ranking method while keeping a close correlation with
the individual rankings simultaneously. We present a generic framework for
multi-view subspace learning to rank (MvSL2R), and two novel solutions are
introduced under the framework. The first solution captures information of
feature mappings from within each view as well as across views using
autoencoder-like networks. Novel feature embedding methods are formulated in
the optimization of multi-view unsupervised and discriminant autoencoders.
Moreover, we introduce an end-to-end solution to learning towards both the
joint ranking objective and the individual rankings. The proposed solution
enhances the joint ranking with minimum view-specific ranking loss, so that it
can achieve the maximum global view agreements in a single optimization
process. The proposed method is evaluated on three different ranking problems,
i.e. university ranking, multi-view lingual text ranking and image data
ranking, providing superior results compared to related methods.Comment: Published at IEEE TKD
Learning Social Image Embedding with Deep Multimodal Attention Networks
Learning social media data embedding by deep models has attracted extensive
research interest as well as boomed a lot of applications, such as link
prediction, classification, and cross-modal search. However, for social images
which contain both link information and multimodal contents (e.g., text
description, and visual content), simply employing the embedding learnt from
network structure or data content results in sub-optimal social image
representation. In this paper, we propose a novel social image embedding
approach called Deep Multimodal Attention Networks (DMAN), which employs a deep
model to jointly embed multimodal contents and link information. Specifically,
to effectively capture the correlations between multimodal contents, we propose
a multimodal attention network to encode the fine-granularity relation between
image regions and textual words. To leverage the network structure for
embedding learning, a novel Siamese-Triplet neural network is proposed to model
the links among images. With the joint deep model, the learnt embedding can
capture both the multimodal contents and the nonlinear network information.
Extensive experiments are conducted to investigate the effectiveness of our
approach in the applications of multi-label classification and cross-modal
search. Compared to state-of-the-art image embeddings, our proposed DMAN
achieves significant improvement in the tasks of multi-label classification and
cross-modal search
Deep Learning based Recommender System: A Survey and New Perspectives
With the ever-growing volume of online information, recommender systems have
been an effective strategy to overcome such information overload. The utility
of recommender systems cannot be overstated, given its widespread adoption in
many web applications, along with its potential impact to ameliorate many
problems related to over-choice. In recent years, deep learning has garnered
considerable interest in many research fields such as computer vision and
natural language processing, owing not only to stellar performance but also the
attractive property of learning feature representations from scratch. The
influence of deep learning is also pervasive, recently demonstrating its
effectiveness when applied to information retrieval and recommender systems
research. Evidently, the field of deep learning in recommender system is
flourishing. This article aims to provide a comprehensive review of recent
research efforts on deep learning based recommender systems. More concretely,
we provide and devise a taxonomy of deep learning based recommendation models,
along with providing a comprehensive summary of the state-of-the-art. Finally,
we expand on current trends and provide new perspectives pertaining to this new
exciting development of the field.Comment: The paper has been accepted by ACM Computing Surveys.
https://doi.acm.org/10.1145/328502
End-to-End Cross-Modality Retrieval with CCA Projections and Pairwise Ranking Loss
Cross-modality retrieval encompasses retrieval tasks where the fetched items
are of a different type than the search query, e.g., retrieving pictures
relevant to a given text query. The state-of-the-art approach to cross-modality
retrieval relies on learning a joint embedding space of the two modalities,
where items from either modality are retrieved using nearest-neighbor search.
In this work, we introduce a neural network layer based on Canonical
Correlation Analysis (CCA) that learns better embedding spaces by analytically
computing projections that maximize correlation. In contrast to previous
approaches, the CCA Layer (CCAL) allows us to combine existing objectives for
embedding space learning, such as pairwise ranking losses, with the optimal
projections of CCA. We show the effectiveness of our approach for
cross-modality retrieval on three different scenarios (text-to-image,
audio-sheet-music and zero-shot retrieval), surpassing both Deep CCA and a
multi-view network using freely learned projections optimized by a pairwise
ranking loss, especially when little training data is available (the code for
all three methods is released at: https://github.com/CPJKU/cca_layer).Comment: Preliminary version of a paper published in the International Journal
of Multimedia Information Retrieva
Heterogeneous Multi-task Learning for Human Pose Estimation with Deep Convolutional Neural Network
We propose an heterogeneous multi-task learning framework for human pose
estimation from monocular image with deep convolutional neural network. In
particular, we simultaneously learn a pose-joint regressor and a sliding-window
body-part detector in a deep network architecture. We show that including the
body-part detection task helps to regularize the network, directing it to
converge to a good solution. We report competitive and state-of-art results on
several data sets. We also empirically show that the learned neurons in the
middle layer of our network are tuned to localized body parts
Deep learning in remote sensing: a review
Standing at the paradigm shift towards data-intensive science, machine
learning techniques are becoming increasingly important. In particular, as a
major breakthrough in the field, deep learning has proven as an extremely
powerful tool in many fields. Shall we embrace deep learning as the key to all?
Or, should we resist a 'black-box' solution? There are controversial opinions
in the remote sensing community. In this article, we analyze the challenges of
using deep learning for remote sensing data analysis, review the recent
advances, and provide resources to make deep learning in remote sensing
ridiculously simple to start with. More importantly, we advocate remote sensing
scientists to bring their expertise into deep learning, and use it as an
implicit general model to tackle unprecedented large-scale influential
challenges, such as climate change and urbanization.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin
- …