37,196 research outputs found
Deep Learning based Recommender System: A Survey and New Perspectives
With the ever-growing volume of online information, recommender systems have
been an effective strategy to overcome such information overload. The utility
of recommender systems cannot be overstated, given its widespread adoption in
many web applications, along with its potential impact to ameliorate many
problems related to over-choice. In recent years, deep learning has garnered
considerable interest in many research fields such as computer vision and
natural language processing, owing not only to stellar performance but also the
attractive property of learning feature representations from scratch. The
influence of deep learning is also pervasive, recently demonstrating its
effectiveness when applied to information retrieval and recommender systems
research. Evidently, the field of deep learning in recommender system is
flourishing. This article aims to provide a comprehensive review of recent
research efforts on deep learning based recommender systems. More concretely,
we provide and devise a taxonomy of deep learning based recommendation models,
along with providing a comprehensive summary of the state-of-the-art. Finally,
we expand on current trends and provide new perspectives pertaining to this new
exciting development of the field.Comment: The paper has been accepted by ACM Computing Surveys.
https://doi.acm.org/10.1145/328502
Learning Compositional Visual Concepts with Mutual Consistency
Compositionality of semantic concepts in image synthesis and analysis is
appealing as it can help in decomposing known and generatively recomposing
unknown data. For instance, we may learn concepts of changing illumination,
geometry or albedo of a scene, and try to recombine them to generate physically
meaningful, but unseen data for training and testing. In practice however we
often do not have samples from the joint concept space available: We may have
data on illumination change in one data set and on geometric change in another
one without complete overlap. We pose the following question: How can we learn
two or more concepts jointly from different data sets with mutual consistency
where we do not have samples from the full joint space? We present a novel
answer in this paper based on cyclic consistency over multiple concepts,
represented individually by generative adversarial networks (GANs). Our method,
ConceptGAN, can be understood as a drop in for data augmentation to improve
resilience for real world applications. Qualitative and quantitative
evaluations demonstrate its efficacy in generating semantically meaningful
images, as well as one shot face verification as an example application.Comment: 10 pages, 8 figures, 4 tables, CVPR 201
Transfer Meets Hybrid: A Synthetic Approach for Cross-Domain Collaborative Filtering with Text
Collaborative filtering (CF) is the key technique for recommender systems
(RSs). CF exploits user-item behavior interactions (e.g., clicks) only and
hence suffers from the data sparsity issue. One research thread is to integrate
auxiliary information such as product reviews and news titles, leading to
hybrid filtering methods. Another thread is to transfer knowledge from other
source domains such as improving the movie recommendation with the knowledge
from the book domain, leading to transfer learning methods. In real-world life,
no single service can satisfy a user's all information needs. Thus it motivates
us to exploit both auxiliary and source information for RSs in this paper. We
propose a novel neural model to smoothly enable Transfer Meeting Hybrid (TMH)
methods for cross-domain recommendation with unstructured text in an end-to-end
manner. TMH attentively extracts useful content from unstructured text via a
memory module and selectively transfers knowledge from a source domain via a
transfer network. On two real-world datasets, TMH shows better performance in
terms of three ranking metrics by comparing with various baselines. We conduct
thorough analyses to understand how the text content and transferred knowledge
help the proposed model.Comment: 11 pages, 7 figures, a full version for the WWW 2019 short pape
TransNFCM: Translation-Based Neural Fashion Compatibility Modeling
Identifying mix-and-match relationships between fashion items is an urgent
task in a fashion e-commerce recommender system. It will significantly enhance
user experience and satisfaction. However, due to the challenges of inferring
the rich yet complicated set of compatibility patterns in a large e-commerce
corpus of fashion items, this task is still underexplored. Inspired by the
recent advances in multi-relational knowledge representation learning and deep
neural networks, this paper proposes a novel Translation-based Neural Fashion
Compatibility Modeling (TransNFCM) framework, which jointly optimizes fashion
item embeddings and category-specific complementary relations in a unified
space via an end-to-end learning manner. TransNFCM places items in a unified
embedding space where a category-specific relation (category-comp-category) is
modeled as a vector translation operating on the embeddings of compatible items
from the corresponding categories. By this way, we not only capture the
specific notion of compatibility conditioned on a specific pair of
complementary categories, but also preserve the global notion of compatibility.
We also design a deep fashion item encoder which exploits the complementary
characteristic of visual and textual features to represent the fashion
products. To the best of our knowledge, this is the first work that uses
category-specific complementary relations to model the category-aware
compatibility between items in a translation-based embedding space. Extensive
experiments demonstrate the effectiveness of TransNFCM over the
state-of-the-arts on two real-world datasets.Comment: Accepted in AAAI 2019 conferenc
End-to-End Cross-Modality Retrieval with CCA Projections and Pairwise Ranking Loss
Cross-modality retrieval encompasses retrieval tasks where the fetched items
are of a different type than the search query, e.g., retrieving pictures
relevant to a given text query. The state-of-the-art approach to cross-modality
retrieval relies on learning a joint embedding space of the two modalities,
where items from either modality are retrieved using nearest-neighbor search.
In this work, we introduce a neural network layer based on Canonical
Correlation Analysis (CCA) that learns better embedding spaces by analytically
computing projections that maximize correlation. In contrast to previous
approaches, the CCA Layer (CCAL) allows us to combine existing objectives for
embedding space learning, such as pairwise ranking losses, with the optimal
projections of CCA. We show the effectiveness of our approach for
cross-modality retrieval on three different scenarios (text-to-image,
audio-sheet-music and zero-shot retrieval), surpassing both Deep CCA and a
multi-view network using freely learned projections optimized by a pairwise
ranking loss, especially when little training data is available (the code for
all three methods is released at: https://github.com/CPJKU/cca_layer).Comment: Preliminary version of a paper published in the International Journal
of Multimedia Information Retrieva
Coupled Deep Learning for Heterogeneous Face Recognition
Heterogeneous face matching is a challenge issue in face recognition due to
large domain difference as well as insufficient pairwise images in different
modalities during training. This paper proposes a coupled deep learning (CDL)
approach for the heterogeneous face matching. CDL seeks a shared feature space
in which the heterogeneous face matching problem can be approximately treated
as a homogeneous face matching problem. The objective function of CDL mainly
includes two parts. The first part contains a trace norm and a block-diagonal
prior as relevance constraints, which not only make unpaired images from
multiple modalities be clustered and correlated, but also regularize the
parameters to alleviate overfitting. An approximate variational formulation is
introduced to deal with the difficulties of optimizing low-rank constraint
directly. The second part contains a cross modal ranking among triplet domain
specific images to maximize the margin for different identities and increase
data for a small amount of training samples. Besides, an alternating
minimization method is employed to iteratively update the parameters of CDL.
Experimental results show that CDL achieves better performance on the
challenging CASIA NIR-VIS 2.0 face recognition database, the IIIT-D Sketch
database, the CUHK Face Sketch (CUFS), and the CUHK Face Sketch FERET (CUFSF),
which significantly outperforms state-of-the-art heterogeneous face recognition
methods.Comment: AAAI 201
- …