16,980 research outputs found
Discrete Factorization Machines for Fast Feature-based Recommendation
User and item features of side information are crucial for accurate
recommendation. However, the large number of feature dimensions, e.g., usually
larger than 10^7, results in expensive storage and computational cost. This
prohibits fast recommendation especially on mobile applications where the
computational resource is very limited. In this paper, we develop a generic
feature-based recommendation model, called Discrete Factorization Machine
(DFM), for fast and accurate recommendation. DFM binarizes the real-valued
model parameters (e.g., float32) of every feature embedding into binary codes
(e.g., boolean), and thus supports efficient storage and fast user-item score
computation. To avoid the severe quantization loss of the binarization, we
propose a convergent updating rule that resolves the challenging discrete
optimization of DFM. Through extensive experiments on two real-world datasets,
we show that 1) DFM consistently outperforms state-of-the-art binarized
recommendation models, and 2) DFM shows very competitive performance compared
to its real-valued version (FM), demonstrating the minimized quantization loss.
This work is accepted by IJCAI 2018.Comment: Appeared in IJCAI 201
Message-Passing Inference on a Factor Graph for Collaborative Filtering
This paper introduces a novel message-passing (MP) framework for the
collaborative filtering (CF) problem associated with recommender systems. We
model the movie-rating prediction problem popularized by the Netflix Prize,
using a probabilistic factor graph model and study the model by deriving
generalization error bounds in terms of the training error. Based on the model,
we develop a new MP algorithm, termed IMP, for learning the model. To show
superiority of the IMP algorithm, we compare it with the closely related
expectation-maximization (EM) based algorithm and a number of other matrix
completion algorithms. Our simulation results on Netflix data show that, while
the methods perform similarly with large amounts of data, the IMP algorithm is
superior for small amounts of data. This improves the cold-start problem of the
CF systems in practice. Another advantage of the IMP algorithm is that it can
be analyzed using the technique of density evolution (DE) that was originally
developed for MP decoding of error-correcting codes
Collaborative Deep Learning for Recommender Systems
Collaborative filtering (CF) is a successful approach commonly used by many
recommender systems. Conventional CF-based methods use the ratings given to
items by users as the sole source of information for learning to make
recommendation. However, the ratings are often very sparse in many
applications, causing CF-based methods to degrade significantly in their
recommendation performance. To address this sparsity problem, auxiliary
information such as item content information may be utilized. Collaborative
topic regression (CTR) is an appealing recent method taking this approach which
tightly couples the two components that learn from two different sources of
information. Nevertheless, the latent representation learned by CTR may not be
very effective when the auxiliary information is very sparse. To address this
problem, we generalize recent advances in deep learning from i.i.d. input to
non-i.i.d. (CF-based) input and propose in this paper a hierarchical Bayesian
model called collaborative deep learning (CDL), which jointly performs deep
representation learning for the content information and collaborative filtering
for the ratings (feedback) matrix. Extensive experiments on three real-world
datasets from different domains show that CDL can significantly advance the
state of the art
Multi-modal Image Processing based on Coupled Dictionary Learning
In real-world scenarios, many data processing problems often involve
heterogeneous images associated with different imaging modalities. Since these
multimodal images originate from the same phenomenon, it is realistic to assume
that they share common attributes or characteristics. In this paper, we propose
a multi-modal image processing framework based on coupled dictionary learning
to capture similarities and disparities between different image modalities. In
particular, our framework can capture favorable structure similarities across
different image modalities such as edges, corners, and other elementary
primitives in a learned sparse transform domain, instead of the original pixel
domain, that can be used to improve a number of image processing tasks such as
denoising, inpainting, or super-resolution. Practical experiments demonstrate
that incorporating multimodal information using our framework brings notable
benefits.Comment: SPAWC 2018, 19th IEEE International Workshop On Signal Processing
Advances In Wireless Communication
Content-aware Neural Hashing for Cold-start Recommendation
Content-aware recommendation approaches are essential for providing
meaningful recommendations for \textit{new} (i.e., \textit{cold-start}) items
in a recommender system. We present a content-aware neural hashing-based
collaborative filtering approach (NeuHash-CF), which generates binary hash
codes for users and items, such that the highly efficient Hamming distance can
be used for estimating user-item relevance. NeuHash-CF is modelled as an
autoencoder architecture, consisting of two joint hashing components for
generating user and item hash codes. Inspired from semantic hashing, the item
hashing component generates a hash code directly from an item's content
information (i.e., it generates cold-start and seen item hash codes in the same
manner). This contrasts existing state-of-the-art models, which treat the two
item cases separately. The user hash codes are generated directly based on user
id, through learning a user embedding matrix. We show experimentally that
NeuHash-CF significantly outperforms state-of-the-art baselines by up to 12\%
NDCG and 13\% MRR in cold-start recommendation settings, and up to 4\% in both
NDCG and MRR in standard settings where all items are present while training.
Our approach uses 2-4x shorter hash codes, while obtaining the same or better
performance compared to the state of the art, thus consequently also enabling a
notable storage reduction.Comment: Accepted to SIGIR 202
- …