37,549 research outputs found
Exact and efficient top-K inference for multi-target prediction by querying separable linear relational models
Many complex multi-target prediction problems that concern large target
spaces are characterised by a need for efficient prediction strategies that
avoid the computation of predictions for all targets explicitly. Examples of
such problems emerge in several subfields of machine learning, such as
collaborative filtering, multi-label classification, dyadic prediction and
biological network inference. In this article we analyse efficient and exact
algorithms for computing the top- predictions in the above problem settings,
using a general class of models that we refer to as separable linear relational
models. We show how to use those inference algorithms, which are modifications
of well-known information retrieval methods, in a variety of machine learning
settings. Furthermore, we study the possibility of scoring items incompletely,
while still retaining an exact top-K retrieval. Experimental results in several
application domains reveal that the so-called threshold algorithm is very
scalable, performing often many orders of magnitude more efficiently than the
naive approach
A Graphical Model Formulation of Collaborative Filtering Neighbourhood Methods with Fast Maximum Entropy Training
Item neighbourhood methods for collaborative filtering learn a weighted graph
over the set of items, where each item is connected to those it is most similar
to. The prediction of a user's rating on an item is then given by that rating
of neighbouring items, weighted by their similarity. This paper presents a new
neighbourhood approach which we call item fields, whereby an undirected
graphical model is formed over the item graph. The resulting prediction rule is
a simple generalization of the classical approaches, which takes into account
non-local information in the graph, allowing its best results to be obtained
when using drastically fewer edges than other neighbourhood approaches. A fast
approximate maximum entropy training method based on the Bethe approximation is
presented, which uses a simple gradient ascent procedure. When using
precomputed sufficient statistics on the Movielens datasets, our method is
faster than maximum likelihood approaches by two orders of magnitude.Comment: ICML201
Dynamic Matrix Factorization with Priors on Unknown Values
Advanced and effective collaborative filtering methods based on explicit
feedback assume that unknown ratings do not follow the same model as the
observed ones (\emph{not missing at random}). In this work, we build on this
assumption, and introduce a novel dynamic matrix factorization framework that
allows to set an explicit prior on unknown values. When new ratings, users, or
items enter the system, we can update the factorization in time independent of
the size of data (number of users, items and ratings). Hence, we can quickly
recommend items even to very recent users. We test our methods on three large
datasets, including two very sparse ones, in static and dynamic conditions. In
each case, we outrank state-of-the-art matrix factorization methods that do not
use a prior on unknown ratings.Comment: in the Proceedings of 21st ACM SIGKDD Conference on Knowledge
Discovery and Data Mining 201
- …