3,367 research outputs found
Message-Passing Inference on a Factor Graph for Collaborative Filtering
This paper introduces a novel message-passing (MP) framework for the
collaborative filtering (CF) problem associated with recommender systems. We
model the movie-rating prediction problem popularized by the Netflix Prize,
using a probabilistic factor graph model and study the model by deriving
generalization error bounds in terms of the training error. Based on the model,
we develop a new MP algorithm, termed IMP, for learning the model. To show
superiority of the IMP algorithm, we compare it with the closely related
expectation-maximization (EM) based algorithm and a number of other matrix
completion algorithms. Our simulation results on Netflix data show that, while
the methods perform similarly with large amounts of data, the IMP algorithm is
superior for small amounts of data. This improves the cold-start problem of the
CF systems in practice. Another advantage of the IMP algorithm is that it can
be analyzed using the technique of density evolution (DE) that was originally
developed for MP decoding of error-correcting codes
Collaborative Filtering in a Non-Uniform World: Learning with the Weighted Trace Norm
We show that matrix completion with trace-norm regularization can be
significantly hurt when entries of the matrix are sampled non-uniformly. We
introduce a weighted version of the trace-norm regularizer that works well also
with non-uniform sampling. Our experimental results demonstrate that the
weighted trace-norm regularization indeed yields significant gains on the
(highly non-uniformly sampled) Netflix dataset.Comment: 9 page
Restricted strong convexity and weighted matrix completion: Optimal bounds with noise
We consider the matrix completion problem under a form of row/column weighted
entrywise sampling, including the case of uniform entrywise sampling as a
special case. We analyze the associated random observation operator, and prove
that with high probability, it satisfies a form of restricted strong convexity
with respect to weighted Frobenius norm. Using this property, we obtain as
corollaries a number of error bounds on matrix completion in the weighted
Frobenius norm under noisy sampling and for both exact and near low-rank
matrices. Our results are based on measures of the "spikiness" and
"low-rankness" of matrices that are less restrictive than the incoherence
conditions imposed in previous work. Our technique involves an -estimator
that includes controls on both the rank and spikiness of the solution, and we
establish non-asymptotic error bounds in weighted Frobenius norm for recovering
matrices lying with -"balls" of bounded spikiness. Using
information-theoretic methods, we show that no algorithm can achieve better
estimates (up to a logarithmic factor) over these same sets, showing that our
conditions on matrices and associated rates are essentially optimal
Leveraging Low-Rank Relations Between Surrogate Tasks in Structured Prediction
We study the interplay between surrogate methods for structured prediction
and techniques from multitask learning designed to leverage relationships
between surrogate outputs. We propose an efficient algorithm based on trace
norm regularization which, differently from previous methods, does not require
explicit knowledge of the coding/decoding functions of the surrogate framework.
As a result, our algorithm can be applied to the broad class of problems in
which the surrogate space is large or even infinite dimensional. We study
excess risk bounds for trace norm regularized structured prediction, implying
the consistency and learning rates for our estimator. We also identify relevant
regimes in which our approach can enjoy better generalization performance than
previous methods. Numerical experiments on ranking problems indicate that
enforcing low-rank relations among surrogate outputs may indeed provide a
significant advantage in practice.Comment: 42 pages, 1 tabl
1-Bit Matrix Completion
In this paper we develop a theory of matrix completion for the extreme case
of noisy 1-bit observations. Instead of observing a subset of the real-valued
entries of a matrix M, we obtain a small number of binary (1-bit) measurements
generated according to a probability distribution determined by the real-valued
entries of M. The central question we ask is whether or not it is possible to
obtain an accurate estimate of M from this data. In general this would seem
impossible, but we show that the maximum likelihood estimate under a suitable
constraint returns an accurate estimate of M when ||M||_{\infty} <= \alpha, and
rank(M) <= r. If the log-likelihood is a concave function (e.g., the logistic
or probit observation models), then we can obtain this maximum likelihood
estimate by optimizing a convex program. In addition, we also show that if
instead of recovering M we simply wish to obtain an estimate of the
distribution generating the 1-bit measurements, then we can eliminate the
requirement that ||M||_{\infty} <= \alpha. For both cases, we provide lower
bounds showing that these estimates are near-optimal. We conclude with a suite
of experiments that both verify the implications of our theorems as well as
illustrate some of the practical applications of 1-bit matrix completion. In
particular, we compare our program to standard matrix completion methods on
movie rating data in which users submit ratings from 1 to 5. In order to use
our program, we quantize this data to a single bit, but we allow the standard
matrix completion program to have access to the original ratings (from 1 to 5).
Surprisingly, the approach based on binary data performs significantly better
Exploring Algorithmic Limits of Matrix Rank Minimization under Affine Constraints
Many applications require recovering a matrix of minimal rank within an
affine constraint set, with matrix completion a notable special case. Because
the problem is NP-hard in general, it is common to replace the matrix rank with
the nuclear norm, which acts as a convenient convex surrogate. While elegant
theoretical conditions elucidate when this replacement is likely to be
successful, they are highly restrictive and convex algorithms fail when the
ambient rank is too high or when the constraint set is poorly structured.
Non-convex alternatives fare somewhat better when carefully tuned; however,
convergence to locally optimal solutions remains a continuing source of
failure. Against this backdrop we derive a deceptively simple and
parameter-free probabilistic PCA-like algorithm that is capable, over a wide
battery of empirical tests, of successful recovery even at the theoretical
limit where the number of measurements equal the degrees of freedom in the
unknown low-rank matrix. Somewhat surprisingly, this is possible even when the
affine constraint set is highly ill-conditioned. While proving general recovery
guarantees remains evasive for non-convex algorithms, Bayesian-inspired or
otherwise, we nonetheless show conditions whereby the underlying cost function
has a unique stationary point located at the global optimum; no existing cost
function we are aware of satisfies this same property. We conclude with a
simple computer vision application involving image rectification and a standard
collaborative filtering benchmark
Link Prediction in Graphs with Autoregressive Features
In the paper, we consider the problem of link prediction in time-evolving
graphs. We assume that certain graph features, such as the node degree, follow
a vector autoregressive (VAR) model and we propose to use this information to
improve the accuracy of prediction. Our strategy involves a joint optimization
procedure over the space of adjacency matrices and VAR matrices which takes
into account both sparsity and low rank properties of the matrices. Oracle
inequalities are derived and illustrate the trade-offs in the choice of
smoothing parameters when modeling the joint effect of sparsity and low rank
property. The estimate is computed efficiently using proximal methods through a
generalized forward-backward agorithm.Comment: NIPS 201
- β¦