11,624 research outputs found
Learning with the Weighted Trace-norm under Arbitrary Sampling Distributions
We provide rigorous guarantees on learning with the weighted trace-norm under
arbitrary sampling distributions. We show that the standard weighted trace-norm
might fail when the sampling distribution is not a product distribution (i.e.
when row and column indexes are not selected independently), present a
corrected variant for which we establish strong learning guarantees, and
demonstrate that it works better in practice. We provide guarantees when
weighting by either the true or empirical sampling distribution, and suggest
that even if the true distribution is known (or is uniform), weighting by the
empirical distribution may be beneficial
A Max-Norm Constrained Minimization Approach to 1-Bit Matrix Completion
We consider in this paper the problem of noisy 1-bit matrix completion under
a general non-uniform sampling distribution using the max-norm as a convex
relaxation for the rank. A max-norm constrained maximum likelihood estimate is
introduced and studied. The rate of convergence for the estimate is obtained.
Information-theoretical methods are used to establish a minimax lower bound
under the general sampling model. The minimax upper and lower bounds together
yield the optimal rate of convergence for the Frobenius norm loss.
Computational algorithms and numerical performance are also discussed.Comment: 33 pages, 3 figure
Collaborative Filtering in a Non-Uniform World: Learning with the Weighted Trace Norm
We show that matrix completion with trace-norm regularization can be
significantly hurt when entries of the matrix are sampled non-uniformly. We
introduce a weighted version of the trace-norm regularizer that works well also
with non-uniform sampling. Our experimental results demonstrate that the
weighted trace-norm regularization indeed yields significant gains on the
(highly non-uniformly sampled) Netflix dataset.Comment: 9 page
Matrix Completion via Max-Norm Constrained Optimization
Matrix completion has been well studied under the uniform sampling model and
the trace-norm regularized methods perform well both theoretically and
numerically in such a setting. However, the uniform sampling model is
unrealistic for a range of applications and the standard trace-norm relaxation
can behave very poorly when the underlying sampling scheme is non-uniform.
In this paper we propose and analyze a max-norm constrained empirical risk
minimization method for noisy matrix completion under a general sampling model.
The optimal rate of convergence is established under the Frobenius norm loss in
the context of approximately low-rank matrix reconstruction. It is shown that
the max-norm constrained method is minimax rate-optimal and yields a unified
and robust approximate recovery guarantee, with respect to the sampling
distributions. The computational effectiveness of this method is also
discussed, based on first-order algorithms for solving convex optimizations
involving max-norm regularization.Comment: 33 page
On landmark selection and sampling in high-dimensional data analysis
In recent years, the spectral analysis of appropriately defined kernel
matrices has emerged as a principled way to extract the low-dimensional
structure often prevalent in high-dimensional data. Here we provide an
introduction to spectral methods for linear and nonlinear dimension reduction,
emphasizing ways to overcome the computational limitations currently faced by
practitioners with massive datasets. In particular, a data subsampling or
landmark selection process is often employed to construct a kernel based on
partial information, followed by an approximate spectral analysis termed the
Nystrom extension. We provide a quantitative framework to analyse this
procedure, and use it to demonstrate algorithmic performance bounds on a range
of practical approaches designed to optimize the landmark selection process. We
compare the practical implications of these bounds by way of real-world
examples drawn from the field of computer vision, whereby low-dimensional
manifold structure is shown to emerge from high-dimensional video data streams.Comment: 18 pages, 6 figures, submitted for publicatio
Noisy low-rank matrix completion with general sampling distribution
In the present paper, we consider the problem of matrix completion with
noise. Unlike previous works, we consider quite general sampling distribution
and we do not need to know or to estimate the variance of the noise. Two new
nuclear-norm penalized estimators are proposed, one of them of "square-root"
type. We analyse their performance under high-dimensional scaling and provide
non-asymptotic bounds on the Frobenius norm error. Up to a logarithmic factor,
these performance guarantees are minimax optimal in a number of circumstances.Comment: Published in at http://dx.doi.org/10.3150/12-BEJ486 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
On Low-rank Trace Regression under General Sampling Distribution
A growing number of modern statistical learning problems involve estimating a
large number of parameters from a (smaller) number of noisy observations. In a
subset of these problems (matrix completion, matrix compressed sensing, and
multi-task learning) the unknown parameters form a high-dimensional matrix B*,
and two popular approaches for the estimation are convex relaxation of
rank-penalized regression or non-convex optimization. It is also known that
these estimators satisfy near optimal error bounds under assumptions on rank,
coherence, or spikiness of the unknown matrix.
In this paper, we introduce a unifying technique for analyzing all of these
problems via both estimators that leads to short proofs for the existing
results as well as new results. Specifically, first we introduce a general
notion of spikiness for B* and consider a general family of estimators and
prove non-asymptotic error bounds for the their estimation error. Our approach
relies on a generic recipe to prove restricted strong convexity for the
sampling operator of the trace regression. Second, and most notably, we prove
similar error bounds when the regularization parameter is chosen via K-fold
cross-validation. This result is significant in that existing theory on
cross-validated estimators do not apply to our setting since our estimators are
not known to satisfy their required notion of stability. Third, we study
applications of our general results to four subproblems of (1) matrix
completion, (2) multi-task learning, (3) compressed sensing with Gaussian
ensembles, and (4) compressed sensing with factored measurements. For (1), (3),
and (4) we recover matching error bounds as those found in the literature, and
for (2) we obtain (to the best of our knowledge) the first such error bound. We
also demonstrate how our frameworks applies to the exact recovery problem in
(3) and (4).Comment: 32 pages, 1 figur
- …