15,717 research outputs found
Low Rank Matrix Completion with Exponential Family Noise
The matrix completion problem consists in reconstructing a matrix from a
sample of entries, possibly observed with noise. A popular class of estimator,
known as nuclear norm penalized estimators, are based on minimizing the sum of
a data fitting term and a nuclear norm penalization. Here, we investigate the
case where the noise distribution belongs to the exponential family and is
sub-exponential. Our framework alllows for a general sampling scheme. We first
consider an estimator defined as the minimizer of the sum of a log-likelihood
term and a nuclear norm penalization and prove an upper bound on the Frobenius
prediction risk. The rate obtained improves on previous works on matrix
completion for exponential family. When the sampling distribution is known, we
propose another estimator and prove an oracle inequality w.r.t. the
Kullback-Leibler prediction risk, which translates immediatly into an upper
bound on the Frobenius prediction risk. Finally, we show that all the rates
obtained are minimax optimal up to a logarithmic factor
Exponential Family Matrix Completion under Structural Constraints
We consider the matrix completion problem of recovering a structured matrix
from noisy and partial measurements. Recent works have proposed tractable
estimators with strong statistical guarantees for the case where the underlying
matrix is low--rank, and the measurements consist of a subset, either of the
exact individual entries, or of the entries perturbed by additive Gaussian
noise, which is thus implicitly suited for thin--tailed continuous data.
Arguably, common applications of matrix completion require estimators for (a)
heterogeneous data--types, such as skewed--continuous, count, binary, etc., (b)
for heterogeneous noise models (beyond Gaussian), which capture varied
uncertainty in the measurements, and (c) heterogeneous structural constraints
beyond low--rank, such as block--sparsity, or a superposition structure of
low--rank plus elementwise sparseness, among others. In this paper, we provide
a vastly unified framework for generalized matrix completion by considering a
matrix completion setting wherein the matrix entries are sampled from any
member of the rich family of exponential family distributions; and impose
general structural constraints on the underlying matrix, as captured by a
general regularizer . We propose a simple convex regularized
--estimator for the generalized framework, and provide a unified and novel
statistical analysis for this general class of estimators. We finally
corroborate our theoretical results on simulated datasets.Comment: 20 pages, 9 figure
On Low-rank Trace Regression under General Sampling Distribution
A growing number of modern statistical learning problems involve estimating a
large number of parameters from a (smaller) number of noisy observations. In a
subset of these problems (matrix completion, matrix compressed sensing, and
multi-task learning) the unknown parameters form a high-dimensional matrix B*,
and two popular approaches for the estimation are convex relaxation of
rank-penalized regression or non-convex optimization. It is also known that
these estimators satisfy near optimal error bounds under assumptions on rank,
coherence, or spikiness of the unknown matrix.
In this paper, we introduce a unifying technique for analyzing all of these
problems via both estimators that leads to short proofs for the existing
results as well as new results. Specifically, first we introduce a general
notion of spikiness for B* and consider a general family of estimators and
prove non-asymptotic error bounds for the their estimation error. Our approach
relies on a generic recipe to prove restricted strong convexity for the
sampling operator of the trace regression. Second, and most notably, we prove
similar error bounds when the regularization parameter is chosen via K-fold
cross-validation. This result is significant in that existing theory on
cross-validated estimators do not apply to our setting since our estimators are
not known to satisfy their required notion of stability. Third, we study
applications of our general results to four subproblems of (1) matrix
completion, (2) multi-task learning, (3) compressed sensing with Gaussian
ensembles, and (4) compressed sensing with factored measurements. For (1), (3),
and (4) we recover matching error bounds as those found in the literature, and
for (2) we obtain (to the best of our knowledge) the first such error bound. We
also demonstrate how our frameworks applies to the exact recovery problem in
(3) and (4).Comment: 32 pages, 1 figur
Bayesian Robust Tensor Factorization for Incomplete Multiway Data
We propose a generative model for robust tensor factorization in the presence
of both missing data and outliers. The objective is to explicitly infer the
underlying low-CP-rank tensor capturing the global information and a sparse
tensor capturing the local information (also considered as outliers), thus
providing the robust predictive distribution over missing entries. The
low-CP-rank tensor is modeled by multilinear interactions between multiple
latent factors on which the column sparsity is enforced by a hierarchical
prior, while the sparse tensor is modeled by a hierarchical view of Student-
distribution that associates an individual hyperparameter with each element
independently. For model learning, we develop an efficient closed-form
variational inference under a fully Bayesian treatment, which can effectively
prevent the overfitting problem and scales linearly with data size. In contrast
to existing related works, our method can perform model selection automatically
and implicitly without need of tuning parameters. More specifically, it can
discover the groundtruth of CP rank and automatically adapt the sparsity
inducing priors to various types of outliers. In addition, the tradeoff between
the low-rank approximation and the sparse representation can be optimized in
the sense of maximum model evidence. The extensive experiments and comparisons
with many state-of-the-art algorithms on both synthetic and real-world datasets
demonstrate the superiorities of our method from several perspectives.Comment: in IEEE Transactions on Neural Networks and Learning Systems, 201
Poisson Matrix Completion
We extend the theory of matrix completion to the case where we make Poisson
observations for a subset of entries of a low-rank matrix. We consider the
(now) usual matrix recovery formulation through maximum likelihood with proper
constraints on the matrix , and establish theoretical upper and lower bounds
on the recovery error. Our bounds are nearly optimal up to a factor on the
order of . These bounds are obtained by adapting
the arguments used for one-bit matrix completion \cite{davenport20121}
(although these two problems are different in nature) and the adaptation
requires new techniques exploiting properties of the Poisson likelihood
function and tackling the difficulties posed by the locally sub-Gaussian
characteristic of the Poisson distribution. Our results highlight a few
important distinctions of Poisson matrix completion compared to the prior work
in matrix completion including having to impose a minimum signal-to-noise
requirement on each observed entry. We also develop an efficient iterative
algorithm and demonstrate its good performance in recovering solar flare
images.Comment: Submitted to IEEE for publicatio
- …