58 research outputs found
Noisy low-rank matrix completion with general sampling distribution
In the present paper, we consider the problem of matrix completion with
noise. Unlike previous works, we consider quite general sampling distribution
and we do not need to know or to estimate the variance of the noise. Two new
nuclear-norm penalized estimators are proposed, one of them of "square-root"
type. We analyse their performance under high-dimensional scaling and provide
non-asymptotic bounds on the Frobenius norm error. Up to a logarithmic factor,
these performance guarantees are minimax optimal in a number of circumstances.Comment: Published in at http://dx.doi.org/10.3150/12-BEJ486 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Matrix completion by singular value thresholding: sharp bounds
We consider the matrix completion problem where the aim is to esti-mate a
large data matrix for which only a relatively small random subset of its
entries is observed. Quite popular approaches to matrix completion problem are
iterative thresholding methods. In spite of their empirical success, the
theoretical guarantees of such iterative thresholding methods are poorly
understood. The goal of this paper is to provide strong theo-retical
guarantees, similar to those obtained for nuclear-norm penalization methods and
one step thresholding methods, for an iterative thresholding algorithm which is
a modification of the softImpute algorithm. An im-portant consequence of our
result is the exact minimax optimal rates of convergence for matrix completion
problem which were known until know only up to a logarithmic factor
Non-asymptotic approach to varying coefficient model
In the present paper we consider the varying coefficient model which
represents a useful tool for exploring dynamic patterns in many applications.
Existing methods typically provide asymptotic evaluation of precision of
estimation procedures under the assumption that the number of observations
tends to infinity. In practical applications, however, only a finite number of
measurements are available. In the present paper we focus on a non-asymptotic
approach to the problem. We propose a novel estimation procedure which is based
on recent developments in matrix estimation. In particular, for our estimator,
we obtain upper bounds for the mean squared and the pointwise estimation
errors. The obtained oracle inequalities are non-asymptotic and hold for finite
sample size
Optimal graphon estimation in cut distance
Consider the twin problems of estimating the connection probability matrix of
an inhomogeneous random graph and the graphon of a W-random graph. We establish
the minimax estimation rates with respect to the cut metric for classes of
block constant matrices and step function graphons. Surprisingly, our results
imply that, from the minimax point of view, the raw data, that is, the
adjacency matrix of the observed graph, is already optimal and more involved
procedures cannot improve the convergence rates for this metric. This
phenomenon contrasts with optimal rates of convergence with respect to other
classical distances for graphons such as the l 1 or l 2 metrics
Constructing confidence sets for the matrix completion problem
In the present note we consider the problem of constructing honest and
adaptive confidence sets for the matrix completion problem. For the Bernoulli
model with known variance of the noise we provide a realizable method for
constructing confidence sets that adapt to the unknown rank of the true matrix
Link Prediction in the Stochastic Block Model with Outliers
The Stochastic Block Model is a popular model for network analysis in the presence of community structure. However, in numerous examples, the assumptions underlying this classical model are put in default by the behaviour of a small number of outlier nodes such as hubs, nodes with mixed membership profiles, or corrupted nodes. In addition, real-life networks are likely to be incomplete, due to non-response or machine failures. We introduce a new algorithm to estimate the connection probabilities in a network, which is robust to both outlier nodes and missing observations. Under fairly general assumptions, this method detects the outliers, and achieves the best known error for the estimation of connection probabilities with polynomial computation cost. In addition, we prove sub-linear convergence of our algorithm. We provide a simulation study which demonstrates the good behaviour of the method in terms of outliers selection and prediction of the missing links
- …