32 research outputs found
Spectral Method and Regularized MLE Are Both Optimal for Top- Ranking
This paper is concerned with the problem of top- ranking from pairwise
comparisons. Given a collection of items and a few pairwise comparisons
across them, one wishes to identify the set of items that receive the
highest ranks. To tackle this problem, we adopt the logistic parametric model
--- the Bradley-Terry-Luce model, where each item is assigned a latent
preference score, and where the outcome of each pairwise comparison depends
solely on the relative scores of the two items involved. Recent works have made
significant progress towards characterizing the performance (e.g. the mean
square error for estimating the scores) of several classical methods, including
the spectral method and the maximum likelihood estimator (MLE). However, where
they stand regarding top- ranking remains unsettled.
We demonstrate that under a natural random sampling model, the spectral
method alone, or the regularized MLE alone, is minimax optimal in terms of the
sample complexity --- the number of paired comparisons needed to ensure exact
top- identification, for the fixed dynamic range regime. This is
accomplished via optimal control of the entrywise error of the score estimates.
We complement our theoretical studies by numerical experiments, confirming that
both methods yield low entrywise errors for estimating the underlying scores.
Our theory is established via a novel leave-one-out trick, which proves
effective for analyzing both iterative and non-iterative procedures. Along the
way, we derive an elementary eigenvector perturbation bound for probability
transition matrices, which parallels the Davis-Kahan theorem for
symmetric matrices. This also allows us to close the gap between the
error upper bound for the spectral method and the minimax lower limit.Comment: Add discussions on the setting of the general condition numbe
Principal component analysis for big data
Big data is transforming our world, revolutionizing operations and analytics
everywhere, from financial engineering to biomedical sciences. The complexity
of big data often makes dimension reduction techniques necessary before
conducting statistical inference. Principal component analysis, commonly
referred to as PCA, has become an essential tool for multivariate data analysis
and unsupervised dimension reduction, the goal of which is to find a lower
dimensional subspace that captures most of the variation in the dataset. This
article provides an overview of methodological and theoretical developments of
PCA over the last decade, with focus on its applications to big data analytics.
We first review the mathematical formulation of PCA and its theoretical
development from the view point of perturbation analysis. We then briefly
discuss the relationship between PCA and factor analysis as well as its
applications to large covariance estimation and multiple testing. PCA also
finds important applications in many modern machine learning problems, and we
focus on community detection, ranking, mixture model and manifold learning in
this paper.Comment: review article, in press with Wiley StatsRe
Asymmetry Helps: Eigenvalue and Eigenvector Analyses of Asymmetrically Perturbed Low-Rank Matrices
This paper is concerned with the interplay between statistical asymmetry and
spectral methods. Suppose we are interested in estimating a rank-1 and
symmetric matrix , yet only a
randomly perturbed version is observed. The noise matrix
is composed of zero-mean independent (but not
necessarily homoscedastic) entries and is, therefore, not symmetric in general.
This might arise, for example, when we have two independent samples for each
entry of and arrange them into an {\em asymmetric} data
matrix . The aim is to estimate the leading eigenvalue and
eigenvector of . We demonstrate that the leading eigenvalue
of the data matrix can be times more accurate --- up
to some log factor --- than its (unadjusted) leading singular value in
eigenvalue estimation. Further, the perturbation of any linear form of the
leading eigenvector of --- say, entrywise eigenvector perturbation
--- is provably well-controlled. This eigen-decomposition approach is fully
adaptive to heteroscedasticity of noise without the need of careful bias
correction or any prior knowledge about the noise variance. We also provide
partial theory for the more general rank- case. The takeaway message is
this: arranging the data samples in an asymmetric manner and performing
eigen-decomposition could sometimes be beneficial.Comment: accepted to Annals of Statistics, 2020. 37 page