288 research outputs found
Phase Transitions in Semidefinite Relaxations
Statistical inference problems arising within signal processing, data mining,
and machine learning naturally give rise to hard combinatorial optimization
problems. These problems become intractable when the dimensionality of the data
is large, as is often the case for modern datasets. A popular idea is to
construct convex relaxations of these combinatorial problems, which can be
solved efficiently for large scale datasets.
Semidefinite programming (SDP) relaxations are among the most powerful
methods in this family, and are surprisingly well-suited for a broad range of
problems where data take the form of matrices or graphs. It has been observed
several times that, when the `statistical noise' is small enough, SDP
relaxations correctly detect the underlying combinatorial structures.
In this paper we develop asymptotic predictions for several `detection
thresholds,' as well as for the estimation error above these thresholds. We
study some classical SDP relaxations for statistical problems motivated by
graph synchronization and community detection in networks. We map these
optimization problems to statistical mechanics models with vector spins, and
use non-rigorous techniques from statistical mechanics to characterize the
corresponding phase transitions. Our results clarify the effectiveness of SDP
relaxations in solving high-dimensional statistical problems.Comment: 71 pages, 24 pdf figure
ECA: High Dimensional Elliptical Component Analysis in non-Gaussian Distributions
We present a robust alternative to principal component analysis (PCA) ---
called elliptical component analysis (ECA) --- for analyzing high dimensional,
elliptically distributed data. ECA estimates the eigenspace of the covariance
matrix of the elliptical data. To cope with heavy-tailed elliptical
distributions, a multivariate rank statistic is exploited. At the model-level,
we consider two settings: either that the leading eigenvectors of the
covariance matrix are non-sparse or that they are sparse. Methodologically, we
propose ECA procedures for both non-sparse and sparse settings. Theoretically,
we provide both non-asymptotic and asymptotic analyses quantifying the
theoretical performances of ECA. In the non-sparse setting, we show that ECA's
performance is highly related to the effective rank of the covariance matrix.
In the sparse setting, the results are twofold: (i) We show that the sparse ECA
estimator based on a combinatoric program attains the optimal rate of
convergence; (ii) Based on some recent developments in estimating sparse
leading eigenvectors, we show that a computationally efficient sparse ECA
estimator attains the optimal rate of convergence under a suboptimal scaling.Comment: to appear in JASA (T&M
MATRIX REDUCTION IN NUMERICAL OPTIMIZATION
Matrix reduction by eliminating some terms in the expansion of a matrix has been applied to a variety of numerical problems in many different areas. Since matrix reduction has different purposes for particular problems, the reduced matrices also have different meanings. In regression problems in statistics, the reduced parts of the matrix are considered to be noise or observation error, so the given raw data are purified by the matrix reduction. In factor analysis and principal component analysis (PCA), the reduced parts are regarded as idiosyncratic (unsystematic) factors, which are not shared by multiple variables in common. In solving constrained convex optimization problems, the reduced terms correspond to unnecessary (inactive) constraints which do not help in the search for an optimal solution.
In using matrix reduction, it is both critical and difficult to determine how and how much we will reduce the matrix. This decision is very important since it determines the quality of the reduced matrix and the final solution. If we reduce too much, fundamental properties will be lost. On the other hand, if we reduce too little, we cannot expect enough benefit from the reduction. It is also a difficult decision because the criteria for the reduction must be based on the particular type of problem.
In this study, we investigatematrix reduction for three numerical optimization problems. First, the total least squares problem uses matrix reduction to remove noise in observed data which follow an underlying linear model. We propose a new method to make the matrix reduction successful under relaxed noise assumptions. Second, we apply matrix reduction to the problem of estimating a covariance matrix of stock returns, used in financial portfolio optimization problem. We summarize all the previously proposed estimation methods in a common framework and present a new and effective Tikhonov method. Third, we present a new algorithm to solve semidefinite programming problems, adaptively reducing inactive constraints. In the constraint reduction, the Schur complement matrix for the Newton equations is the object of the matrix reduction. For all three problems, we propose appropriate criteria to determine the intensity of the matrix reduction. In addition, we verify the correctness of our criteria by experimental results and mathematical proof
- …