288 research outputs found

    Phase Transitions in Semidefinite Relaxations

    Full text link
    Statistical inference problems arising within signal processing, data mining, and machine learning naturally give rise to hard combinatorial optimization problems. These problems become intractable when the dimensionality of the data is large, as is often the case for modern datasets. A popular idea is to construct convex relaxations of these combinatorial problems, which can be solved efficiently for large scale datasets. Semidefinite programming (SDP) relaxations are among the most powerful methods in this family, and are surprisingly well-suited for a broad range of problems where data take the form of matrices or graphs. It has been observed several times that, when the `statistical noise' is small enough, SDP relaxations correctly detect the underlying combinatorial structures. In this paper we develop asymptotic predictions for several `detection thresholds,' as well as for the estimation error above these thresholds. We study some classical SDP relaxations for statistical problems motivated by graph synchronization and community detection in networks. We map these optimization problems to statistical mechanics models with vector spins, and use non-rigorous techniques from statistical mechanics to characterize the corresponding phase transitions. Our results clarify the effectiveness of SDP relaxations in solving high-dimensional statistical problems.Comment: 71 pages, 24 pdf figure

    ECA: High Dimensional Elliptical Component Analysis in non-Gaussian Distributions

    Full text link
    We present a robust alternative to principal component analysis (PCA) --- called elliptical component analysis (ECA) --- for analyzing high dimensional, elliptically distributed data. ECA estimates the eigenspace of the covariance matrix of the elliptical data. To cope with heavy-tailed elliptical distributions, a multivariate rank statistic is exploited. At the model-level, we consider two settings: either that the leading eigenvectors of the covariance matrix are non-sparse or that they are sparse. Methodologically, we propose ECA procedures for both non-sparse and sparse settings. Theoretically, we provide both non-asymptotic and asymptotic analyses quantifying the theoretical performances of ECA. In the non-sparse setting, we show that ECA's performance is highly related to the effective rank of the covariance matrix. In the sparse setting, the results are twofold: (i) We show that the sparse ECA estimator based on a combinatoric program attains the optimal rate of convergence; (ii) Based on some recent developments in estimating sparse leading eigenvectors, we show that a computationally efficient sparse ECA estimator attains the optimal rate of convergence under a suboptimal scaling.Comment: to appear in JASA (T&M

    MATRIX REDUCTION IN NUMERICAL OPTIMIZATION

    Get PDF
    Matrix reduction by eliminating some terms in the expansion of a matrix has been applied to a variety of numerical problems in many different areas. Since matrix reduction has different purposes for particular problems, the reduced matrices also have different meanings. In regression problems in statistics, the reduced parts of the matrix are considered to be noise or observation error, so the given raw data are purified by the matrix reduction. In factor analysis and principal component analysis (PCA), the reduced parts are regarded as idiosyncratic (unsystematic) factors, which are not shared by multiple variables in common. In solving constrained convex optimization problems, the reduced terms correspond to unnecessary (inactive) constraints which do not help in the search for an optimal solution. In using matrix reduction, it is both critical and difficult to determine how and how much we will reduce the matrix. This decision is very important since it determines the quality of the reduced matrix and the final solution. If we reduce too much, fundamental properties will be lost. On the other hand, if we reduce too little, we cannot expect enough benefit from the reduction. It is also a difficult decision because the criteria for the reduction must be based on the particular type of problem. In this study, we investigatematrix reduction for three numerical optimization problems. First, the total least squares problem uses matrix reduction to remove noise in observed data which follow an underlying linear model. We propose a new method to make the matrix reduction successful under relaxed noise assumptions. Second, we apply matrix reduction to the problem of estimating a covariance matrix of stock returns, used in financial portfolio optimization problem. We summarize all the previously proposed estimation methods in a common framework and present a new and effective Tikhonov method. Third, we present a new algorithm to solve semidefinite programming problems, adaptively reducing inactive constraints. In the constraint reduction, the Schur complement matrix for the Newton equations is the object of the matrix reduction. For all three problems, we propose appropriate criteria to determine the intensity of the matrix reduction. In addition, we verify the correctness of our criteria by experimental results and mathematical proof
    • …
    corecore