Efficient Estimation and Inference in Nonconvex Low-Complexity Models

Abstract

Low-complexity models serve as a pivotal tool for extraction of key information from large-scale data, spanning a varied array of machine learning applications. However, due to the limits of computation and the nonconvexity issue in high dimensions, modern data analysis calls for new procedures that allow significant reduction of sample size and computational costs, while at the same time preserving near-optimal statistical accuracy. This thesis is devoted to development of efficient estimation and inference methods for low-rank models, and the exploration of theoretical foundations underlying these approaches. We start with statistical estimation of the column space of an unknown matrix given noisy and partial observations, and focus on the highly unbalanced case where the column dimension far exceeds the row dimension. We investigate an efficient spectral method and establish near-optimal statistical guarantees in terms of both β„“2\ell_2 and β„“2,∞\ell_{2,\infty} estimation accuracy. When applied to concrete statistical applications---tensor completion, principal component analysis and community recovery---the general framework leads to significant performance improvement over prior literature. Moving beyond matrix-type data, we study a natural higher-order generalization---noisy tensor completion. Given that existing methods either are computationally expensive or fail to achieve statistical optimal performance, we propose a two-stage nonconvex algorithm achieving near-optimal computational efficiency (i.e. linear time complexity) and statistical accuracy (i.e. minimal sample complexity and optimal estimation accuracy) at once. In addition to estimation, we further characterize the non-asymptotic distribution of the proposed nonconvex estimator down to fine scales, and develop a data-driven inferential procedure to construct optimal entrywise confidence intervals for the unknowns, which fully adapts to unknown noise distributions and noise heteroscedasticity. As a byproduct, the distributional theory justifies the statistical optimality of the nonconvex estimator---its β„“2\ell_2 estimation error is un-improvable including the pre-constant. All of this is attained through the integrated consideration of statistics and nonconvex optimization, and fine-grained analysis of spectral methods

    Similar works

    Full text

    thumbnail-image

    Available Versions