9,006 research outputs found

    Sparse CCA: Adaptive Estimation and Computational Barriers

    Get PDF
    Canonical correlation analysis is a classical technique for exploring the relationship between two sets of variables. It has important applications in analyzing high dimensional datasets originated from genomics, imaging and other fields. This paper considers adaptive minimax and computationally tractable estimation of leading sparse canonical coefficient vectors in high dimensions. First, we establish separate minimax estimation rates for canonical coefficient vectors of each set of random variables under no structural assumption on marginal covariance matrices. Second, we propose a computationally feasible estimator to attain the optimal rates adaptively under an additional sample size condition. Finally, we show that a sample size condition of this kind is needed for any randomized polynomial-time estimator to be consistent, assuming hardness of certain instances of the Planted Clique detection problem. The result is faithful to the Gaussian models used in the paper. As a byproduct, we obtain the first computational lower bounds for sparse PCA under the Gaussian single spiked covariance model

    Computational Hardness of Certifying Bounds on Constrained PCA Problems

    Get PDF
    Given a random n×n symmetric matrix W drawn from the Gaussian orthogonal ensemble (GOE), we consider the problem of certifying an upper bound on the maximum value of the quadratic form x⊤Wx over all vectors x in a constraint set S⊂Rn. For a certain class of normalized constraint sets S we show that, conditional on certain complexity-theoretic assumptions, there is no polynomial-time algorithm certifying a better upper bound than the largest eigenvalue of W. A notable special case included in our results is the hypercube S={±1/n−−√}n, which corresponds to the problem of certifying bounds on the Hamiltonian of the Sherrington-Kirkpatrick spin glass model from statistical physics. Our proof proceeds in two steps. First, we give a reduction from the detection problem in the negatively-spiked Wishart model to the above certification problem. We then give evidence that this Wishart detection problem is computationally hard below the classical spectral threshold, by showing that no low-degree polynomial can (in expectation) distinguish the spiked and unspiked models. This method for identifying computational thresholds was proposed in a sequence of recent works on the sum-of-squares hierarchy, and is believed to be correct for a large class of problems. Our proof can be seen as constructing a distribution over symmetric matrices that appears computationally indistinguishable from the GOE, yet is supported on matrices whose maximum quadratic form over x∈S is much larger than that of a GOE matrix.ISSN:1868-896
    • …
    corecore