14,337 research outputs found

    On the Power of Adaptivity in Matrix Completion and Approximation

    Full text link
    We consider the related tasks of matrix completion and matrix approximation from missing data and propose adaptive sampling procedures for both problems. We show that adaptive sampling allows one to eliminate standard incoherence assumptions on the matrix row space that are necessary for passive sampling procedures. For exact recovery of a low-rank matrix, our algorithm judiciously selects a few columns to observe in full and, with few additional measurements, projects the remaining columns onto their span. This algorithm exactly recovers an n×nn \times n rank rr matrix using O(nrμ0log2(r))O(nr\mu_0 \log^2(r)) observations, where μ0\mu_0 is a coherence parameter on the column space of the matrix. In addition to completely eliminating any row space assumptions that have pervaded the literature, this algorithm enjoys a better sample complexity than any existing matrix completion algorithm. To certify that this improvement is due to adaptive sampling, we establish that row space coherence is necessary for passive sampling algorithms to achieve non-trivial sample complexity bounds. For constructing a low-rank approximation to a high-rank input matrix, we propose a simple algorithm that thresholds the singular values of a zero-filled version of the input matrix. The algorithm computes an approximation that is nearly as good as the best rank-rr approximation using O(nrμlog2(n))O(nr\mu \log^2(n)) samples, where μ\mu is a slightly different coherence parameter on the matrix columns. Again we eliminate assumptions on the row space

    Provable Sparse Tensor Decomposition

    Full text link
    We propose a novel sparse tensor decomposition method, namely Tensor Truncated Power (TTP) method, that incorporates variable selection into the estimation of decomposition components. The sparsity is achieved via an efficient truncation step embedded in the tensor power iteration. Our method applies to a broad family of high dimensional latent variable models, including high dimensional Gaussian mixture and mixtures of sparse regressions. A thorough theoretical investigation is further conducted. In particular, we show that the final decomposition estimator is guaranteed to achieve a local statistical rate, and further strengthen it to the global statistical rate by introducing a proper initialization procedure. In high dimensional regimes, the obtained statistical rate significantly improves those shown in the existing non-sparse decomposition methods. The empirical advantages of TTP are confirmed in extensive simulated results and two real applications of click-through rate prediction and high-dimensional gene clustering.Comment: To Appear in JRSS-
    corecore