5 research outputs found

    Optimal Principal Component Analysis in Distributed and Streaming Models

    Full text link
    We study the Principal Component Analysis (PCA) problem in the distributed and streaming models of computation. Given a matrix A∈Rm×n,A \in R^{m \times n}, a rank parameter k<rank(A)k < rank(A), and an accuracy parameter 0<ϵ<10 < \epsilon < 1, we want to output an m×km \times k orthonormal matrix UU for which ∣∣A−UUTA∣∣F2≤(1+ϵ)⋅∣∣A−Ak∣∣F2, || A - U U^T A ||_F^2 \le \left(1 + \epsilon \right) \cdot || A - A_k||_F^2, where Ak∈Rm×nA_k \in R^{m \times n} is the best rank-kk approximation to AA. This paper provides improved algorithms for distributed PCA and streaming PCA.Comment: STOC2016 full versio
    corecore