26 research outputs found

    Quantile and pseudo-Huber Tensor Decomposition

    Full text link
    This paper studies the computational and statistical aspects of quantile and pseudo-Huber tensor decomposition. The integrated investigation of computational and statistical issues of robust tensor decomposition poses challenges due to the non-smooth loss functions. We propose a projected sub-gradient descent algorithm for tensor decomposition, equipped with either the pseudo-Huber loss or the quantile loss. In the presence of both heavy-tailed noise and Huber's contamination error, we demonstrate that our algorithm exhibits a so-called phenomenon of two-phase convergence with a carefully chosen step size schedule. The algorithm converges linearly and delivers an estimator that is statistically optimal with respect to both the heavy-tailed noise and arbitrary corruptions. Interestingly, our results achieve the first minimax optimal rates under Huber's contamination model for noisy tensor decomposition. Compared with existing literature, quantile tensor decomposition removes the requirement of specifying a sparsity level in advance, making it more flexible for practical use. We also demonstrate the effectiveness of our algorithms in the presence of missing values. Our methods are subsequently applied to the food balance dataset and the international trade flow dataset, both of which yield intriguing findings

    Exact Clustering in Tensor Block Model: Statistical Optimality and Computational Limit

    Full text link
    High-order clustering aims to identify heterogeneous substructures in multiway datasets that arise commonly in neuroimaging, genomics, social network studies, etc. The non-convex and discontinuous nature of this problem pose significant challenges in both statistics and computation. In this paper, we propose a tensor block model and the computationally efficient methods, \emph{high-order Lloyd algorithm} (HLloyd), and \emph{high-order spectral clustering} (HSC), for high-order clustering. The convergence guarantees and statistical optimality are established for the proposed procedure under a mild sub-Gaussian noise assumption. Under the Gaussian tensor block model, we completely characterize the statistical-computational trade-off for achieving high-order exact clustering based on three different signal-to-noise ratio regimes. The analysis relies on new techniques of high-order spectral perturbation analysis and a "singular-value-gap-free" error bound in tensor estimation, which are substantially different from the matrix spectral analyses in the literature. Finally, we show the merits of the proposed procedures via extensive experiments on both synthetic and real datasets.Comment: 65 page
    corecore