66 research outputs found

    Spectral norm of random tensors

    Full text link
    We show that the spectral norm of a random n1Γ—n2Γ—β‹―Γ—nKn_1\times n_2\times \cdots \times n_K tensor (or higher-order array) scales as O((βˆ‘k=1Knk)log⁑(K))O\left(\sqrt{(\sum_{k=1}^{K}n_k)\log(K)}\right) under some sub-Gaussian assumption on the entries. The proof is based on a covering number argument. Since the spectral norm is dual to the tensor nuclear norm (the tightest convex relaxation of the set of rank one tensors), the bound implies that the convex relaxation yields sample complexity that is linear in (the sum of) the number of dimensions, which is much smaller than other recently proposed convex relaxations of tensor rank that use unfolding.Comment: 5 page

    Convex Tensor Decomposition via Structured Schatten Norm Regularization

    Full text link
    We discuss structured Schatten norms for tensor decomposition that includes two recently proposed norms ("overlapped" and "latent") for convex-optimization-based tensor decomposition, and connect tensor decomposition with wider literature on structured sparsity. Based on the properties of the structured Schatten norms, we mathematically analyze the performance of "latent" approach for tensor decomposition, which was empirically found to perform better than the "overlapped" approach in some settings. We show theoretically that this is indeed the case. In particular, when the unknown true tensor is low-rank in a specific mode, this approach performs as good as knowing the mode with the smallest rank. Along the way, we show a novel duality result for structures Schatten norms, establish the consistency, and discuss the identifiability of this approach. We confirm through numerical simulations that our theoretical prediction can precisely predict the scaling behavior of the mean squared error.Comment: 12 pages, 3 figure

    Sparsity-accuracy trade-off in MKL

    Full text link
    We empirically investigate the best trade-off between sparse and uniformly-weighted multiple kernel learning (MKL) using the elastic-net regularization on real and simulated datasets. We find that the best trade-off parameter depends not only on the sparsity of the true kernel-weight spectrum but also on the linear dependence among kernels and the number of samples.Comment: 8pages, 2 figure

    Super-Linear Convergence of Dual Augmented-Lagrangian Algorithm for Sparsity Regularized Estimation

    Full text link
    We analyze the convergence behaviour of a recently proposed algorithm for regularized estimation called Dual Augmented Lagrangian (DAL). Our analysis is based on a new interpretation of DAL as a proximal minimization algorithm. We theoretically show under some conditions that DAL converges super-linearly in a non-asymptotic and global sense. Due to a special modelling of sparse estimation problems in the context of machine learning, the assumptions we make are milder and more natural than those made in conventional analysis of augmented Lagrangian algorithms. In addition, the new interpretation enables us to generalize DAL to wide varieties of sparse estimation problems. We experimentally confirm our analysis in a large scale β„“1\ell_1-regularized logistic regression problem and extensively compare the efficiency of DAL algorithm to previously proposed algorithms on both synthetic and benchmark datasets.Comment: 51 pages, 9 figure

    Fast Convergence Rate of Multiple Kernel Learning with Elastic-net Regularization

    Full text link
    We investigate the learning rate of multiple kernel leaning (MKL) with elastic-net regularization, which consists of an β„“1\ell_1-regularizer for inducing the sparsity and an β„“2\ell_2-regularizer for controlling the smoothness. We focus on a sparse setting where the total number of kernels is large but the number of non-zero components of the ground truth is relatively small, and prove that elastic-net MKL achieves the minimax learning rate on the β„“2\ell_2-mixed-norm ball. Our bound is sharper than the convergence rates ever shown, and has a property that the smoother the truth is, the faster the convergence rate is.Comment: 21 pages, 0 figur
    • …
    corecore