21,986 research outputs found

    Fitting Spectral Decay with the k-Support Norm

    Get PDF
    The spectral kk-support norm enjoys good estimation properties in low rank matrix learning problems, empirically outperforming the trace norm. Its unit ball is the convex hull of rank kk matrices with unit Frobenius norm. In this paper we generalize the norm to the spectral (k,p)(k,p)-support norm, whose additional parameter pp can be used to tailor the norm to the decay of the spectrum of the underlying model. We characterize the unit ball and we explicitly compute the norm. We further provide a conditional gradient method to solve regularization problems with the norm, and we derive an efficient algorithm to compute the Euclidean projection on the unit ball in the case p=∞p=∞. In numerical experiments, we show that allowing pp to vary significantly improves performance over the spectral kk-support norm on various matrix completion benchmarks, and better captures the spectral decay of the underlying model

    Structured sparsity via optimal interpolation norms

    Get PDF
    We study norms that can be used as penalties in machine learning problems. In particular, we consider norms that are defined by an optimal interpolation problem and whose additional structure can be used to encourage specific characteristics, such as sparsity, in the solution to a learning problem. We first study a norm that is defined as an infimum of quadratics parameterized over a convex set. We show that this formulation includes the k-support norm for sparse vector learning, and its Moreau envelope, the box-norm. These extend naturally to spectral regularizers for matrices, and we introduce the spectral k-support norm and spectral box-norm. We study their properties and we apply the penalties to low rank matrix and multitask learning problems. We next introduce two generalizations of the k-support norm. The first of these is the (k, p)-support norm. In the matrix setting, the additional parameter p allows us to better learn the curvature of the spectrum of the underlying solution. A second application is to multilinear algebra. By considering the rank of its matricizations, we obtain a k-support norm that can be applied to learn a low rank tensor. For each of these norms we provide an optimization method to solve the underlying learning problem, and we present numerical experiments. Finally, we present a general framework for optimal interpolation norms. We focus on a specific formulation that involves an infimal convolution coupled with a linear operator, and which captures several of the penalties discussed in this thesis. Finally we introduce an algorithm to solve regularization problems with norms of this type, and we provide numerical experiments to illustrate the method

    Learning Sets with Separating Kernels

    Full text link
    We consider the problem of learning a set from random samples. We show how relevant geometric and topological properties of a set can be studied analytically using concepts from the theory of reproducing kernel Hilbert spaces. A new kind of reproducing kernel, that we call separating kernel, plays a crucial role in our study and is analyzed in detail. We prove a new analytic characterization of the support of a distribution, that naturally leads to a family of provably consistent regularized learning algorithms and we discuss the stability of these methods with respect to random sampling. Numerical experiments show that the approach is competitive, and often better, than other state of the art techniques.Comment: final versio

    Atomic norm denoising with applications to line spectral estimation

    Get PDF
    Motivated by recent work on atomic norms in inverse problems, we propose a new approach to line spectral estimation that provides theoretical guarantees for the mean-squared-error (MSE) performance in the presence of noise and without knowledge of the model order. We propose an abstract theory of denoising with atomic norms and specialize this theory to provide a convex optimization problem for estimating the frequencies and phases of a mixture of complex exponentials. We show that the associated convex optimization problem can be solved in polynomial time via semidefinite programming (SDP). We also show that the SDP can be approximated by an l1-regularized least-squares problem that achieves nearly the same error rate as the SDP but can scale to much larger problems. We compare both SDP and l1-based approaches with classical line spectral analysis methods and demonstrate that the SDP outperforms the l1 optimization which outperforms MUSIC, Cadzow's, and Matrix Pencil approaches in terms of MSE over a wide range of signal-to-noise ratios.Comment: 27 pages, 10 figures. A preliminary version of this work appeared in the Proceedings of the 49th Annual Allerton Conference in September 2011. Numerous numerical experiments added to this version in accordance with suggestions by anonymous reviewer

    A Kernel Perspective for Regularizing Deep Neural Networks

    Get PDF
    We propose a new point of view for regularizing deep neural networks by using the norm of a reproducing kernel Hilbert space (RKHS). Even though this norm cannot be computed, it admits upper and lower approximations leading to various practical strategies. Specifically, this perspective (i) provides a common umbrella for many existing regularization principles, including spectral norm and gradient penalties, or adversarial training, (ii) leads to new effective regularization penalties, and (iii) suggests hybrid strategies combining lower and upper bounds to get better approximations of the RKHS norm. We experimentally show this approach to be effective when learning on small datasets, or to obtain adversarially robust models.Comment: ICM
    • …
    corecore