1,068 research outputs found

    Optimization with Sparsity-Inducing Penalties

    Get PDF
    Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. They were first dedicated to linear variable selection but numerous extensions have now emerged such as structured sparsity or kernel selection. It turns out that many of the related estimation problems can be cast as convex optimization problems by regularizing the empirical risk with appropriate non-smooth norms. The goal of this paper is to present from a general perspective optimization tools and techniques dedicated to such sparsity-inducing penalties. We cover proximal methods, block-coordinate descent, reweighted â„“2\ell_2-penalized techniques, working-set and homotopy methods, as well as non-convex formulations and extensions, and provide an extensive set of experiments to compare various algorithms from a computational point of view

    Graph Regularized Tensor Sparse Coding for Image Representation

    Full text link
    Sparse coding (SC) is an unsupervised learning scheme that has received an increasing amount of interests in recent years. However, conventional SC vectorizes the input images, which destructs the intrinsic spatial structures of the images. In this paper, we propose a novel graph regularized tensor sparse coding (GTSC) for image representation. GTSC preserves the local proximity of elementary structures in the image by adopting the newly proposed tubal-tensor representation. Simultaneously, it considers the intrinsic geometric properties by imposing graph regularization that has been successfully applied to uncover the geometric distribution for the image data. Moreover, the returned sparse representations by GTSC have better physical explanations as the key operation (i.e., circular convolution) in the tubal-tensor model preserves the shifting invariance property. Experimental results on image clustering demonstrate the effectiveness of the proposed scheme

    Convex and Network Flow Optimization for Structured Sparsity

    Get PDF
    We consider a class of learning problems regularized by a structured sparsity-inducing norm defined as the sum of l_2- or l_infinity-norms over groups of variables. Whereas much effort has been put in developing fast optimization techniques when the groups are disjoint or embedded in a hierarchy, we address here the case of general overlapping groups. To this end, we present two different strategies: On the one hand, we show that the proximal operator associated with a sum of l_infinity-norms can be computed exactly in polynomial time by solving a quadratic min-cost flow problem, allowing the use of accelerated proximal gradient methods. On the other hand, we use proximal splitting techniques, and address an equivalent formulation with non-overlapping groups, but in higher dimension and with additional constraints. We propose efficient and scalable algorithms exploiting these two strategies, which are significantly faster than alternative approaches. We illustrate these methods with several problems such as CUR matrix factorization, multi-task learning of tree-structured dictionaries, background subtraction in video sequences, image denoising with wavelets, and topographic dictionary learning of natural image patches.Comment: to appear in the Journal of Machine Learning Research (JMLR

    PADDLE: Proximal Algorithm for Dual Dictionaries LEarning

    Full text link
    Recently, considerable research efforts have been devoted to the design of methods to learn from data overcomplete dictionaries for sparse coding. However, learned dictionaries require the solution of an optimization problem for coding new data. In order to overcome this drawback, we propose an algorithm aimed at learning both a dictionary and its dual: a linear mapping directly performing the coding. By leveraging on proximal methods, our algorithm jointly minimizes the reconstruction error of the dictionary and the coding error of its dual; the sparsity of the representation is induced by an â„“1\ell_1-based penalty on its coefficients. The results obtained on synthetic data and real images show that the algorithm is capable of recovering the expected dictionaries. Furthermore, on a benchmark dataset, we show that the image features obtained from the dual matrix yield state-of-the-art classification performance while being much less computational intensive
    • …
    corecore