1,068 research outputs found
Optimization with Sparsity-Inducing Penalties
Sparse estimation methods are aimed at using or obtaining parsimonious
representations of data or models. They were first dedicated to linear variable
selection but numerous extensions have now emerged such as structured sparsity
or kernel selection. It turns out that many of the related estimation problems
can be cast as convex optimization problems by regularizing the empirical risk
with appropriate non-smooth norms. The goal of this paper is to present from a
general perspective optimization tools and techniques dedicated to such
sparsity-inducing penalties. We cover proximal methods, block-coordinate
descent, reweighted -penalized techniques, working-set and homotopy
methods, as well as non-convex formulations and extensions, and provide an
extensive set of experiments to compare various algorithms from a computational
point of view
Graph Regularized Tensor Sparse Coding for Image Representation
Sparse coding (SC) is an unsupervised learning scheme that has received an
increasing amount of interests in recent years. However, conventional SC
vectorizes the input images, which destructs the intrinsic spatial structures
of the images. In this paper, we propose a novel graph regularized tensor
sparse coding (GTSC) for image representation. GTSC preserves the local
proximity of elementary structures in the image by adopting the newly proposed
tubal-tensor representation. Simultaneously, it considers the intrinsic
geometric properties by imposing graph regularization that has been
successfully applied to uncover the geometric distribution for the image data.
Moreover, the returned sparse representations by GTSC have better physical
explanations as the key operation (i.e., circular convolution) in the
tubal-tensor model preserves the shifting invariance property. Experimental
results on image clustering demonstrate the effectiveness of the proposed
scheme
Convex and Network Flow Optimization for Structured Sparsity
We consider a class of learning problems regularized by a structured
sparsity-inducing norm defined as the sum of l_2- or l_infinity-norms over
groups of variables. Whereas much effort has been put in developing fast
optimization techniques when the groups are disjoint or embedded in a
hierarchy, we address here the case of general overlapping groups. To this end,
we present two different strategies: On the one hand, we show that the proximal
operator associated with a sum of l_infinity-norms can be computed exactly in
polynomial time by solving a quadratic min-cost flow problem, allowing the use
of accelerated proximal gradient methods. On the other hand, we use proximal
splitting techniques, and address an equivalent formulation with
non-overlapping groups, but in higher dimension and with additional
constraints. We propose efficient and scalable algorithms exploiting these two
strategies, which are significantly faster than alternative approaches. We
illustrate these methods with several problems such as CUR matrix
factorization, multi-task learning of tree-structured dictionaries, background
subtraction in video sequences, image denoising with wavelets, and topographic
dictionary learning of natural image patches.Comment: to appear in the Journal of Machine Learning Research (JMLR
PADDLE: Proximal Algorithm for Dual Dictionaries LEarning
Recently, considerable research efforts have been devoted to the design of
methods to learn from data overcomplete dictionaries for sparse coding.
However, learned dictionaries require the solution of an optimization problem
for coding new data. In order to overcome this drawback, we propose an
algorithm aimed at learning both a dictionary and its dual: a linear mapping
directly performing the coding. By leveraging on proximal methods, our
algorithm jointly minimizes the reconstruction error of the dictionary and the
coding error of its dual; the sparsity of the representation is induced by an
-based penalty on its coefficients. The results obtained on synthetic
data and real images show that the algorithm is capable of recovering the
expected dictionaries. Furthermore, on a benchmark dataset, we show that the
image features obtained from the dual matrix yield state-of-the-art
classification performance while being much less computational intensive
- …