16 research outputs found
Sharp Oracle Inequalities for Square Root Regularization
We study a set of regularization methods for high-dimensional linear
regression models. These penalized estimators have the square root of the
residual sum of squared errors as loss function, and any weakly decomposable
norm as penalty function. This fit measure is chosen because of its property
that the estimator does not depend on the unknown standard deviation of the
noise. On the other hand, a generalized weakly decomposable norm penalty is
very useful in being able to deal with different underlying sparsity
structures. We can choose a different sparsity inducing norm depending on how
we want to interpret the unknown parameter vector . Structured sparsity
norms, as defined in Micchelli et al. [18], are special cases of weakly
decomposable norms, therefore we also include the square root LASSO (Belloni et
al. [3]), the group square root LASSO (Bunea et al. [10]) and a new method
called the square root SLOPE (in a similar fashion to the SLOPE from Bogdan et
al. [6]). For this collection of estimators our results provide sharp oracle
inequalities with the Karush-Kuhn-Tucker conditions. We discuss some examples
of estimators. Based on a simulation we illustrate some advantages of the
square root SLOPE
On Sparsity Inducing Regularization Methods for Machine Learning
During the past years there has been an explosion of interest in learning
methods based on sparsity regularization. In this paper, we discuss a general
class of such methods, in which the regularizer can be expressed as the
composition of a convex function with a linear function. This setting
includes several methods such the group Lasso, the Fused Lasso, multi-task
learning and many more. We present a general approach for solving
regularization problems of this kind, under the assumption that the proximity
operator of the function is available. Furthermore, we comment on the
application of this approach to support vector machines, a technique pioneered
by the groundbreaking work of Vladimir Vapnik.Comment: 12 pages. arXiv admin note: text overlap with arXiv:1104.143
Identifying disease sensitive and quantitative trait-relevant biomarkers from multidimensional heterogeneous imaging genetics data via sparse multimodal multitask learning
Motivation: Recent advances in brain imaging and high-throughput genotyping techniques enable new approaches to study the influence of genetic and anatomical variations on brain functions and disorders. Traditional association studies typically perform independent and pairwise analysis among neuroimaging measures, cognitive scores and disease status, and ignore the important underlying interacting relationships between these units
Proximal Methods for Hierarchical Sparse Coding
Sparse coding consists in representing signals as sparse linear combinations
of atoms selected from a dictionary. We consider an extension of this framework
where the atoms are further assumed to be embedded in a tree. This is achieved
using a recently introduced tree-structured sparse regularization norm, which
has proven useful in several applications. This norm leads to regularized
problems that are difficult to optimize, and we propose in this paper efficient
algorithms for solving them. More precisely, we show that the proximal operator
associated with this norm is computable exactly via a dual approach that can be
viewed as the composition of elementary proximal operators. Our procedure has a
complexity linear, or close to linear, in the number of atoms, and allows the
use of accelerated gradient techniques to solve the tree-structured sparse
approximation problem at the same computational cost as traditional ones using
the L1-norm. Our method is efficient and scales gracefully to millions of
variables, which we illustrate in two types of applications: first, we consider
fixed hierarchical dictionaries of wavelets to denoise natural images. Then, we
apply our optimization tools in the context of dictionary learning, where
learned dictionary elements naturally organize in a prespecified arborescent
structure, leading to a better performance in reconstruction of natural image
patches. When applied to text documents, our method learns hierarchies of
topics, thus providing a competitive alternative to probabilistic topic models