8 research outputs found
Sharp Oracle Inequalities for Square Root Regularization
We study a set of regularization methods for high-dimensional linear
regression models. These penalized estimators have the square root of the
residual sum of squared errors as loss function, and any weakly decomposable
norm as penalty function. This fit measure is chosen because of its property
that the estimator does not depend on the unknown standard deviation of the
noise. On the other hand, a generalized weakly decomposable norm penalty is
very useful in being able to deal with different underlying sparsity
structures. We can choose a different sparsity inducing norm depending on how
we want to interpret the unknown parameter vector . Structured sparsity
norms, as defined in Micchelli et al. [18], are special cases of weakly
decomposable norms, therefore we also include the square root LASSO (Belloni et
al. [3]), the group square root LASSO (Bunea et al. [10]) and a new method
called the square root SLOPE (in a similar fashion to the SLOPE from Bogdan et
al. [6]). For this collection of estimators our results provide sharp oracle
inequalities with the Karush-Kuhn-Tucker conditions. We discuss some examples
of estimators. Based on a simulation we illustrate some advantages of the
square root SLOPE
Square Root {LASSO}: well-posedness, Lipschitz stability and the tuning trade off
This paper studies well-posedness and parameter sensitivity of the Square
Root LASSO (SR-LASSO), an optimization model for recovering sparse solutions to
linear inverse problems in finite dimension. An advantage of the SR-LASSO
(e.g., over the standard LASSO) is that the optimal tuning of the
regularization parameter is robust with respect to measurement noise. This
paper provides three point-based regularity conditions at a solution of the
SR-LASSO: the weak, intermediate, and strong assumptions. It is shown that the
weak assumption implies uniqueness of the solution in question. The
intermediate assumption yields a directionally differentiable and locally
Lipschitz solution map (with explicit Lipschitz bounds), whereas the strong
assumption gives continuous differentiability of said map around the point in
question. Our analysis leads to new theoretical insights on the comparison
between SR-LASSO and LASSO from the viewpoint of tuning parameter sensitivity:
noise-robust optimal parameter choice for SR-LASSO comes at the "price" of
elevated tuning parameter sensitivity. Numerical results support and showcase
the theoretical findings