Search CORE

3,611 research outputs found

Differentially Private Empirical Risk Minimization with Sparsity-Inducing Norms

Author: Deisenroth Marc Peter
Kumar K S Sesh
Publication venue
Publication date: 13/05/2019
Field of study

Differential privacy is concerned about the prediction quality while measuring the privacy impact on individuals whose information is contained in the data. We consider differentially private risk minimization problems with regularizers that induce structured sparsity. These regularizers are known to be convex but they are often non-differentiable. We analyze the standard differentially private algorithms, such as output perturbation, Frank-Wolfe and objective perturbation. Output perturbation is a differentially private algorithm that is known to perform well for minimizing risks that are strongly convex. Previous works have derived excess risk bounds that are independent of the dimensionality. In this paper, we assume a particular class of convex but non-smooth regularizers that induce structured sparsity and loss functions for generalized linear models. We also consider differentially private Frank-Wolfe algorithms to optimize the dual of the risk minimization problem. We derive excess risk bounds for both these algorithms. Both the bounds depend on the Gaussian width of the unit ball of the dual norm. We also show that objective perturbation of the risk minimization problems is equivalent to the output perturbation of a dual optimization problem. This is the first work that analyzes the dual optimization problems of risk minimization problems in the context of differential privacy

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

Theoretical Properties of the Overlapping Groups Lasso

Author: Percival Daniel
Publication venue
Publication date: 09/11/2011
Field of study

We present two sets of theoretical results on the grouped lasso with overlap of Jacob, Obozinski and Vert (2009) in the linear regression setting. This method allows for joint selection of predictors in sparse regression, allowing for complex structured sparsity over the predictors encoded as a set of groups. This flexible framework suggests that arbitrarily complex structures can be encoded with an intricate set of groups. Our results show that this strategy results in unexpected theoretical consequences for the procedure. In particular, we give two sets of results: (1) finite sample bounds on prediction and estimation, and (2) asymptotic distribution and selection. Both sets of results give insight into the consequences of choosing an increasingly complex set of groups for the procedure, as well as what happens when the set of groups cannot recover the true sparsity pattern. Additionally, these results demonstrate the differences and similarities between the the grouped lasso procedure with and without overlapping groups. Our analysis shows the set of groups must be chosen with caution - an overly complex set of groups will damage the analysis.Comment: 20 pages, submitted to Annals of Statistic

arXiv.org e-Print Archive

Crossref

Sparse Recovery via Differential Inclusions

Author: Osher Stanley
Ruan Feng
Xiong Jiechao
Yao Yuan
Yin Wotao
Publication venue: 'Elsevier BV'
Publication date: 21/01/2016
Field of study

In this paper, we recover sparse signals from their noisy linear measurements by solving nonlinear differential inclusions, which is based on the notion of inverse scale space (ISS) developed in applied mathematics. Our goal here is to bring this idea to address a challenging problem in statistics, \emph{i.e.} finding the oracle estimator which is unbiased and sign-consistent using dynamics. We call our dynamics \emph{Bregman ISS} and \emph{Linearized Bregman ISS}. A well-known shortcoming of LASSO and any convex regularization approaches lies in the bias of estimators. However, we show that under proper conditions, there exists a bias-free and sign-consistent point on the solution paths of such dynamics, which corresponds to a signal that is the unbiased estimate of the true signal and whose entries have the same signs as those of the true signs, \emph{i.e.} the oracle estimator. Therefore, their solution paths are regularization paths better than the LASSO regularization path, since the points on the latter path are biased when sign-consistency is reached. We also show how to efficiently compute their solution paths in both continuous and discretized settings: the full solution paths can be exactly computed piece by piece, and a discretization leads to \emph{Linearized Bregman iteration}, which is a simple iterative thresholding rule and easy to parallelize. Theoretical guarantees such as sign-consistency and minimax optimal

l_2

-error bounds are established in both continuous and discrete settings for specific points on the paths. Early-stopping rules for identifying these points are given. The key treatment relies on the development of differential inequalities for differential inclusions and their discretizations, which extends the previous results and leads to exponentially fast recovering of sparse signals before selecting wrong ones.Comment: In Applied and Computational Harmonic Analysis, 201

arXiv.org e-Print Archive

Simple Error Bounds for Regularized Noisy Linear Inverse Problems

Author: Hassibi Babak
Oymak Samet
Thrampoulidis Christos
Publication venue
Publication date: 01/01/2014
Field of study

Consider estimating a structured signal

\mathbf{x}_0

from linear, underdetermined and noisy measurements

\mathbf{y}=\mathbf{A}\mathbf{x}_0+\mathbf{z}

, via solving a variant of the lasso algorithm:

\hat{\mathbf{x}}=\arg\min_\mathbf{x}\{ \|\mathbf{y}-\mathbf{A}\mathbf{x}\|_2+\lambda f(\mathbf{x})\}

. Here,

f

is a convex function aiming to promote the structure of

\mathbf{x}_0

, say

\ell_1

-norm to promote sparsity or nuclear norm to promote low-rankness. We assume that the entries of

\mathbf{A}

are independent and normally distributed and make no assumptions on the noise vector

\mathbf{z}

, other than it being independent of

\mathbf{A}

. Under this generic setup, we derive a general, non-asymptotic and rather tight upper bound on the

\ell_2

-norm of the estimation error

\|\hat{\mathbf{x}}-\mathbf{x}_0\|_2

. Our bound is geometric in nature and obeys a simple formula; the roles of

\lambda

f

and

\mathbf{x}_0

are all captured by a single summary parameter

\delta(\lambda\partial((f(\mathbf{x}_0)))

, termed the Gaussian squared distance to the scaled subdifferential. We connect our result to the literature and verify its validity through simulations.Comment: 6pages, 2 figur

arXiv.org e-Print Archive

Crossref

Caltech Authors