3,847 research outputs found
Differentially Private Empirical Risk Minimization
Privacy-preserving machine learning algorithms are crucial for the
increasingly common setting in which personal data, such as medical or
financial records, are analyzed. We provide general techniques to produce
privacy-preserving approximations of classifiers learned via (regularized)
empirical risk minimization (ERM). These algorithms are private under the
-differential privacy definition due to Dwork et al. (2006). First we
apply the output perturbation ideas of Dwork et al. (2006), to ERM
classification. Then we propose a new method, objective perturbation, for
privacy-preserving machine learning algorithm design. This method entails
perturbing the objective function before optimizing over classifiers. If the
loss and regularizer satisfy certain convexity and differentiability criteria,
we prove theoretical results showing that our algorithms preserve privacy, and
provide generalization bounds for linear and nonlinear kernels. We further
present a privacy-preserving technique for tuning the parameters in general
machine learning algorithms, thereby providing end-to-end privacy guarantees
for the training process. We apply these results to produce privacy-preserving
analogues of regularized logistic regression and support vector machines. We
obtain encouraging results from evaluating their performance on real
demographic and benchmark data sets. Our results show that both theoretically
and empirically, objective perturbation is superior to the previous
state-of-the-art, output perturbation, in managing the inherent tradeoff
between privacy and learning performance.Comment: 40 pages, 7 figures, accepted to the Journal of Machine Learning
Researc
Differentially Private Empirical Risk Minimization with Sparsity-Inducing Norms
Differential privacy is concerned about the prediction quality while
measuring the privacy impact on individuals whose information is contained in
the data. We consider differentially private risk minimization problems with
regularizers that induce structured sparsity. These regularizers are known to
be convex but they are often non-differentiable. We analyze the standard
differentially private algorithms, such as output perturbation, Frank-Wolfe and
objective perturbation. Output perturbation is a differentially private
algorithm that is known to perform well for minimizing risks that are strongly
convex. Previous works have derived excess risk bounds that are independent of
the dimensionality. In this paper, we assume a particular class of convex but
non-smooth regularizers that induce structured sparsity and loss functions for
generalized linear models. We also consider differentially private Frank-Wolfe
algorithms to optimize the dual of the risk minimization problem. We derive
excess risk bounds for both these algorithms. Both the bounds depend on the
Gaussian width of the unit ball of the dual norm. We also show that objective
perturbation of the risk minimization problems is equivalent to the output
perturbation of a dual optimization problem. This is the first work that
analyzes the dual optimization problems of risk minimization problems in the
context of differential privacy
Efficient Private ERM for Smooth Objectives
In this paper, we consider efficient differentially private empirical risk
minimization from the viewpoint of optimization algorithms. For strongly convex
and smooth objectives, we prove that gradient descent with output perturbation
not only achieves nearly optimal utility, but also significantly improves the
running time of previous state-of-the-art private optimization algorithms, for
both -DP and -DP. For non-convex but smooth
objectives, we propose an RRPSGD (Random Round Private Stochastic Gradient
Descent) algorithm, which provably converges to a stationary point with privacy
guarantee. Besides the expected utility bounds, we also provide guarantees in
high probability form. Experiments demonstrate that our algorithm consistently
outperforms existing method in both utility and running time
Differentially Private Coordinate Descent for Composite Empirical Risk Minimization
International audienceMachine learning models can leak information about the data used to train them. To mitigate this issue, Differentially Private (DP) variants of optimization algorithms like Stochastic Gradient Descent (DP-SGD) have been designed to trade-off utility for privacy in Empirical Risk Minimization (ERM) problems. In this paper, we propose Differentially Private proximal Coordinate Descent (DP-CD), a new method to solve composite DP-ERM problems. We derive utility guarantees through a novel theoretical analysis of inexact coordinate descent. Our results show that, thanks to larger step sizes, DP-CD can exploit imbalance in gradient coordinates to outperform DP-SGD. We also prove new lower bounds for composite DP-ERM under coordinate-wise regularity assumptions, that are nearly matched by DP-CD. For practical implementations, we propose to clip gradients using coordinate-wise thresholds that emerge from our theory, avoiding costly hyperparameter tuning. Experiments on real and synthetic data support our results, and show that DP-CD compares favorably with DP-SGD
- …