Search CORE

93,148 research outputs found

Differentially Private Empirical Risk Minimization

Author: Anand D. Sarwate
Claire Monteleoni
Kamalika Chaudhuri
Nicolas Vayatis
Publication venue
Publication date: 01/01/2011
Field of study

Privacy-preserving machine learning algorithms are crucial for the increasingly common setting in which personal data, such as medical or financial records, are analyzed. We provide general techniques to produce privacy-preserving approximations of classifiers learned via (regularized) empirical risk minimization (ERM). These algorithms are private under the

\epsilon

-differential privacy definition due to Dwork et al. (2006). First we apply the output perturbation ideas of Dwork et al. (2006), to ERM classification. Then we propose a new method, objective perturbation, for privacy-preserving machine learning algorithm design. This method entails perturbing the objective function before optimizing over classifiers. If the loss and regularizer satisfy certain convexity and differentiability criteria, we prove theoretical results showing that our algorithms preserve privacy, and provide generalization bounds for linear and nonlinear kernels. We further present a privacy-preserving technique for tuning the parameters in general machine learning algorithms, thereby providing end-to-end privacy guarantees for the training process. We apply these results to produce privacy-preserving analogues of regularized logistic regression and support vector machines. We obtain encouraging results from evaluating their performance on real demographic and benchmark data sets. Our results show that both theoretically and empirically, objective perturbation is superior to the previous state-of-the-art, output perturbation, in managing the inherent tradeoff between privacy and learning performance.Comment: 40 pages, 7 figures, accepted to the Journal of Machine Learning Researc

arXiv.org e-Print Archive

CiteSeerX

Empirical risk minimization in inverse problems

Author: Klemelä Jussi
Mammen Enno
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2010
Field of study

We study estimation of a multivariate function

f:\mathbf{R}^d\to\mathbf{R}

when the observations are available from the function

Af

, where

A

is a known linear operator. Both the Gaussian white noise model and density estimation are studied. We define an

L_2

-empirical risk functional which is used to define a

\delta

-net minimizer and a dense empirical risk minimizer. Upper bounds for the mean integrated squared error of the estimators are given. The upper bounds show how the difficulty of the estimation depends on the operator through the norm of the adjoint of the inverse of the operator and on the underlying function class through the entropy of the class. Corresponding lower bounds are also derived. As examples, we consider convolution operators and the Radon transform. In these examples, the estimators achieve the optimal rates of convergence. Furthermore, a new type of oracle inequality is given for inverse problems in additive models.Comment: Published in at http://dx.doi.org/10.1214/09-AOS726 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

MAnnheim DOCument Server

On concentration for (regularized) empirical risk minimization

Author: van de Geer Sara
Wainwright Martin
Publication venue
Publication date: 11/01/2016
Field of study

Rates of convergence for empirical risk minimizers have been well studied in the literature. In this paper, we aim to provide a complementary set of results, in particular by showing that after normalization, the risk of the empirical minimizer concentrates on a single point. Such results have been established by~\cite{chatterjee2014new} for constrained estimators in the normal sequence model. We first generalize and sharpen this result to regularized least squares with convex penalties, making use of a "direct" argument based on Borell's theorem. We then study generalizations to other loss functions, including the negative log-likelihood for exponential families combined with a strictly convex regularization penalty. The results in this general setting are based on more "indirect" arguments as well as on concentration inequalities for maxima of empirical processes.Comment: 27 page

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

Empirical Risk Minimization for Probabilistic Grammars: Sample Complexity and Hardness of Learning

Author: Cohen S. B.
Smith N. A.
Publication venue
Publication date: 01/01/2012
Field of study

Probabilistic grammars are generative statistical models that are useful for compositional and sequential structures. They are used ubiquitously in computational linguistics. We present a framework, reminiscent of structural risk minimization, for empirical risk minimization of probabilistic grammars using the log-loss. We derive sample complexity bounds in this framework that apply both to the supervised setting and the unsupervised setting. By making assumptions about the underlying distribution that are appropriate for natural language scenarios, we are able to derive distribution-dependent sample complexity bounds for probabilistic grammars. We also give simple algorithms for carrying out empirical risk minimization using this framework in both the supervised and unsupervised settings. In the unsupervised case, we show that the problem of minimizing empirical risk is NP-hard. We therefore suggest an approximate algorithm, similar to expectation-maximization, to minimize the empirical risk. Learning from data is central to contemporary computational linguistics. It is in common in such learning to estimate a model in a parametric family using the maximum likelihood principle. This principle applies in the supervised case (i.e., using annotate

CiteSeerX

Crossref

Edinburgh Research Explorer

Explainable Empirical Risk Minimization

Author: Jung A.
Publication venue
Publication date: 03/09/2020
Field of study

The widespread use of modern machine learning methods in decision making crucially depends on their interpretability or explainability. The human users (decision makers) of machine learning methods are often not only interested in getting accurate predictions or projections. Rather, as a decision-maker, the user also needs a convincing answer (or explanation) to the question of why a particular prediction was delivered. Explainable machine learning might be a legal requirement when used for decision making with an immediate effect on the health of human beings. As an example consider the computer vision of a self-driving car whose predictions are used to decide if to stop the car. We have recently proposed an information-theoretic approach to construct personalized explanations for predictions obtained from ML. This method was model-agnostic and only required some training samples of the model to be explained along with a user feedback signal. This paper uses an information-theoretic measure for the quality of an explanation to learn predictors that are intrinsically explainable to a specific user. Our approach is not restricted to a particular hypothesis space, such as linear maps or shallow decision trees, whose predictor maps are considered as explainable by definition. Rather, we regularize an arbitrary hypothesis space using a personalized measure for the explainability of a particular predictor

arXiv.org e-Print Archive