158 research outputs found
Differentially Private Convex Optimization with Piecewise Affine Objectives
Differential privacy is a recently proposed notion of privacy that provides
strong privacy guarantees without any assumptions on the adversary. The paper
studies the problem of computing a differentially private solution to convex
optimization problems whose objective function is piecewise affine. Such
problem is motivated by applications in which the affine functions that define
the objective function contain sensitive user information. We propose several
privacy preserving mechanisms and provide analysis on the trade-offs between
optimality and the level of privacy for these mechanisms. Numerical experiments
are also presented to evaluate their performance in practice
Differentially Private Empirical Risk Minimization
Privacy-preserving machine learning algorithms are crucial for the
increasingly common setting in which personal data, such as medical or
financial records, are analyzed. We provide general techniques to produce
privacy-preserving approximations of classifiers learned via (regularized)
empirical risk minimization (ERM). These algorithms are private under the
-differential privacy definition due to Dwork et al. (2006). First we
apply the output perturbation ideas of Dwork et al. (2006), to ERM
classification. Then we propose a new method, objective perturbation, for
privacy-preserving machine learning algorithm design. This method entails
perturbing the objective function before optimizing over classifiers. If the
loss and regularizer satisfy certain convexity and differentiability criteria,
we prove theoretical results showing that our algorithms preserve privacy, and
provide generalization bounds for linear and nonlinear kernels. We further
present a privacy-preserving technique for tuning the parameters in general
machine learning algorithms, thereby providing end-to-end privacy guarantees
for the training process. We apply these results to produce privacy-preserving
analogues of regularized logistic regression and support vector machines. We
obtain encouraging results from evaluating their performance on real
demographic and benchmark data sets. Our results show that both theoretically
and empirically, objective perturbation is superior to the previous
state-of-the-art, output perturbation, in managing the inherent tradeoff
between privacy and learning performance.Comment: 40 pages, 7 figures, accepted to the Journal of Machine Learning
Researc
Gaussian Mechanisms Against Statistical Inference:Synthesis Tools
In this manuscript, we provide a set of tools (in terms of semidefinite programs) to synthesize Gaussian mechanisms to maximize privacy of databases. Information about the database is disclosed through queries requested by (potentially) adversarial users. We aim to keep part of the database private (private sensitive information); however, disclosed data could be used to estimate private information. To avoid an accurate estimation by the adversaries, we pass the requested data through distorting (privacy-preserving) mechanisms before transmission and send the distorted data to the user. These mechanisms consist of a coordinate transformation and an additive dependent Gaussian vector. We formulate the synthesis of distorting mechanisms in terms of semidefinite programs in which we seek to minimize the mutual information (our privacy metric) between private data and the disclosed distorted data given a desired distortion level -- how different actual and distorted data are allowed to be
Information-Theoretic Privacy through Chaos Synchronization and Optimal Additive Noise
We study the problem of maximizing privacy of data sets by adding random
vectors generated via synchronized chaotic oscillators. In particular, we
consider the setup where information about data sets, queries, is sent through
public (unsecured) communication channels to a remote station. To hide private
features (specific entries) within the data set, we corrupt the response to
queries by adding random vectors. We send the distorted query (the sum of the
requested query and the random vector) through the public channel. The
distribution of the additive random vector is designed to minimize the mutual
information (our privacy metric) between private entries of the data set and
the distorted query. We cast the synthesis of this distribution as a convex
program in the probabilities of the additive random vector. Once we have the
optimal distribution, we propose an algorithm to generate pseudo-random
realizations from this distribution using trajectories of a chaotic oscillator.
At the other end of the channel, we have a second chaotic oscillator, which we
use to generate realizations from the same distribution. Note that if we obtain
the same realizations on both sides of the channel, we can simply subtract the
realization from the distorted query to recover the requested query. To
generate equal realizations, we need the two chaotic oscillators to be
synchronized, i.e., we need them to generate exactly the same trajectories on
both sides of the channel synchronously in time. We force the two chaotic
oscillators into exponential synchronization using a driving signal.
Simulations are presented to illustrate our results.Comment: arXiv admin note: text overlap with arXiv:1809.03133 by other author
- …