2,206 research outputs found
Differentially Private Empirical Risk Minimization with Sparsity-Inducing Norms
Differential privacy is concerned about the prediction quality while
measuring the privacy impact on individuals whose information is contained in
the data. We consider differentially private risk minimization problems with
regularizers that induce structured sparsity. These regularizers are known to
be convex but they are often non-differentiable. We analyze the standard
differentially private algorithms, such as output perturbation, Frank-Wolfe and
objective perturbation. Output perturbation is a differentially private
algorithm that is known to perform well for minimizing risks that are strongly
convex. Previous works have derived excess risk bounds that are independent of
the dimensionality. In this paper, we assume a particular class of convex but
non-smooth regularizers that induce structured sparsity and loss functions for
generalized linear models. We also consider differentially private Frank-Wolfe
algorithms to optimize the dual of the risk minimization problem. We derive
excess risk bounds for both these algorithms. Both the bounds depend on the
Gaussian width of the unit ball of the dual norm. We also show that objective
perturbation of the risk minimization problems is equivalent to the output
perturbation of a dual optimization problem. This is the first work that
analyzes the dual optimization problems of risk minimization problems in the
context of differential privacy
Differentially Private Decomposable Submodular Maximization
We study the problem of differentially private constrained maximization of
decomposable submodular functions. A submodular function is decomposable if it
takes the form of a sum of submodular functions. The special case of maximizing
a monotone, decomposable submodular function under cardinality constraints is
known as the Combinatorial Public Projects (CPP) problem [Papadimitriou et al.,
2008]. Previous work by Gupta et al. [2010] gave a differentially private
algorithm for the CPP problem. We extend this work by designing differentially
private algorithms for both monotone and non-monotone decomposable submodular
maximization under general matroid constraints, with competitive utility
guarantees. We complement our theoretical bounds with experiments demonstrating
empirical performance, which improves over the differentially private
algorithms for the general case of submodular maximization and is close to the
performance of non-private algorithms
Characterizing the Sample Complexity of Private Learners
In 2008, Kasiviswanathan et al. defined private learning as a combination of
PAC learning and differential privacy. Informally, a private learner is applied
to a collection of labeled individual information and outputs a hypothesis
while preserving the privacy of each individual. Kasiviswanathan et al. gave a
generic construction of private learners for (finite) concept classes, with
sample complexity logarithmic in the size of the concept class. This sample
complexity is higher than what is needed for non-private learners, hence
leaving open the possibility that the sample complexity of private learning may
be sometimes significantly higher than that of non-private learning.
We give a combinatorial characterization of the sample size sufficient and
necessary to privately learn a class of concepts. This characterization is
analogous to the well known characterization of the sample complexity of
non-private learning in terms of the VC dimension of the concept class. We
introduce the notion of probabilistic representation of a concept class, and
our new complexity measure RepDim corresponds to the size of the smallest
probabilistic representation of the concept class.
We show that any private learning algorithm for a concept class C with sample
complexity m implies RepDim(C)=O(m), and that there exists a private learning
algorithm with sample complexity m=O(RepDim(C)). We further demonstrate that a
similar characterization holds for the database size needed for privately
computing a large class of optimization problems and also for the well studied
problem of private data release
Optimal Lower Bounds for Universal and Differentially Private Steiner Tree and TSP
Given a metric space on n points, an {\alpha}-approximate universal algorithm
for the Steiner tree problem outputs a distribution over rooted spanning trees
such that for any subset X of vertices containing the root, the expected cost
of the induced subtree is within an {\alpha} factor of the optimal Steiner tree
cost for X. An {\alpha}-approximate differentially private algorithm for the
Steiner tree problem takes as input a subset X of vertices, and outputs a tree
distribution that induces a solution within an {\alpha} factor of the optimal
as before, and satisfies the additional property that for any set X' that
differs in a single vertex from X, the tree distributions for X and X' are
"close" to each other. Universal and differentially private algorithms for TSP
are defined similarly. An {\alpha}-approximate universal algorithm for the
Steiner tree problem or TSP is also an {\alpha}-approximate differentially
private algorithm. It is known that both problems admit O(logn)-approximate
universal algorithms, and hence O(log n)-approximate differentially private
algorithms as well. We prove an {\Omega}(logn) lower bound on the approximation
ratio achievable for the universal Steiner tree problem and the universal TSP,
matching the known upper bounds. Our lower bound for the Steiner tree problem
holds even when the algorithm is allowed to output a more general solution of a
distribution on paths to the root.Comment: 14 page
Differentially Private Convex Optimization with Piecewise Affine Objectives
Differential privacy is a recently proposed notion of privacy that provides
strong privacy guarantees without any assumptions on the adversary. The paper
studies the problem of computing a differentially private solution to convex
optimization problems whose objective function is piecewise affine. Such
problem is motivated by applications in which the affine functions that define
the objective function contain sensitive user information. We propose several
privacy preserving mechanisms and provide analysis on the trade-offs between
optimality and the level of privacy for these mechanisms. Numerical experiments
are also presented to evaluate their performance in practice
Learning Coverage Functions and Private Release of Marginals
We study the problem of approximating and learning coverage functions. A
function is a coverage function, if
there exists a universe with non-negative weights for each
and subsets of such that . Alternatively, coverage functions can be described
as non-negative linear combinations of monotone disjunctions. They are a
natural subclass of submodular functions and arise in a number of applications.
We give an algorithm that for any , given random and uniform
examples of an unknown coverage function , finds a function that
approximates within factor on all but -fraction of the
points in time . This is the first fully-polynomial
algorithm for learning an interesting class of functions in the demanding PMAC
model of Balcan and Harvey (2011). Our algorithms are based on several new
structural properties of coverage functions. Using the results in (Feldman and
Kothari, 2014), we also show that coverage functions are learnable agnostically
with excess -error over all product and symmetric
distributions in time . In contrast, we show that,
without assumptions on the distribution, learning coverage functions is at
least as hard as learning polynomial-size disjoint DNF formulas, a class of
functions for which the best known algorithm runs in time
(Klivans and Servedio, 2004).
As an application of our learning results, we give simple
differentially-private algorithms for releasing monotone conjunction counting
queries with low average error. In particular, for any , we obtain
private release of -way marginals with average error in time
- …