11,814 research outputs found
Preserving Randomness for Adaptive Algorithms
Suppose Est is a randomized estimation algorithm that uses n random bits and outputs values in R^d. We show how to execute Est on k adaptively chosen inputs using only n + O(k log(d + 1)) random bits instead of the trivial nk (at the cost of mild increases in the error and failure probability). Our algorithm combines a variant of the INW pseudorandom generator [Impagliazzo et al., 1994] with a new scheme for shifting and rounding the outputs of Est. We prove that modifying the outputs of Est is necessary in this setting, and furthermore, our algorithm\u27s randomness complexity is near-optimal in the case d {-1, 1} using O(n log n) * poly(1/theta) queries to F and O(n) random bits (independent of theta), improving previous work by Bshouty et al. [Bshouty et al., 2004]
Private Learning Implies Online Learning: An Efficient Reduction
We study the relationship between the notions of differentially private
learning and online learning in games. Several recent works have shown that
differentially private learning implies online learning, but an open problem of
Neel, Roth, and Wu \cite{NeelAaronRoth2018} asks whether this implication is
{\it efficient}. Specifically, does an efficient differentially private learner
imply an efficient online learner? In this paper we resolve this open question
in the context of pure differential privacy. We derive an efficient black-box
reduction from differentially private learning to online learning from expert
advice
Optimal Error Rates for Interactive Coding I: Adaptivity and Other Settings
We consider the task of interactive communication in the presence of
adversarial errors and present tight bounds on the tolerable error-rates in a
number of different settings.
Most significantly, we explore adaptive interactive communication where the
communicating parties decide who should speak next based on the history of the
interaction. Braverman and Rao [STOC'11] show that non-adaptively one can code
for any constant error rate below 1/4 but not more. They asked whether this
bound could be improved using adaptivity. We answer this open question in the
affirmative (with a slightly different collection of resources): Our adaptive
coding scheme tolerates any error rate below 2/7 and we show that tolerating a
higher error rate is impossible. We also show that in the setting of Franklin
et al. [CRYPTO'13], where parties share randomness not known to the adversary,
adaptivity increases the tolerable error rate from 1/2 to 2/3. For
list-decodable interactive communications, where each party outputs a constant
size list of possible outcomes, the tight tolerable error rate is 1/2.
Our negative results hold even if the communication and computation are
unbounded, whereas for our positive results communication and computation are
polynomially bounded. Most prior work considered coding schemes with linear
amount of communication, while allowing unbounded computations. We argue that
studying tolerable error rates in this relaxed context helps to identify a
setting's intrinsic optimal error rate. We set forward a strong working
hypothesis which stipulates that for any setting the maximum tolerable error
rate is independent of many computational and communication complexity
measures. We believe this hypothesis to be a powerful guideline for the design
of simple, natural, and efficient coding schemes and for understanding the
(im)possibilities of coding for interactive communications
Optimal Principal Component Analysis in Distributed and Streaming Models
We study the Principal Component Analysis (PCA) problem in the distributed
and streaming models of computation. Given a matrix a
rank parameter , and an accuracy parameter , we
want to output an orthonormal matrix for which where is the best rank- approximation to .
This paper provides improved algorithms for distributed PCA and streaming
PCA.Comment: STOC2016 full versio
Differentially Private Mixture of Generative Neural Networks
Generative models are used in a wide range of applications building on large
amounts of contextually rich information. Due to possible privacy violations of
the individuals whose data is used to train these models, however, publishing
or sharing generative models is not always viable. In this paper, we present a
novel technique for privately releasing generative models and entire
high-dimensional datasets produced by these models. We model the generator
distribution of the training data with a mixture of generative neural
networks. These are trained together and collectively learn the generator
distribution of a dataset. Data is divided into clusters, using a novel
differentially private kernel -means, then each cluster is given to separate
generative neural networks, such as Restricted Boltzmann Machines or
Variational Autoencoders, which are trained only on their own cluster using
differentially private gradient descent. We evaluate our approach using the
MNIST dataset, as well as call detail records and transit datasets, showing
that it produces realistic synthetic samples, which can also be used to
accurately compute arbitrary number of counting queries.Comment: A shorter version of this paper appeared at the 17th IEEE
International Conference on Data Mining (ICDM 2017). This is the full
version, published in IEEE Transactions on Knowledge and Data Engineering
(TKDE
- …