Search CORE

49 research outputs found

Learning Poisson Binomial Distributions

Author: A Röllin
A Schönhage
AD Barbour
Andrew C Berry
AYu Volkova
B Roos
Carl-Gustav Esseen
D Dubhashi
Donald E Knuth
ET Whittaker
Eugene Salamin
H Chernoff
J Keilson
JL Johnson
JM Steele
Ken-Iti Sato
L Birgé
L Birgé
L Birgé
L Cam Le
L Devroye
L Devroye
L Devroye
LHY Chen
P Deheuvels
RP Brent
RP Brent
S Kotz
Sandra Fillebrown
SD Poisson
SL Hodges
SX Chen
SYT Soon
VG Mikhailov
W Hoeffding
Werner Ehm
YG Yatracos
YH Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

We consider a basic problem in unsupervised learning: learning an unknown \emph{Poisson Binomial Distribution}. A Poisson Binomial Distribution (PBD) over

\{0,1,\dots,n\}

is the distribution of a sum of

n

independent Bernoulli random variables which may have arbitrary, potentially non-equal, expectations. These distributions were first studied by S. Poisson in 1837 \cite{Poisson:37} and are a natural

n

-parameter generalization of the familiar Binomial Distribution. Surprisingly, prior to our work this basic learning problem was poorly understood, and known results for it were far from optimal. We essentially settle the complexity of the learning problem for this basic class of distributions. As our first main result we give a highly efficient algorithm which learns to \eps-accuracy (with respect to the total variation distance) using \tilde{O}(1/\eps^3) samples \emph{independent of

n

}. The running time of the algorithm is \emph{quasilinear} in the size of its input data, i.e., \tilde{O}(\log(n)/\eps^3) bit-operations. (Observe that each draw from the distribution is a

\log(n)

-bit string.) Our second main result is a {\em proper} learning algorithm that learns to \eps-accuracy using \tilde{O}(1/\eps^2) samples, and runs in time (1/\eps)^{\poly (\log (1/\eps))} \cdot \log n. This is nearly optimal, since any algorithm {for this problem} must use \Omega(1/\eps^2) samples. We also give positive and negative results for some extensions of this learning problem to weighted sums of independent Bernoulli random variables.Comment: Revised full version. Improved sample complexity bound of O~(1/eps^2

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Learning Poisson Binomial Distributions with Differential Privacy

Author: Giannakopoulos Agamemnon
Γιαννακόπουλος Αγαμέμνων
Publication venue
Publication date: 01/01/2017
Field of study

Στη διπλωματική αυτή προσπαθούμε να ενοποιήσουμε δύο ερευνητικά πεδία. Το πρώτο πεδίο αφορά το Distribution Learning ενώ το δεύτερο το Differnetial Privacy. Πιο συγκεκριμένα, δοθέντος ενός learning αλγορίθμου ο οποίος μαθαίνει με ε-accuracy μια Poisson διωνυμική κατανομή προσπαθούμε να βρούμε αν ο αλγόριθμος είναι Differential Private. Δείχνουμε ότι ο αλγόριθμος πετυχαίνει Differential Privacy κάτω από συγκεκριμένες υποθέσεις. Άν η κατανομή είναι κοντά σε μια (n,k) Διωνυμική κατανομή τότε ο αλγόριθμος παραμένει Differential Private. Άν η κατανομή είναι κοντά σε μια κ-Sparse μορφή τότε η ιδιότητα του Differential Privacy εξαρτάται από το πλήθος των στοιχείων του αλγορίθμου.This thesis tries to leverage two major research areas. The first area concerns the Distribution Learning area and the second the Differential Privacy. More specific, given a highly efficient algorithm which learns with ε-accuracy a Poisson Binomial distribution we try to study its Differential Privacy property. We show that if the algorithm is close to a (n,k)-Binomial form the algorithm is differential private. If the PBD is close to a k-Sparse form the algorithm's privacy depends on PBD cardinalit

Pergamos : Unified Institutional Repository / Digital Library Platform of the National and Kapodistrian University of Athens

Properly Learning Poisson Binomial Distributions in Almost Polynomial Time

Author: Diakonikolas Ilias
Kane D.
Stewart Alistair
Publication venue
Publication date: 26/06/2016
Field of study

Edinburgh Research Explorer

A Polynomial Time Algorithm for Lossy Population Recovery

Author: Moitra Ankur
Saks Michael
Publication venue
Publication date: 01/01/2013
Field of study

We give a polynomial time algorithm for the lossy population recovery problem. In this problem, the goal is to approximately learn an unknown distribution on binary strings of length

n

from lossy samples: for some parameter

\mu

each coordinate of the sample is preserved with probability

\mu

and otherwise is replaced by a `?'. The running time and number of samples needed for our algorithm is polynomial in

n

and

1/\varepsilon

for each fixed

\mu>0

. This improves on algorithm of Wigderson and Yehudayoff that runs in quasi-polynomial time for any

\mu > 0

and the polynomial time algorithm of Dvir et al which was shown to work for

\mu \gtrapprox 0.30

by Batman et al. In fact, our algorithm also works in the more general framework of Batman et al. in which there is no a priori bound on the size of the support of the distribution. The algorithm we analyze is implicit in previous work; our main contribution is to analyze the algorithm by showing (via linear programming duality and connections to complex analysis) that a certain matrix associated with the problem has a robust local inverse even though its condition number is exponentially small. A corollary of our result is the first polynomial time algorithm for learning DNFs in the restriction access model of Dvir et al

arXiv.org e-Print Archive

CiteSeerX

Crossref