Search CORE

30,501 research outputs found

Testing non-uniform k-wise independent distributions over product spaces (extended abstract)

Author: A. Czumaj
A. Joffe
A. Terras
A.F. Nikiforov
B. Chor
C. Bertram-Kretzberg
C.P. Neuman
D. Ron
G. Even
G.H. Hardy
H.J.S. Smith
J. Naor
J.R. Silvester
M. Blum
M. Luby
N. Alon
N. Alon
N. Alon
O. Goldreich
R. Karp
R. Kumar
R. Rubinfeld
S. Yekhanin
T. Batu
V. Grolmusz
Y. Azar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

A distribution D over Σ1× ⋯ ×Σ n is called (non-uniform) k-wise independent if for any set of k indices {i 1, ..., i k } and for any z1zki1ik, PrXD[Xi1Xik=z1zk]=PrXD[Xi1=z1]PrXD[Xik=zk]. We study the problem of testing (non-uniform) k-wise independent distributions over product spaces. For the uniform case we show an upper bound on the distance between a distribution D from the set of k-wise independent distributions in terms of the sum of Fourier coefficients of D at vectors of weight at most k. Such a bound was previously known only for the binary field. For the non-uniform case, we give a new characterization of distributions being k-wise independent and further show that such a characterization is robust. These greatly generalize the results of Alon et al. [1] on uniform k-wise independence over the binary field to non-uniform k-wise independence over product spaces. Our results yield natural testing algorithms for k-wise independence with time and sample complexity sublinear in terms of the support size when k is a constant. The main technical tools employed include discrete Fourier transforms and the theory of linear systems of congruences.National Science Foundation (U.S.) (NSF grant 0514771)National Science Foundation (U.S.) (grant 0728645)National Science Foundation (U.S.) (Grant 0732334)Marie Curie International Reintegration Grants (Grant PIRG03-GA-2008-231077)Israel Science Foundation (Grant 1147/09)Israel Science Foundation (Grant 1675/09)Massachusetts Institute of Technology (Akamai Presidential Fellowship

CiteSeerX

DSpace@MIT

Crossref

Testing k-wise independent distributions

Author: Xie Ning, Ph. D. Massachusetts Institute of Technology
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2012
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 119-123).A probability distribution over {0, 1}' is k-wise independent if its restriction to any k coordinates is uniform. More generally, a discrete distribution D over E1 x ... x E, is called (non-uniform) k-wise independent if for any subset of k indices {ii, . . . , ik} and for any zi E Ei 1, .. , Zk E Eik , PrX~D [Xi 1 - - -Xi, = Z1 .. z] = PrX-D[Xi 1 = zi] ... PrX~D [Xik = Zk]. k-wise independent distributions look random "locally" to an observer of only k coordinates, even though they may be far from random "globally". Because of this key feature, k-wise independent distributions are important concepts in probability, complexity, and algorithm design. In this thesis, we study the problem of testing (non-uniform) k-wise independent distributions over product spaces. For the problem of distinguishing k-wise independent distributions supported on the Boolean cube from those that are 6-far in statistical distance from any k-wise independent distribution, we upper bound the number of required samples by O(nk/6 2 ) and lower bound it by Q (n 2 /6) (these bounds hold for constant k, and essentially the same bounds hold for general k). To achieve these bounds, we use novel Fourier analysis techniques to relate a distribution's statistical distance from k-wise independence to its biases, a measure of the parity imbalance it induces on a set of variables. The relationships we derive are tighter than previously known, and may be of independent interest. We then generalize our results to distributions over larger domains. For the uniform case we show an upper bound on the distance between a distribution D from k-wise independent distributions in terms of the sum of Fourier coefficients of D at vectors of weight at most k. For the non-uniform case, we give a new characterization of distributions being k-wise independent and further show that such a characterization is robust based on our results for the uniform case. Our results yield natural testing algorithms for k-wise independence with time and sample complexity sublinear in terms of the support size of the distribution when k is a constant. The main technical tools employed include discrete Fourier transform and the theory of linear systems of congruences.by Ning Xie.Ph.D

DSpace@MIT

Deterministic parallel algorithms for bilinear objective functions

Author: Harris David G.
Publication venue
Publication date: 21/06/2018
Field of study

Many randomized algorithms can be derandomized efficiently using either the method of conditional expectations or probability spaces with low independence. A series of papers, beginning with work by Luby (1988), showed that in many cases these techniques can be combined to give deterministic parallel (NC) algorithms for a variety of combinatorial optimization problems, with low time- and processor-complexity. We extend and generalize a technique of Luby for efficiently handling bilinear objective functions. One noteworthy application is an NC algorithm for maximal independent set. On a graph

G

with

m

edges and

n

vertices, this takes

\tilde O(\log^2 n)

time and

(m + n) n^{o(1)}

processors, nearly matching the best randomized parallel algorithms. Other applications include reduced processor counts for algorithms of Berger (1997) for maximum acyclic subgraph and Gale-Berlekamp switching games. This bilinear factorization also gives better algorithms for problems involving discrepancy. An important application of this is to automata-fooling probability spaces, which are the basis of a notable derandomization technique of Sivakumar (2002). Our method leads to large reduction in processor complexity for a number of derandomization algorithms based on automata-fooling, including set discrepancy and the Johnson-Lindenstrauss Lemma

arXiv.org e-Print Archive

On choosing and bounding probability metrics

Author: Aldous
Barron
Bernardo
Borovkov
Cam
Chung
Cover
Csiszar
Diaconis
Diaconis
Diaconis
Dudley
Dudley
Hartigan
Huber
Ibragimov
Jacod
Kakutani
Kolmogorov
Kuipers
Kullback
Kullback
LeCam
LeCam
Lehmann
Liese
Lindsay
Lindvall
Linnik
Lukacs
Lévy
Mathai
Nummelin
Orey
Petrov
Prokhorov
Rachev
Reiss
Rosenthal
Shannon
Strassen
Su
Su
Szulga
Tierney
Tierney
Williams
Zolotarev
Publication venue
Publication date: 01/01/2002
Field of study

When studying convergence of measures, an important issue is the choice of probability metric. In this review, we provide a summary and some new results concerning bounds among ten important probability metrics/distances that are used by statisticians and probabilists. We focus on these metrics because they are either well-known, commonly used, or admit practical bounding techniques. We summarize these relationships in a handy reference diagram, and also give examples to show how rates of convergence can depend on the metric chosen.Comment: To appear, International Statistical Review. Related work at http://www.math.hmc.edu/~su/papers.htm

arXiv.org e-Print Archive

CiteSeerX

Scholarship@Claremont

Crossref

A directed isoperimetric inequality with application to Bregman near neighbor lower bounds

Author: Chaudhuri Kamalika
Marcel
Talagrand Michel
Publication venue
Publication date: 16/05/2015
Field of study

Bregman divergences

D_\phi

are a class of divergences parametrized by a convex function

\phi

and include well known distance functions like

\ell_2^2

and the Kullback-Leibler divergence. There has been extensive research on algorithms for problems like clustering and near neighbor search with respect to Bregman divergences, in all cases, the algorithms depend not just on the data size

n

and dimensionality

d

, but also on a structure constant

\mu \ge 1

that depends solely on

\phi

and can grow without bound independently. In this paper, we provide the first evidence that this dependence on

\mu

might be intrinsic. We focus on the problem of approximate near neighbor search for Bregman divergences. We show that under the cell probe model, any non-adaptive data structure (like locality-sensitive hashing) for

c

-approximate near-neighbor search that admits

r

probes must use space

\Omega(n^{1 + \frac{\mu}{c r}})

. In contrast, for LSH under

\ell_1

the best bound is

\Omega(n^{1+\frac{1}{cr}})

. Our new tool is a directed variant of the standard boolean noise operator. We show that a generalization of the Bonami-Beckner hypercontractivity inequality exists "in expectation" or upon restriction to certain subsets of the Hamming cube, and that this is sufficient to prove the desired isoperimetric inequality that we use in our data structure lower bound. We also present a structural result reducing the Hamming cube to a Bregman cube. This structure allows us to obtain lower bounds for problems under Bregman divergences from their

\ell_1

analog. In particular, we get a (weaker) lower bound for approximate near neighbor search of the form

\Omega(n^{1 + \frac{1}{cr}})

for an

r

-query non-adaptive data structure, and new cell probe lower bounds for a number of other near neighbor questions in Bregman space.Comment: 27 page

arXiv.org e-Print Archive

Crossref