98 research outputs found
Learning mixtures of structured distributions over discrete domains
Let be a class of probability distributions over the discrete
domain We show that if satisfies a rather
general condition -- essentially, that each distribution in can
be well-approximated by a variable-width histogram with few bins -- then there
is a highly efficient (both in terms of running time and sample complexity)
algorithm that can learn any mixture of unknown distributions from
We analyze several natural types of distributions over , including
log-concave, monotone hazard rate and unimodal distributions, and show that
they have the required structural property of being well-approximated by a
histogram with few bins. Applying our general algorithm, we obtain
near-optimally efficient algorithms for all these mixture learning problems.Comment: preliminary full version of soda'13 pape
On Extracting Common Random Bits From Correlated Sources on Large Alphabets
Suppose Alice and Bob receive strings X=(X1,...,Xn) and Y=(Y1,...,Yn) each uniformly random in [s]n, but so that X and Y are correlated. For each symbol i, we have that Yi=Xi with probability 1-ε and otherwise Yi is chosen independently and uniformly from [s]. Alice and Bob wish to use their respective strings to extract a uniformly chosen common sequence from [s]k, but without communicating. How well can they do? The trivial strategy of outputting the first k symbols yields an agreement probability of (1-ε+ε/s)k. In a recent work by Bogdanov and Mossel, it was shown that in the binary case where s=2 and k=k(ε) is large enough then it is possible to extract k bits with a better agreement probability rate. In particular, it is possible to achieve agreement probability (kε)-1/2·2-kε/(2(1-ε/2)) using a random construction based on Hamming balls, and this is optimal up to lower order terms. In this paper, we consider the same problem over larger alphabet sizes s and we show that the agreement probability rate changes dramatically as the alphabet grows. In particular, we show no strategy can achieve agreement probability better than (1-ε)k(1+δ(s))k where δ(s)→ 0 as s→∞. We also show that Hamming ball-based constructions have much lower agreement probability rate than the trivial algorithm as s→∞. Our proofs and results are intimately related to subtle properties of hypercontractive inequalities
Convergence, unanimity and disagreement in majority dynamics on unimodular graphs and random graphs
In majority dynamics, agents located at the vertices of an undirected simple graph update their binary opinions synchronously by adopting those of the majority of their neighbors.
On infinite unimodular transitive graphs (e.g., Cayley graphs), when initial opinions are chosen from a distribution that is invariant with respect to the graph automorphism group, we show that the opinion of each agent almost surely either converges, or else eventually oscillates with period two; this is known to hold for finite graphs, but not for all infinite graphs.
On Erdős-Rényi random graphs with degrees Ω(n√), we show that when initial opinions are chosen i.i.d. then agents all converge to the initial majority opinion, with constant probability. Conversely, on random 4-regular finite graphs, we show that with high probability different agents converge to different opinions
Optimal Algorithms for Testing Closeness of Discrete Distributions
We study the question of closeness testing for two discrete distributions.
More precisely, given samples from two distributions and over an
-element set, we wish to distinguish whether versus is at least
\eps-far from , in either or distance. Batu et al. gave
the first sub-linear time algorithms for these problems, which matched the
lower bounds of Valiant up to a logarithmic factor in , and a polynomial
factor of \eps.
In this work, we present simple (and new) testers for both the and
settings, with sample complexity that is information-theoretically
optimal, to constant factors, both in the dependence on , and the dependence
on \eps; for the testing problem we establish that the sample
complexity is $\Theta(\max\{n^{2/3}/\eps^{4/3}, n^{1/2}/\eps^2 \}).
Near-Optimal Density Estimation in Near-Linear Time Using Variable-Width Histograms
Let be an unknown and arbitrary probability distribution over . We
consider the problem of {\em density estimation}, in which a learning algorithm
is given i.i.d. draws from and must (with high probability) output a
hypothesis distribution that is close to . The main contribution of this
paper is a highly efficient density estimation algorithm for learning using a
variable-width histogram, i.e., a hypothesis distribution with a piecewise
constant probability density function.
In more detail, for any and , we give an algorithm that makes
draws from , runs in
time, and outputs a hypothesis distribution that is piecewise constant with
pieces. With high probability the hypothesis
satisfies ,
where denotes the total variation distance (statistical
distance), is a universal constant, and is the smallest
total variation distance between and any -piecewise constant
distribution. The sample size and running time of our algorithm are optimal up
to logarithmic factors. The "approximation factor" in our result is
inherent in the problem, as we prove that no algorithm with sample size bounded
in terms of and can achieve regardless of what kind of
hypothesis distribution it uses.Comment: conference version appears in NIPS 201
Convergence, unanimity and disagreement in majority dynamics on unimodular graphs and random graphs
In majority dynamics, agents located at the vertices of an undirected simple graph update their binary opinions synchronously by adopting those of the majority of their neighbors.
On infinite unimodular transitive graphs (e.g., Cayley graphs), when initial opinions are chosen from a distribution that is invariant with respect to the graph automorphism group, we show that the opinion of each agent almost surely either converges, or else eventually oscillates with period two; this is known to hold for finite graphs, but not for all infinite graphs.
On Erdős-Rényi random graphs with degrees Ω(n√), we show that when initial opinions are chosen i.i.d. then agents all converge to the initial majority opinion, with constant probability. Conversely, on random 4-regular finite graphs, we show that with high probability different agents converge to different opinions
The management of municipal solid waste in Hong Kong : a study of civic engagement strategies
published_or_final_versionPolitics and Public AdministrationMasterMaster of Public Administratio
- …