14 research outputs found
Near-Optimal Density Estimation in Near-Linear Time Using Variable-Width Histograms
Let be an unknown and arbitrary probability distribution over . We
consider the problem of {\em density estimation}, in which a learning algorithm
is given i.i.d. draws from and must (with high probability) output a
hypothesis distribution that is close to . The main contribution of this
paper is a highly efficient density estimation algorithm for learning using a
variable-width histogram, i.e., a hypothesis distribution with a piecewise
constant probability density function.
In more detail, for any and , we give an algorithm that makes
draws from , runs in
time, and outputs a hypothesis distribution that is piecewise constant with
pieces. With high probability the hypothesis
satisfies ,
where denotes the total variation distance (statistical
distance), is a universal constant, and is the smallest
total variation distance between and any -piecewise constant
distribution. The sample size and running time of our algorithm are optimal up
to logarithmic factors. The "approximation factor" in our result is
inherent in the problem, as we prove that no algorithm with sample size bounded
in terms of and can achieve regardless of what kind of
hypothesis distribution it uses.Comment: conference version appears in NIPS 201
Robust Learning of Fixed-Structure Bayesian Networks
We investigate the problem of learning Bayesian networks in a robust model
where an -fraction of the samples are adversarially corrupted. In
this work, we study the fully observable discrete case where the structure of
the network is given. Even in this basic setting, previous learning algorithms
either run in exponential time or lose dimension-dependent factors in their
error guarantees. We provide the first computationally efficient robust
learning algorithm for this problem with dimension-independent error
guarantees. Our algorithm has near-optimal sample complexity, runs in
polynomial time, and achieves error that scales nearly-linearly with the
fraction of adversarially corrupted samples. Finally, we show on both synthetic
and semi-synthetic data that our algorithm performs well in practice