Let $p$ be an unknown and arbitrary probability distribution over $[0,1)$. We
consider the problem of {\em density estimation}, in which a learning algorithm
is given i.i.d. draws from $p$ and must (with high probability) output a
hypothesis distribution that is close to $p$. The main contribution of this
paper is a highly efficient density estimation algorithm for learning using a
variable-width histogram, i.e., a hypothesis distribution with a piecewise
constant probability density function.
  In more detail, for any $k$ and $\epsilon$, we give an algorithm that makes
$\tilde{O}(k/\epsilon^2)$ draws from $p$, runs in $\tilde{O}(k/\epsilon^2)$
time, and outputs a hypothesis distribution $h$ that is piecewise constant with
$O(k \log^2(1/\epsilon))$ pieces. With high probability the hypothesis $h$
satisfies $d_{\mathrm{TV}}(p,h) \leq C \cdot \mathrm{opt}_k(p) + \epsilon$,
where $d_{\mathrm{TV}}$ denotes the total variation distance (statistical
distance), $C$ is a universal constant, and $\mathrm{opt}_k(p)$ is the smallest
total variation distance between $p$ and any $k$-piecewise constant
distribution. The sample size and running time of our algorithm are optimal up
to logarithmic factors. The "approximation factor" $C$ in our result is
inherent in the problem, as we prove that no algorithm with sample size bounded
in terms of $k$ and $\epsilon$ can achieve $C<2$ regardless of what kind of
hypothesis distribution it uses.Comment: conference version appears in NIPS 201

Chan, Siu On

Diakonikolas, Ilias

Servedio, Rocco A

Sun, Xiaorui

arXiv

Let p be an unknown and arbitrary probability distribution over [0, 1). We con-sider the problem of density estimation, in which a learning algorithm is given i.i.d. draws from p and must (with high probability) output a hypothesis distri-bution that is close to p. The main contribution of this paper is a highly efficient density estimation algorithm for learning using a variable-width histogram, i.e., a hypothesis distribution with a piecewise constant probability density function. In more detail, for any k and &quot;, we give an algorithm that makes Õ(k/&quot;2) draws from p, runs in Õ(k/&quot;2) time, and outputs a hypothesis distribution h that is piece-wise constant with O(k log2(1/&quot;)) pieces. With high probability the hypothesis h satisfies dTV(p, h)   C · optk(p) + &quot;, where dTV denotes the total variation distance (statistical distance), C is a universal constant, and optk(p) is the small-est total variation distance between p and any k-piecewise constant distribution. The sample size and running time of our algorithm are optimal up to logarithmic factors. The “approximation factor ” C in our result is inherent in the problem, as we prove that no algorithm with sample size bounded in terms of k and &quot; can achieve C &lt; 2 regardless of what kind of hypothesis distribution it uses. 

Siu-on Chan

Ilias Diakonikolas

Rocco A. Servedio

Xiaorui Sun

CiteSeerX

Near-optimal density estimation in near-linear time using variable-width histograms

Let be an unknown and arbitrary probability distribution over . We consider the problem of \emph{density estimation}, in which a learning algorithm is given i.i.d. draws from and must (with high probability) output a hypothesis distribution that is close to . The main contribution of this paper is a highly efficient density estimation algorithm for learning using a variable-width histogram, i.e., a hypothesis distribution with a piecewise constant probability density function. In more detail, for any and , we give an algorithm that makes draws from , runs in time, and outputs a hypothesis distribution that is piecewise constant with pieces. With high probability the hypothesis satisfies , where denotes the total variation distance (statistical distance), is a universal constant, and is the smallest total variation distance between and any -piecewise constant distribution. The sample size and running time of our algorithm are both optimal up to logarithmic factors. The approximation factor'' that is present in our result is inherent in the problem, as we prove that no algorithm with sample size bounded in terms of and can achieve regardless of what kind of hypothesis distribution it uses

Edinburgh Research Explorer

Near-Optimal Density Estimation in Near-Linear Time Using Variable-Width Histograms

http://papers.nips.cc/paper/5226-near-optimal-density-estimation-in-near-linear-time-using-variable-width-histograms.pdf

Near-Optimal Density Estimation in Near-Linear Time Using Variable-Width Histograms

Abstract

Similar works

Full text

Available Versions

CiteSeerX

Edinburgh Research Explorer

CiteSeerX