27,861 research outputs found

    Finite-Sample Analysis of Fixed-k Nearest Neighbor Density Functional Estimators

    Full text link
    We provide finite-sample analysis of a general framework for using k-nearest neighbor statistics to estimate functionals of a nonparametric continuous probability density, including entropies and divergences. Rather than plugging a consistent density estimate (which requires k→∞k \to \infty as the sample size n→∞n \to \infty) into the functional of interest, the estimators we consider fix k and perform a bias correction. This is more efficient computationally, and, as we show in certain cases, statistically, leading to faster convergence rates. Our framework unifies several previous estimators, for most of which ours are the first finite sample guarantees.Comment: 16 pages, 0 figure

    (Un)Conditional Sample Generation Based on Distribution Element Trees

    Full text link
    Recently, distribution element trees (DETs) were introduced as an accurate and computationally efficient method for density estimation. In this work, we demonstrate that the DET formulation promotes an easy and inexpensive way to generate random samples similar to a smooth bootstrap. These samples can be generated unconditionally, but also, without further complications, conditionally utilizing available information about certain probability-space components.Comment: published online in the Journal of Computational and Graphical Statistic

    Information Theoretic Structure Learning with Confidence

    Full text link
    Information theoretic measures (e.g. the Kullback Liebler divergence and Shannon mutual information) have been used for exploring possibly nonlinear multivariate dependencies in high dimension. If these dependencies are assumed to follow a Markov factor graph model, this exploration process is called structure discovery. For discrete-valued samples, estimates of the information divergence over the parametric class of multinomial models lead to structure discovery methods whose mean squared error achieves parametric convergence rates as the sample size grows. However, a naive application of this method to continuous nonparametric multivariate models converges much more slowly. In this paper we introduce a new method for nonparametric structure discovery that uses weighted ensemble divergence estimators that achieve parametric convergence rates and obey an asymptotic central limit theorem that facilitates hypothesis testing and other types of statistical validation.Comment: 10 pages, 3 figure
    • …
    corecore