Search CORE

4 research outputs found

(Un)Conditional Sample Generation Based on Distribution Element Trees

Author: Meyer Daniel W.
Publication venue: 'Informa UK Limited'
Publication date: 14/06/2018
Field of study

Recently, distribution element trees (DETs) were introduced as an accurate and computationally efficient method for density estimation. In this work, we demonstrate that the DET formulation promotes an easy and inexpensive way to generate random samples similar to a smooth bootstrap. These samples can be generated unconditionally, but also, without further complications, conditionally utilizing available information about certain probability-space components.Comment: published online in the Journal of Computational and Graphical Statistic

arXiv.org e-Print Archive

FigShare

Offline and Online Density Estimation for Large High-Dimensional Data

Author: Majdara Aref
Publication venue: Digital Commons @ Michigan Tech
Publication date: 01/01/2018
Field of study

Density estimation has wide applications in machine learning and data analysis techniques including clustering, classification, multimodality analysis, bump hunting and anomaly detection. In high-dimensional space, sparsity of data in local neighborhood makes many of parametric and nonparametric density estimation methods mostly inefficient. This work presents development of computationally efficient algorithms for high-dimensional density estimation, based on Bayesian sequential partitioning (BSP). Copula transform is used to separate the estimation of marginal and joint densities, with the purpose of reducing the computational complexity and estimation error. Using this separation, a parallel implementation of the density estimation algorithm on a 4-core CPU is presented. Also, some example applications of the high-dimensional density estimation in density-based classification and clustering are presented. Another challenge in the area of density estimation rises in dealing with online sources of data, where data is arriving over an open-ended and non-stationary stream. This calls for efficient algorithms for online density estimation. An online density estimator needs to be capable of providing up-to-date estimates of the density, bound to the available computing resources and requirements of the application. In response to this, BBSP method for online density estimation is introduced. It works based on collecting and processing the data in blocks of fixed size, followed by a weighted averaging over block-wise estimates of the density. Proper choice of block size is discussed via simulations for streams of synthetic and real datasets. Further, with the purpose of efficiency improvement in offline and online density estimation, progressive update of the binary partitions in BBSP is proposed, which as simulation results show, leads into improved accuracy as well as speed-up, for various block sizes

Michigan Technological University

Density estimation with distribution element trees

Author: Meyer Daniel W.
Publication venue: Springer
Publication date: 03/05/2017
Field of study

The estimation of probability densities based on available data is a central task in many statistical applications. Especially in the case of large ensembles with many samples or high-dimensional sample spaces, computationally efficient methods are needed. We propose a new method that is based on a decomposition of the unknown distribution in terms of so-called distribution elements (DEs). These elements enable an adaptive and hierarchical discretization of the sample space with small or large elements in regions with smoothly or highly variable densities, respectively. The novel refinement strategy that we propose is based on statistical goodness-of-fit and pairwise (as an approximation to mutual) independence tests that evaluate the local approximation of the distribution in terms of DEs. The capabilities of our new method are inspected based on several examples of different dimensionality and successfully compared with other state-of-the-art density estimators.ISSN:0960-3174ISSN:1573-137

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

Density estimation with distribution element trees

Author: A Achilleos
A Kogure
A Papoulis
BU Park
BW Silverman
C Kooperberg
Daniel W. Meyer
DO Loftsgaarden
DW Scott
GR Shorack
H Jiang
HB Mann
J Jing
J Zaunders
JS Marron
K Pearson
L Bagnato
L Breiman
L Ma
LF Shampine
LF Shampine
M Rosenblatt
M Steele
MC Jones
MG Kendall
N Smirnov
R Cao
RB Nelsen
RM Neal
RR Curtin
SJ Sheather
TA O’Brien
TS Ferguson
WG Cochran
WH Wong
X Wang
Y Yang
ZI Botev
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref