69 research outputs found
Nonparametric ridge estimation
We study the problem of estimating the ridges of a density function. Ridge
estimation is an extension of mode finding and is useful for understanding the
structure of a density. It can also be used to find hidden structure in point
cloud data. We show that, under mild regularity conditions, the ridges of the
kernel density estimator consistently estimate the ridges of the true density.
When the data are noisy measurements of a manifold, we show that the ridges are
close and topologically similar to the hidden manifold. To find the estimated
ridges in practice, we adapt the modified mean-shift algorithm proposed by
Ozertem and Erdogmus [J. Mach. Learn. Res. 12 (2011) 1249-1286]. Some numerical
experiments verify that the algorithm is accurate.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1218 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Nonparametric Inference For Density Modes
We derive nonparametric confidence intervals for the eigenvalues of the
Hessian at modes of a density estimate. This provides information about the
strength and shape of modes and can also be used as a significance test. We use
a data-splitting approach in which potential modes are identified using the
first half of the data and inference is done with the second half of the data.
To get valid confidence sets for the eigenvalues, we use a bootstrap based on
an elementary-symmetric-polynomial (ESP) transformation. This leads to valid
bootstrap confidence sets regardless of any multiplicities in the eigenvalues.
We also suggest a new method for bandwidth selection, namely, choosing the
bandwidth to maximize the number of significant modes. We show by example that
this method works well. Even when the true distribution is singular, and hence
does not have a density, (in which case cross validation chooses a zero
bandwidth), our method chooses a reasonable bandwidth
Manifold estimation and singular deconvolution under Hausdorff loss
We find lower and upper bounds for the risk of estimating a manifold in
Hausdorff distance under several models. We also show that there are close
connections between manifold estimation and the problem of deconvolving a
singular measure.Comment: Published in at http://dx.doi.org/10.1214/12-AOS994 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
On the path density of a gradient field
We consider the problem of reliably finding filaments in point clouds.
Realistic data sets often have numerous filaments of various sizes and shapes.
Statistical techniques exist for finding one (or a few) filaments but these
methods do not handle noisy data sets with many filaments. Other methods can be
found in the astronomy literature but they do not have rigorous statistical
guarantees. We propose the following method. Starting at each data point we
construct the steepest ascent path along a kernel density estimator. We locate
filaments by finding regions where these paths are highly concentrated.
Formally, we define the density of these paths and we construct a consistent
estimator of this path density.Comment: Published in at http://dx.doi.org/10.1214/08-AOS671 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Minimax Manifold Estimation
We find the minimax rate of convergence in Hausdorff distance for estimating a manifold M of dimension d embedded in R-D given a noisy sample from the manifold. Under certain conditions, we show that the optimal rate of convergence is n(-2/(2+d)). Thus, the minimax rate depends only on the dimension of the manifold, not on the dimension of the space in which M is embedded
Nonparametric Ridge Estimation
We study the problem of estimating the ridges of a density function. Ridge estimation is an extension of mode finding and is useful for understanding the structure of a density. It can also be used to find hidden structure in point cloud data. We show that, under mild regularity conditions, the ridges of the kernel density estimator consistently estimate the ridges of the true density. When the data are noisy measurements of a manifold, we show that the ridges are close and topologically similar to the hidden manifold. To find the estimated ridges in practice, we adapt the modified mean-shift algorithm proposed by Ozertem and Erdogmus [J. Mach. Learn. Res. 12 (2011) 1249–1286]. Some numerical experiments verify that the algorithm is accurate
Minimax Manifold Estimation
We find the minimax rate of convergence in Hausdorff distance for estimating a manifold M of dimension d embedded in R-D given a noisy sample from the manifold. Under certain conditions, we show that the optimal rate of convergence is n(-2/(2+d)). Thus, the minimax rate depends only on the dimension of the manifold, not on the dimension of the space in which M is embedded
- …
