39,256 research outputs found

    Statistical Inference using the Morse-Smale Complex

    Full text link
    The Morse-Smale complex of a function ff decomposes the sample space into cells where ff is increasing or decreasing. When applied to nonparametric density estimation and regression, it provides a way to represent, visualize, and compare multivariate functions. In this paper, we present some statistical results on estimating Morse-Smale complexes. This allows us to derive new results for two existing methods: mode clustering and Morse-Smale regression. We also develop two new methods based on the Morse-Smale complex: a visualization technique for multivariate functions and a two-sample, multivariate hypothesis test.Comment: 45 pages, 13 figures. Accepted to Electronic Journal of Statistic

    Nonparametric ridge estimation

    Full text link
    We study the problem of estimating the ridges of a density function. Ridge estimation is an extension of mode finding and is useful for understanding the structure of a density. It can also be used to find hidden structure in point cloud data. We show that, under mild regularity conditions, the ridges of the kernel density estimator consistently estimate the ridges of the true density. When the data are noisy measurements of a manifold, we show that the ridges are close and topologically similar to the hidden manifold. To find the estimated ridges in practice, we adapt the modified mean-shift algorithm proposed by Ozertem and Erdogmus [J. Mach. Learn. Res. 12 (2011) 1249-1286]. Some numerical experiments verify that the algorithm is accurate.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1218 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Feature Selection For High-Dimensional Clustering

    Full text link
    We present a nonparametric method for selecting informative features in high-dimensional clustering problems. We start with a screening step that uses a test for multimodality. Then we apply kernel density estimation and mode clustering to the selected features. The output of the method consists of a list of relevant features, and cluster assignments. We provide explicit bounds on the error rate of the resulting clustering. In addition, we provide the first error bounds on mode based clustering.Comment: 11 pages, 2 figure
    • …
    corecore