2,143 research outputs found

    Gaussian processes, kinematic formulae and Poincar\'e's limit

    Full text link
    We consider vector valued, unit variance Gaussian processes defined over stratified manifolds and the geometry of their excursion sets. In particular, we develop an explicit formula for the expectation of all the Lipschitz--Killing curvatures of these sets. Whereas our motivation is primarily probabilistic, with statistical applications in the background, this formula has also an interpretation as a version of the classic kinematic fundamental formula of integral geometry. All of these aspects are developed in the paper. Particularly novel is the method of proof, which is based on a an approximation to the canonical Gaussian process on the nn-sphere. The nβ†’βˆžn\to\infty limit, which gives the final result, is handled via recent extensions of the classic Poincar\'e limit theorem.Comment: Published in at http://dx.doi.org/10.1214/08-AOP439 the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Modeling and replicating statistical topology, and evidence for CMB non-homogeneity

    Full text link
    Under the banner of `Big Data', the detection and classification of structure in extremely large, high dimensional, data sets, is, one of the central statistical challenges of our times. Among the most intriguing approaches to this challenge is `TDA', or `Topological Data Analysis', one of the primary aims of which is providing non-metric, but topologically informative, pre-analyses of data sets which make later, more quantitative analyses feasible. While TDA rests on strong mathematical foundations from Topology, in applications it has faced challenges due to an inability to handle issues of statistical reliability and robustness and, most importantly, in an inability to make scientific claims with verifiable levels of statistical confidence. We propose a methodology for the parametric representation, estimation, and replication of persistence diagrams, the main diagnostic tool of TDA. The power of the methodology lies in the fact that even if only one persistence diagram is available for analysis -- the typical case for big data applications -- replications can be generated to allow for conventional statistical hypothesis testing. The methodology is conceptually simple and computationally practical, and provides a broadly effective statistical procedure for persistence diagram TDA analysis. We demonstrate the basic ideas on a toy example, and the power of the approach in a novel and revealing analysis of CMB non-homogeneity

    Validity of the expected Euler characteristic heuristic

    Full text link
    We study the accuracy of the expected Euler characteristic approximation to the distribution of the maximum of a smooth, centered, unit variance Gaussian process f. Using a point process representation of the error, valid for arbitrary smooth processes, we show that the error is in general exponentially smaller than any of the terms in the approximation. We also give a lower bound on this exponential rate of decay in terms of the maximal variance of a family of Gaussian processes f^x, derived from the original process f.Comment: Published at http://dx.doi.org/10.1214/009117905000000099 in the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org
    • …
    corecore