2,160 research outputs found
Gaussian processes, kinematic formulae and Poincar\'e's limit
We consider vector valued, unit variance Gaussian processes defined over
stratified manifolds and the geometry of their excursion sets. In particular,
we develop an explicit formula for the expectation of all the
Lipschitz--Killing curvatures of these sets. Whereas our motivation is
primarily probabilistic, with statistical applications in the background, this
formula has also an interpretation as a version of the classic kinematic
fundamental formula of integral geometry. All of these aspects are developed in
the paper. Particularly novel is the method of proof, which is based on a an
approximation to the canonical Gaussian process on the -sphere. The
limit, which gives the final result, is handled via recent
extensions of the classic Poincar\'e limit theorem.Comment: Published in at http://dx.doi.org/10.1214/08-AOP439 the Annals of
Probability (http://www.imstat.org/aop/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Modeling and replicating statistical topology, and evidence for CMB non-homogeneity
Under the banner of `Big Data', the detection and classification of structure
in extremely large, high dimensional, data sets, is, one of the central
statistical challenges of our times. Among the most intriguing approaches to
this challenge is `TDA', or `Topological Data Analysis', one of the primary
aims of which is providing non-metric, but topologically informative,
pre-analyses of data sets which make later, more quantitative analyses
feasible. While TDA rests on strong mathematical foundations from Topology, in
applications it has faced challenges due to an inability to handle issues of
statistical reliability and robustness and, most importantly, in an inability
to make scientific claims with verifiable levels of statistical confidence. We
propose a methodology for the parametric representation, estimation, and
replication of persistence diagrams, the main diagnostic tool of TDA. The power
of the methodology lies in the fact that even if only one persistence diagram
is available for analysis -- the typical case for big data applications --
replications can be generated to allow for conventional statistical hypothesis
testing. The methodology is conceptually simple and computationally practical,
and provides a broadly effective statistical procedure for persistence diagram
TDA analysis. We demonstrate the basic ideas on a toy example, and the power of
the approach in a novel and revealing analysis of CMB non-homogeneity
Validity of the expected Euler characteristic heuristic
We study the accuracy of the expected Euler characteristic approximation to
the distribution of the maximum of a smooth, centered, unit variance Gaussian
process f. Using a point process representation of the error, valid for
arbitrary smooth processes, we show that the error is in general exponentially
smaller than any of the terms in the approximation. We also give a lower bound
on this exponential rate of decay in terms of the maximal variance of a family
of Gaussian processes f^x, derived from the original process f.Comment: Published at http://dx.doi.org/10.1214/009117905000000099 in the
Annals of Probability (http://www.imstat.org/aop/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- β¦