Search CORE

149 research outputs found

Stochastic Convergence of Persistence Landscapes and Silhouettes

Author: Chazal Frédéric
Fasy Brittany Terese
Lecci Fabrizio
Rinaldo Alessandro
Wasserman Larry
Publication venue
Publication date: 01/12/2013
Field of study

Persistent homology is a widely used tool in Topological Data Analysis that encodes multiscale topological information as a multi-set of points in the plane called a persistence diagram. It is difficult to apply statistical theory directly to a random sample of diagrams. Instead, we can summarize the persistent homology with the persistence landscape, introduced by Bubenik, which converts a diagram into a well-behaved real-valued function. We investigate the statistical properties of landscapes, such as weak convergence of the average landscapes and convergence of the bootstrap. In addition, we introduce an alternate functional summary of persistent homology, which we call the silhouette, and derive an analogous statistical theory

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Directory of Open Access Journals

Journal of Computational Geometry (JoCG - Carleton University, Computational Geometry Lab)

Persistent homology analysis of brain artery trees

Author: Bendich Paul
Marron
Miller Ezra
Pieloch Alex
Skwerer Sean
Publication venue
Publication date: 01/01/2016
Field of study

New representations of tree-structured data objects, using ideas from topological data analysis, enable improved statistical analyses of a population of brain artery trees. A number of representations of each data tree arise from persistence diagrams that quantify branching and looping of vessels at multiple scales. Novel approaches to the statistical analysis, through various summaries of the persistence diagrams, lead to heightened correlations with covariates such as age and sex, relative to earlier analyses of this data set. The correlation with age continues to be significant even after controlling for correlations from earlier significant summaries

PubMed Central

Carolina Digital Repository

New methods for fixed-margin binary matrix sampling, Fréchet covariance, and MANOVA tests for random objects in multiple metric spaces

Author: Fout Alex M.
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2022
Field of study

2022 Summer.Includes bibliographical references.Many approaches to the analysis of network data essentially view the data as Euclidean and apply standard multivariate techniques. In this dissertation, we refrain from this approach, exploring two alternate approaches to the analysis of networks and other structured data. The first approach seeks to determine how unique an observed simple, directed network is by comparing it to like networks which share its degree distribution. Generating networks for comparison requires sampling from the space of all binary matrices with the prescribed row and column margins, since enumeration of all such matrices is often infeasible for even moderately sized networks with 20-50 nodes. We propose two new sampling methods for this problem. First, we extend two Markov chain Monte Carlo methods to sample from the space non-uniformly, allowing flexibility in the case that some networks are more likely than others. We show that non-uniform sampling could impede the MCMC process, but in certain special cases is still valid. Critically, we illustrate the differential conclusions that could be drawn from uniform vs. nonuniform sampling. Second, we develop a generalized divide and conquer approach which recursively divides matrices into smaller subproblems which are much easier to count and sample. Each division step reveals interesting mathematics involving the enumeration of integer partitions and points in convex lattice polytopes. The second broad approach we explore is comparing random objects in metric spaces lacking a coordinate system. Traditional definitions of the mean and variance no longer apply, and standard statistical tests have needed reconceptualization in terms of only distances in the metric space. We consider the multivariate setting where random objects exist in multiple metric spaces, which can be thought of as distinct views of the random object. We define the notion of Fréchet covariance to measure dependence between two metric spaces, and establish consistency for the sample estimator. We then propose several tests for differences in means and covariance matrices among two or more groups in multiple metric spaces, and compare their performance on scenarios involving random probability distributions and networks with node covariates

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Interpretable statistics for complex modelling: quantile and topological learning

Author: Padellini Tullia
Publication venue
Publication date: 22/02/2019
Field of study

As the complexity of our data increased exponentially in the last decades, so has our need for interpretable features. This thesis revolves around two paradigms to approach this quest for insights. In the first part we focus on parametric models, where the problem of interpretability can be seen as a “parametrization selection”. We introduce a quantile-centric parametrization and we show the advantages of our proposal in the context of regression, where it allows to bridge the gap between classical generalized linear (mixed) models and increasingly popular quantile methods. The second part of the thesis, concerned with topological learning, tackles the problem from a non-parametric perspective. As topology can be thought of as a way of characterizing data in terms of their connectivity structure, it allows to represent complex and possibly high dimensional through few features, such as the number of connected components, loops and voids. We illustrate how the emerging branch of statistics devoted to recovering topological structures in the data, Topological Data Analysis, can be exploited both for exploratory and inferential purposes with a special emphasis on kernels that preserve the topological information in the data. Finally, we show with an application how these two approaches can borrow strength from one another in the identification and description of brain activity through fMRI data from the ABIDE project

Archivio della ricerca- Università di Roma La Sapienza

SuPP & MaPP: Adaptable Structure-Based Representations For Mir Tasks

Author: Bugbee Erin H.
Kinnaird Katherine M.
McGuirl Melissa R,
Savard Claire
Publication venue: Smith ScholarWorks
Publication date: 01/01/2020
Field of study

Accurate and flexible representations of music data are paramount to addressing MIR tasks, yet many of the existing approaches are difficult to interpret or rigid in nature. This work introduces two new song representations for structure-based retrieval methods: Surface Pattern Preservation (SuPP), a continuous song representation, and Matrix Pattern Preservation (MaPP), SuPP’s discrete counterpart. These representations come equipped with several user-defined parameters so that they are adaptable for a range of MIR tasks. Experimental results show MaPP as successful in addressing the cover song task on a set of Mazurka scores, with a mean precision of 0.965 and recall of 0.776. SuPP and MaPP also show promise in other MIR applications, such as novel-segment detection and genre classification, the latter of which demonstrates their suitability as inputs for machine learning problems

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Smith College: Smith ScholarWorks

Unified Topological Inference for Brain Networks in Temporal Lobe Epilepsy Using the Wasserstein Distance

Author: Binder Jeffery R.
Chung Moo K.
De Paiva Felipe Branco
Hermann Bruce P.
Mathis Jedidiah
Meyerand Elizabeth
Nair Veena A.
Prabharakaren Vivek
Ramos Camille Garcia
Struck Aaron F.
Publication venue
Publication date: 13/02/2023
Field of study

Persistent homology can extract hidden topological signals present in brain networks. Persistent homology summarizes the changes of topological structures over multiple different scales called filtrations. Doing so detect hidden topological signals that persist over multiple scales. However, a key obstacle of applying persistent homology to brain network studies has always been the lack of coherent statistical inference framework. To address this problem, we present a unified topological inference framework based on the Wasserstein distance. Our approach has no explicit models and distributional assumptions. The inference is performed in a completely data driven fashion. The method is applied to the resting-state functional magnetic resonance images (rs-fMRI) of the temporal lobe epilepsy patients collected at two different sites: University of Wisconsin-Madison and the Medical College of Wisconsin. However, the topological method is robust to variations due to sex and acquisition, and thus there is no need to account for sex and site as categorical nuisance covariates. We are able to localize brain regions that contribute the most to topological differences. We made MATLAB package available at https://github.com/laplcebeltrami/dynamicTDA that was used to perform all the analysis in this study

arXiv.org e-Print Archive

Directory of Open Access Journals

Recommended from our members

Topological and geometric inference of data

Author: Goucher Adam P
Publication venue: University of Cambridge
Publication date: 01/08/2020
Field of study

The overarching problem under consideration is to determine the structure of the subspace on which a distribution is supported, given only a finite noisy sample thereof. The special case in which the subspace is an embedded manifold is given particular attention owing to its conceptual elegance, and asymptotic bounds are obtained on the admissible level of noise such that the manifold can be recovered up to homotopy equivalence. Attention is turned on how to accomplish this in practice. Following ideas from topological data analysis, simplicial complexes are used as discrete analogues of spaces suitable for computation. By utilising the prior assumption that the data lie on a manifold, topologically inspired techniques are proposed for refining the simplicial complex to better approximate this manifold. This is applied to the problem of nonlinear dimensionality reduction and found to improve accuracy of reconstructing several synthetic and real-world datasets. The second chapter focuses on extending this work to the case where the ambient space is non-Euclidean. The interfaces between topological data analysis, functional data analysis, and shape analysis are thoroughly explored. Lipschitz bounds are proved which relate several metrics on the space of positive semidefinite matrices; they are then interpreted in the context of topological data analysis. This is applied to diffusion tensor imaging and phonology. The final chapter explores the case where the points are non-uniformly distributed over the embedded subspace. In particular, a method is proposed to overcome the shortcomings of witness complex construction when there are large deviations in the density. The theory of multidimensional persistence is leveraged to provide a succinct setting in which the structure of the data can be interpreted as a generalised stratified space.EPSR

Apollo (Cambridge)