1,435 research outputs found
Inference of Ancestral Recombination Graphs through Topological Data Analysis
The recent explosion of genomic data has underscored the need for
interpretable and comprehensive analyses that can capture complex phylogenetic
relationships within and across species. Recombination, reassortment and
horizontal gene transfer constitute examples of pervasive biological phenomena
that cannot be captured by tree-like representations. Starting from hundreds of
genomes, we are interested in the reconstruction of potential evolutionary
histories leading to the observed data. Ancestral recombination graphs
represent potential histories that explicitly accommodate recombination and
mutation events across orthologous genomes. However, they are computationally
costly to reconstruct, usually being infeasible for more than few tens of
genomes. Recently, Topological Data Analysis (TDA) methods have been proposed
as robust and scalable methods that can capture the genetic scale and frequency
of recombination. We build upon previous TDA developments for detecting and
quantifying recombination, and present a novel framework that can be applied to
hundreds of genomes and can be interpreted in terms of minimal histories of
mutation and recombination events, quantifying the scales and identifying the
genomic locations of recombinations. We implement this framework in a software
package, called TARGet, and apply it to several examples, including small
migration between different populations, human recombination, and horizontal
evolution in finches inhabiting the Gal\'apagos Islands.Comment: 33 pages, 12 figures. The accompanying software, instructions and
example files used in the manuscript can be obtained from
https://github.com/RabadanLab/TARGe
Distances and Isomorphism between Networks and the Stability of Network Invariants
We develop the theoretical foundations of a network distance that has
recently been applied to various subfields of topological data analysis, namely
persistent homology and hierarchical clustering. While this network distance
has previously appeared in the context of finite networks, we extend the
setting to that of compact networks. The main challenge in this new setting is
the lack of an easy notion of sampling from compact networks; we solve this
problem in the process of obtaining our results. The generality of our setting
means that we automatically establish results for exotic objects such as
directed metric spaces and Finsler manifolds. We identify readily computable
network invariants and establish their quantitative stability under this
network distance. We also discuss the computational complexity involved in
precisely computing this distance, and develop easily-computable lower bounds
by using the identified invariants. By constructing a wide range of explicit
examples, we show that these lower bounds are effective in distinguishing
between networks. Finally, we provide a simple algorithm that computes a lower
bound on the distance between two networks in polynomial time and illustrate
our metric and invariant constructions on a database of random networks and a
database of simulated hippocampal networks
Optimal rates of convergence for persistence diagrams in Topological Data Analysis
Computational topology has recently known an important development toward
data analysis, giving birth to the field of topological data analysis.
Topological persistence, or persistent homology, appears as a fundamental tool
in this field. In this paper, we study topological persistence in general
metric spaces, with a statistical approach. We show that the use of persistent
homology can be naturally considered in general statistical frameworks and
persistence diagrams can be used as statistics with interesting convergence
properties. Some numerical experiments are performed in various contexts to
illustrate our results
Persistent topology for natural data analysis - A survey
Natural data offer a hard challenge to data analysis. One set of tools is
being developed by several teams to face this difficult task: Persistent
topology. After a brief introduction to this theory, some applications to the
analysis and classification of cells, lesions, music pieces, gait, oil and gas
reservoirs, cyclones, galaxies, bones, brain connections, languages,
handwritten and gestured letters are shown
Forman's Ricci curvature - From networks to hypernetworks
Networks and their higher order generalizations, such as hypernetworks or
multiplex networks are ever more popular models in the applied sciences.
However, methods developed for the study of their structural properties go
little beyond the common name and the heavy reliance of combinatorial tools. We
show that, in fact, a geometric unifying approach is possible, by viewing them
as polyhedral complexes endowed with a simple, yet, the powerful notion of
curvature - the Forman Ricci curvature. We systematically explore some aspects
related to the modeling of weighted and directed hypernetworks and present
expressive and natural choices involved in their definitions. A benefit of this
approach is a simple method of structure-preserving embedding of hypernetworks
in Euclidean N-space. Furthermore, we introduce a simple and efficient manner
of computing the well established Ollivier-Ricci curvature of a hypernetwork.Comment: to appear: Complex Networks '18 (oral presentation
- …