Search CORE

4,365 research outputs found

A Fast Algorithm for Computing Geodesic Distances in Tree Space

Author: Owen Megan
Provan J. Scott
Publication venue
Publication date: 01/01/2011
Field of study

Comparing and computing distances between phylogenetic trees are important biological problems, especially for models where edge lengths play an important role. The geodesic distance measure between two phylogenetic trees with edge lengths is the length of the shortest path between them in the continuous tree space introduced by Billera, Holmes, and Vogtmann. This tree space provides a powerful tool for studying and comparing phylogenetic trees, both in exhibiting a natural distance measure and in providing a Euclidean-like structure for solving optimization problems on trees. An important open problem is to find a polynomial time algorithm for finding geodesics in tree space. This paper gives such an algorithm, which starts with a simple initial path and moves through a series of successively shorter paths until the geodesic is attained

Carolina Digital Repository

Fast approximation of centrality and distances in hyperbolic graphs

Author: A Brandstädt
A Brandstädt
C Jordan
D Aingworth
D Dor
D Kratsch
DG Corneil
E Prisner
E Prisner
FF Dragan
FF Dragan
FF Dragan
M Abu-Ata
M Borassi
M Gromov
Martin R. Bridson
O Narayan
P Berman
R Impagliazzo
R Williams
SL Hakimi
V Chepoi
V Chepoi
V Chepoi
Y Dourisboure
Y Shavitt
Publication venue
Publication date: 16/05/2018
Field of study

We show that the eccentricities (and thus the centrality indices) of all vertices of a

\delta

-hyperbolic graph

G=(V,E)

can be computed in linear time with an additive one-sided error of at most

c\delta

, i.e., after a linear time preprocessing, for every vertex

v

G

one can compute in

O(1)

time an estimate

\hat{e}(v)

of its eccentricity

ecc_G(v)

such that

ecc_G(v)\leq \hat{e}(v)\leq ecc_G(v)+ c\delta

for a small constant

c

. We prove that every

\delta

-hyperbolic graph

G

has a shortest path tree, constructible in linear time, such that for every vertex

v

G

ecc_G(v)\leq ecc_T(v)\leq ecc_G(v)+ c\delta

. These results are based on an interesting monotonicity property of the eccentricity function of hyperbolic graphs: the closer a vertex is to the center of

G

, the smaller its eccentricity is. We also show that the distance matrix of

G

with an additive one-sided error of at most

c'\delta

can be computed in

O(|V|^2\log^2|V|)

time, where

c'< c

is a small constant. Recent empirical studies show that many real-world graphs (including Internet application networks, web networks, collaboration networks, social networks, biological networks, and others) have small hyperbolicity. So, we analyze the performance of our algorithms for approximating centrality and distance matrix on a number of real-world networks. Our experimental results show that the obtained estimates are even better than the theoretical bounds.Comment: arXiv admin note: text overlap with arXiv:1506.01799 by other author

arXiv.org e-Print Archive

Crossref

HAL AMU

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Exact Computation of a Manifold Metric, via Lipschitz Embeddings and Shortest Paths on a Graph

Author: Chu Timothy
Miller Gary
Sheehy Donald
Publication venue
Publication date: 21/04/2020
Field of study

Data-sensitive metrics adapt distances locally based the density of data points with the goal of aligning distances and some notion of similarity. In this paper, we give the first exact algorithm for computing a data-sensitive metric called the nearest neighbor metric. In fact, we prove the surprising result that a previously published

3

-approximation is an exact algorithm. The nearest neighbor metric can be viewed as a special case of a density-based distance used in machine learning, or it can be seen as an example of a manifold metric. Previous computational research on such metrics despaired of computing exact distances on account of the apparent difficulty of minimizing over all continuous paths between a pair of points. We leverage the exact computation of the nearest neighbor metric to compute sparse spanners and persistent homology. We also explore the behavior of the metric built from point sets drawn from an underlying distribution and consider the more general case of inputs that are finite collections of path-connected compact sets. The main results connect several classical theories such as the conformal change of Riemannian metrics, the theory of positive definite functions of Schoenberg, and screw function theory of Schoenberg and Von Neumann. We develop novel proof techniques based on the combination of screw functions and Lipschitz extensions that may be of independent interest.Comment: 15 page

arXiv.org e-Print Archive

Crossref

Principal components analysis in the space of phylogenetic trees

Author: Nye Tom M. W.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 23/02/2012
Field of study

Phylogenetic analysis of DNA or other data commonly gives rise to a collection or sample of inferred evolutionary trees. Principal Components Analysis (PCA) cannot be applied directly to collections of trees since the space of evolutionary trees on a fixed set of taxa is not a vector space. This paper describes a novel geometrical approach to PCA in tree-space that constructs the first principal path in an analogous way to standard linear Euclidean PCA. Given a data set of phylogenetic trees, a geodesic principal path is sought that maximizes the variance of the data under a form of projection onto the path. Due to the high dimensionality of tree-space and the nonlinear nature of this problem, the computational complexity is potentially very high, so approximate optimization algorithms are used to search for the optimal path. Principal paths identified in this way reveal and quantify the main sources of variation in the original collection of trees in terms of both topology and branch lengths. The approach is illustrated by application to simulated sets of trees and to a set of gene trees from metazoan (animal) species.Comment: Published in at http://dx.doi.org/10.1214/11-AOS915 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref