Search CORE

1,443 research outputs found

Counting Coalescent Histories

Author: Noah A. Rosenberg
Pamilo P.
Rannala B.
Takahata N.
Publication venue: 'Mary Ann Liebert Inc'
Publication date: 01/01/2007
Field of study

Given a species tree and a gene tree, a valid coalescent history is a list of the branches of the species tree on which coalescences in the gene tree take place. I develop a recursion for the number of valid coalescent histories that exist for an arbitrary gene tree/species tree pair, when one gene lineage is studied per species. The result is obtained by defining a concept of m-extended coalescent histories, enumerating and counting these histories, and taking the special case of m = 1. As a sum over valid coalescent histories appears in a formula for the probability that a random gene tree evolving along the branches of a fixed species tree has a specified labeled topology, the enumeration of valid coalescent histories can considerably reduce the effort required for evaluating this formula.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/63175/1/cmb.2006.0109.pd

CiteSeerX

Crossref

Deep Blue Documents at the University of Michigan

Coalescent histories for lodgepole species trees

Author: Disanto Filippo
Rosenberg Noah A.
Publication venue
Publication date: 01/01/2015
Field of study

Coalescent histories are combinatorial structures that describe for a given gene tree and species tree the possible lists of branches of the species tree on which the gene tree coalescences take place. Properties of the number of coalescent histories for gene trees and species trees affect a variety of probabilistic calculations in mathematical phylogenetics. Exact and asymptotic evaluations of the number of coalescent histories, however, are known only in a limited number of cases. Here we introduce a particular family of species trees, the \emph{lodgepole} species trees

(\lambda_n)_{n\geq 0}

, in which tree

\lambda_n

has

m=2n+1

taxa. We determine the number of coalescent histories for the lodgepole species trees, in the case that the gene tree matches the species tree, showing that this number grows with

m!!

in the number of taxa

m

. This computation demonstrates the existence of tree families in which the growth in the number of coalescent histories is faster than exponential. Further, it provides a substantial improvement on the lower bound for the ratio of the largest number of matching coalescent histories to the smallest number of matching coalescent histories for trees with

m

taxa, increasing a previous bound of

(\sqrt{\pi} / 32)[(5m-12)/(4m-6)] m \sqrt{m}

[ \sqrt{m-1}/(4 \sqrt{e}) ]^{m}

. We discuss the implications of our enumerative results for phylogenetic computations

arXiv.org e-Print Archive

Crossref

Archivio della Ricerca - Università di Pisa

Importance sampling for Lambda-coalescents in the infinitely many sites model

Author: Birkner
Birkner
Carr
Dong
Eldon
Ethier
Felsenstein
Griffiths
Griffiths
Griffiths
Griffiths
Hobolth
Hobolth
Jochen Blath
Matthias Birkner
Matthias Steinrücken
Möhle
Pepin
Pitman
Rogers
Sagitov
Schweinsberg
Sigurgíslason
Stephens
Tavaré
Ward
Árnason
Árnason
Árnason
Árnason
Publication venue: 'Elsevier BV'
Publication date: 09/05/2011
Field of study

We present and discuss new importance sampling schemes for the approximate computation of the sample probability of observed genetic types in the infinitely many sites model from population genetics. More specifically, we extend the 'classical framework', where genealogies are assumed to be governed by Kingman's coalescent, to the more general class of Lambda-coalescents and develop further Hobolth et. al.'s (2008) idea of deriving importance sampling schemes based on 'compressed genetrees'. The resulting schemes extend earlier work by Griffiths and Tavar\'e (1994), Stephens and Donnelly (2000), Birkner and Blath (2008) and Hobolth et. al. (2008). We conclude with a performance comparison of classical and new schemes for Beta- and Kingman coalescents.Comment: (38 pages, 40 figures

arXiv.org e-Print Archive

Crossref

Inference of Ancestral Recombination Graphs through Topological Data Analysis

Author: Camara Pablo G.
Levine Arnold J.
Rabadan Raul
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2016
Field of study

The recent explosion of genomic data has underscored the need for interpretable and comprehensive analyses that can capture complex phylogenetic relationships within and across species. Recombination, reassortment and horizontal gene transfer constitute examples of pervasive biological phenomena that cannot be captured by tree-like representations. Starting from hundreds of genomes, we are interested in the reconstruction of potential evolutionary histories leading to the observed data. Ancestral recombination graphs represent potential histories that explicitly accommodate recombination and mutation events across orthologous genomes. However, they are computationally costly to reconstruct, usually being infeasible for more than few tens of genomes. Recently, Topological Data Analysis (TDA) methods have been proposed as robust and scalable methods that can capture the genetic scale and frequency of recombination. We build upon previous TDA developments for detecting and quantifying recombination, and present a novel framework that can be applied to hundreds of genomes and can be interpreted in terms of minimal histories of mutation and recombination events, quantifying the scales and identifying the genomic locations of recombinations. We implement this framework in a software package, called TARGet, and apply it to several examples, including small migration between different populations, human recombination, and horizontal evolution in finches inhabiting the Gal\'apagos Islands.Comment: 33 pages, 12 figures. The accompanying software, instructions and example files used in the manuscript can be obtained from https://github.com/RabadanLab/TARGe

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

Columbia University Academic Commons

Directory of Open Access Journals

PubMed Central

FigShare

A polynomial time algorithm for calculating the probability of a ranked gene tree given a species tree

Author: Degnan James H.
Stadler Tanja
Publication venue
Publication date: 01/01/2012
Field of study

In this paper, we provide a polynomial time algorithm to calculate the probability of a {\it ranked} gene tree topology for a given species tree, where a ranked tree topology is a tree topology with the internal vertices being ordered. The probability of a gene tree topology can thus be calculated in polynomial time if the number of orderings of the internal vertices is a polynomial number. However, the complexity of calculating the probability of a gene tree topology with an exponential number of rankings for a given species tree remains unknown

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

Springer - Publisher Connector

PubMed Central

Enumeration of coalescent histories for caterpillar species trees and $p$ -pseudocaterpillar gene trees

Author: Alimpiev Egor
Rosenberg Noah A
Publication venue
Publication date: 24/03/2021
Field of study

For a fixed set

X

containing

n

taxon labels, an ordered pair consisting of a gene tree topology

G

and a species tree

S

bijectively labeled with the labels of

X

possesses a set of coalescent histories -- mappings from the set of internal nodes of

G

to the set of edges of

S

describing possible lists of edges in

S

on which the coalescences in

G

take place. Enumerations of coalescent histories for gene trees and species trees have produced suggestive results regarding the pairs

(G,S)

that, for a fixed

n

, have the largest number of coalescent histories. We define a class of 2-cherry binary tree topologies that we term

p

-pseudocaterpillars, examining coalescent histories for non-matching pairs

(G,S)

, in the case in which

S

has a caterpillar shape and

G

has a

p

-pseudocaterpillar shape. Using a construction that associates coalescent histories for

(G,S)

with a class of "roadblocked" monotonic paths, we identify the

p

-pseudocaterpillar labeled gene tree topology that, for a fixed caterpillar labeled species tree topology, gives rise to the largest number of coalescent histories. The shape that maximizes the number of coalescent histories places the "second" cherry of the

p

-pseudocaterpillar equidistantly from the root of the "first" cherry and from the tree root. A symmetry in the numbers of coalescent histories for

p

-pseudocaterpillar gene trees and caterpillar species trees is seen to exist around the maximizing value of the parameter

p

. The results provide insight into the factors that influence the number of coalescent histories possible for a given gene tree and species tree

arXiv.org e-Print Archive

The Time Machine: A Simulation Approach for Stochastic Trees

Author: Dempster A. P.
Edwards A. W. F.
Fearnhead P.
Gorur D.
Hudson R. R.
Kuhner M. K.
Stephens M.
Tavaré S.
Wilson I. J.
Publication venue: 'The Royal Society'
Publication date: 26/09/2010
Field of study

In the following paper we consider a simulation technique for stochastic trees. One of the most important areas in computational genetics is the calculation and subsequent maximization of the likelihood function associated to such models. This typically consists of using importance sampling (IS) and sequential Monte Carlo (SMC) techniques. The approach proceeds by simulating the tree, backward in time from observed data, to a most recent common ancestor (MRCA). However, in many cases, the computational time and variance of estimators are often too high to make standard approaches useful. In this paper we propose to stop the simulation, subsequently yielding biased estimates of the likelihood surface. The bias is investigated from a theoretical point of view. Results from simulation studies are also given to investigate the balance between loss of accuracy, saving in computing time and variance reduction.Comment: 22 Pages, 5 Figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

UCL Discovery