7,060 research outputs found
When two trees go to war
Rooted phylogenetic networks are often constructed by combining trees,
clusters, triplets or characters into a single network that in some
well-defined sense simultaneously represents them all. We review these four
models and investigate how they are related. In general, the model chosen
influences the minimum number of reticulation events required. However, when
one obtains the input data from two binary trees, we show that the minimum
number of reticulations is independent of the model. The number of
reticulations necessary to represent the trees, triplets, clusters (in the
softwired sense) and characters (with unrestricted multiple crossover
recombination) are all equal. Furthermore, we show that these results also hold
when not the number of reticulations but the level of the constructed network
is minimised. We use these unification results to settle several complexity
questions that have been open in the field for some time. We also give explicit
examples to show that already for data obtained from three binary trees the
models begin to diverge
Lateral transfer in Stochastic Dollo models
Lateral transfer, a process whereby species exchange evolutionary traits
through non-ancestral relationships, is a frequent source of model
misspecification in phylogenetic inference. Lateral transfer obscures the
phylogenetic signal in the data as the histories of affected traits are mosaics
of the overall phylogeny. We control for the effect of lateral transfer in a
Stochastic Dollo model and a Bayesian setting. Our likelihood is highly
intractable as the parameters are the solution of a sequence of large systems
of differential equations representing the expected evolution of traits along a
tree. We illustrate our method on a data set of lexical traits in Eastern
Polynesian languages and obtain an improved fit over the corresponding model
without lateral transfer.Comment: Improvements suggested by reviewer
Learning mutational graphs of individual tumour evolution from single-cell and multi-region sequencing data
Background. A large number of algorithms is being developed to reconstruct
evolutionary models of individual tumours from genome sequencing data. Most
methods can analyze multiple samples collected either through bulk multi-region
sequencing experiments or the sequencing of individual cancer cells. However,
rarely the same method can support both data types.
Results. We introduce TRaIT, a computational framework to infer mutational
graphs that model the accumulation of multiple types of somatic alterations
driving tumour evolution. Compared to other tools, TRaIT supports multi-region
and single-cell sequencing data within the same statistical framework, and
delivers expressive models that capture many complex evolutionary phenomena.
TRaIT improves accuracy, robustness to data-specific errors and computational
complexity compared to competing methods.
Conclusions. We show that the application of TRaIT to single-cell and
multi-region cancer datasets can produce accurate and reliable models of
single-tumour evolution, quantify the extent of intra-tumour heterogeneity and
generate new testable experimental hypotheses
Genetic Stratigraphy of Key Demographic Events in Arabia
The issue of admixture in human populations is normally addressed by genome-wide (GW) studies, and several approaches have been developed to date admixture events [1,2,3,4,5]. Admixed populations bear chromosomes with segments of DNA from all contributing source groups, the size of which decreases over successive generations until recombination renders them undetectably short. Several algorithms attempt to date admixture events by inferring the size of the nuclear ancestry segments, and these can work well when dating recent episodes in human history, such as the sub-Saharan African input into the New World [6], but they fail to detect several known episodes that took place at earlier times, such as the African input into Iberia [1] and genetic exchanges across the Red Sea [7]. Simulations with the suite of methods available at the ADMIXTOOLS package indicated that these methods could detect admixture events as early as 500 generation ago, but real data did not allow the tracing of such old events [8]. A recent improved algorithm, called GLOBETROTTER, has been used to tackle the detection of the co-occurrence of several mixture events by decomposing each chromosome into a series of haplotypic chunks and then analysing each chunk independently [3], but the problem of detecting ancient events remains. Its application to the systematic screening of worldwide admixture events was able to reveal around 100 events, but all occurring over only the past 4,000 years [3
Inferring phylogenetic networks with maximum pseudolikelihood under incomplete lineage sorting
Phylogenetic networks are necessary to represent the tree of life expanded by
edges to represent events such as horizontal gene transfers, hybridizations or
gene flow. Not all species follow the paradigm of vertical inheritance of their
genetic material. While a great deal of research has flourished into the
inference of phylogenetic trees, statistical methods to infer phylogenetic
networks are still limited and under development. The main disadvantage of
existing methods is a lack of scalability. Here, we present a statistical
method to infer phylogenetic networks from multi-locus genetic data in a
pseudolikelihood framework. Our model accounts for incomplete lineage sorting
through the coalescent model, and for horizontal inheritance of genes through
reticulation nodes in the network. Computation of the pseudolikelihood is fast
and simple, and it avoids the burdensome calculation of the full likelihood
which can be intractable with many species. Moreover, estimation at the
quartet-level has the added computational benefit that it is easily
parallelizable. Simulation studies comparing our method to a full likelihood
approach show that our pseudolikelihood approach is much faster without
compromising accuracy. We applied our method to reconstruct the evolutionary
relationships among swordtails and platyfishes (: Poeciliidae),
which is characterized by widespread hybridizations
- …