1,655 research outputs found
Parsimonious Inference of Hybridization in the Presence of Incomplete Lineage Sorting
Hybridization plays an important evolutionary role in several groups of organisms.
A phylogenetic approach to detect hybridization entails sequencing multiple loci
across the genomes of a group of species of interest, reconstructing their gene trees,
and taking their differences as indicators of hybridization. However, methods that
follow this approach mostly ignore population effects, such as incomplete lineage
sorting (ILS). Given that hybridization occurs between closely related organisms, ILS
may very well be at play and, hence, must be accounted for in the analysis
framework. To address this issue, we present a parsimony criterion for reconciling
gene trees within the branches of a phylogenetic network, and a local search heuristic
for inferring phylogenetic networks from collections of gene-tree topologies under this
criterion. This framework enables phylogenetic analyses while accounting for both
hybridization and ILS. Further, we propose two techniques for incorporating
information about uncertainty in gene-tree estimates. Our simulation studies
demonstrate the good performance of our framework in terms of identifying the
location of hybridization events, as well as estimating the proportions of genes that
underwent hybridization. Also, our framework shows good performance in terms of
efficiency on handling large data sets in our experiments. Further, in analyzing a
yeast data set, we demonstrate issues that arise when analyzing real data sets. While
a probabilistic approach was recently introduced for this problem, and while
parsimonious reconciliations have accuracy issues under certain settings, our
parsimony framework provides a much more computationally efficient technique for
this type of analysis. Our framework now allows for genome-wide scans for
hybridization, while also accounting for ILS
The inference of gene trees with species trees
Molecular phylogeny has focused mainly on improving models for the
reconstruction of gene trees based on sequence alignments. Yet, most
phylogeneticists seek to reveal the history of species. Although the histories
of genes and species are tightly linked, they are seldom identical, because
genes duplicate, are lost or horizontally transferred, and because alleles can
co-exist in populations for periods that may span several speciation events.
Building models describing the relationship between gene and species trees can
thus improve the reconstruction of gene trees when a species tree is known, and
vice-versa. Several approaches have been proposed to solve the problem in one
direction or the other, but in general neither gene trees nor species trees are
known. Only a few studies have attempted to jointly infer gene trees and
species trees. In this article we review the various models that have been used
to describe the relationship between gene trees and species trees. These models
account for gene duplication and loss, transfer or incomplete lineage sorting.
Some of them consider several types of events together, but none exists
currently that considers the full repertoire of processes that generate gene
trees along the species tree. Simulations as well as empirical studies on
genomic data show that combining gene tree-species tree models with models of
sequence evolution improves gene tree reconstruction. In turn, these better
gene trees provide a better basis for studying genome evolution or
reconstructing ancestral chromosomes and ancestral gene sequences. We predict
that gene tree-species tree methods that can deal with genomic data sets will
be instrumental to advancing our understanding of genomic evolution.Comment: Review article in relation to the "Mathematical and Computational
Evolutionary Biology" conference, Montpellier, 201
Building explicit hybridization networks using the maximum likelihood and Neighbor-Joining approaches
Tree topologies are the simplest structures which can be used to represent the evolution of species. Over the two last decades more complex structures, called phylogenetic networks, have been introduced to take into account the mechanisms of reticulate evolution, such as species hybridization and horizontal gene transfer among bacteria and viruses. Several algorithms and software have been developed in this context, but most of them yield as output only an implicit network, which can be difficult to interpret. In this paper, we introduce a new algorithm for inferring explicit hybridization networks from binary data. In order to build our explicit hybridization networks, we use a maximum likelihood approach applied to Neighbor-Joining tree configurations
Inferring phylogenetic networks with maximum pseudolikelihood under incomplete lineage sorting
Phylogenetic networks are necessary to represent the tree of life expanded by
edges to represent events such as horizontal gene transfers, hybridizations or
gene flow. Not all species follow the paradigm of vertical inheritance of their
genetic material. While a great deal of research has flourished into the
inference of phylogenetic trees, statistical methods to infer phylogenetic
networks are still limited and under development. The main disadvantage of
existing methods is a lack of scalability. Here, we present a statistical
method to infer phylogenetic networks from multi-locus genetic data in a
pseudolikelihood framework. Our model accounts for incomplete lineage sorting
through the coalescent model, and for horizontal inheritance of genes through
reticulation nodes in the network. Computation of the pseudolikelihood is fast
and simple, and it avoids the burdensome calculation of the full likelihood
which can be intractable with many species. Moreover, estimation at the
quartet-level has the added computational benefit that it is easily
parallelizable. Simulation studies comparing our method to a full likelihood
approach show that our pseudolikelihood approach is much faster without
compromising accuracy. We applied our method to reconstruct the evolutionary
relationships among swordtails and platyfishes (: Poeciliidae),
which is characterized by widespread hybridizations
Empirical Performance of Tree-Based Inference of Phylogenetic Networks
Phylogenetic networks extend the phylogenetic tree structure and allow for modeling vertical and horizontal evolution in a single framework. Statistical inference of phylogenetic networks is prohibitive and currently limited to small networks. An approach that could significantly improve phylogenetic network space exploration is based on first inferring an evolutionary tree of the species under consideration, and then augmenting the tree into a network by adding a set of "horizontal" edges to better fit the data.
In this paper, we study the performance of such an approach on networks generated under a birth-hybridization model and explore its feasibility as an alternative to approaches that search the phylogenetic network space directly (without relying on a fixed underlying tree). We find that the concatenation method does poorly at obtaining a "backbone" tree that could be augmented into the correct network, whereas the popular species tree inference method ASTRAL does significantly better at such a task. We then evaluated the tree-to-network augmentation phase under the minimizing deep coalescence and pseudo-likelihood criteria. We find that even though this is a much faster approach than the direct search of the network space, the accuracy is much poorer, even when the backbone tree is a good starting tree.
Our results show that tree-based inference of phylogenetic networks could yield very poor results. As exploration of the network space directly in search of maximum likelihood estimates or a representative sample of the posterior is very expensive, significant improvements to the computational complexity of phylogenetic network inference are imperative if analyses of large data sets are to be performed. We show that a recently developed divide-and-conquer approach significantly outperforms tree-based inference in terms of accuracy, albeit still at a higher computational cost
A matter of phylogenetic scale: Distinguishing incomplete lineage sorting from lateral gene transfer as the cause of gene tree discord in recent versus deep diversification histories
Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/143759/1/ajb21064_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/143759/2/ajb21064.pd
- âŠ