1,133 research outputs found
The Probability of a Gene Tree Topology within a Phylogenetic Network with Applications to Hybridization Detection
Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa
An analytical comparison of coalescent-based multilocus methods: The three-taxon case
Incomplete lineage sorting (ILS) is a common source of gene tree incongruence
in multilocus analyses. A large number of methods have been developed to infer
species trees in the presence of ILS. Here we provide a mathematical analysis
of several coalescent-based methods. Our analysis is performed on a three-taxon
species tree and assumes that the gene trees are correctly reconstructed along
with their branch lengths
Consensus properties for the deep coalescence problem and their application for scalable tree search
Parsimonious Inference of Hybridization in the Presence of Incomplete Lineage Sorting
Hybridization plays an important evolutionary role in several groups of organisms.
A phylogenetic approach to detect hybridization entails sequencing multiple loci
across the genomes of a group of species of interest, reconstructing their gene trees,
and taking their differences as indicators of hybridization. However, methods that
follow this approach mostly ignore population effects, such as incomplete lineage
sorting (ILS). Given that hybridization occurs between closely related organisms, ILS
may very well be at play and, hence, must be accounted for in the analysis
framework. To address this issue, we present a parsimony criterion for reconciling
gene trees within the branches of a phylogenetic network, and a local search heuristic
for inferring phylogenetic networks from collections of gene-tree topologies under this
criterion. This framework enables phylogenetic analyses while accounting for both
hybridization and ILS. Further, we propose two techniques for incorporating
information about uncertainty in gene-tree estimates. Our simulation studies
demonstrate the good performance of our framework in terms of identifying the
location of hybridization events, as well as estimating the proportions of genes that
underwent hybridization. Also, our framework shows good performance in terms of
efficiency on handling large data sets in our experiments. Further, in analyzing a
yeast data set, we demonstrate issues that arise when analyzing real data sets. While
a probabilistic approach was recently introduced for this problem, and while
parsimonious reconciliations have accuracy issues under certain settings, our
parsimony framework provides a much more computationally efficient technique for
this type of analysis. Our framework now allows for genome-wide scans for
hybridization, while also accounting for ILS
- …