The Probability of a Gene Tree Topology within a Phylogenetic Network with Applications to Hybridization Detection

A Rokas; AD Leaché; B Carstens; B Holland; B Rannala; C Ané; C Ané; C Ané; C Meng; C Than; C Than; C Than; CR Linder; CV Than; D Huson; D Posada; D Ruths; D Swofford; DA Pollard; DL Swofford; ES Allman; ES Allman; EW Bloomquist; G Schwarz; H Akaike; H Huang; H Lanier; J Heled; J Mallet; J Mallet; J Wakeley; James H. Degnan; JH Degnan; JH Degnan; JH Degnan; JJ Doyle; Joseph Felsenstein; JP Huelsenbeck; K Burnham; L Liu; L Liu; L Nakhleh; LL Knowles; LS Kubatko; LS Kubatko; LS Kubatko; Luay Nakhleh; M DeGiorgio; M Nei; M Slatkin; ML Arnold; NA Rosenberg; SM Ross; SV Edwards; SV Edwards; TC Bruen; W Maddison; Y Wang; Y Wu; Y Yu; Yun Yu

The Probability of a Gene Tree Topology within a Phylogenetic Network with Applications to Hybridization Detection

Authors: A Rokas
AD Leaché
B Carstens
B Holland
B Rannala
C Ané
C Ané
C Ané
C Meng
C Than
C Than
C Than
CR Linder
CV Than
D Huson
D Posada
D Ruths
D Swofford
DA Pollard
DL Swofford
ES Allman
ES Allman
EW Bloomquist
G Schwarz
H Akaike
H Huang
H Lanier
J Heled
J Mallet
J Mallet
J Wakeley
James H. Degnan
JH Degnan
JH Degnan
JH Degnan
JJ Doyle
Joseph Felsenstein
JP Huelsenbeck
K Burnham
L Liu
L Liu
L Nakhleh
LL Knowles
LS Kubatko
LS Kubatko
LS Kubatko
Luay Nakhleh
M DeGiorgio
M Nei
M Slatkin
ML Arnold
NA Rosenberg
SM Ross
SV Edwards
SV Edwards
TC Bruen
W Maddison
Y Wang
Y Wu
Y Yu
Yun Yu
Publication date: 1 January 2012
Publisher: Public Library of Science
Doi

Abstract

Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa