Understanding How Stochasticity Impacts Reconstructions of Recent Species Divergent History.

Abstract

Molecular phylogenetic studies are complicated by the fact that differentiation between orthologous gene copies is determined by two stochastic process–lineage sorting (coalescent) and mutational processes. The former could lead to discrepancies between the species divergent history and genealogies, while the later could result in differences between genealogies and estimated gene trees. Only recently has the idea of incorporating the coalescent process into species-tree estimation been applied in empirical phylogenetics. My thesis focuses on examining the impacts of these two stochasticities on reconstructing recent species divergent histories where incomplete lineage sorting is prevalent. Using simulated data, the effect of mutation variance is re-evaluated on accuracy of species-tree estimates with different methods, ranging from the simplest “democratic voting”, to the Maximum-likelihood method includes the branch length information, and the implications in terms of sampling design, methods for gene-tree and species-tree estimation, are discussed in Chapter II&III. While future phylogenetic studies will benefit from the new species-tree estimation methods, it is not clear is the extent to which species relationships estimated with data and methods that predate these developments are robust. I proposed a parametric bootstrap species tree (PBST) approach to assess the reliability of past phylogenetic studies in which the stochastic lineage sorting processes were overlooked, and applied the approach as a meta-analysis of east African cichlid phylogenies in Chapter IV. Another problem for empirical phylogenetic studies to applying species-tree estimation is to having a multi-locus sequencing dataset, Next-generation sequencing (NGS) combined with Reduce Representation Library technique has the premise but concerns exist about whether the high NGS error rates are amenable for directly use for phylogenetic analysis. The use of NGS as primary data for reconstructing the divergent history was explored of four montane grasshopper species in Chapter IV, and parametric simulation was used to three possible sources of uncertainty in the estimated species tree: the true species divergent history, sequencing errors and error correction method. Possible improvement on sampling design and the methodological developments needed for future studies are discussed. The last chapter explored the use of gene divergent history combined with geographic information to infer speciation models.Ph.D.Ecology and Evolutionary BiologyUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/91486/1/huatengh_1.pd

    Similar works

    Full text

    thumbnail-image