4,036 research outputs found

    A rapid and scalable method for multilocus species delimitation using Bayesian model comparison and rooted triplets

    Get PDF
    Multilocus sequence data provide far greater power to resolve species limits than the single locus data typically used for broad surveys of clades. However, current statistical methods based on a multispecies coalescent framework are computationally demanding, because of the number of possible delimitations that must be compared and time-consuming likelihood calculations. New methods are therefore needed to open up the power of multilocus approaches to larger systematic surveys. Here, we present a rapid and scalable method that introduces two new innovations. First, the method reduces the complexity of likelihood calculations by decomposing the tree into rooted triplets. The distribution of topologies for a triplet across multiple loci has a uniform trinomial distribution when the 3 individuals belong to the same species, but a skewed distribution if they belong to separate species with a form that is specified by the multispecies coalescent. A Bayesian model comparison framework was developed and the best delimitation found by comparing the product of posterior probabilities of all triplets. The second innovation is a new dynamic programming algorithm for finding the optimum delimitation from all those compatible with a guide tree by successively analyzing subtrees defined by each node. This algorithm removes the need for heuristic searches used by current methods, and guarantees that the best solution is found and potentially could be used in other systematic applications. We assessed the performance of the method with simulated, published and newly generated data. Analyses of simulated data demonstrate that the combined method has favourable statistical properties and scalability with increasing sample sizes. Analyses of empirical data from both eukaryotes and prokaryotes demonstrate its potential for delimiting species in real cases

    Molecular phylogenies map to biogeography better than morphological ones

    Get PDF
    Phylogenetic relationships are inferred principally from two classes of data: morphological and molecular. Most current phylogenies of extant taxa are inferred from molecules, and when morphological and molecular trees conflict the latter are often preferred. Although supported by simulations, the superiority of molecular trees has never been assessed empirically. Here we test phylogenetic accuracy using two independent data sources: biogeographical distributions and fossil first occurrences. For 48 pairs of morphological and molecular trees, we show that, on average, molecular trees provide a better fit to biogeographical data than their morphological counterparts, and that, biogeographical congruence increases over research time. We find no significant differences in stratigraphical congruence between morphological and molecular trees. These findings have implications for understanding homoplasy in morphological data sets, the utility of morphology as a test of molecular hypotheses, and the implications of analysing fossil groups for which molecular data are unavailable

    Does behavior reflect phylogeny in swiftlets (Aves: Apodidae)? A test using cytochrome b mitochondrial DNA sequences

    Get PDF
    Swiftlets are small insectivorous birds, many of which nest in caves and are known to echolocate. Due to a lack of distinguishing morphological characters, the taxonomy of swiftlets is primarily based on the presence or absence of echolocating ability, together with nest characters. To test the reliability of these behavioral characters, we constructed an independent phylogeny using cytochrome b mitochondrial DNA sequences from swiftlets and their relatives. This phylogeny is broadly consistent with the higher classification of swifts but does not support the monophyly of swiftlets. Echolocating swiftlets (Aerodramus) and the nonecholocating "giant swiftlet" (Hydrochous gigas) group together, but the remaining nonecholocating swiftlets belonging to Collocalia are not sister taxa to these swiftlets. While echolocation may be a synapomorphy of Aerodramus (perhaps secondarily lost in Hydrochous), no character of Aerodramus nests showed a statistically significant fit to the molecular phylogeny, indicating that nest characters are not phylogenetically reliable in this group

    Megaphylogenetic Specimen-Level Approaches to the Carex (Cyperaceae) Phylogeny Using ITS, ETS, and matK Sequences: Implications for Classification

    Get PDF
    We present the first large-scale phylogenetic hypothesis for the genus Carex based on 996 of the 1983 accepted species (50.23%). We used a supermatrix approach using three DNA regions: ETS, ITS and matK. Every concatenated sequence was derived from a single specimen. The topology of our phylogenetic reconstruction largely agreed with previous studies. We also gained new insights into the early divergence structure of the two largest clades, core Carex and Vignea clades, challenging some previous evolutionary hypotheses about inflorescence structure. Most sections were recovered as non-monophyletic. Homoplasy of characters traditionally selected as relevant for classification, historical misunderstanding of how morphology varies across Carex, and regional rather than global views of Carex diversity seem to be the main reasons for the high levels of polyphyly and paraphyly in the current infrageneric classification

    Total Evidence, Average Consensus and Matrix Representation with Parsimony: What a Difference Distances Make

    Get PDF
    Matrix representation with parsimony (MRP) can be used to combine trees in the supertree or the consensus settings. However, despite its popularity, it is still unclear whether MRP is really a consensus method or whether it behaves more like the total evidence approach. Previous simulations have shown that it approximates total evidence trees, whereas other studies have depicted similarities with average consensus trees. In this paper, we assess the hypothesis that MRP is equally related to both approaches. We conducted a simulation study to evaluate the accuracy of total evidence with that or various consensus methods, including MRP. Our results show that the total evidence trees are not significantly more accurate than average consensus trees that accounts for branch lengths, but that both perform better than MRP trees in the consensus setting. The accuracy rate of all methods was similarly affected by the number of taxa, the number of partitions, and the heterogeneity of the data

    Towards the true tree: Bioinformatic approaches in the phylogenetics and molecular evolution of the Endopterygota

    No full text
    In this thesis, I use bioinformatic approaches to address new and existing issues surrounding large-scale phylogenetic analysis. A phylogenetic analysis pipeline is developed to aid an investigation of the suitability of integrating Cytochrome Oxidase Subunit 1 (cox1) into phylogenetic supermatrices. In the first two chapters I assess the effect of varying cox1 sample size within a large variable phylogenetic context. As well as intuitive results on increased quality with greater taxon sampling, there are clear monophyly patters relating to local taxonomic sampling. Specifically, more monophyletic resampled taxa in cases when fewer consubfamilials are represented, with a tendency for these to remain unchanged in the degree of monophyly when rarefied. Sampling analyses are extended in chapter two using a mined Scarabaeoidea multilocus dataset, where taxa from given loci are used to improve existing matrices. Improvement in phylogenetic signal is best achieved by targeting cox1 to existing taxa, which suggests minimum parameters for cox1 adoption in large-scale phylogenetics. In chapter 3 I address recently-arisen issues related to phyloinformatic analysis of sequence-delineated matrices. There is ongoing work on setting species boundaries by sequence variation alone, but incongruence results in methodological issues upon integrating multiple loci delineated in this way. In the final chapter I assess the impact of heterogeneous substitution rates on large scale cox1 datasets. Although the number of heterogeneous sites in Coleoptera cox1 is substantial, their presence is found to be beneficial, as their removal negatively impacts the ability of the alignment to generate the 'known' topology. The homoplasy and heterogeneous characteristics of cox1 have not substantially impacted its utility, thus the cox1 datasets have potential to play a substantial role in the tree-of-life

    Reassessing the role of morphology in bryophyte phylogenetics : combined data improves phylogenetic inference despite character conflict

    Get PDF
    Morphological data has gained renewed attention and has been shown to be crucial in clarifying the phylogenetic relationship in a wide range of taxa. In the last decades, phylogenetic analyses of sequence-level data have radically modified the systematic schemes within bryophytes (early non-vascular land plants) and have revealed a widespread pattern of conflict with morphology-based classifications. Yet, a comprehensive evaluation of character conflict has not yet been performed in the context of combined matrices. In this study, we evaluate the impact of morphology on bryophyte phylogeny following a total-evidence approach across 10 published matrices. The analysed matrices span a wide range of bryophytes, taxonomic levels, gene sampling and number of morphological characters and taxa. Data conflict was addressed by measuring: (i) the topological congruence between individual partitions, (ii) changes in support values of the combined data relative to the molecular partition and (iii) clade stability. The association between these measures and the number of morphological characters per taxon (Nc/T ratio) and the proportion of non-fixed characters (i.e., inapplicable, polymorphic and missing data) was explored. In the individual partition analyses, the Nc/T ratio correlated positively with the topological congruence in six to seven datasets depending on the weighting scheme. The proportion of non-fixed cells had a minor influence on congruence between data partitions. The number of characters and proportion of non-fixed data varied significantly between morphological datasets that improved congruence between data types. This variation suggests that morphological datasets affect the results of combined analyses in different ways, depending on the taxa studied. Combined analyses revealed that, despite the low congruence values between partitions, integrating data types improves support values and stability. However, while non-fixed data had no negative effect on support values, stability was reduced as the proportion of non-fixed cells increased. Nc/T ratio was negatively associated with support values and it showed ambiguous responses in stability evaluations. Overall, the results indicate that adding morphology may contribute to the inference of phylogenetic relationships of bryophytes despite character conflict. Our findings suggest that merely comparing (a) morphology-based classifications with molecular phylogenies or (b) the outcome from individual data partitions can misestimate data conflict. These findings imply that analyses of combined data may provide conservative assessments of data conflict and, eventually, lead to an improved sampling of morphological characters in large-scale analyses of bryophytes.Peer reviewe

    Assessing the Value of DNA Barcodes for Molecular Phylogenetics: Effect of Increased Taxon Sampling in Lepidoptera

    Get PDF
    BACKGROUND: A common perception is that DNA barcode datamatrices have limited phylogenetic signal due to the small number of characters available per taxon. However, another school of thought suggests that the massively increased taxon sampling afforded through the use of DNA barcodes may considerably increase the phylogenetic signal present in a datamatrix. Here I test this hypothesis using a large dataset of macrolepidopteran DNA barcodes. METHODOLOGY/PRINCIPAL FINDINGS: Taxon sampling was systematically increased in datamatrices containing macrolepidopteran DNA barcodes. Sixteen family groups were designated as concordance groups and two quantitative measures; the taxon consistency index and the taxon retention index, were used to assess any changes in phylogenetic signal as a result of the increase in taxon sampling. DNA barcodes alone, even with maximal taxon sampling (500 species per family), were not sufficient to reconstruct monophyly of families and increased taxon sampling generally increased the number of clades formed per family. However, the scores indicated a similar level of taxon retention (species from a family clustering together) in the cladograms as the number of species included in the datamatrix was increased, suggesting substantial phylogenetic signal below the 'family' branch. CONCLUSIONS/SIGNIFICANCE: The development of supermatrix, supertree or constrained tree approaches could enable the exploitation of the massive taxon sampling afforded through DNA barcodes for phylogenetics, connecting the twigs resolved by barcodes to the deep branches resolved through phylogenomics
    corecore