73 research outputs found

    Sources of Signal in 62 Protein-Coding Nuclear Genes for Higher-Level Phylogenetics of Arthropods

    Get PDF
    BACKGROUND: This study aims to investigate the strength of various sources of phylogenetic information that led to recent seemingly robust conclusions about higher-level arthropod phylogeny and to assess the role of excluding or downweighting synonymous change for arriving at those conclusions. METHODOLOGY/PRINCIPAL FINDINGS: The current study analyzes DNA sequences from 68 gene segments of 62 distinct protein-coding nuclear genes for 80 species. Gene segments analyzed individually support numerous nodes recovered in combined-gene analyses, but few of the higher-level nodes of greatest current interest. However, neither is there support for conflicting alternatives to these higher-level nodes. Gene segments with higher rates of nonsynonymous change tend to be more informative overall, but those with lower rates tend to provide stronger support for deeper nodes. Higher-level nodes with bootstrap values in the 80% - 99% range for the complete data matrix are markedly more sensitive to substantial drops in their bootstrap percentages after character subsampling than those with 100% bootstrap, suggesting that these nodes are likely not to have been strongly supported with many fewer data than in the full matrix. Data set partitioning of total data by (mostly) synonymous and (mostly) nonsynonymous change improves overall node support, but the result remains much inferior to analysis of (unpartitioned) nonsynonymous change alone. Clusters of genes with similar nonsynonymous rate properties (e.g., faster vs. slower) show some distinct patterns of node support but few conflicts. Synonymous change is shown to contribute little, if any, phylogenetic signal to the support of higher-level nodes, but it does contribute nonphylogenetic signal, probably through its underlying heterogeneous nucleotide composition. Analysis of seemingly conservative indels does not prove useful. CONCLUSIONS: Generating a robust molecular higher-level phylogeny of Arthropoda is currently possible with large amounts of data and an exclusive reliance on nonsynonymous change

    Resolving Discrepancy between Nucleotides and Amino Acids in Deep-Level Arthropod Phylogenomics: Differentiating Serine Codons in 21-Amino-Acid Models

    Get PDF
    BACKGROUND: In a previous study of higher-level arthropod phylogeny, analyses of nucleotide sequences from 62 protein-coding nuclear genes for 80 panarthopod species yielded significantly higher bootstrap support for selected nodes than did amino acids. This study investigates the cause of that discrepancy. METHODOLOGY/PRINCIPAL FINIDINGS: The hypothesis is tested that failure to distinguish the serine residues encoded by two disjunct clusters of codons (TCN, AGY) in amino acid analyses leads to this discrepancy. In one test, the two clusters of serine codons (Ser1, Ser2) are conceptually translated as separate amino acids. Analysis of the resulting 21-amino-acid data matrix shows striking increases in bootstrap support, in some cases matching that in nucleotide analyses. In a second approach, nucleotide and 20-amino-acid data sets are artificially altered through targeted deletions, modifications, and replacements, revealing the pivotal contributions of distinct Ser1 and Ser2 codons. We confirm that previous methods of coding nonsynonymous nucleotide change are robust and computationally efficient by introducing two new degeneracy coding methods. We demonstrate for degeneracy coding that neither compositional heterogeneity at the level of nucleotides nor codon usage bias between Ser1 and Ser2 clusters of codons (or their separately coded amino acids) is a major source of non-phylogenetic signal. CONCLUSIONS: The incongruity in support between amino-acid and nucleotide analyses of the forementioned arthropod data set is resolved by showing that “standard” 20-amino-acid analyses yield lower node support specifically when serine provides crucial signal. Separate coding of Ser1 and Ser2 residues yields support commensurate with that found by degenerated nucleotides, without introducing phylogenetic artifacts. While exclusion of all serine data leads to reduced support for serine-sensitive nodes, these nodes are still recovered in the ML topology, indicating that the enhanced signal from Ser1 and Ser2 is not qualitatively different from that of the other amino acids.This study was supported by grants from the National Science Foundation, U.S.A. (grant numbers 0531626, 1042845 and 0120635). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

    A Molecular Phylogeny for the Leaf-Roller Moths (Lepidoptera: Tortricidae) and Its Implications for Classification and Life History Evolution

    Get PDF
    Tortricidae, one of the largest families of microlepidopterans, comprise about 10,000 described species worldwide, including important pests, biological control agents and experimental models. Understanding of tortricid phylogeny, the basis for a predictive classification, is currently provisional. We present the first detailed molecular estimate of relationships across the tribes and subfamilies of Tortricidae, assess its concordance with previous morphological evidence, and re-examine postulated evolutionary trends in host plant use and biogeography.We sequenced up to five nuclear genes (6,633 bp) in each of 52 tortricids spanning all three subfamilies and 19 of the 22 tribes, plus up to 14 additional genes, for a total of 14,826 bp, in 29 of those taxa plus all 14 outgroup taxa. Maximum likelihood analyses yield trees that, within Tortricidae, differ little among data sets and character treatments and are nearly always strongly supported at all levels of divergence. Support for several nodes was greatly increased by the additional 14 genes sequenced in just 29 of 52 tortricids, with no evidence of phylogenetic artifacts from deliberately incomplete gene sampling. There is strong support for the monophyly of Tortricinae and of Olethreutinae, and for grouping of these to the exclusion of Chlidanotinae. Relationships among tribes are robustly resolved in Tortricinae and mostly so in Olethreutinae. Feeding habit (internal versus external) is strongly conserved on the phylogeny. Within Tortricinae, a clade characterized by eggs being deposited in large clusters, in contrast to singly or in small batches, has markedly elevated incidence of polyphagous species. The five earliest-branching tortricid lineages are all species-poor tribes with mainly southern/tropical distributions, consistent with a hypothesized Gondwanan origin for the family.We present the first robustly supported phylogeny for Tortricidae, and a revised classification in which all of the sampled tribes are now monophyletic

    Can Deliberately Incomplete Gene Sample Augmentation Improve a Phylogeny Estimate for the Advanced Moths and Butterflies (Hexapoda: Lepidoptera)?

    Get PDF
    This paper addresses the question of whether one can economically improve the robustness of a molecular phylogeny estimate by increasing gene sampling in only a subset of taxa, without having the analysis invalidated by artifacts arising from large blocks of missing data. Our case study stems from an ongoing effort to resolve poorly understood deeper relationships in the large clade Ditrysia ( > 150,000 species) of the insect order Lepidoptera (butterflies and moths). Seeking to remedy the overall weak support for deeper divergences in an initial study based on five nuclear genes (6.6 kb) in 123 exemplars, we nearly tripled the total gene sample (to 26 genes, 18.4 kb) but only in a third (41) of the taxa. The resulting partially augmented data matrix (45% intentionally missing data) consistently increased bootstrap support for groupings previously identified in the five-gene (nearly) complete matrix, while introducing no contradictory groupings of the kind that missing data have been predicted to produce. Our results add to growing evidence that data sets differing substantially in gene and taxon sampling can often be safely and profitably combined. The strongest overall support for nodes above the family level came from including all nucleotide changes, while partitioning sites into sets undergoing mostly nonsynonymous versus mostly synonymous change. In contrast, support for the deepest node for which any persuasive molecular evidence has yet emerged (78–85% bootstrap) was weak or nonexistent unless synonymous change was entirely excluded, a result plausibly attributed to compositional heterogeneity. This node (Gelechioidea + Apoditrysia), tentatively proposed by previous authors on the basis of four morphological synapomorphies, is the first major subset of ditrysian superfamilies to receive strong statistical support in any phylogenetic study. A “more-genes-only” data set (41 taxa×26 genes) also gave strong signal for a second deep grouping (Macrolepidoptera) that was obscured, but not strongly contradicted, in more taxon-rich analyses

    Phylogeny and Biogeography of Hawkmoths (Lepidoptera: Sphingidae): Evidence from Five Nuclear Genes

    Get PDF
    The 1400 species of hawkmoths (Lepidoptera: Sphingidae) comprise one of most conspicuous and well-studied groups of insects, and provide model systems for diverse biological disciplines. However, a robust phylogenetic framework for the family is currently lacking. Morphology is unable to confidently determine relationships among most groups. As a major step toward understanding relationships of this model group, we have undertaken the first large-scale molecular phylogenetic analysis of hawkmoths representing all subfamilies, tribes and subtribes.The data set consisted of 131 sphingid species and 6793 bp of sequence from five protein-coding nuclear genes. Maximum likelihood and parsimony analyses provided strong support for more than two-thirds of all nodes, including strong signal for or against nearly all of the fifteen current subfamily, tribal and sub-tribal groupings. Monophyly was strongly supported for some of these, including Macroglossinae, Sphinginae, Acherontiini, Ambulycini, Philampelini, Choerocampina, and Hemarina. Other groupings proved para- or polyphyletic, and will need significant redefinition; these include Smerinthinae, Smerinthini, Sphingini, Sphingulini, Dilophonotini, Dilophonotina, Macroglossini, and Macroglossina. The basal divergence, strongly supported, is between Macroglossinae and Smerinthinae+Sphinginae. All genes contribute significantly to the signal from the combined data set, and there is little conflict between genes. Ancestral state reconstruction reveals multiple separate origins of New World and Old World radiations.Our study provides the first comprehensive phylogeny of one of the most conspicuous and well-studied insects. The molecular phylogeny challenges current concepts of Sphingidae based on morphology, and provides a foundation for a new classification. While there are multiple independent origins of New World and Old World radiations, we conclude that broad-scale geographic distribution in hawkmoths is more phylogenetically conserved than previously postulated

    Toward reconstructing the evolution of advanced moths and butterflies (Lepidoptera: Ditrysia): an initial molecular study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the mega-diverse insect order Lepidoptera (butterflies and moths; 165,000 described species), deeper relationships are little understood within the clade Ditrysia, to which 98% of the species belong. To begin addressing this problem, we tested the ability of five protein-coding nuclear genes (6.7 kb total), and character subsets therein, to resolve relationships among 123 species representing 27 (of 33) superfamilies and 55 (of 100) families of Ditrysia under maximum likelihood analysis.</p> <p>Results</p> <p>Our trees show broad concordance with previous morphological hypotheses of ditrysian phylogeny, although most relationships among superfamilies are weakly supported. There are also notable surprises, such as a consistently closer relationship of Pyraloidea than of butterflies to most Macrolepidoptera. Monophyly is significantly rejected by one or more character sets for the putative clades Macrolepidoptera as currently defined (<it>P </it>< 0.05) and Macrolepidoptera excluding Noctuoidea and Bombycoidea sensu lato (<it>P </it>≀ 0.005), and nearly so for the superfamily Drepanoidea as currently defined (<it>P </it>< 0.08). Superfamilies are typically recovered or nearly so, but usually without strong support. Relationships within superfamilies and families, however, are often robustly resolved. We provide some of the first strong molecular evidence on deeper splits within Pyraloidea, Tortricoidea, Geometroidea, Noctuoidea and others.</p> <p>Separate analyses of mostly synonymous versus non-synonymous character sets revealed notable differences (though not strong conflict), including a marked influence of compositional heterogeneity on apparent signal in the third codon position (nt3). As available model partitioning methods cannot correct for this variation, we assessed overall phylogeny resolution through separate examination of trees from each character set. Exploration of "tree space" with GARLI, using grid computing, showed that hundreds of searches are typically needed to find the best-feasible phylogeny estimate for these data.</p> <p>Conclusion</p> <p>Our results (a) corroborate the broad outlines of the current working phylogenetic hypothesis for Ditrysia, (b) demonstrate that some prominent features of that hypothesis, including the position of the butterflies, need revision, and (c) resolve the majority of family and subfamily relationships within superfamilies as thus far sampled. Much further gene and taxon sampling will be needed, however, to strongly resolve individual deeper nodes.</p

    Increased gene sampling strengthens support for higher-level groups within leaf-mining moths and relatives (Lepidoptera: Gracillariidae)

    Get PDF
    Background: Researchers conducting molecular phylogenetic studies are frequently faced with the decision of what to do when weak branch support is obtained for key nodes of importance. As one solution, the researcher may choose to sequence additional orthologous genes of appropriate evolutionary rate for the taxa in the study. However, generating large, complete data matrices can become increasingly difficult as the number of characters increases. A few empirical studies have shown that augmenting genes even for a subset of taxa can improve branch support. However, because each study differs in the number of characters and taxa, there is still a need for additional studies that examine whether incomplete sampling designs are likely to aid at increasing deep node resolution. We target Gracillariidae, a Cretaceous-age (similar to 100 Ma) group of leaf-mining moths to test whether the strategy of adding genes for a subset of taxa can improve branch support for deep nodes. We initially sequenced ten genes (8,418 bp) for 57 taxa that represent the major lineages of Gracillariidae plus outgroups. After finding that many deep divergences remained weakly supported, we sequenced eleven additional genes (6,375 bp) for a 27-taxon subset. We then compared results from different data sets to assess whether one sampling design can be favored over another. The concatenated data set comprising all genes and all taxa and three other data sets of different taxon and gene sub-sampling design were analyzed with maximum likelihood. Each data set was subject to five different models and partitioning schemes of non-synonymous and synonymous changes. Statistical significance of non-monophyly was examined with the Approximately Unbiased (AU) test. Results: Partial augmentation of genes led to high support for deep divergences, especially when non-synonymous changes were analyzed alone. Increasing the number of taxa without an increase in number of characters led to lower bootstrap support; increasing the number of characters without increasing the number of taxa generally increased bootstrap support. More than three-quarters of nodes were supported with bootstrap values greater than 80% when all taxa and genes were combined. Gracillariidae, Lithocolletinae + Leucanthiza, and Acrocercops and Parectopa groups were strongly supported in nearly every analysis. Gracillaria group was well supported in some analyses, but less so in others. We find strong evidence for the exclusion of Douglasiidae from Gracillarioidea sensu Davis and Robinson (1998). Our results strongly support the monophyly of a G.B.R.Y. clade, a group comprised of Gracillariidae + Bucculatricidae + Roeslerstammiidae + Yponomeutidae, when analyzed with non-synonymous changes only, but this group was frequently split when synonymous and non-synonymous substitutions were analyzed together. Conclusions: 1) Partially or fully augmenting a data set with more characters increased bootstrap support for particular deep nodes, and this increase was dramatic when non-synonymous changes were analyzed alone. Thus, the addition of sites that have low levels of saturation and compositional heterogeneity can greatly improve results. 2) Gracillarioidea, as defined by Davis and Robinson (1998), clearly do not include Douglasiidae, and changes to current classification will be required. 3) Gracillariidae were monophyletic in all analyses conducted, and nearly all species can be placed into one of six strongly supported clades though relationships among these remain unclear. 4) The difficulty in determining the phylogenetic placement of Bucculatricidae is probably attributable to compositional heterogeneity at the third codon position. From our tests for compositional heterogeneity and strong bootstrap values obtained when synonymous changes are excluded, we tentatively conclude that Bucculatricidae is closely related to Gracillariidae + Roeslerstammiidae + Yponomeutidae
    • 

    corecore