3 research outputs found

    Species-Level Para- and Polyphyly in DNA Barcode Gene Trees: Strong Operational Bias in European Lepidoptera

    No full text
    Matunen, Marko et al.The proliferation of DNA data is revolutionizing all fields of systematic research. DNA barcode sequences, now available for millions of specimens and several hundred thousand species, are increasingly used in algorithmic species delimitations. This is complicated by occasional incongruences between species and gene genealogies, as indicated by situations where conspecific individuals do not form a monophyletic cluster in a gene tree. In two previous reviews, non-monophyly has been reported as being common in mitochondrial DNA gene trees. We developed a novel web service “Monophylizer” to detect non-monophyly in phylogenetic trees and used it to ascertain the incidence of species non-monophyly in COI (a.k.a. cox1) barcode sequence data from 4977 species and 41,583 specimens of European Lepidoptera, the largest data set of DNA barcodes analyzed from this regard. Particular attention was paid to accurate species identification to ensure data integrity. We investigated the effects of tree-building method, sampling effort, and other methodological issues, all of which can influence estimates of non-monophyly. We found a 12% incidence of non-monophyly, a value significantly lower than that observed in previous studies. Neighbor joining (NJ) and maximum likelihood (ML) methods yielded almost equal numbers of non-monophyletic species, but 24.1% of these cases of non-monophyly were only found by one of these methods. Non-monophyletic species tend to show either low genetic distances to their nearest neighbors or exceptionally high levels of intraspecific variability. Cases of polyphyly in COI trees arising as a result of deep intraspecific divergence are negligible, as the detected cases reflected misidentifications or methodological errors. Taking into consideration variation in sampling effort, we estimate that the true incidence of non-monophyly is ∌23%, but with operational factors still being included. Within the operational factors, we separately assessed the frequency of taxonomic limitations (presence of overlooked cryptic and oversplit species) and identification uncertainties. We observed that operational factors are potentially present in more than half (58.6%) of the detected cases of non-monophyly. Furthermore, we observed that in about 20% of non-monophyletic species and entangled species, the lineages involved are either allopatric or parapatric—conditions where species delimitation is inherently subjective and particularly dependent on the species concept that has been adopted. These observations suggest that species-level non-monophyly in COI gene trees is less common than previously supposed, with many cases reflecting misidentifications, the subjectivity of species delimitation or other operational factors.Most of the sequences used in this study were generated at the Biodiversity Institute of Ontario under the International Barcode of Life Project, funded by the Government of Canada through Genome Canada and the Ontario Genomics Institute. The generation of German data was funded by grants from the Bavarian State Ministry of Education, Culture, Science and the Arts (Barcoding Fauna Bavarica, BFB) and the German Federal Ministry of Education and Research (German Barcode of Life GBOL2: BMBF #01LI1101B). Molecular laboratory infrastructure and sequencing within the Nature of The Netherlands project was funded by a FES grant from the Dutch Ministry of Finance. The Finnish Barcode of Life project was funded by the Kone Foundation, the Finnish Cultural Foundation, and the University of Oulu. Support for this research was provided by the Spanish Ministerio de EconomĂ­a y Competitividad [projects CGL2010-21226/BOS and CGL2013-48277-P to R.V.], by a RĂ©gion Haute-Normandie post-doctoral fellowship [to R.R.], and by a Marie Curie International Outgoing Fellowship within the 7th European Community Framework Programme [project no. 625997 to V.D.]ă Sequencing of Norwegian material was supported by the Natural History Museum, University of Oslo, and the Norwegian Barcode of Life Network (NorBOL). Sequencing within the framework of the Lepidoptera of the Alps Campaign was supported by the Promotion of Educational Policies, University and Research Department of the Autonomous Province of Bolzano—South Tyrol with funds to the project “Genetic biodiversity archive -DNA barcoding of Lepidoptera of the central Alpine region (South, East and North Tyrol),” the Austrian Federal Ministry of Science, Research and Economics with funds to ABOL (Austrian Barcode of Life), and by the regional institutions Tiroler Landesmuseen, inatura and Landesmuseum KĂ€rnten. S.M.K. was funded by the international fellowship program at Stockholm University and Finnish Cultural Foundation. Sampling of Lepidoptera from Upper-Normandy (France) was supported by a grant by Conseil RĂ©gional de Haute-Normandie to Thibaud DecaĂ«ns, then member of the ECODIV laboratory at the University of Rouen.Peer reviewe
    corecore