7 research outputs found

    PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region

    Get PDF
    Abstract Background DNA barcoding technology, which uses a short piece of DNA sequence to identify species, has wide ranges of applications. Until today, a universal DNA barcode marker for plants remains elusive. The rbcL and matK regions have been proposed as the “core barcode” for plants and the ITS2 and psbA-trnH intergenic spacer (PTIGS) regions were later added as supplemental barcodes. The use of PTIGS region as a supplemental barcode has been limited by the lack of computational tools that can handle significant insertions and deletions in the PTIGS sequences. Here, we compared the most commonly used alignment-based and alignment-free methods and developed a web server to allow the biologists to carry out PTIGS-based DNA barcoding analyses. Results First, we compared several alignment-based methods such as BLAST and those calculating P distance and Edit distance, alignment-free methods Di-Nucleotide Frequency Profile (DNFP) and their combinations. We found that the DNFP and Edit-distance methods increased the identification success rate to ~80%, 20% higher than the most commonly used BLAST method. Second, the combined methods showed overall better success rate and performance. Last, we have developed a web server that allows (1) retrieving various sub-regions and the consensus sequences of PTIGS, (2) annotating novel PTIGS sequences, (3) determining species identity by PTIGS sequences using eight methods, and (4) examining identification efficiency and performance of the eight methods for various taxonomy groups. Conclusions The Edit distance and the DNFP methods have the highest discrimination powers. Hybrid methods can be used to achieve significant improvement in performance. These methods can be extended to applications using the core barcodes and the other supplemental DNA barcode ITS2. To our knowledge, the web server developed here is the only one that allows species determination based on PTIGS sequences. The web server can be accessed at http://psba-trnh-plantidit.dnsalias.org

    Applied Barcoding: The Practicalities of DNA Testing for Herbals

    Get PDF
    open access articleDNA barcoding is a widely accepted technique for the identification of plant materials, and its application to the authentication of commercial medicinal plants has attracted significant attention. The incorporation ofDNA-based technologies into the quality testing protocols of international pharmacopoeias represents a step-change in status, requiring the establishment of standardized, reliable and reproducible methods. The process by which this can be achieved for any herbal medicine is described, using Hypericum perforatum L. (St John’sWort) and potential adulterant Hypericum species as a case study. A range of practical issues are considered including quality control of DNA sequences from public repositories and the construction of individual curated databases, choice of DNA barcode region(s) and the identification of informative polymorphic nucleotide sequences. A decision tree informs the structure of the manuscript and provides a template to guide the development of future DNA barcode tests for herbals

    The complete chloroplast genome sequence of Rhododendron fortunei: structural comparative and phylogenetic analysis in the Ericaceae family

    Get PDF
    Rhododendron fortunei (Ericaceae) possesses valuable horticultural and medicinal values. However, the genomic information on R. fortunei is very limited. In this study, the complete chloroplast genome (cp) of R. fortunei was assembled and annotated, SSR loci were characterised, comparative genomic analysis was carried out, and phylogenetic research was also performed. The results showed that the R. fortunei cp genome was of a typical quadripartite structure (200,997 bp). The lengths of the large single copy region (LSC), the inverted repeat regions (IR), and the small single copy region (SSC) were 109,151 bp, 2,604 bp, and 44,619 bp, respectively. A total of 147 unique genes were identified, including 99 protein-coding genes, 42 tRNA genes, and 6 rRNA genes, respectively. Leucine (11.51%) and cysteine (1.15%) were the highest and lowest representative amino acids, respectively. The total of 30 codons with obvious codon usage bias were all A/U-ending codons. Among the 77 simple sequence repeats, the majority were mononucleotide A/T repeats located in the intergenic spacer region. Five gene regions showed high levels of nucleotide diversity (Pi > 0.03). The comparative genome analysis revealed 7 hotspot intergenic regions (trnI-rpoB, trnTrpl16, rpoA-psbJ, rps7-rrn16, ndhI-rps16, rps16-rps19, and rrn16-trnI), showing great potential as molecular makers for species authentication. Expansion and contraction were detected in the IR region of the R. fortunei cp genome. In the phylogenetic tree, R. fortunei was closely related to R. platypodum. This research will be beneficial for evolutionary and genetic diversity studies of R. fortunei and related species among the Ericaceae family

    MISSEL: a method to identify a large number of small species-specific genomic subsequences and its application to viruses classification

    Get PDF
    Continuous improvements in next generation sequencing technologies led to ever-increasing collections of genomic sequences, which have not been easily characterized by biologists, and whose analysis requires huge computational effort. The classification of species emerged as one of the main applications of DNA analysis and has been addressed with several approaches, e.g., multiple alignments-, phylogenetic trees-, statistical- and character-based methods

    PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region

    No full text
    Abstract Background DNA barcoding technology, which uses a short piece of DNA sequence to identify species, has wide ranges of applications. Until today, a universal DNA barcode marker for plants remains elusive. The rbcL and matK regions have been proposed as the “core barcode” for plants and the ITS2 and psbA-trnH intergenic spacer (PTIGS) regions were later added as supplemental barcodes. The use of PTIGS region as a supplemental barcode has been limited by the lack of computational tools that can handle significant insertions and deletions in the PTIGS sequences. Here, we compared the most commonly used alignment-based and alignment-free methods and developed a web server to allow the biologists to carry out PTIGS-based DNA barcoding analyses. Results First, we compared several alignment-based methods such as BLAST and those calculating P distance and Edit distance, alignment-free methods Di-Nucleotide Frequency Profile (DNFP) and their combinations. We found that the DNFP and Edit-distance methods increased the identification success rate to ~80%, 20% higher than the most commonly used BLAST method. Second, the combined methods showed overall better success rate and performance. Last, we have developed a web server that allows (1) retrieving various sub-regions and the consensus sequences of PTIGS, (2) annotating novel PTIGS sequences, (3) determining species identity by PTIGS sequences using eight methods, and (4) examining identification efficiency and performance of the eight methods for various taxonomy groups. Conclusions The Edit distance and the DNFP methods have the highest discrimination powers. Hybrid methods can be used to achieve significant improvement in performance. These methods can be extended to applications using the core barcodes and the other supplemental DNA barcode ITS2. To our knowledge, the web server developed here is the only one that allows species determination based on PTIGS sequences. The web server can be accessed at http://psba-trnh-plantidit.dnsalias.org.</p
    corecore