11 research outputs found

    Assembly and Analysis of the Complete Mitochondrial Genome of Capsella bursa-pastoris

    No full text
    Shepherd’s purse (Capsella bursa-pastoris) is a cosmopolitan annual weed and a promising model plant for studying allopolyploidization in the evolution of angiosperms. Though plant mitochondrial genomes are a valuable source of genetic information, they are hard to assemble. At present, only the complete mitogenome of C. rubella is available out of all species of the genus Capsella. In this work, we have assembled the complete mitogenome of C. bursa-pastoris using high-precision PacBio SMRT third-generation sequencing technology. It is 287,799 bp long and contains 32 protein-coding genes, 3 rRNAs, 25 tRNAs corresponding to 15 amino acids, and 8 open reading frames (ORFs) supported by RNAseq data. Though many repeat regions have been found, none of them is longer than 1 kbp, and the most frequent structural variant originated from these repeats is present in only 4% of the mitogenome copies. The mitochondrial DNA sequence of C. bursa-pastoris differs from C. rubella, but not from C. orientalis, by two long inversions, suggesting that C. orientalis could be its maternal progenitor species. In total, 377 C to U RNA editing sites have been detected. All genes except cox1 and atp8 contain RNA editing sites, and most of them lead to non-synonymous changes of amino acids. Most of the identified RNA editing sites are identical to corresponding RNA editing sites in A. thaliana

    Origin and diversity of Capsella bursa-pastoris from the genomic point of view

    No full text
    Abstract Background Capsella bursa-pastoris, a cosmopolitan weed of hybrid origin, is an emerging model object for the study of early consequences of polyploidy, being a fast growing annual and a close relative of Arabidopsis thaliana. The development of this model is hampered by the absence of a reference genome sequence. Results We present here a subgenome-resolved chromosome-scale assembly and a genetic map of the genome of Capsella bursa-pastoris. It shows that the subgenomes are mostly colinear, with no massive deletions, insertions, or rearrangements in any of them. A subgenome-aware annotation reveals the lack of genome dominance—both subgenomes carry similar number of genes. While most chromosomes can be unambiguously recognized as derived from either paternal or maternal parent, we also found homeologous exchange between two chromosomes. It led to an emergence of two hybrid chromosomes; this event is shared between distant populations of C. bursa-pastoris. The whole-genome analysis of 119 samples belonging to C. bursa-pastoris and its parental species C. grandiflora/rubella and C. orientalis reveals introgression from C. orientalis but not from C. grandiflora/rubella. Conclusions C. bursa-pastoris does not show genome dominance. In the earliest stages of evolution of this species, a homeologous exchange occurred; its presence in all present-day populations of C. bursa-pastoris indicates on a single origin of this species. The evidence coming from whole-genome analysis challenges the current view that C. grandiflora/rubella was a direct progenitor of C. bursa-pastoris; we hypothesize that it was an extinct (or undiscovered) species sister to C. grandiflora/rubella

    Assessment of ITS1, ITS2, 5′-ETS, and <i>trnL-F</i> DNA Barcodes for Metabarcoding of Poaceae Pollen

    No full text
    Grass pollen is one of the major causes of allergy. Aerobiological monitoring is a necessary element of the complex of anti-allergic measures, but the similar pollen morphology of Poaceae species makes it challenging to discriminate species in airborne pollen mixes, which impairs the quality of aerobiological monitoring. One of the solutions to this problem is the metabarcoding approach employing DNA barcodes for taxonomical identification of species in a mix by high-throughput sequencing of the pollen DNA. A diverse set of 14 grass species of different genera were selected to create a local reference database of nuclear ITS1, ITS2, 5′-ETS, and plastome trnL-F DNA barcodes. Sequences for the database were Sanger sequenced from live field and herbarium specimens and collected from GenBank. New Poaceae-specific primers for 5′-ETS were designed and tested to obtain a 5′-ETS region less than 600 bp long, suitable for high-throughput sequencing. The DNA extraction method for single-species pollen samples and mixes was optimized to increase the yield for amplification and sequencing of pollen DNA. Barcode sequences were analyzed and compared by the barcoding gap and intra- and interspecific distances. Their capability to correctly identify grass pollen was tested on artificial pollen mixes of various complexity. Metabarcoding analysis of the artificial pollen mixes showed that nuclear DNA barcodes ITS1, ITS2, and 5′-ETS proved to be more efficient than the plastome barcode in both amplification from pollen DNA and identification of grass species. Although the metabarcoding results were qualitatively congruent with the actual composition of the pollen mixes in most cases, the quantitative results based on read-counts did not match the actual ratio of pollen grains in the mixes

    Assessment of ITS1, ITS2, 5&prime;-ETS, and trnL-F DNA Barcodes for Metabarcoding of Poaceae Pollen

    No full text
    Grass pollen is one of the major causes of allergy. Aerobiological monitoring is a necessary element of the complex of anti-allergic measures, but the similar pollen morphology of Poaceae species makes it challenging to discriminate species in airborne pollen mixes, which impairs the quality of aerobiological monitoring. One of the solutions to this problem is the metabarcoding approach employing DNA barcodes for taxonomical identification of species in a mix by high-throughput sequencing of the pollen DNA. A diverse set of 14 grass species of different genera were selected to create a local reference database of nuclear ITS1, ITS2, 5&prime;-ETS, and plastome trnL-F DNA barcodes. Sequences for the database were Sanger sequenced from live field and herbarium specimens and collected from GenBank. New Poaceae-specific primers for 5&prime;-ETS were designed and tested to obtain a 5&prime;-ETS region less than 600 bp long, suitable for high-throughput sequencing. The DNA extraction method for single-species pollen samples and mixes was optimized to increase the yield for amplification and sequencing of pollen DNA. Barcode sequences were analyzed and compared by the barcoding gap and intra- and interspecific distances. Their capability to correctly identify grass pollen was tested on artificial pollen mixes of various complexity. Metabarcoding analysis of the artificial pollen mixes showed that nuclear DNA barcodes ITS1, ITS2, and 5&prime;-ETS proved to be more efficient than the plastome barcode in both amplification from pollen DNA and identification of grass species. Although the metabarcoding results were qualitatively congruent with the actual composition of the pollen mixes in most cases, the quantitative results based on read-counts did not match the actual ratio of pollen grains in the mixes

    Capsella bursa-pastoris de novo assembly

    No full text
    Files with genome assembly and annotation data for Capsella bursa-pastoris   Cbp_msk.genome.assembly.fasta - genome assembly sequences Cbp_msk.anot.gff -  genome annotation Cbp_msk.anot.cds.fa - CDS sequences for genome annotation  </p

    Additional file 1 of Origin and diversity of Capsella bursa-pastoris from the genomic point of view

    No full text
    Additional file 1: Fig. S1. Scheme for data acquisition for the genetic map. Fig. S2. An example of a contig fragment with a colored state of the markers. Due to the low coverage some of the markers are "noisy". Fig. S3. An example of chimeric assembly. Two adjacent markers located at a distance of ~90 kbp have 33 "recombinations" per 100 chromosomes, which is impossible and indicates independent inheritance of markers.Fig. S4. An example of correction of a local chimeric assembly. a Insertion of a foreign fragment(s) and b view of the site after correction. The green dashed lines show the signals indicating the proximity of the sites. Fig. S5. Simulation of the subgenome separation procedure. Example of the coverage by reads of the parental species of some reference contigs created from the genomes of C. orientalis and C. rubella for subgenome separation. Fig. S6. Analysis of introgression by admixture analysis in C. bursa-pastoris, for K=6. The colors of the line names correspond to the populations in Fig. 4b.Fig. S7. Fbranch matrix plotted using Dsuite f4-statistics results for different tree topologies of parental species and lineages of C. bursa-pastoris for subgenome O. a for ((((ME,EU),ASI),Co),Cgr-Outgroup) tree; b for ((((ASI,EU),ME),Co),Cgr-Outgroup) tree; c for ((((ASI,ME),EU),Co),Cgr-Outgroup) tree; d for (((ME,EU),ASI),Co-Outgroup) tree; e for (((ASI,EU),ME),Co-Outgroup) tree; f for (((ASI,ME),EU),Co-Outgroup) tree. Co – C. orientalis, Cgr – C. rubella/C. grandiflora, and ASI, ME, EU – lineages of C. bursa-pastoris. Fig. S8. Fbranch matrix plotted using Dsuite f4-stastics results for different tree topologies of parental species and lineages of C. bursa-pastoris for subgenome R. a for ((((ME,EU),ASI),Cgr),Co-Outgroup) tree; b for ((((ASI,EU),ME),Cgr),Co-Outgroup) tree; c for ((((ASI,ME),EU),Cgr),Co-Outgroup) tree; d for (((ME,EU),ASI),Cgr-Outgroup) tree; e for (((ASI,EU),ME),Cgr-Outgroup) tree; f for (((ASI,ME),EU),Cgr-Outgroup) tree. Co – C. orientalis, Cgr – C. rubella/C. grandiflora, and ASI, ME, EU – lineages of C. bursa-pastoris. Fig. S9. F2 data processing. Fig. S10. Building the SNP Database

    Aerobiological Monitoring and Metabarcoding of Grass Pollen

    No full text
    Grass pollen is one of the leading causes of pollinosis, affecting 10–30% of the world’s population. The allergenicity of pollen from different Poaceae species is not the same and is estimated from moderate to high. Aerobiological monitoring is a standard method that allows one to track and predict the dynamics of allergen concentration in the air. Poaceae is a stenopalynous family, and thus grass pollen can usually be identified only at the family level with optical microscopy. Molecular methods, in particular the DNA barcoding technique, can be used to conduct a more accurate analysis of aerobiological samples containing the DNA of various plant species. This study aimed to test the possibility of using the ITS1 and ITS2 nuclear loci for determining the presence of grass pollen from air samples via metabarcoding and to compare the analysis results with the results of phenological observations. Based on the high-throughput sequencing data, we analyzed the changes in the composition of aerobiological samples taken in the Moscow and Ryazan regions for three years during the period of active flowering of grasses. Ten genera of the Poaceae family were detected in airborne pollen samples. The representation for most of them for ITS1 and ITS2 barcodes was similar. At the same time, in some samples, the presence of specific genera was characterized by only one sequence: either ITS1 or ITS2. Based on the analysis of the abundance of both barcode reads in the samples, the following order could describe the change with time in the dominant species in the air: Poa, Alopecurus, and Arrhenatherum in early mid-June, Lolium, Bromus, Dactylis, and Briza in mid-late June, Phleum, Elymus in late June to early July, and Calamagrostis in early mid-July. In most samples, the number of taxa found via metabarcoding analysis was higher compared to that in the phenological observations. The semi-quantitative analysis of high-throughput sequencing data well reflects the abundance of only major grass species at the flowering stage
    corecore