60 research outputs found

    \u3cem\u3eIn silico\u3c/em\u3e Detection of EMS-Induced Mutations in an \u3cem\u3eArabis alpina\u3c/em\u3e Population

    Get PDF
    Arabis alpine (Alpine rock-cress weed) is a flowering plant, native to mountainous environments of thenorthern hemisphere. We analyzed 1,454,931,853 next-generation sequencing (NGS) reads from 38 sequenced Arabis alpine mutant individuals which that were mutagenized using the chemical mutagen, ethyl methanesulfphonate (EMS). Using the BWA short reads mapper, BWA, 95% (1,387,167,658) of the NGS reads mapped to Arabis alpine reference genome version 4. Using the SAMtools variant- detection algorithm, SAMtools, we detected a total of 1,457,917 mutations, with an average of 38,366 mutations per sample. Overall, the predicted mutations include 971,252 high-quality single nucleotide polymorphisms (SNPs) and 168,783 high-quality insertions and deletions (INDELs)

    Analysis of Natural Variation in 30 Sorghum Landraces

    Get PDF
    Sorghum is a next generation of crop species for food grain, feedstock, beverage and biofuel production. To discover highly desirable agronomic traits in sorghum, we analyzed 3.42 billion DNA sequences derived from 30 sequenced sorghum landraces using next-generation sequencing (NGS) technology. Using the BWA short reads aligner, 97% of the sequenced reads mapped successfully to the sorghum reference genome. Using the SAMtools variant-calling algorithm, we detected 68.14 million mutations, including 61.32 million DNA base substitutions or single nucleotide polymorphisms (SNPs) and 6.81 million insertions and deletions (INDELs). In our preliminary analysis using the snpEff variant annotation tool, we predicted a total of 134,207 high-impact mutations and 1.81 million moderate-impact mutations in the 30 sequenced sorghum landraces

    Physcomitrella patens DCL3 Is Required for 22–24 nt siRNA Accumulation, Suppression of Retrotransposon-Derived Transcripts, and Normal Development

    Get PDF
    Endogenous 24 nt short interfering RNAs (siRNAs), derived mostly from intergenic and repetitive genomic regions, constitute a major class of endogenous small RNAs in flowering plants. Accumulation of Arabidopsis thaliana 24 nt siRNAs requires the Dicer family member DCL3, and clear homologs of DCL3 exist in both flowering and non-flowering plants. However, the absence of a conspicuous 24 nt peak in the total RNA populations of several non-flowering plants has raised the question of whether this class of siRNAs might, in contrast to the ancient 21 nt microRNAs (miRNAs) and 21–22 nt trans-acting siRNAs (tasiRNAs), be an angiosperm-specific innovation. Analysis of non-miRNA, non-tasiRNA hotspots of small RNA production within the genome of the moss Physcomitrella patens revealed multiple loci that consistently produced a mixture of 21–24 nt siRNAs with a peak at 23 nt. These Pp23SR loci were significantly enriched in transposon content, depleted in overlap with annotated genes, and typified by dense concentrations of the 5-methyl cytosine (5 mC) DNA modification. Deep sequencing of small RNAs from two independent Ppdcl3 mutants showed that the P. patens DCL3 homolog is required for the accumulation of 22–24 nt siRNAs, but not 21 nt siRNAs, at Pp23SR loci. The 21 nt component of Pp23SR-derived siRNAs was also unaffected by a mutation in the RNA-dependent RNA polymerase mutant Pprdr6. Transcriptome-wide, Ppdcl3 mutants failed to accumulate 22–24 nt small RNAs from repetitive regions while transcripts from two abundant families of long terminal repeat (LTR) retrotransposon-associated reverse transcriptases were up-regulated. Ppdcl3 mutants also displayed an acceleration of leafy gametophore production, suggesting that repetitive siRNAs may play a role in the development of P. patens. We conclude that intergenic/repeat-derived siRNAs are indeed a broadly conserved, distinct class of small regulatory RNAs within land plants

    Volvox_scan: A Clustering Algorithm for Predicting Related Mutants in a Sequenced Population

    No full text
    In forward genetics studies, the accurate detection of bona fide induced DNA mutations can be negatively impacted by the presence contaminants introduced by DNA library and sample preparation errors, DNA sequencing and alignment errors, sample mislabeling and pollen contamination (in plants). These challenges impact the accuracy of variant-calling algorithms for predicting DNA mutations in next-generation sequencing (NGS) datasets, leading to false-positive detections. For large-scale mutant population studies utilizing independently mutagenized individuals, the filtering of common (or shared) variants is a potent solution to mitigating false positives. Although filtering of common variants is a widely used technique, it can result in the unintentional removal of false negatives if the sequenced mutant population includes mutants that are genetically related. Hence determining which mutants are genetically related would be beneficial for downstream variant-call filtering. We implemented an efficient mutation clustering algorithm (volvox_scan) for detecting subpopulations of mutants in a sequenced population that are likely genetically related. We demonstrate the efficiency of the volvox_scan algorithm in uncovering clusters of likely related mutants from datasets of several large-scale mutant population studies

    Mutation Detection Software for Bench Scientists

    No full text
    The accurate detection of bona fide causal DNA mutations in a sequenced sample is critical for gene function discovery studies. The rapid sequencing rate and high-throughput nature of next-generation sequencing (NGS) technologies have greatly accelerated the pace of discoveries in molecular genetics. The computational task of detecting DNA variants in large-scale NGS sequencing datasets may be daunting for the typical bench scientist. In an era of tremendous advances in computing and reliable cyberinfrastructure availability, empowering wet-lab scientists to independently perform computational analysis on sequencing datasets without the requirement for detailed knowledge of the underlying hardware and software infrastructure is a very desirable goal. We describe the initial implementation of a user-friendly DNA variant-detection software for the analysis of NGS sequencing datasets called Variant-Scope. Variant-Scope is implemented in Python and provides an intuitive user-interface (UI), and also includes a setup configuration wizard, computation status viewer, and a variety of summary charts and figures generated for an experimental study. Variant-Scope integrates cutting-edge NGS analysis software suites such as BWA (for reads alignment), SAMtools (for variant detection), BEDTools (for genome arithmetic), and SnpEff (for variant-effect annotation), and is distributed as a package for users running Windows, MacOS, or Linux. Variant-Scope is a very flexible package and can handle sequencing datasets ranging from whole-genome sequences of single individuals to large populations. The package is freely available and has been successfully utilized for the detection of both Fast-Neutron and EMS-induced DNA variants in a variety of mutagenized populations

    Re-Evaluation of Reportedly Metal Tolerant Arabidopsis thaliana Accessions.

    No full text
    Santa Clara, Limeport, and Berkeley are Arabidopsis thaliana accessions previously identified as diversely metal resistant. Yet these same accessions were determined to be genetically indistinguishable from the metal sensitive Col-0. We robustly tested tolerance for Zn, Ni and Cu, and genetic relatedness by growing these accessions under a range of Ni, Zn and Cu concentrations for three durations in multiple replicates. Neither metal resistance nor variance in growth were detected between them and Col-0. We re-sequenced the genomes of these accessions and all stocks available for each accession. In all cases they were nearly indistinguishable from the standard laboratory accession Col-0. As Santa Clara was allegedly collected from the Jasper Ridge serpentine outcrop in California, USA we investigated the possibility of extant A. thaliana populations adapted to serpentine soils. Botanically vouchered Arabidopsis accessions in the Jepson database were overlaid with soil maps of California. This provided no evidence of A. thaliana collections from serpentine sites in California. Thus, our work demonstrates that the Santa Clara, Berkeley and Limeport accessions are not metal tolerant, not genetically distinct from Col-0, and that there are no known serpentine adapted populations or accessions of A. thaliana

    Supplemental File S17

    No full text
    SIFT prediction results for the missense-annotated homozygous EMS-induced SNPs in all the 586 sequenced individuals

    Supplemental File S16

    No full text
    Function description for the genes containing the SnpEff-annotated likely EMS-induced indels in all the 586 sorghum individuals

    Supplemental File S25

    No full text
    Sample scripts for the variants detection and annotation pipeline
    • …
    corecore