4 research outputs found

    Identification of high-efficiency 3′GG gRNA motifs in indexed FASTA files with ngg2

    Get PDF
    CRISPR/Cas9 is emerging as one of the most-used methods of genome modification in organisms ranging from bacteria to human cells. However, the efficiency of editing varies tremendously site-to-site. A recent report identified a novel motif, called the 3′GG motif, which substantially increases the efficiency of editing at all sites tested in C. elegans. Furthermore, they highlighted that previously published gRNAs with high editing efficiency also had this motif. I designed a Python command-line tool, ngg2, to identify 3′GG gRNA sites from indexed FASTA files. As a proof-of-concept, I screened for these motifs in six model genomes: Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, Danio rerio, Mus musculus, and Homo sapiens. I also scanned the genomes of pig (Sus scrofa) and African elephant (Loxodonta africana) to demonstrate the utility in non-model organisms. I identified more than 60 million single match 3′GG motifs in these genomes. Greater than 61% of all protein coding genes in the reference genomes had at least one unique 3′GG gRNA site overlapping an exon. In particular, more than 96% of mouse and 93% of human protein coding genes have at least one unique, overlapping 3′GG gRNA. These identified sites can be used as a starting point in gRNA selection, and the ngg2 tool provides an important ability to identify 3′GG editing sites in any species with an available genome sequence

    Identification of high-efficiency 3'GG gRNA motifs in indexed FASTA files with ngg2

    No full text
    Abstract 15 CRISPR/Cas9 is emerging as one of the most-used methods of genome modification in 16 organisms ranging from bacteria to human cells. However, the efficiency of editing 17 varies tremendously site-to-site. A recent report identified a novel motif, called the 18 3'GG motif, which substantially increases the efficiency of editing at all sites tested in C. 19 elegans. Furthermore, they highlighted that previously published gRNAs with high 20 editing efficiency also had this motif. I designed a python command-line tool, ngg2, to 21 identify 3'GG gRNA sites from indexed FASTA files. As a proof-of-concept, I screened 22 for these motifs in six model genomes: Saccharomyces cerevisiae, Caenorhabditis elegans, 23 Drosophila melanogaster, Danio rerio, Mus musculus, and Homo sapiens. I also scanned the 24 genomes of pig (Sus scrofa) and African elephant (Loxodonta africana) to demonstrate the 25 utility in non-model organisms. I identified more than 60 million single match 3'GG 26 motifs in these genomes. Greater than 61% of all protein coding genes in the reference 27 genomes had at least one unique 3'GG gRNA site overlapping an exon. In particular, 28 more than 96% of mouse and 93% of human protein coding genes have at least one 29 unique, overlapping 3'GG gRNA. These identified sites can be used as a starting point 30 PeerJ PrePrints | https://dx.doi.org/10.7287/peerj.preprints.969v2 | CC-BY 4.0 Open Access
    corecore