303 research outputs found

    Quantifying CRISPR off-target effects

    Get PDF
    Recent advances in the era of genetic engineering have significantly improved our ability to make precise changes in the genomes of human cells. Throughout the years, clinical trials based on gene therapies have led to the cure of diseases such as X-linked severe combined immunodeficiency (SCID-X1), adenosine deaminase deficiency (ADA-SCID) and Wiskott–Aldrich syndrome. Despite the success gene therapy has had, there is still the risk of genotoxicity due to the potential oncogenesis introduced by utilising viral vectors. Research has focused on alternative strategies like genome editing without viral vectors as a means to reduce genotoxicity introduced by the viral vectors. Although there is an extensive use of RNA-guided genome editing via the clustered regularly interspaced short palindromic repeats (CRISPR) and associated protein-9 (Cas9) technology for biomedical research, its genome-wide target specificity and its genotoxic side effects remain controversial. There have been reports of on- and off-target effects created by CRISPR–Cas9 that can include small and large indels and inversions, highlighting the potential risk of insertional mutagenesis. In the last few years, a plethora of in silico, in vitro and in vivo genome-wide assays have been introduced with the sole purpose of profiling these effects. Here, we are going to discuss the genotoxic obstacles in gene therapies and give an up-to-date overview of methodologies for quantifying CRISPR–Cas9 effects

    Predicting Off-target Effects in CRISPR-Cas9 System using Graph Convolutional Network

    Get PDF
    CRISPR-Cas9 is a powerful genome editing technology that has been widely applied in target gene repair and gene expression regulation. One of the main challenges for the CRISPR-Cas9 system is the occurrence of unexpected cleavage at some sites (off-targets) and predicting them is necessary due to its relevance in gene editing research. Very few deep learning models have been developed so far that predict the off-target propensity of single guide RNA (sgRNA) at specific DNA fragments by using artificial feature extract operations and machine learning techniques. Unfortunately, they implement a convoluted process that is difficult to understand and implement by researchers. This thesis focuses on developing a novel graph-based approach to predict off-target efficacy of sgRNA in CRISPR-Cas9 system that is easy to understand and replicate by researchers. This is achieved by creating a graph with sequences as nodes and by performing link prediction using Graph Convolutional Network (GCN) to predict the presence of links between sgRNA and off-target inducing target DNA sequences. Features for the sequences are extracted from within the sequences

    Generalisable Methods for Improving CRISPR Efficiency and Outcome Specificity using Machine Learning Algorithms

    Get PDF
    CRISPR (clustered regularly interspaced short palindromic repeats) based genome editing has become a popular tool for a range of disciplines, including microbiology, agricultural science, and health. Driving these applications is the ability of the "programmable" system to target a predefined location in the genome. A single guide RNA (sgRNA) defines the target through Watson-Crick base pairing, and a class 2 type II CRISPR associated protein 9 (Cas9) nuclease cleaves the target, resulting in a double-strand break (DSB). This activates DNA repair, and depending on the repair pathway initiated, can result in arbitrary insertions/deletions or a predefined variant. Despite the versatility and ease of design enabled by this RNA-guided nuclease, it lacks specificity, regarding off-target effects, and efficiency, regarding the rate of successful editing outcomes. The overarching hypothesis of my thesis is to solve the disadvantages of CRISPR systems by using machine learning to train generalisable models on existing and novel datasets. One pathway that demonstrates the need for prediction models is homology directed repair (HDR). HDR enables researchers to induce nearly any editing outcome, however, it is inefficient. And with an incomplete knowledge of its kinetics, no models existed for predicting its efficiency. I generated a novel dataset representing the efficiency of HDR. Using the Random Forests algorithm, I identified the sgRNA and the 3' region of the template to modulate HDR efficiency. This novel finding relates to the kinetics of template interaction during HDR repair. Even with efficient gene editing, a potential problem is unwanted side effects, such as embryonic lethality. This can be solved by using CRISPR to create conditional knockout alleles, to control when and where knockouts occur. To investigate the efficiency of this process, I used statistical analyses and the Random Forest algorithm to analyse a dataset generated by a consortium of 19 laboratories. I identified the inherent inefficiency of this method as defined by the efficiency of two simultaneous HDR events. Other experimental variables, like reagent concentrations or technician skill level, had no significant influence on efficiency. Because of the unrivalled versatility of this method, I created a statistical model for forecasting the efficiency of this technique from a low number of attempts, aiming to overcome its inherent inefficiency. While Cas9 is the most cited CRISPR system, alternative CRISPR systems can further expand the gene editing repertoire. To support the uptake of the more-recent Cas12a, I performed a comprehensive comparison between the two nucleases. I found support for Cas12a having a superior specificity. Despite this, editing outcome and efficiency prediction tools for Cas12a were scarce. Aiming to address this, I trained a Cas12a cleavage efficiency prediction model on representative data. This outperformed the current top model despite the dataset being 300x smaller, demonstrating the importance of clean data. Altogether, this thesis improves the knowledge of different CRISPR gene editing techniques. These findings can enable researchers to design efficient experiments as well as provide researchers guidance where certain techniques may be inherently inefficient. As well as resulting in CUNE (Computational Universal Nucleotide Editor) and Cas12aRF, it also identifies the generalisability of prediction models due to the high degree of influence on efficiency by the sgRNA and repair template design

    In silico design and analysis of targeted genome editing with CRISPR

    Get PDF
    CRISPR/Cas systems have become a tool of choice for targeted genome engineering in recent years. Scientists around the world want to accelerate their research with the use of CRISPR/Cas systems, but are being slowed down by the need to understand the technology and computational steps needed for design and analysis. However, bioinformatics tools for the design and analysis of CRISPR experiments are being created to aid those scientists. For the design of CRISPR targeted genome editing experiments, CHOPCHOP has become one of the most cited and most used tools. After the initial publication of CHOPCHOP, our understanding of the CRISPR system underwent a scientific evolution. I therefore updated CHOPCHOP to accommodate the latest discoveries, such as designs for nickase and isoform targeting, machine learning algorithms for efficiency scoring and repair profile prediction, in addition to many others. On the other spectrum of genome engineering with CRISPR, there is a need for analysis of the data and validation of mutants. For the analysis of the CRISPR targeted genome editing experiments, I have created ampliCan, an R package that with the use of ‘editing aware’ alignment and automated normalization, performs precise estimation of editing efficiencies for thousands of CRISPR experiments. I have benchmarked ampliCan to display its strengths at handling a variety of editing indels, filtering out contaminant reads and performing HDR editing estimates. Both of these tools were developed with the idea that biologists without a deep understanding of CRISPR should be able to use them, and at the same time seasoned experts can adjust the settings for their purposes. I hope that these tools will facilitate adaptation of CRISPR systems for targeted genome editing and indirectly allow for great discoveries in the future

    The Current State and Future of CRISPR-Cas9 gRNA Design Tools

    Get PDF
    Recent years have seen the development of computational tools to assist researchers in performing CRISPR-Cas9 experiment optimally. More specifically, these tools aim to maximize on-target activity (guide efficiency) while also minimizing potential off-target effects (guide specificity) by analyzing the features of the target site. Nonetheless, currently available tools cannot robustly predict experimental success as prediction accuracy depends on the approximations of the underlying model and how closely the experimental setup matches the data the model was trained on. Here, we present an overview of the available computational tools, their current limitations and future considerations. We discuss new trends around personalized health by taking genomic variants into account when predicting target sites as well as discussing other governing factors that can improve prediction accuracy

    Tools for experimental and computational analyses of off-target editing by programmable nucleases

    Get PDF
    Genome editing using programmable nucleases is revolutionizing life science and medicine. Off-target editing by these nucleases remains a considerable concern, especially in therapeutic applications. Here we review tools developed for identifying potential off-target editing sites and compare the ability of these tools to properly analyze off-target effects. Recent advances in both in silico and experimental tools for off-target analysis have generated remarkably concordant results for sites with high off-target editing activity. However, no single tool is able to accurately predict low-frequency off-target editing, presenting a bottleneck in therapeutic genome editing, because even a small number of cells with off-target editing can be detrimental. Therefore, we recommend that at least one in silico tool and one experimental tool should be used together to identify potential off-target sites, and amplicon-based next-generation sequencing (NGS) should be used as the gold standard assay for assessing the true off-target effects at these candidate sites. Future work to improve off-target analysis includes expanding the true off-target editing dataset to evaluate new experimental techniques and to train machine learning algorithms; performing analysis using the particular genome of the cells in question rather than the reference genome; and applying novel NGS techniques to improve the sensitivity of amplicon-based off-target editing quantification.Off-target effects of programmable nucleases remain a critical issue for therapeutic applications of genome editing. This review compares experimental and computational tools for off-target analysis and provides recommendations for better assessments of off-target effects

    Development and application of CRISPR-based genetic tools in Bacillus species and Bacillus phages

    Get PDF
    Recently, the clustered regularly interspaced short palindromic repeats (CRISPR) system has been developed into a precise and efficient genome editing tool. Since its discovery as an adaptive immune system in prokaryotes, it has been applied in many different research fields including biotechnology and medical sciences. The high demand for rapid, highly efficient and versatile genetic tools to thrive in bacteria-based cell factories accelerates this process. This review mainly focuses on significant advancements of the CRISPR system in Bacillus subtilis, including the achievements in gene editing, and on problems still remaining. Next, we comprehensively summarize this genetic tool's up-to-date development and utilization in other Bacillus species, including B. licheniformis, B. methanolicus, B. anthracis, B. cereus, B. smithii and B. thuringiensis. Furthermore, we describe the current application of CRISPR tools in phages to increase Bacillus hosts' resistance to virulent phages and phage genetic modification. Finally, we suggest potential strategies to further improve this advanced technique and provide insights into future directions of CRISPR technologies for rendering Bacillus species cell factories more effective and more powerful

    Engineering Proteins by Domain Insertion

    Get PDF
    Protein domains are structural and functional subunits of proteins. The recombination of existing domains is a source of evolutionary innovation, as it can result in new protein features and functions. Inspired by nature, protein engineering commonly uses domain recombination in order to create artificial proteins with tailor-made properties. Customized control over protein activity, for instance, can be achieved by harnessing switchable domains and functionally linking them to effector domains. Many natural protein domains exhibit conformational changes in response to exogenous triggers. The insertion of light-switchable receptor domains into an effector protein of choice, for instance, allows the control of effector activity with light. The resulting optogenetic proteins represent powerful tools for the investigation of dynamic cellular processes with high precision in time and space. On top, optogenetic proteins enable manifold biotechnological applications and they are even considered potential candidates for future therapeutics. In this study, we first focused on CRISPR-Cas9 genome editing and applied a domain insertion strategy to genetically encoded inhibitors of the CRISPR nuclease from Neisseria meningitidis (NmeCas9), which due to its small size and high DNA sequence-specificity is of great interest for CRISPR genome editing applications. Fusing stabilizing domains to the NmeCas9 inhibitory protein AcrIIC1 allowed us to boost its inhibitory effect, thereby yielding a potent gene editing off-switch. Furthermore, the insertion of the light-responsive LOV2 domain from Avena sativa into AcrIIC3, the most potent inhibitor of NmeCas9, enabled the optogenetic control of gene editing via light-dependent NmeCas9 inhibition. Further investigation of the engineered inhibitors revealed the potential these proteins could have with respect to safe-guarding of the CRISPR technology by selectively reducing off-target editing. The laborious optimization of the engineered CRISPR inhibitors necessary by the time motivated us to more systematically investigate possibilities and constraints of protein engineering by domain insertion using an unbiased insertion approach. Previously, single protein domains were usually introduced only at a few rationally selected sites into target proteins. Here, we inserted up to five structurally and functionally unrelated domains into several different candidate effector proteins at all possible positions. The resulting libraries of protein hybrids were screened for activity by fluorescence-activated cell sorting (FACS) and subsequent next-generation sequencing (Flow-seq). Training machine learning models on the resulting, comprehensive datasets allowed us to dissect parameters that affect domain insertion tolerance and revealed that sequence conservation statistics are the most powerful predictors for domain insertion success. Finally, extending our experimental Flow-seq pipeline towards the screening of engineered, switchable effector variants yielded two potent optogenetic derivatives of the E. coli transcription factor AraC. These novel hybrids will enable the co-regulation of bacterial gene expression by light and chemicals. Taken together, our study showcases the design of functionally diverse protein switches for the control of gene editing and gene expression in mammalian cells and E. coli, respectively. In addition, the generation of a large domain insertion datasets enabled - for the first time - the unbiased investigation of domain insertion tolerance in several evolutionary unrelated proteins. Our study showcases the manifold opportunities and remaining challenges behind the engineering of proteins with new properties and functionalities by domain recombination

    Development of a high-throughput CRISPR/Cas9 based fluorescent reporter to study DNA double-strand break repair choices.

    Get PDF
    DNA double-stranded breaks (DSBs) are the most toxic types of DNA lesions. Cells have several strategies to repair such lesions that can be grouped into end-protection and end-resection coupled mechanisms. To study the DNA DSB choices in-vitro, we developed and employed Color-Assay-Tracing-Repair (CAT-R) as a dual-fluorescence reporter taking advantage of the highly efficient CRISPR/Cas9 system to induce DSBs. We can distinguish point-mutations (InDels) from large-deletions/insertions in-vitro and quantify simultaneously the rates of NHEJ vs end-resection based DNA repair. We combined this system with high-throughput flow cytometry and studied the DNA DSB repair choices. On the one hand, we evaluated the efficiency and potency of 26 different pharmacological compounds, that are currently in clinical/preclinical studies targeting ATM, DNA-PK, ATR, and PARP, and we identify key differences in the way these compounds engage to DNA DSB repair choice. On the other hand, in order to find new players that can regulate the choice between end-protection and end-resection we performed a custom designed CRISPR/Cas9 arrayed genetic screen with CAT-R and evaluated the functions of 420 individual DNA repair components. For this, we developed a highly efficient transfection platform for arrayed CRISPR/Cas9 screens based on solid phase transfection with synthetic gRNA:tracrRNA complexes. In addition to known components of DNA DSBs we uncovered novel molecules that can tip the balance of the DNA DSB repair choice either towards end-protection or end-resection. In summary, CAT-R can be used to assess the functions of DNA DSB repair components in genetic/chemical screens in a variety of mammalian cells
    corecore