9 research outputs found

    Understanding key features of bacterial restriction-modification systems through quantitative modeling

    Get PDF
    Restriction-modification (R-M) systems are rudimentary bacterial immune systems. The main components include restriction enzyme (R), which cuts specific unmethylated DNA sequences, and the methyltransferase (M), which protects the same DNA sequences. The expression of R-M system components is considered to be tightly regulated, to ensure successful establishment in a naΓ―ve bacterial host. R-M systems are organized in different architectures (convergent or divergent) and are characterized by different features, i.e. binding cooperativities, dissociation constants of dimerization, translation rates, which ensure this tight regulation. It has been proposed that R-M systems should exhibit certain dynamical properties during the system establishment, such as: i) a delayed expression of R with respect to M, ii) fast transition of R from "OFF" to "ON" state, iii) increased stability of the toxic molecule (R) steady-state levels. It is however unclear how different R-M system features and architectures ensure these dynamical properties, particularly since it is hard to address this question experimentally

    Methods for Epigenetic Analyses from Long-Read Sequencing Data

    Get PDF
    Epigenetics, particularly the study of DNA methylation, is a cornerstone field for our understanding of human development and disease. DNA methylation has been included in the "hallmarks of cancer" due to its important function as a biomarker and its contribution to carcinogenesis and cancer cell plasticity. Long-read sequencing technologies, such as the Oxford Nanopore Technologies platform, have evolved the study of structural variations, while at the same time allowing direct measurement of DNA methylation on the same reads. With this, new avenues of analysis have opened up, such as long-range allele-specific methylation analysis, methylation analysis on structural variations, or relating nearby epigenetic modalities on the same read to another. Basecalling and methylation calling of Nanopore reads is a computationally expensive task which requires complex machine learning architectures. Read-level methylation calls require different approaches to data management and analysis than ones developed for methylation frequencies measured from short-read technologies or array data. The 2-dimensional nature of read and genome associated DNA methylation calls, including methylation caller uncertainties, are much more storage costly than 1-dimensional methylation frequencies. Methods for storage, retrieval, and analysis of such data therefore require careful consideration. Downstream analysis tasks, such as methylation segmentation or differential methylation calling, have the potential of benefiting from read information and allow uncertainty propagation. These avenues had not been considered in existing tools. In my work, I explored the potential of long-read DNA methylation analysis and tackled some of the challenges of data management and downstream analysis using state of the art software architecture and machine learning methods. I defined a storage standard for reference anchored and read assigned DNA methylation calls, including methylation calling uncertainties and read annotations such as haplotype or sample information. This storage container is defined as a schema for the hierarchical data format version 5, includes an index for rapid access to genomic coordinates, and is optimized for parallel computing with even load balancing. It further includes a python API for creation, modification, and data access, including convenience functions for the extraction of important quality statistics via a command line interface. Furthermore, I developed software solutions for the segmentation and differential methylation testing of DNA methylation calls from Nanopore sequencing. This implementation takes advantage of the performance benefits provided by my high performance storage container. It includes a Bayesian methylome segmentation algorithm which allows for the consensus instance segmentation of multiple sample and/or haplotype assigned DNA methylation profiles, while considering methylation calling uncertainties. Based on this segmentation, the software can then perform differential methylation testing and provides a large number of options for statistical testing and multiple testing correction. I benchmarked all tools on both simulated and publicly available real data, and show the performance benefits compared to previously existing and concurrently developed solutions. Next, I applied the methods to a cancer study on a chromothriptic cancer sample from a patient with Sonic Hedgehog Medulloblastoma. I here report regulatory genomic regions differentially methylated before and after treatment, allele-specific methylation in the tumor, as well as methylation on chromothriptic structures. Finally, I developed specialized methylation callers for the combined DNA methylation profiling of CpG, GpC, and context-free adenine methylation. These callers can be used to measure chromatin accessibility in a NOMe-seq like setup, showing the potential of long-read sequencing for the profiling of transcription factor co-binding. In conclusion, this thesis presents and subsequently benchmarks new algorithmic and infrastructural solutions for the analysis of DNA methylation data from long-read sequencing

    Biophysical modeling of bacterial restriction-modification systems

    Get PDF
    РСстрикционо-ΠΌΠΎΠ΄ΠΈΡ„ΠΈΠΊΠ°Ρ†ΠΈΠΎΠ½ΠΈ (Π -М) ΠΈ CRISPR-Cas систСми користС Ρ€Π°Π·Π»ΠΈΡ‡ΠΈΡ‚Π΅ ΠΌΠ΅Ρ…Π°Π½ΠΈΠ·ΠΌΠ΅ Π·Π° ΠΎΠ±Π°Π²Ρ™Π°ΡšΠ΅ основнС Ρ„ΡƒΠ½ΠΊΡ†ΠΈΡ˜Π΅ – ΠΎΠ΄Π±Ρ€Π°Π½Π΅ прокариотскС Ρ›Π΅Π»ΠΈΡ˜Π΅ ΠΎΠ΄ странС Π”ΠΠš. Π—Π° Ρ‡Π΅Ρ‚ΠΈΡ€ΠΈ ΠΎΠ΄Π°Π±Ρ€Π°Π½Π° Π -М систСма Π’ΠΈΠΏΠ° II ΠΈ CRISPR-Cas Π’ΠΈΠΏΠ° I-E су постављСни Ρ‚Π΅Ρ€ΠΌΠΎΠ΄ΠΈΠ½Π°ΠΌΠΈΡ‡ΠΊΠΈ ΠΌΠΎΠ΄Π΅Π»ΠΈ Ρ€Π΅Π³ΡƒΠ»Π°Ρ†ΠΈΡ˜Π΅ Ρ‚Ρ€Π°Π½ΡΠΊΡ€ΠΈΠΏΡ†ΠΈΡ˜Π΅ ΠΈ Π΄ΠΈΠ½Π°ΠΌΠΈΡ‡ΠΊΠΈ ΠΌΠΎΠ΄Π΅Π»ΠΈ Π΅ΠΊΡΠΏΡ€Π΅ΡΠΈΡ˜Π΅ транскрипата ΠΈ ΠΏΡ€ΠΎΡ‚Π΅ΠΈΠ½Π°. Π‘ΠΈΠΌΡƒΠ»Π°Ρ†ΠΈΡ˜ΠΎΠΌ ΠΈ Π°Π½Π°Π»ΠΈΠ·ΠΎΠΌ Π΄ΠΈΠ½Π°ΠΌΠΈΠΊΠ΅ ΠΌΠΎΠ΄Π΅Π»Π° су ΠΈΠ΄Π΅Π½Ρ‚ΠΈΡ„ΠΈΠΊΠΎΠ²Π°Π½Π΅ особинС Π΄ΠΈΠ½Π°ΠΌΠΈΠΊΠ΅ Π΅ΠΊΡΠΏΡ€Π΅ΡΠΈΡ˜Π΅ систСма ΠΏΠΎ ΠΏΠΎΠΊΡ€Π΅Ρ‚Π°ΡšΡƒ ΡšΠΈΡ…ΠΎΠ²Π΅ активности Ρƒ Ρ›Π΅Π»ΠΈΡ˜ΠΈ којС Π²Π΅Ρ€ΠΎΠ²Π°Ρ‚Π½ΠΎ ΠΏΡ€Π΅Π΄ΡΡ‚Π°Π²Ρ™Π°Ρ˜Ρƒ ΠΏΡ€ΠΈΠ½Ρ†ΠΈΠΏΠ΅ Π΅Π²ΠΎΠ»ΡƒΡ‚ΠΈΠ²Π½ΠΎΠ³ дизајна ΡšΠΈΡ…ΠΎΠ²Π΅ Ρ€Π΅Π³ΡƒΠ»Π°Ρ†ΠΈΡ˜Π΅. ΠŸΡ€Π΅Ρ†ΠΈΠ·Π½ΠΈΡ˜Π΅, испитано јС: i) ΠΊΠ°ΠΊΠΎ ΠΏΠ΅Ρ€Ρ‚ΡƒΡ€Π±Π°Ρ†ΠΈΡ˜Π΅ карактСристичних Ρ€Π΅Π³ΡƒΠ»Π°Ρ‚ΠΎΡ€Π½ΠΈΡ… ΡΠ²ΠΎΡ˜ΡΡ‚Π°Π²Π° Π -М систСма AhdI ΠΈ EcoRV ΡƒΡ‚ΠΈΡ‡Ρƒ Π½Π° Ρ‚Ρ€ΠΈ ΠΏΡ€Π΅Π΄Π»ΠΎΠΆΠ΅Π½Π° Π΄ΠΈΠ½Π°ΠΌΠΈΡ‡ΠΊΠ° ΠΏΡ€ΠΈΠ½Ρ†ΠΈΠΏΠ°; ii) Π΄Π° Π»ΠΈ Π -М систСм Kpn2I, са Ρ€Π΅Π³ΡƒΠ»Π°Ρ†ΠΈΡ˜ΠΎΠΌ Π½Π° Π½ΠΈΠ²ΠΎΡƒ Π΅Π»ΠΎΠ½Π³Π°Ρ†ΠΈΡ˜Π΅ Ρ‚Ρ€Π°Π½ΡΠΊΡ€ΠΈΠΏΡ†ΠΈΡ˜Π΅, ΠΌΠΎΠΆΠ΅ Π΄Π° ΠΎΠ±Π΅Π·Π±Π΅Π΄ΠΈ ΠΎΡ‡Π΅ΠΊΠΈΠ²Π°Π½Π° Π΄ΠΈΠ½Π°ΠΌΠΈΡ‡ΠΊΠ° ΡΠ²ΠΎΡ˜ΡΡ‚Π²Π°; iii) Π΄Π° Π»ΠΈ су ΠΏΠΎΡΡ‚ΠΎΡ˜Π΅Ρ›Π° сазнања ΠΎ Ρ€Π΅Π³ΡƒΠ»Π°Ρ†ΠΈΡ˜ΠΈ Π -М систСма Esp1396I Π΄ΠΎΠ²ΠΎΡ™Π½Π° Π·Π° Ρ€Π΅ΠΏΡ€ΠΎΠ΄ΡƒΠΊΠΎΠ²Π°ΡšΠ΅ Π΄ΠΈΠ½Π°ΠΌΠΈΠΊΠ΅ Π΅ΠΊΡΠΏΡ€Π΅ΡΠΈΡ˜Π΅ ΠΏΡ€ΠΎΡ‚Π΅ΠΈΠ½Π° ΠΈΠ·ΠΌΠ΅Ρ€Π΅Π½Π΅ Π½Π° Π½ΠΈΠ²ΠΎΡƒ ΠΏΠΎΡ˜Π΅Π΄ΠΈΠ½Π°Ρ‡Π½ΠΈΡ… Ρ›Π΅Π»ΠΈΡ˜Π°; iv) ΠΊΠ°ΠΊΠ²Π΅ особинС Π²Π΅Ρ€ΠΎΠ²Π°Ρ‚Π½ΠΎ ΠΈΠΌΠ° Π½Π΅ΠΏΠΎΠ·Π½Π°Ρ‚Π° Π΄ΠΈΠ½Π°ΠΌΠΈΠΊΠ° Π΅ΠΊΡΠΏΡ€Π΅ΡΠΈΡ˜Π΅ CRISPR-Cas систСма Ρƒ Escherichia coli, ΠΏΡ€Π΅Π΄Π²ΠΈΡ’Π΅Π½Π° ΡƒΠ· прСтпоставку Π΄Π° сС њСгов ΠΌΠ΅Ρ…Π°Π½ΠΈΠ·Π°ΠΌ Ρ€Π΅Π³ΡƒΠ»Π°Ρ†ΠΈΡ˜Π΅ Ρ‚Ρ€Π°Π½ΡΠΊΡ€ΠΈΠΏΡ†ΠΈΡ˜Π΅ ΠΌΠΎΠΆΠ΅ апроксимирати ΠΊΠΎΠ½Ρ†Π΅ΠΏΡ‚ΡƒΠ°Π»Π½ΠΎ сличним ΠΌΠ΅Ρ…Π°Π½ΠΈΠ·ΠΌΠΎΠΌ Π -М систСма. Показано јС Π΄Π° сва Ρ‡Π΅Ρ‚ΠΈΡ€ΠΈ Π -М систСма, ΠΊΠ°ΠΎ ΠΈ CRISPR-Cas, структурно ΠΈΡΠΏΡƒΡšΠ°Π²Π°Ρ˜Ρƒ условС Π·Π° ΠΏΠΎΡΡ‚ΠΈΠ·Π°ΡšΠ΅ Π΄Π²Π° ΠΏΡ€Π΅Π΄Π»ΠΎΠΆΠ΅Π½Π° Π΄ΠΈΠ½Π°ΠΌΠΈΡ‡ΠΊΠ° ΠΏΡ€ΠΈΠ½Ρ†ΠΈΠΏΠ° – ΠΏΠΎΡ‡Π΅Ρ‚Π½ΠΎ кашњСњС Π΅ΠΊΡΠΏΡ€Π΅ΡΠΈΡ˜Π΅ рСстрикционС Π΅Π½Π΄ΠΎΠ½ΡƒΠΊΠ»Π΅Π°Π·Π΅ Π·Π° Π΅ΠΊΡΠΏΡ€Π΅ΡΠΈΡ˜ΠΎΠΌ мСтилтрансфСразС ΠΈ њСн Π½Π°Π³Π»ΠΈ пораст ΠΊΠ° стационарном ΡΡ‚Π°ΡšΡƒ, Π΄ΠΎΠΊ Π°Π½Π°Π»ΠΈΠ·Π° систСма AhdI ΠΈ EcoRV ΠΏΠΎΠ΄Ρ€ΠΆΠ°Π²Π° ΠΈ Ρ‚Ρ€Π΅Ρ›ΠΈ – нискС Ρ„Π»ΡƒΠΊΡ‚ΡƒΠ°Ρ†ΠΈΡ˜Π΅ Ρƒ стационарном ΡΡ‚Π°ΡšΡƒ. Ова сазнања ΠΎ Π΄ΠΈΠ·Π°Ρ˜Π½Ρƒ ΠΏΡ€ΠΈΡ€ΠΎΠ΄Π½ΠΈΡ… гСнских ΠΌΡ€Π΅ΠΆΠ° ΠΎΠΌΠΎΠ³ΡƒΡ›Π°Π²Π°Ρ˜Ρƒ Π±ΠΎΡ™Π΅ Ρ€Π°Π·ΡƒΠΌΠ΅Π²Π°ΡšΠ΅ Π²Π΅Π·Π΅ ΠΈΠ·ΠΌΠ΅Ρ’Ρƒ ΡšΠΈΡ…ΠΎΠ²Π΅ структурС ΠΈ Ρ„ΡƒΠ½ΠΊΡ†ΠΈΡ˜Π΅ ΠΈ Π΄Π°Ρ˜Ρƒ смСрницС Π·Π° дизајн синтСтичких гСнских ΠΊΠΎΠ»Π°.Restriction-modification (R-M) and CRISPR-Cas systems use different mechanisms to perform their main function - defend prokaryotic cells from foreign DNA. Thermodynamic models of transcription regulation and dynamic models of transcript and protein expression were set for four selected Type II R-M systems and a Type I-E CRISPR-Cas. By simulating and analyzing the model dynamics, we identified the properties of the system expression dynamics upon the induction in a cell which may be the principles of the regulation evolutionary design. Specifically, we examined: i) how perturbing of the characteristic regulatory features of the R-M systems AhdI and EcoRV affects the three proposed dynamic principles; ii) if the R-M system Kpn2I, whith regulation at the level of transcription elongation, can provide the expected dynamic properties; iii) if the known regulation of the R-M system Esp1396I is sufficient to reproduce the protein expression dynamics measured on single-cells; iv) which properties are probably found in the unknown expression dynamics of the CRISPR-Cas system in Escherichia coli, predicted under the assumption that its transcription regulation mechanism can be approximated by a similar one from R-M systems. We showed that all four R-M systems, as well as CRISPR-Cas, are able to achieve the two proposed dynamic principles - initial delay of restriction endonuclease with respect to methyltransferase expression and its rapid increase towards steady-state, while analysis of AhdI and EcoRV adds the third principle - low fluctuations in the steady-state. Gained insights into the design of these natural gene networks provide a better understanding of the relationship between their structure and function, as well as guidelines for the design of synthetic gene circuits

    Transposon mutagenesis in RT 078 Clostridioides difficile

    Get PDF
    Clostridioides difficile is a Gram-positive, spore-forming, anaerobic bacterium and a major cause of healthcare associated diarrhoea. Significant increases in the incidence of hypervirulent strains, such as those belonging to PCR ribotype (RT) 027, and increased antibiotic resistance have formed the focus of current C. difficile clinical research. Hypervirulent strains belonging to RT 078, in contrast, have received comparative less attention, despite the fact that they are widely recognized as being zoonotic, with a particular association with pigs. A greater understanding of RT 078 strains would benefit from the implementation of forward genetic approaches. Here we sought to implement Transposon directed insertion-site sequencing (TraDIS), a high throughput method able to define gene essentiality under niche-specific conditions, to elucidate physiological changes such as sporulation and germination in RT 078 strains. As effective DNA transfer is a prerequisite for TraDIS implementation, the most efficient strains as both donor and recipient in conjugation were identified. Applying next generation sequencing technologies on 10 clinical isolates and subsequent methylome analysis demonstrated that although the tested strains of RT 078 were genetically similar (up to 99.99%), they possess a variety of potential Restriction-Modification (R-M) barriers. One of these R-M systems was circumvented using the novel Escherichia coli donor strain, sExpress. Improved DNA transformability in C. difficile RT 078 strain CD9301 made it an optimal target for further genetic manipulations and subsequent TraDIS analysis. Subsequently, several transposon delivery systems were evaluated, based on their potential to mediate random transposon insertion and reliable plasmid loss, to prevent interference of the transposase during downstream experiments in C. difficile. The Tet-inducible transposon vector pRF215, performed best in CD9301. Based on this plasmid system, the novel vector pMTL-MtV10 was created, which was additionally equipped with I-SceI digestion sites, to achieve increased plasmid clearance during library preparation. Using both plasmids, genes essential for growth in rich media were determined. In total, 448 essential genes were predicted. The incorporation of I-SceI sites into pMtV-10 did not, however, improves plasmid loss during the TraDIS library preparation. A further 398 genes were predicted to be essential for sporulation. The number of genes identified is most likely an underestimate as the manual cut-off used to predict essentiality lacks sensitivity. The described findings lay the ground work necessary for determining essentiality in RT 078 and improving our understanding of this important ribotype

    Transposon mutagenesis in RT 078 Clostridioides difficile

    Get PDF
    Clostridioides difficile is a Gram-positive, spore-forming, anaerobic bacterium and a major cause of healthcare associated diarrhoea. Significant increases in the incidence of hypervirulent strains, such as those belonging to PCR ribotype (RT) 027, and increased antibiotic resistance have formed the focus of current C. difficile clinical research. Hypervirulent strains belonging to RT 078, in contrast, have received comparative less attention, despite the fact that they are widely recognized as being zoonotic, with a particular association with pigs. A greater understanding of RT 078 strains would benefit from the implementation of forward genetic approaches. Here we sought to implement Transposon directed insertion-site sequencing (TraDIS), a high throughput method able to define gene essentiality under niche-specific conditions, to elucidate physiological changes such as sporulation and germination in RT 078 strains. As effective DNA transfer is a prerequisite for TraDIS implementation, the most efficient strains as both donor and recipient in conjugation were identified. Applying next generation sequencing technologies on 10 clinical isolates and subsequent methylome analysis demonstrated that although the tested strains of RT 078 were genetically similar (up to 99.99%), they possess a variety of potential Restriction-Modification (R-M) barriers. One of these R-M systems was circumvented using the novel Escherichia coli donor strain, sExpress. Improved DNA transformability in C. difficile RT 078 strain CD9301 made it an optimal target for further genetic manipulations and subsequent TraDIS analysis. Subsequently, several transposon delivery systems were evaluated, based on their potential to mediate random transposon insertion and reliable plasmid loss, to prevent interference of the transposase during downstream experiments in C. difficile. The Tet-inducible transposon vector pRF215, performed best in CD9301. Based on this plasmid system, the novel vector pMTL-MtV10 was created, which was additionally equipped with I-SceI digestion sites, to achieve increased plasmid clearance during library preparation. Using both plasmids, genes essential for growth in rich media were determined. In total, 448 essential genes were predicted. The incorporation of I-SceI sites into pMtV-10 did not, however, improves plasmid loss during the TraDIS library preparation. A further 398 genes were predicted to be essential for sporulation. The number of genes identified is most likely an underestimate as the manual cut-off used to predict essentiality lacks sensitivity. The described findings lay the ground work necessary for determining essentiality in RT 078 and improving our understanding of this important ribotype
    corecore