67 research outputs found

    Evolutionary Selection Against Short Nucleotide Sequences in Viruses and Their Related Hosts

    Get PDF
    Viruses are under constant evolutionary pressure to effectively interact with the host intracellular factors, while evading its immune system. Understanding how viruses co-evolve with their hosts is a fundamental topic in molecular evolution and may also aid in developing novel viral based applications such as vaccines, oncologic therapies, and anti-bacterial treatments. Here, based on a novel statistical framework and a large-scale genomic analysis of 2,625 viruses from all classes infecting 439 host organisms from all kingdoms of life, we identify short nucleotide sequences that are under-represented in the coding regions of viruses and their hosts. These sequences cannot be explained by the coding regions’ amino acid content, codon, and dinucleotide frequencies. We specifically show that short homooligonucleotide and palindromic sequences tend to be under-represented in many viruses probably due to their effect on gene expression regulation and the interaction with the host immune system. In addition, we show that more sequences tend to be under-represented in dsDNA viruses than in other viral groups. Finally, we demonstrate, based on in vitro and in vivo experiments, how under-represented sequences can be used to attenuated Zika virus strains

    From Genes to Ecosystems: Resource Availability and DNA Methylation Drive the Diversity and Abundance of Restriction Modification Systems in Prokaryotes

    Get PDF
    Together, prokaryotic hosts and their viruses numerically dominate the planet and are engaged in an eternal struggle of hosts evading viral predation and viruses overcoming defensive mechanisms employed by their hosts. Prokaryotic hosts have been found to carry several viral defense systems in recent years with Restriction Modification systems (RMs) were the first discovered in the 1950s. While we have biochemically elucidated many of these systems in the last 70 years, we still struggle to understand what drives their gain and loss in prokaryotic genomes. In this work, we take a computational approach to understand the underlying evolutionary drivers of RMs by assessing ‘big data’ signals of RMs in prokaryotic genomes and incorporating molecular data in trait-based mathematical models. Focusing on the Cyanobacteria, we found a large discrepancy in the frequency of RMs per genome in different environmental contexts, where Cyanobacteria that live in oligotrophic nutrient conditions have few to no RMs and those in nutrient-rich conditions consistently have many RMs. While our models agree with the observation that increased nutrient inputs make the selective pressure of RMs more intense, they were unable to reconcile the high numbers of RMs per genome with their potent defensive properties- a situation of apparent overkill. By incorporating viral methylation, an unavoidable effect of RMs, we were able to explain how organisms could carry over 15 RMs. With this discovery, we then tried and reassess the distribution of methyltransferases, an essential component of RMs that can also have alternate physiological rolls in the cell. We expand on conventional wisdom, that methyltransferases that are widely phylogenetically conserved are associated with global cellular regulation. However, we also find that organisms with high numbers of RMs also have a surprising amount of conservation in the methyltransferases that they carry. This data suggests caution should be used in associating phylogenic signals with functional rolls in methyltransferases as different functional rolls seem to overlap in their phylogenetic signal. Indeed, we suggest trait-based modeling may be the best tool in elucidating why organisms with a high selective pressure to maintain RMs appear to have conserved methyltransferase

    Genetics of Halophilic Microorganisms

    Get PDF
    Halophilic microorganisms are found in all domains of life and thrive in hypersaline (high salt content) environments. These unusual microbes have been a subject of study for many years due to their interesting properties and physiology. Studies of the genetics of halophilic microorganisms (from gene expression and regulation to genomics) have provided understanding into the mechanisms of how life can exist at high salinity levels. Here, we highlight recent studies that advance the knowledge of biological function through examination of the genetics of halophilic microorganisms and their viruses

    Environmental adaptability and stress tolerance of Laribacter hongkongensis: a genome-wide analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p><it>Laribacter hongkongensis </it>is associated with community-acquired gastroenteritis and traveler's diarrhea and it can reside in human, fish, frogs and water. In this study, we performed an in-depth annotation of the genes in its genome related to adaptation to the various environmental niches.</p> <p>Results</p> <p><it>L. hongkongensis </it>possessed genes for DNA repair and recombination, basal transcription, alternative σ-factors and 109 putative transcription factors, allowing DNA repair and global changes in gene expression in response to different environmental stresses. For acid stress, it possessed a urease gene cassette and two <it>arc </it>gene clusters. For alkaline stress, it possessed six CDSs for transporters of the monovalent cation/proton antiporter-2 and NhaC Na<sup>+</sup>:H<sup>+ </sup>antiporter families. For heavy metals acquisition and tolerance, it possessed CDSs for iron and nickel transport and efflux pumps for other metals. For temperature stress, it possessed genes related to chaperones and chaperonins, heat shock proteins and cold shock proteins. For osmotic stress, 25 CDSs were observed, mostly related to regulators for potassium ion, proline and glutamate transport. For oxidative and UV light stress, genes for oxidant-resistant dehydratase, superoxide scavenging, hydrogen peroxide scavenging, exclusion and export of redox-cycling antibiotics, redox balancing, DNA repair, reduction of disulfide bonds, limitation of iron availability and reduction of iron-sulfur clusters are present. For starvation, it possessed phosphorus and, despite being asaccharolytic, carbon starvation-related CDSs.</p> <p>Conclusions</p> <p>The <it>L. hongkongensis </it>genome possessed a high variety of genes for adaptation to acid, alkaline, temperature, osmotic, oxidative, UV light and starvation stresses and acquisition of and tolerance to heavy metals.</p

    Environmental adaptability and stress tolerance of Laribacter hongkongensis: a genome-wide analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p><it>Laribacter hongkongensis </it>is associated with community-acquired gastroenteritis and traveler's diarrhea and it can reside in human, fish, frogs and water. In this study, we performed an in-depth annotation of the genes in its genome related to adaptation to the various environmental niches.</p> <p>Results</p> <p><it>L. hongkongensis </it>possessed genes for DNA repair and recombination, basal transcription, alternative σ-factors and 109 putative transcription factors, allowing DNA repair and global changes in gene expression in response to different environmental stresses. For acid stress, it possessed a urease gene cassette and two <it>arc </it>gene clusters. For alkaline stress, it possessed six CDSs for transporters of the monovalent cation/proton antiporter-2 and NhaC Na<sup>+</sup>:H<sup>+ </sup>antiporter families. For heavy metals acquisition and tolerance, it possessed CDSs for iron and nickel transport and efflux pumps for other metals. For temperature stress, it possessed genes related to chaperones and chaperonins, heat shock proteins and cold shock proteins. For osmotic stress, 25 CDSs were observed, mostly related to regulators for potassium ion, proline and glutamate transport. For oxidative and UV light stress, genes for oxidant-resistant dehydratase, superoxide scavenging, hydrogen peroxide scavenging, exclusion and export of redox-cycling antibiotics, redox balancing, DNA repair, reduction of disulfide bonds, limitation of iron availability and reduction of iron-sulfur clusters are present. For starvation, it possessed phosphorus and, despite being asaccharolytic, carbon starvation-related CDSs.</p> <p>Conclusions</p> <p>The <it>L. hongkongensis </it>genome possessed a high variety of genes for adaptation to acid, alkaline, temperature, osmotic, oxidative, UV light and starvation stresses and acquisition of and tolerance to heavy metals.</p

    Access denied : A closer look at anti-phage defense mechanisms

    Get PDF
    Microorganisms are under constant threat by their viruses, bacteriophages (phages). In response to this pressure, they have developed multiple strategies to protect from phage infection. Mechanisms that protect microorganisms from phage infection include restriction-modification, BREX, DISARM, and CRISPR-Cas systems. However, phages have developed several mechanisms by which they can evade anti-phage defense systems. One common way to evade these systems by phages is by modifying their nucleic acids (DNA). This thesis encompasses the characterization of several CRISPR-Cas and DISARM proteins and studying the effect of phage DNA modifications on the activity of these proteins.</p

    Studies of DNA demethylation

    Get PDF

    Adapting to change : on the mechanism of type I-E CRISPR-Cas defence

    Get PDF
    Host-pathogen interactions are among the most prevalent and evolutionary important interactions known today. The predation of prokaryotes by their viruses is happening on an especially large scale and had a major influence on the evolutionary history of prokaryotes. Since most viruses are lytic at some point in their life-cycle, there is a high selection pressure for prokaryotes to develop defense mechanisms. As described in Chapter 1, the CRISPR-Cas system is a relatively recently discovered defense system and is also the first adaptive defense system discovered in prokaryotes. CRISPR-Cas systems are widespread, occurring in the majority of archaea and also a considerable fraction of bacteria. This diversity is also reflected in the diversity of different types of CRISPR-Cas systems, currently being divided into 6 major types with a large number of subtypes. The type I-E system of Escherichia coli is a well-studied model system and of high relevance, since it is a major subtype of type I systems which make up around 50 % of all discovered CRISPR-Cas systems. CRISPR-Cas systems basically comprise the CRISPR array, made up of repeats and foreign derived spacers, and a set of cas genes. Immunity is commonly divided into three functional stages, adaptation, expression and interference. Adaptation is the acquisition of new spacers from the foreign nucleic acid and its incorporation into the CRISPR array. During expression, the CRISPR array is transcribed, processed and assembled with Cas proteins into CRISPR RNA (crRNA) guided ribonucleoprotein complexes (crRNP). Interference is the detection, binding and destruction of foreign nucleic acids by the crRNP and in type I systems the Cas3 nuclease. The type I-E system contains another function, called primed adaptation. Primed adaptation is a more rapid and efficient version of regular (naïve) adaptation. In addition to the adaptation machinery, primed adaptation also requires the interference machinery. Chapter 2 describes and compares a fundamental feature of most, if not all, CRISPR-Cas systems and also many other small RNA based systems. The mode of action of small RNAs relies on protein-assisted base pairing of the guide RNA with target mRNA or DNA to interfere with their transcription, translation or replication. Several unrelated classes of small non-coding RNAs have been identified including eukaryotic RNA silencing associated small RNAs, prokaryotic small regulatory RNAs and prokaryotic CRISPR (clustered regularly interspaced short palindromic repeats) RNAs. All three groups identify their target sequence by base pairing after finding it in a pool of millions of other nucleotide sequences in the cell. In this complicated target search process, a region of 6 to 12 nucleotides of the small RNA termed the ‘seed’ plays a critical role. The seed is often a structurally pre-ordered region that increases accessibility and lowers the energy barrier of RNA-DNA duplex formation. Furthermore, the length of the seed is optimally chosen to allow rapid probing and also rejection of potential target sites. The seed is a perfect example of parallel evolution, showing that nature comes up with the same strategy independently multiple times. Chapter 3 provides a description and protocol of the Electrophoretic Mobility Shift Assay (EMSA) and its use for studying crRNPs. EMSA is a straightforward and inexpensive method for the determination and quantification of protein–nucleic acid interactions. It relies on the different mobility of free and protein-bound nucleic acid in a gel matrix during electrophoresis. Nucleic acid affinities of crRNPs can be quantified by calculating the dissociation constant (Kd ). Protocols for two types of EMSA assays are described using the Cascade ribonucleoprotein complex from Escherichia coli as an example. One protocol uses plasmid DNA as substrate, while the other uses short linear oligonucleotides. Plasmids can be easily visualized with traditional DNA staining, while oligos have to be radioactively labelled using the 32Phosphate isotope. The EMSA method and these protocols are applied throughout the other chapters of this thesis. Chapter 4 focusses on the processes of interference and primed adaptation, specifically on their tolerance of mutations. Invaders can escape Type I-E CRISPR-Cas immunity in E. coli by making point mutations in the protospacer (especially in the seed) or its adjacent motif (PAM), but hosts quickly restore immunity by integrating new spacers in a positive feedback process termed priming. Here, we provide a systematic analysis of the constraints of both direct interference and subsequent priming in E. coli. We have defined a high-resolution genetic map of direct interference by Cascade and Cas3, which includes five positions of the protospacer at 6 nt intervals that readily tolerate mutations. Importantly, we show that priming is an extremely robust process capable of utilizing degenerate target regions with up to at least eleven mutations throughout the PAM and protospacer region. Priming is influenced by the number of mismatches, their position and is nucleotide dependent. Our findings imply that even out-dated spacers containing many mismatches can induce a rapid primed CRISPR response against diversified or related invaders, giving microbes an advantage in the co- evolutionary arms race with their invaders. In Chapter 5 we elucidate the mechanism of priming. Specifically, we determine how new spacers are produced and selected for integration into the CRISPR array during priming. We show that priming is directly dependent on interference. Rapid priming occurs when the rate of interference is high, delayed priming occurs when the rate of interference is low. Using in vitro assays and next generation sequencing, we show that Cas3 couples CRISPR interference to adaptation by producing DNA breakdown products that fuel the spacer integration process in a two-step, PAM-associated manner. The helicase-nuclease Cas3 pre-processes target DNA into fragments of about 30–100 nt enriched for thymine-stretches in their 3’ ends. By reconstituting the spacer integration process in vitro, we show that the Cas1-2 complex further processes these fragments and integrates them sequence- specifically into CRISPR repeats by coupling of a 3’ cytosine of the fragment. Our results highlight that the selection of PAM-compliant spacers during priming is enhanced by the combined sequence specificities of Cas3 and the Cas1-2 complex, leading to an increased propensity of integrating functional CTT-containing spacers. In Chapter 6 we look deeper into a nucleotide specific effect on priming that was discovered in Chapter 4. Immunity is based on the complementarity of host encoded spacer sequences with protospacers on the foreign genetic element. The efficiency of both direct interference and primed acquisition depends on the degree of complementarity between spacer and protospacer. Previous studies focused on the amount and positions of mutations, not the identity of the substituted nucleotide. In Chapter 4, we describe a nucleotide bias, showing a positive effect on priming of C substitutions and a negative effect on priming of G substitutions in the basepairing strand of the protospacer. Here we show that these substitutions rather directly influence the efficiency of interference and therefore indirectly influence the efficiency of interference dependent priming. We show that G substitutions have a profoundly negative effect on interference, while C substitutions are readily tolerated when in the same positions. Furthermore, we show that this effect is based on strongly decreased binding of the effector complex Cascade to G mutants, while C mutants only minimally affect binding. In Chapter 5 we showed a connection between the rate of interference and the time of occurrence of priming. Here, we also quantify the extent of priming and show that priming is very prevalent in a population that shows intermediate levels of interference, while high or low levels of interference lead to a lower prevalence of priming. Chapter 7 describes an attempt to make use of our knowledge about the Cascade complex and develop it into a genome editing tool. The development of genome editing tools has made major leaps in the last decade. Recently, RNA guided endonucleases (RGENs) such as Cas9 or Cpf1 have revolutionized genome editing. These RGENs are the hallmark proteins of class II CRISPR-Cas systems. Here, we have explored the possibility to develop a new genome editing tool that makes use of the Cascade complex from E. coli. This RNA guided protein complex is fused to a FokI nuclease domain to sequence specifically cleave DNA. We validate the tool in vitro using purified protein and two sets of guide RNAs, showing specific cleavage activity. The tool requires two target sites of 32 nt each at a distance of 30-40 nt and inward facing three nucleotide flexible PAM sequences. Cleavage occurs in the middle between the two binding sites and primarily creates 4 nt overhangs. Furthermore, we show that an additional RFP can be fused to FokI-Cascade, allowing visualization of the complex in target cells. Unfortunately, we were not able to successfully apply the tool in vivo in eukaryotic cells.</p