14 research outputs found

    Research: A comprehensive and quantitative exploration of thousands of viral genomes

    Get PDF
    The complete assembly of viral genomes from metagenomic datasets (short genomic sequences gathered from environmental samples) has proven to be challenging, so there are significant blind spots when we view viral genomes through the lens of metagenomics. One approach to overcoming this problem is to leverage the thousands of complete viral genomes that are publicly available. Here we describe our efforts to assemble a comprehensive resource that provides a quantitative snapshot of viral genomic trends – such as gene density, noncoding percentage, and abundances of functional gene categories – across thousands of viral genomes. We have also developed a coarse-grained method for visualizing viral genome organization for hundreds of genomes at once, and have explored the extent of the overlap between bacterial and bacteriophage gene pools. Existing viral classification systems were developed prior to the sequencing era, so we present our analysis in a way that allows us to assess the utility of the different classification systems for capturing genomic trends

    Virology By The Numbers: A Quantitative Exploration of Viral Energetics, Genomics, and Ecology

    Get PDF
    Over the past couple of decades, technological advancements in sequencing and imaging have unequivocally proven that the world of viruses is far bigger and more consequential than previously imagined. There are 1031 viruses estimated to inhabit our planet, outnumbering even bacteria. Despite their astronomical numbers and staggering sequence diversity, environmental viruses are poorly characterized. In this thesis we will demonstrate our three-pronged exploration of viruses through the lenses of energetics (Chapters 2 and 3), genomics (Chapter 4) and ecology (Chapter 5). We will first focus on one of the defining features of viruses, namely their reliance on their host for energy, and demonstrate the energetic cost of building a virus and mounting an infection. In our second study, we present one of the largest surveys of complete viral genomes, providing a comprehensive and quantitative snapshot of viral genomic trends for thousands of viruses. In our third study, we shift our focus towards ecological questions surrounding the large number of commensal phages inhabiting the human body. We discovered that phage community composition could serve as a fingerprint, or a "phageprint" – highly personal and stable over time. To our knowledge, this study is one of the largest studies of human phages and the first to demonstrate the feasibility of human identification based on phage sequences.</p

    The Energetic Cost of Building a Virus

    Full text link
    Viruses are incapable of autonomous energy production. Although many experimental studies make it clear that viruses are parasitic entities that hijack the host's molecular resources, a detailed estimate for the energetic cost of viral synthesis is largely lacking. To quantify the energetic cost of viruses to their hosts, we enumerated the costs associated with two very distinct but representative DNA and RNA viruses, namely T4 and influenza. We found that for these viruses, translation of viral proteins is the most energetically expensive process. Interestingly, the cost of building a T4 phage and a single influenza virus are nearly the same. Due to influenza's higher burst size, however, the overall cost of a T4 phage infection is only 2-3% of the cost of an influenza infection. The costs of these infections relative to their host's estimated energy budget during the infection reveal that a T4 infection consumes about a third of its host's energy budget, where as an influenza infection consumes only 1%. Building on our estimates for T4, we show how the energetic costs of double-stranded DNA viruses scale with virus size, revealing that the dominant cost of building a virus can switch from translation to genome replication above a critical virus size. Lastly, using our predictions for the energetic cost of viruses, we provide estimates for the strengths of selection and genetic drift acting on newly incorporated genetic elements in viral genomes, under conditions of energy limitation

    Research: A comprehensive and quantitative exploration of thousands of viral genomes

    Get PDF
    The complete assembly of viral genomes from metagenomic datasets (short genomic sequences gathered from environmental samples) has proven to be challenging, so there are significant blind spots when we view viral genomes through the lens of metagenomics. One approach to overcoming this problem is to leverage the thousands of complete viral genomes that are publicly available. Here we describe our efforts to assemble a comprehensive resource that provides a quantitative snapshot of viral genomic trends – such as gene density, noncoding percentage, and abundances of functional gene categories – across thousands of viral genomes. We have also developed a coarse-grained method for visualizing viral genome organization for hundreds of genomes at once, and have explored the extent of the overlap between bacterial and bacteriophage gene pools. Existing viral classification systems were developed prior to the sequencing era, so we present our analysis in a way that allows us to assess the utility of the different classification systems for capturing genomic trends

    Defining the Energetic Costs of Cellular Structures

    Get PDF
    All cellular structures are assembled from molecular building blocks, and molecular building blocks incur energetic costs to the cell. In an energy-limited environment, the energetic cost of a cellular structure imposes a fitness cost and impacts a cell's evolutionary trajectory. While the importance of energetic considerations was realized for decades, the distinction between direct energetic costs expended by the cell and potential energy that the cell diverts into cellular biomass components, which we define as the opportunity cost, was not explicitly made, leading to large differences in values for energetic costs of molecular building blocks used in the literature. We describe a framework that defines and separates various components relevant for estimating the energetic costs of molecular building blocks and the resulting cellular structures. This distinction among energetic costs is an essential step towards discussing the conversion of an energetic cost to a corresponding fitness cost

    Human Phageprints: A high-resolution exploration of oral phages reveals globally-distributed phage families with individual-specific and temporally-stable community compositions

    Get PDF
    Metagenomic studies have revolutionized the study of novel phages. However these studies trade the depth of coverage for breadth. In this study we show that the targeted sequencing of a phage genomic region as small as 200-300 base pairs, can provide sufficient sequence diversity to serve as an individual-specific barcode or Phageprint. The targeted approach reveals a high-resolution view of phage communities that is not available through metagenomic datasets. By creating instructional videos and collection kits, we enabled citizen scientists to gather ~700 oral samples spanning ~100 individuals residing in different parts of the world. In examining phage communities at 6 different oral sites, and by comparing phage communities of individuals living across the globe, we were able to study the effect of spatial separation, ranging from several millimeters to thousands of kilometers. We found that the spatial separation of just a few centimeters (the distance between two oral sites) can already result in highly distinct phage community compositions. For larger distances, spanning the phage communities of different individuals living in different parts of the world, we did not observe any correlation between spatial distance and phage community composition as individuals residing in the same city did not have any more similar phage communities than individuals living on different continents. Additionally, we found that neither genetics nor cohabitation seem to play a role in the relatedness of phage community compositions across individuals. Cohabitating siblings and even identical twins did not have phage community compositions that were any more similar than those of unrelated individuals. The primary factor contributing to phage community composition relatedness is direct contact between two habitats, as is demonstrated by the similarity between oral phage community compositions of partners. Furthermore, by exploring phage communities across the span of a month, and in some cases several years, we observed highly stable community compositions. These studies consistently point to the existence of remarkably diverse and personal phage families that are stable in time and apparently present in people around the world

    Energetic cost of building a virus

    Get PDF
    Viruses are incapable of autonomous energy production. Although many experimental studies make it clear that viruses are parasitic entities that hijack the molecular resources of the host, a detailed estimate for the energetic cost of viral synthesis is largely lacking. To quantify the energetic cost of viruses to their hosts, we enumerated the costs associated with two very distinct but representative DNA and RNA viruses, namely, T4 and influenza. We found that, for these viruses, translation of viral proteins is the most energetically expensive process. Interestingly, the costs of building a T4 phage and a single influenza virus are nearly the same. Due to influenza’s higher burst size, however, the overall cost of a T4 phage infection is only 2–3% of the cost of an influenza infection. The costs of these infections relative to their host’s estimated energy budget during the infection reveal that a T4 infection consumes about a third of its host’s energy budget, whereas an influenza infection consumes only ≈ 1%. Building on our estimates for T4, we show how the energetic costs of double-stranded DNA phages scale with the capsid size, revealing that the dominant cost of building a virus can switch from translation to genome replication above a critical size. Last, using our predictions for the energetic cost of viruses, we provide estimates for the strengths of selection and genetic drift acting on newly incorporated genetic elements in viral genomes, under conditions of energy limitation

    Defining the Energetic Costs of Cellular Structures

    Get PDF
    All cellular structures are assembled from molecular building blocks, and molecular building blocks incur energetic costs to the cell. In an energy-limited environment, the energetic cost of a cellular structure imposes a fitness cost and impacts a cell's evolutionary trajectory. While the importance of energetic considerations was realized for decades, the distinction between direct energetic costs expended by the cell and potential energy that the cell diverts into cellular biomass components, which we define as the opportunity cost, was not explicitly made, leading to large differences in values for energetic costs of molecular building blocks used in the literature. We describe a framework that defines and separates various components relevant for estimating the energetic costs of molecular building blocks and the resulting cellular structures. This distinction among energetic costs is an essential step towards discussing the conversion of an energetic cost to a corresponding fitness cost

    Intrinsically disordered proteins and conformational noise: Implications in cancer

    Get PDF
    Intrinsically disordered proteins, IDPs, are proteins that lack a rigid 3D structure under physiological conditions, at least in vitro. Despite the lack of structure, IDPs play important roles in biological processes and transition from disorder to order upon binding to their targets. With multiple conformational states and rapid conformational dynamics, they engage in myriad and often “promiscuous” interactions. These stochastic interactions between IDPs and their partners, defined here as conformational noise, is an inherent characteristic of IDP interactions. The collective effect of conformational noise is an ensemble of protein network configurations, from which the most suitable can be explored in response to perturbations, conferring protein networks with remarkable flexibility and resilience. Moreover, the ubiquitous presence of IDPs as transcriptional factors and, more generally, as hubs in protein networks, is indicative of their role in propagation of transcriptional (genetic) noise. As effectors of transcriptional and conformational noise, IDPs rewire protein networks and unmask latent interactions in response to perturbations. Thus, noise-driven activation of latent pathways could underlie state-switching events such as cellular transformation in cancer. To test this hypothesis, we created a model of a protein network with the topological characteristics of a cancer protein network and tested its response to a perturbation in presence of IDP hubs and conformational noise. Because numerous IDPs are found to be epigenetic modifiers and chromatin remodelers, we hypothesize that they could further channel noise into stable, heritable genotypic changes

    Human Phageprints: A high-resolution exploration of oral phages reveals globally-distributed phage families with individual-specific and temporally-stable community compositions

    Get PDF
    Metagenomic studies have revolutionized the study of novel phages. However these studies trade the depth of coverage for breadth. In this study we show that the targeted sequencing of a phage genomic region as small as 200-300 base pairs, can provide sufficient sequence diversity to serve as an individual-specific barcode or Phageprint. The targeted approach reveals a high-resolution view of phage communities that is not available through metagenomic datasets. By creating instructional videos and collection kits, we enabled citizen scientists to gather ~700 oral samples spanning ~100 individuals residing in different parts of the world. In examining phage communities at 6 different oral sites, and by comparing phage communities of individuals living across the globe, we were able to study the effect of spatial separation, ranging from several millimeters to thousands of kilometers. We found that the spatial separation of just a few centimeters (the distance between two oral sites) can already result in highly distinct phage community compositions. For larger distances, spanning the phage communities of different individuals living in different parts of the world, we did not observe any correlation between spatial distance and phage community composition as individuals residing in the same city did not have any more similar phage communities than individuals living on different continents. Additionally, we found that neither genetics nor cohabitation seem to play a role in the relatedness of phage community compositions across individuals. Cohabitating siblings and even identical twins did not have phage community compositions that were any more similar than those of unrelated individuals. The primary factor contributing to phage community composition relatedness is direct contact between two habitats, as is demonstrated by the similarity between oral phage community compositions of partners. Furthermore, by exploring phage communities across the span of a month, and in some cases several years, we observed highly stable community compositions. These studies consistently point to the existence of remarkably diverse and personal phage families that are stable in time and apparently present in people around the world
    corecore