5,024 research outputs found

    Recovering complete and draft population genomes from metagenome datasets.

    Get PDF
    Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem of chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution

    Time series genome-centric analysis unveils bacterial response to operational disturbance in activated sludge

    Get PDF
    Understanding ecosystem response to disturbances and identifying the most critical traits for the maintenance of ecosystem functioning are important goals for microbial community ecology. In this study, we used 16S rRNA amplicon sequencing and metagenomics to investigate the assembly of bacterial populations in a full-scale municipal activated sludge wastewater treatment plant over a period of 3 years, including a 9-month period of disturbance characterized by short-term plant shutdowns. Following the reconstruction of 173 metagenome-assembled genomes, we assessed the functional potential, the number of rRNA gene operons, and the in situ growth rate of microorganisms present throughout the time series. Operational disturbances caused a significant decrease in bacteria with a single copy of the rRNA (rrn) operon. Despite moderate differences in resource availability, replication rates were distributed uniformly throughout time, with no differences between disturbed and stable periods. We suggest that the length of the growth lag phase, rather than the growth rate, is the primary driver of selection under disturbed conditions. Thus, the system could maintain its function in the face of disturbance by recruiting bacteria with the capacity to rapidly resume growth under unsteady operating conditions.Fil: Pérez, María Victoria. Agua y Saneamientos Argentinos S.a.; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Investigaciones en Ingeniería Genética y Biología Molecular "Dr. Héctor N. Torres"; ArgentinaFil: Guerrero, Leandro Demián. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Investigaciones en Ingeniería Genética y Biología Molecular "Dr. Héctor N. Torres"; ArgentinaFil: Orellana, Esteban. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Investigaciones en Ingeniería Genética y Biología Molecular "Dr. Héctor N. Torres"; ArgentinaFil: Figuerola, Eva Lucia Margarita. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Investigaciones en Ingeniería Genética y Biología Molecular "Dr. Héctor N. Torres"; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Fisiología, Biología Molecular y Celular; ArgentinaFil: Erijman, Leonardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Investigaciones en Ingeniería Genética y Biología Molecular "Dr. Héctor N. Torres"; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Fisiología, Biología Molecular y Celular; Argentin

    eXamine: a Cytoscape app for exploring annotated modules in networks

    Get PDF
    Background. Biological networks have growing importance for the interpretation of high-throughput "omics" data. Statistical and combinatorial methods allow to obtain mechanistic insights through the extraction of smaller subnetwork modules. Further enrichment analyses provide set-based annotations of these modules. Results. We present eXamine, a set-oriented visual analysis approach for annotated modules that displays set membership as contours on top of a node-link layout. Our approach extends upon Self Organizing Maps to simultaneously lay out nodes, links, and set contours. Conclusions. We implemented eXamine as a freely available Cytoscape app. Using eXamine we study a module that is activated by the virally-encoded G-protein coupled receptor US28 and formulate a novel hypothesis about its functioning

    The Evolving Faces of the SARS-CoV-2 Genome

    Get PDF
    Surveillance of the evolving SARS-CoV-2 genome combined with epidemiological monitoring and emerging vaccination became paramount tasks to control the pandemic which is rapidly changing in time and space. Genomic surveillance must combine generation and sharing sequence data with appropriate bioinformatics monitoring and analysis methods. We applied molecular portrayal using self-organizing maps machine learning (SOM portrayal) to characterize the diversity of the virus genomes, their mutual relatedness and development since the beginning of the pandemic. The genetic landscape obtained visualizes the relevant mutations in a lineage-specific fashion and provides developmental paths in genetic state space from early lineages towards the variants of concern alpha, beta, gamma and delta. The different genes of the virus have specific footprints in the landscape reflecting their biological impact. SOM portrayal provides a novel option for ‘bioinformatics surveillance’ of the pandemic, with strong odds regarding visualization, intuitive perception and ‘personalization’ of the mutational patterns of the virus genomes

    Unusual Metabolism and Hypervariation in the Genome of a Gracilibacterium (BD1-5) from an Oil-Degrading Community.

    Get PDF
    The candidate phyla radiation (CPR) comprises a large monophyletic group of bacterial lineages known almost exclusively based on genomes obtained using cultivation-independent methods. Within the CPR, Gracilibacteria (BD1-5) are particularly poorly understood due to undersampling and the inherent fragmented nature of available genomes. Here, we report the first closed, curated genome of a gracilibacterium from an enrichment experiment inoculated from the Gulf of Mexico and designed to investigate hydrocarbon degradation. The gracilibacterium rose in abundance after the community switched to dominance by Colwellia Notably, we predict that this gracilibacterium completely lacks glycolysis, the pentose phosphate and Entner-Doudoroff pathways. It appears to acquire pyruvate, acetyl coenzyme A (acetyl-CoA), and oxaloacetate via degradation of externally derived citrate, malate, and amino acids and may use compound interconversion and oxidoreductases to generate and recycle reductive power. The initial genome assembly was fragmented in an unusual gene that is hypervariable within a repeat region. Such extreme local variation is rare but characteristic of genes that confer traits under pressure to diversify within a population. Notably, the four major repeated 9-mer nucleotide sequences all generate a proline-threonine-aspartic acid (PTD) repeat. The genome of an abundant Colwellia psychrerythraea population has a large extracellular protein that also contains the repeated PTD motif. Although we do not know the host for the BD1-5 cell, the high relative abundance of the C. psychrerythraea population and the shared surface protein repeat may indicate an association between these bacteria.IMPORTANCE CPR bacteria are generally predicted to be symbionts due to their extensive biosynthetic deficits. Although monophyletic, they are not monolithic in terms of their lifestyles. The organism described here appears to have evolved an unusual metabolic platform not reliant on glucose or pentose sugars. Its biology appears to be centered around bacterial host-derived compounds and/or cell detritus. Amino acids likely provide building blocks for nucleic acids, peptidoglycan, and protein synthesis. We resolved an unusual repeat region that would be invisible without genome curation. The nucleotide sequence is apparently under strong diversifying selection, but the amino acid sequence is under stabilizing selection. The amino acid repeat also occurs in a surface protein of a coexisting bacterium, suggesting colocation and possibly interdependence

    Genomic Inference of the Metabolism and Evolution of the Archaeal Phylum Aigarchaeota

    Get PDF
    Microbes of the phylum Aigarchaeota are widely distributed in geothermal environments, but their physiological and ecological roles are poorly understood. Here we analyze six Aigarchaeota metagenomic bins from two circumneutral hot springs in Tengchong, China, to reveal that they are either strict or facultative anaerobes, and most are chemolithotrophs that can perform sulfide oxidation. Applying comparative genomics to the Thaumarchaeota and Aigarchaeota, we find that they both originated from thermal habitats, sharing 1154 genes with their common ancestor. Horizontal gene transfer played a crucial role in shaping genetic diversity of Aigarchaeota and led to functional partitioning and ecological divergence among sympatric microbes, as several key functional innovations were endowed by Bacteria, including dissimilatory sulfite reduction and possibly carbon monoxide oxidation. Our study expands our knowledge of the possible ecological roles of the Aigarchaeota and clarifies their evolutionary relationship to their sister lineage Thaumarchaeota

    Genomic Methods for Bacterial Infection Identification

    Get PDF
    Hospital-acquired infections (HAIs) have high mortality rates around the world and are a challenge to medical science due to rapid mutation rates in their pathogens. A new methodology is proposed to identify bacterial species causing HAIs based on sets of universal biomarkers for next-generation microarray designs (i.e., nxh chips), rather than a priori selections of biomarkers. This method allows arbitrary organisms to be classified based on readouts of their DNA sequences, including whole genomes. The underlying models are based on the biochemistry of DNA, unlike traditional edit-distance based alignments. Furthermore, the methodology is fairly robust to genetic mutations, which are likely to reduce accuracy. Standard machine learning methods (neural networks, self-organizing maps, and random forests) produce results to identify HAIs on nxh chips that are very competitive, if not superior, to current standards in the field. The potential feasibility of translating these techniques to a clinical test is also discussed

    Evolution of size and pattern in the social amoebas

    Get PDF
    A fundamental goal of biology is to understand how novel phenotypes evolved through changes in existing genes. The Dictyostelia or social amoebas represent a simple form of multicellularity, where starving cells aggregate to build fruiting structures. This review summarizes efforts to provide a framework for investigating the genetic changes that generated novel morphologies in the Dictyostelia. The foundation is a recently constructed molecular phylogeny of the Dictyostelia, which was used to examine trends in the evolution of novel forms and in the divergence of genes that shape these forms. There is a major trend towards the formation of large unbranched fruiting bodies, which is correlated with the use of cyclic AMP (cAMP) as a secreted signal to coordinate cell aggregation. The role of cAMP in aggregation arose through co-option of a pathway that originally acted to coordinate fruiting body formation. The genotypic changes that caused this innovation and the role of dynamic cAMP signaling in defining fruiting body size and pattern throughout social amoeba evolution are discussed. BioEssays 29:635–644, 2007. © 2007 Wiley Periodicals, Inc
    • …
    corecore