4 research outputs found

    FROM THERMAL SPRINGS TO SUBWAY BENCHES: EXPLORING THE DIVERSITY OF CARBON MONOXIDE DEHYDROGENASES THROUGH METAGENOMES, PHYLOGENETICS, AND MACHINE LEARNING

    Get PDF
    Carbon monoxide is well known as a toxic gas but can also be an important input and intermediary for microbial metabolisms. Carbon monoxide dehydrogenases (CODHs) serve as key enzyme complexes for a variety of microbial carbon monoxide (CO) utilization pathways. Such pathways include the Wood-Ljungdahl pathway, which is important in methanogenesis and acetogenesis, metal and sulfate reduction pathways, hydrogen production, and others. The CODH enzymes allow microbes to turn the traditionally toxic waste gas of CO into a useful carbon and energy source. Despite the flexibility of CODH enzymes, the use of carbon monoxide is still believed to be a fringe metabolism. Here we seek to expand the known diversity, distribution, and phylogeny of CODH catalytic subunit proteins by searching an expansive dataset of over 50,000 metagenome assembled genomes. Our work has shown that this dataset contains 5,426 putative CODH protein sequences found within 4,001 metagenome assembled genomes. Despite the considerable expansion of the known set of CODH sequences, our phylogenetic analysis has validated the protein\u27s previously established phylogeny while showing a wider environmental and taxonomic distribution of CODHs. Often considered to be found primarily in areas with high levels of CO, CODHs are typically associated with thermal and extremophiles. In addition to the expected high temperature environments, CODHs were found in metagenomes from diverse environments from soils to subway benches, and in phyla ranging from archaeal Euryarchaeota to bacterial Actinobacterota. We also have constructed a machine learning model to extract functional predictions and information using a sequence-only method to predict gene ontologies (GO-terms) associated with CODH function. While our model can achieve accurate prediction of GO-terms, our work has shown some of the current limitations in the approach. This study reveals CODHs to be a more diverse and ubiquitous enzyme than previously anticipated. Despite tripling the number of sequences in the phylogeny, we provide strong support for the previously established clades and report no new clades. This work has also identified some key areas for experimental follow up regarding the importance of carbon monoxide and CODH genes in many environments

    Microbiome preterm birth DREAM challenge: Crowdsourcing machine learning approaches to advance preterm birth research

    Get PDF
    Every year, 11% of infants are born preterm with significant health consequences, with the vaginal microbiome a risk factor for preterm birth. We crowdsource models to predict (1) preterm birth (PTB; \u3c37 \u3eweeks) or (2) early preterm birth (ePTB; \u3c32 \u3eweeks) from 9 vaginal microbiome studies representing 3,578 samples from 1,268 pregnant individuals, aggregated from public raw data via phylogenetic harmonization. The predictive models are validated on two independent unpublished datasets representing 331 samples from 148 pregnant individuals. The top-performing models (among 148 and 121 submissions from 318 teams) achieve area under the receiver operator characteristic (AUROC) curve scores of 0.69 and 0.87 predicting PTB and ePTB, respectively. Alpha diversity, VALENCIA community state types, and composition are important features in the top-performing models, most of which are tree-based methods. This work is a model for translation of microbiome data into clinically relevant predictive models and to better understand preterm birth

    Coexistence of specialist and generalist species within mixed plastic derivative-utilizing microbial communities

    Get PDF
    Abstract Background Plastic-degrading microbial isolates offer great potential to degrade, transform, and upcycle plastic waste. Tandem chemical and biological processing of plastic wastes has been shown to substantially increase the rates of plastic degradation; however, the focus of this work has been almost entirely on microbial isolates (either bioengineered or naturally occurring). We propose that a microbial community has even greater potential for plastic upcycling. A microbial community has greater metabolic diversity to process mixed plastic waste streams and has built-in functional redundancy for optimal resilience. Results Here, we used two plastic-derivative degrading communities as a model system to investigate the roles of specialist and generalist species within the microbial communities. These communities were grown on five plastic-derived substrates: pyrolysis treated high-density polyethylene, chemically deconstructed polyethylene terephthalate, disodium terephthalate, terephthalamide, and ethylene glycol. Short-read metagenomic and metatranscriptomic sequencing were performed to evaluate activity of microorganisms in each treatment. Long-read metagenomic sequencing was performed to obtain high-quality metagenome assembled genomes and evaluate division of labor. Conclusions Data presented here show that the communities are primarily dominated by Rhodococcus generalists and lower abundance specialists for each of the plastic-derived substrates investigated here, supporting previous research that generalist species dominate batch culture. Additionally, division of labor may be present between Hydrogenophaga terephthalate degrading specialists and lower abundance protocatechuate degrading specialists. Video Abstrac

    Microbiome preterm birth DREAM challenge: Crowdsourcing machine learning approaches to advance preterm birth research

    No full text
    Every year, 11% of infants are born preterm with significant health consequences, with the vaginal microbiome a risk factor for preterm birth. We crowdsource models to predict (1) preterm birth (PTB; <37 weeks) or (2) early preterm birth (ePTB; <32 weeks) from 9 vaginal microbiome studies representing 3,578 samples from 1,268 pregnant individuals, aggregated from public raw data via phylogenetic harmonization. The predictive models are validated on two independent unpublished datasets representing 331 samples from 148 pregnant individuals. The top-performing models (among 148 and 121 submissions from 318 teams) achieve area under the receiver operator characteristic (AUROC) curve scores of 0.69 and 0.87 predicting PTB and ePTB, respectively. Alpha diversity, VALENCIA community state types, and composition are important features in the top-performing models, most of which are tree-based methods. This work is a model for translation of microbiome data into clinically relevant predictive models and to better understand preterm birth
    corecore