15 research outputs found

    Efam: an expanded, metaproteome-supported HMM profile database of viral protein families

    Get PDF
    Motivation: Viruses infect, reprogram and kill microbes, leading to profound ecosystem consequences, from elemental cycling in oceans and soils to microbiome-modulated diseases in plants and animals. Although metagenomic datasets are increasingly available, identifying viruses in them is challenging due to poor representation and annotation of viral sequences in databases. Results: Here, we establish efam, an expanded collection of Hidden Markov Model (HMM) profiles that represent viral protein families conservatively identified from the Global Ocean Virome 2.0 dataset. This resulted in 240 311 HMM profiles, each with at least 2 protein sequences, making efam >7-fold larger than the next largest, pan-ecosystem viral HMM profile database. Adjusting the criteria for viral contig confidence from 'conservative' to 'eXtremely Conservative' resulted in 37 841 HMM profiles in our efam-XC database. To assess the value of this resource, we integrated efam-XC into VirSorter viral discovery software to discover viruses from less-studied, ecologically distinct oxygen minimum zone (OMZ) marine habitats. This expanded database led to an increase in viruses recovered from every tested OMZ virome by ∼24% on average (up to ∼42%) and especially improved the recovery of often-missed shorter contigs (<5 kb). Additionally, to help elucidate lesser-known viral protein functions, we annotated the profiles using multiple databases from the DRAM pipeline and virion-associated metaproteomic data, which doubled the number of annotations obtainable by standard, single-database annotation approaches. Together, these marine resources (efam and efam-XC) are provided as searchable, compressed HMM databases that will be updated bi-annually to help maximize viral sequence discovery and study from any ecosystem

    Biosynthetic potential of the global ocean microbiome

    Get PDF
    8 pages, 4 figures, supplementary information https://doi.org/10.1038/s41586-022-04862-3.-- This Article is contribution number 130 of Tara OceansNatural microbial communities are phylogenetically and metabolically diverse. In addition to underexplored organismal groups1, this diversity encompasses a rich discovery potential for ecologically and biotechnologically relevant enzymes and biochemical compounds2,3. However, studying this diversity to identify genomic pathways for the synthesis of such compounds4 and assigning them to their respective hosts remains challenging. The biosynthetic potential of microorganisms in the open ocean remains largely uncharted owing to limitations in the analysis of genome-resolved data at the global scale. Here we investigated the diversity and novelty of biosynthetic gene clusters in the ocean by integrating around 10,000 microbial genomes from cultivated and single cells with more than 25,000 newly reconstructed draft genomes from more than 1,000 seawater samples. These efforts revealed approximately 40,000 putative mostly new biosynthetic gene clusters, several of which were found in previously unsuspected phylogenetic groups. Among these groups, we identified a lineage rich in biosynthetic gene clusters (‘Candidatus Eudoremicrobiaceae’) that belongs to an uncultivated bacterial phylum and includes some of the most biosynthetically diverse microorganisms in this environment. From these, we characterized the phospeptin and pythonamide pathways, revealing cases of unusual bioactive compound structure and enzymology, respectively. Together, this research demonstrates how microbiomics-driven strategies can enable the investigation of previously undescribed enzymes and natural products in underexplored microbial groups and environmentsThis work was supported by funding from the ETH and the Helmut Horten Foundation; the Swiss National Science Foundation (SNSF) through project grants 205321_184955 to S.S., 205320_185077 to J.P. and the NCCR Microbiomes (51NF40_180575) to S.S.; by the Gordon and Betty Moore Foundation (https://doi.org/10.37807/GBMF9204) and the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 101000392 (MARBLES) to J.P.; by an ETH research grant ETH-21 18-2 to J.P.; and by the Peter and Traudl Engelhorn Foundation and by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement no. 897571 to C.C.F. S.L.R. was supported by an ETH Zurich postdoctoral fellowship 20-1 FEL-07. M.L., L.M.C. and G.Z. were supported by EMBL Core Funding and the German Research Foundation (DFG, Deutsche Forschungsgemeinschaft, project no. 395357507, SFB 1371 to G.Z.). M.B.S. was supported by the NSF grant OCE#1829831. C.B. was supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement Diatomic, no. 835067). S.G.A. was supported by the Spanish Ministry of Economy and Competitiveness (PID2020-116489RB-I00). M.K. and H.M. were funded by the SNSF grant 407540_167331 as part of the Swiss National Research Programme 75 ‘Big Data’. M.K., H.M. and A.K. are also partially funded by ETH core funding (to G. Rätsch)With the institutional support of the ‘Severo Ochoa Centre of Excellence’ accreditation (CEX2019-000928-S)Peer reviewe

    The association of obesity with cardiovascular events in patients with peripheral artery disease

    No full text
    Objectives: This systematic review aimed to summarise published evidence that has assessed the association of obesity with major cardiovascular events (CVEs) (non-fatal myocardial infarction, non-fatal stroke, cardiovascular death) in patients with peripheral artery disease (PAD). Methods: Studies investigating the association of markers of obesity with CVEs were identified by searching the PUBMED database. To be eligible for inclusion studies had to report an established measure of adiposity, i.e. body mass index (BMI), waist circumference (WC), waist-to-hip ratio (WHR) or an imaging technique to quantify adipose distribution. Results: A total of 9319 patients with PAD were followed for a mean of 1.0-5.7 years in the 7 studies identified. Four studies assessed BMI; one study assessed BMI and WC; one study assessed BMI, WC and WHR; one study assessed WHR. Both of the studies which assessed multiple adipose measures reported a more powerful positive association of WC with CVEs than BMI; one study reported less CVEs in obese subjects as defined by BMI; one study reported a negative association of overweight but not obesity, defined by BMI, with CVEs; one study reported an inverse association of BMI >20 with CVEs; one study did not find a significant association between WHR and cardiovascular death; one study did not find a significant association between BMI and CVEs. Meta-analysis of reported risk ratios found a mild positive association between combined measures of obesity and CVEs in patients with PAD (RR 1.09; 95%CI 1.03-1.16, P=0.006; random effects model). Conclusion: This meta-analysis suggests that obesity is an independent risk factor for CVEs in patients with PAD however larger and more homogeneous studies using equivalent anthropometric measures are needed for more definitive evidence

    The association of genetic variants of matrix metalloproteinases with abdominal aortic aneurysm: a systematic review and meta-analysis

    No full text
    Context: Aberrant matrix turnover is believed to play a key role in the pathogenesis of abdominal aortic aneurysm (AAA). Matrix metalloproteinases (MMPs) and their inhibitors (tissue inhibitor of metalloproteinases; TIMPs) are important enzymes in the control of extracellular matrix remodelling.\ud \ud Objective: The aim of this study was to investigate if single nucleotide polymorphisms (SNPs) within MMP and TIMP gene families are associated with the presence of AAA.\ud \ud Data sources: We performed a search of MEDLINE and EMBASE databases on the 21st November 2012.\ud \ud Study selection: Case-control studies assessing the association of at least one SNP in a MMP or TIMP gene with AAA were included.\ud \ud Data extraction: Data were independently extracted by two reviewers. A random effects model was used to calculate combined odds ratios for commonly investigated SNPs according to dominant, recessive and additive inheritance.\ud \ud Results: Thirteen studies examining 58 SNPs within 10 different MMP and TIMP genes were identified. Eight SNPs were assessed in at least 3 studies (combined sample size ranging from 141- 2191 AAA cases and 340-2013 controls) and included in a meta-analysis. Results on 1258 cases and 1406 controls for MMP3 rs3025058 showed an association with AAA presence; best described by a dominant pattern of inheritance (OR=1.48 95%CI 1.23 – 1.78, p=3.95×10-5). No associations with AAA were identified for other SNPs assessed in this study including rs1799750 (MMP1), rs3918242 (MMP9), rs486055 (MMP10), rs2276109 (MMP12), rs2252070 (MMP13), rs4898 (TIMP1) or rs9619311 (TIMP3).\ud \ud Conclusion: A common SNP within the MMP3 promoter region, previously suggested to increase MMP3 expression, appears to be a moderate risk factor for AAA

    Sustainable Aviation Fuel from Hydrothermal Liquefaction of Wet Wastes

    No full text
    Hydrothermal liquefaction (HTL) uses heat and pressure to liquefy the organic matter in biomass/waste feedstocks to produce biocrude. When hydrotreated the biocrude is converted into transportation fuels including sustainable aviation fuel (SAF). Further, by liquifying the organic matter in wet wastes such as sewage sludge, manure, and food waste, HTL can prevent landfilling or other disposal methods such as anerobic digestion, or incineration. A significant roadblock to the development of a new route for SAF is the strict approval process, and the large volumes required (>400 L) for testing. Tier α and β testing can predict some of the properties required for ASTM testing with <400 mL samples. The current study is the first to investigate the potential for utilizing wet-waste HTL biocrude (WWHTLB) as an SAF feedstock. Herein, several WWHTLB samples were produced from food waste, sewage sludge, and fats, oils, and grease, and subsequently hydrotreated and distilled to produce SAF samples. The fuels (both undistilled and distilled samples) were analyzed via elemental and 2D-GC-MS. Herein, we report the Tier α and β analysis of an SAF sample derived originally from a WWHTLB. The results of this work indicate that the upgraded WWHTLB material exhibits key fuel properties, including carbon number distribution, distillation profile, surface tension, density, viscosity, heat of combustion, and flash point, which all fall within the required range for aviation fuel. WWHTLB has therefore been shown to be a promising candidate feedstock for the production of SAF

    Prokaryotic-virus-encoded auxiliary metabolic genes throughout the global oceans

    No full text
    International audienceBackgroundProkaryotic microbes have impacted marine biogeochemical cycles for billions of years. Viruses also impact these cycles, through lysis, horizontal gene transfer, and encoding and expressing genes that contribute to metabolic reprogramming of prokaryotic cells. While this impact is difficult to quantify in nature, we hypothesized that it can be examined by surveying virus-encoded auxiliary metabolic genes (AMGs) and assessing their ecological context.ResultsWe systematically developed a global ocean AMG catalog by integrating previously described and newly identified AMGs and then placed this catalog into ecological and metabolic contexts relevant to ocean biogeochemistry. From 7.6 terabases of Tara Oceans paired prokaryote- and virus-enriched metagenomic sequence data, we increased known ocean virus populations to 579,904 (up 16%). From these virus populations, we then conservatively identified 86,913 AMGs that grouped into 22,779 sequence-based gene clusters, 7248 (~ 32%) of which were not previously reported. Using our catalog and modeled data from mock communities, we estimate that ~ 19% of ocean virus populations carry at least one AMG. To understand AMGs in their metabolic context, we identified 340 metabolic pathways encoded by ocean microbes and showed that AMGs map to 128 of them. Furthermore, we identified metabolic “hot spots” targeted by virus AMGs, including nine pathways where most steps (≥ 0.75) were AMG-targeted (involved in carbohydrate, amino acid, fatty acid, and nucleotide metabolism), as well as other pathways where virus-encoded AMGs outnumbered cellular homologs (involved in lipid A phosphates, phosphatidylethanolamine, creatine biosynthesis, phosphoribosylamine-glycine ligase, and carbamoyl-phosphate synthase pathways).ConclusionsTogether, this systematically curated, global ocean AMG catalog and analyses provide a valuable resource and foundational observations to understand the role of viruses in modulating global ocean metabolisms and their biogeochemical implications

    Synthesizing Signaling Pathways from Temporal Phosphoproteomic Data

    No full text
    We present a method for automatically discovering signaling pathways from time-resolved phosphoproteomic data. The Temporal Pathway Synthesizer (TPS) algorithm uses constraint-solving techniques first developed in the context of formal verification to explore paths in an interaction network. It systematically eliminates all candidate structures for a signaling pathway where a protein is activated or inactivated before its upstream regulators. The algorithm can model more than one hundred thousand dynamic phosphosites and can discover pathway members that are not differentially phosphorylated. By analyzing temporal data, TPS defines signaling cascades without needing to experimentally perturb individual proteins. It recovers known pathways and proposes pathway connections when applied to the human epidermal growth factor and yeast osmotic stress responses. Independent kinase mutant studies validate predicted substrates in the TPS osmotic stress pathway. Köksal et al. present a computational technique, the temporal pathway synthesizer (TPS), that combines time series global phosphoproteomic data and protein-protein interaction networks to reconstruct the vast signaling pathways that control post-translational modifications.National Science Foundation (U.S.) ( grant DBI-1553206)National Institutes of Health (U.S.) (training grant T32-HL007312)National Institutes of Health (U.S.) (grant U01-CA184898)National Institutes of Health (U.S.) (grant U54-NS09104

    efam: an expanded, metaproteome-supported HMM profile database of viral protein families.

    Get PDF
    MotivationViruses infect, reprogram, and kill microbes, leading to profound ecosystem consequences, from elemental cycling in oceans and soils to microbiome-modulated diseases in plants and animals. Although metagenomic datasets are increasingly available, identifying viruses in them is challenging due to poor representation and annotation of viral sequences in databases.ResultsHere we establish efam, an expanded collection of Hidden Markov Model (HMM) profiles that represent viral protein families conservatively identified from the Global Ocean Virome 2.0 dataset. This resulted in 240,311 HMM profiles, each with at least 2 protein sequences, making efam &gt;7-fold larger than the next largest, pan-ecosystem viral HMM profile database. Adjusting the criteria for viral contig confidence from "conservative" to "eXtremely Conservative" resulted in 37,841 HMM profiles in our efam-XC database. To assess the value of this resource, we integrated efam-XC into VirSorter viral discovery software to discover viruses from less-studied, ecologically distinct oxygen minimum zone (OMZ) marine habitats. This expanded database led to an increase in viruses recovered from every tested OMZ virome by ∼24% on average (up to ∼42%) and especially improved the recovery of often-missed shorter contigs (&lt;5 kb). Additionally, to help elucidate lesser-known viral protein functions, we annotated the profiles using multiple databases from the DRAM pipeline and virion-associated metaproteomic data, which doubled the number of annotations obtainable by standard, single-database annotation approaches. Together, these marine resources (efam and efam-XC) are provided as searchable, compressed HMM databases that will be updated bi-annually to help maximize viral sequence discovery and study from any ecosystem.AvailabilityThe resources are available on the iVirus platform at (doi.org/10.25739/9vze-4143).Supplementary informationSupplementary data are available at Bioinformatics online

    Long-read powered viral metagenomics in the oligotrophic Sargasso Sea

    No full text
    Abstract Dominant microorganisms of the Sargasso Sea are key drivers of the global carbon cycle. However, associated viruses that shape microbial community structure and function are not well characterised. Here, we combined short and long read sequencing to survey Sargasso Sea phage communities in virus- and cellular fractions at viral maximum (80 m) and mesopelagic (200 m) depths. We identified 2,301 Sargasso Sea phage populations from 186 genera. Over half of the phage populations identified here lacked representation in global ocean viral metagenomes, whilst 177 of the 186 identified genera lacked representation in genomic databases of phage isolates. Viral fraction and cell-associated viral communities were decoupled, indicating viral turnover occurred across periods longer than the sampling period of three days. Inclusion of long-read data was critical for capturing the breadth of viral diversity. Phage isolates that infect the dominant bacterial taxa Prochlorococcus and Pelagibacter, usually regarded as cosmopolitan and abundant, were poorly represented
    corecore