Search CORE

10 research outputs found

Unraveling the functional dark matter through global metagenomics

Author: Acinas Silvia G.
Azad Ariful
Baker David
Baltoumas Fotis A.
Buluç Aydin
Call Lee
Camargo Antonio Pedro
Chen I. Min
Iliopoulos Ioannis
Ivanova Natalia N.
Karatzas Evangelos
Konstantinidis Konstantinos T.
Kyrpides Nikos C.
Liu Sirui
Nayfach Stephen
Novel Metagenome Protein Families Consortium
Ouzounis Christos
Ovchinnikov Sergey
Pavlopoulos Georgios A.
Pett-Ridge Jennifer
Páez-Espino A. David
Roux Simon
Selvitopi Oguz
Tiedje James M.
Visel Axel
Publication venue: Nature Publishing Group
Publication date: 01/10/2023
Field of study

30 pages, 4 figures, 1 table, supplementary information https://doi.org/10.1038/s41586-023-06583-7.-- Data availability: All of the analysed datasets along with their corresponding sequences are available from the IMG system (http://img.jgi.doe.gov/). A list of the datasets used in this study is provided in Supplementary Data 8. All data from the protein clusters, including sequences, multiple alignments, HMM profiles, 3D structure models, and taxonomic and ecosystem annotation, are available through NMPFamsDB, publicly accessible at www.nmpfamsdb.org. The 3D models are also available at ModelArchive under accession code ma-nmpfamsdb.-- Code availability: Sequence analysis was performed using Tantan (https://gitlab.com/mcfrith/tantan), BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi), LAST (https://gitlab.com/mcfrith/last), HMMER (http://hmmer.org/) and HH-suite3 (https://github.com/soedinglab/hh-suite). Clustering was performed using HipMCL (https://bitbucket.org/azadcse/hipmcl/src/master/). Additional taxonomic annotation was performed using Whokaryote (https://github.com/LottePronk/whokaryote), EukRep (https://github.com/patrickwest/EukRep), DeepVirFinder (https://github.com/jessieren/DeepVirFinder) and MMseqs2 (https://github.com/soedinglab/MMseqs2). 3D modelling was performed using AlphaFold2 (https://github.com/deepmind/alphafold) and TrRosetta2 (https://github.com/RosettaCommons/trRosetta2). Structural alignments were performed using TMalign (https://zhanggroup.org/TM-align/) and MMalign (https://zhanggroup.org/MM-align/). All custom scripts used for the generation and analysis of the data are available at Zenodo (https://doi.org/10.5281/zenodo.8097349)Metagenomes encode an enormous diversity of proteins, reflecting a multiplicity of functions and activities1,2. Exploration of this vast sequence space has been limited to a comparative analysis against reference microbial genomes and protein families derived from those genomes. Here, to examine the scale of yet untapped functional diversity beyond what is currently possible through the lens of reference genomes, we develop a computational approach to generate reference-free protein families from the sequence space in metagenomes. We analyse 26,931 metagenomes and identify 1.17 billion protein sequences longer than 35 amino acids with no similarity to any sequences from 102,491 reference genomes or the Pfam database3. Using massively parallel graph-based clustering, we group these proteins into 106,198 novel sequence clusters with more than 100 members, doubling the number of protein families obtained from the reference genomes clustered using the same approach. We annotate these families on the basis of their taxonomic, habitat, geographical and gene neighbourhood distributions and, where sufficient sequence diversity is available, predict protein three-dimensional models, revealing novel structures. Overall, our results uncover an enormously diverse functional space, highlighting the importance of further exploring the microbial functional dark matterWith the institutional support of the ‘Severo Ochoa Centre of Excellence’ accreditation (CEX2019-000928-S)Peer reviewe

Digital.CSIC

The Standard European Vector Architecture (SEVA): a coherent platform for the analysis and deployment of complex prokaryotic phenotypes

Author: A. David Páez-Espino
Aitor de las Heras
Alejandro Arce-Rodríguez
Amann
Anderson
Antoine
Bagdasarian
Bagdasarian
Belén Calles
Blatny
Bussiere
Canton
Cases
Choi
de Las Heras
de Las Heras
de Lorenzo
de Lorenzo
de Lorenzo
de Lorenzo
de Lorenzo
Diaz
Endy
Esteban Martínez-García
Farinha
Ferrer
Figurski
Gallivan
Gibson
Gibson
Gonzalo Durante-Rodríguez
Herrero
Ho
Juhyun Kim
Katashkina
Kelly
Kohlmeier
Kolter
Kovach
Lale
Lee
Lutz
Martinez-Garcia
Martinez-Garcia
Martinez-Garcia
Max Chavarría
Meighen
Meyer
Michalodimitrakis
Miller
Miller
Miller
Miller
Miller
Mouser
Nishikawa
Nour-Eldin
Novick
Olsen
Pablo I. Nikel
Purnick
Rafael Silva-Rocha
Raúl Platero
Rosenfeld
Salis
Salje
Santos
Scherzinger
Schweder
Schweizer
Shetty
Silva-Rocha
Simon
Sprinzak
Stueber
Szpirer
Thomas
Tropel
Vieira
Voigt
Vologodskii
Víctor de Lorenzo
Wackett
Waters
Weaver
West
Wieland
Yokobayashi
Publication venue: 'Oxford University Press (OUP)'
Publication date: 22/11/2012
Field of study

The 'Standard European Vector Architecture' database (SEVA-DB, http://seva.cnb.csic.es) was conceived as a user-friendly, web-based resource and a material clone repository to assist in the choice of optimal plasmid vectors for de-constructing and re-constructing complex prokaryotic phenotypes. The SEVA-DB adopts simple design concepts that facilitate the swapping of functional modules and the extension of genome engineering options to microorganisms beyond typical laboratory strains. Under the SEVA standard, every DNA portion of the plasmid vectors is minimized, edited for flaws in their sequence and/or functionality, and endowed with physical connectivity through three inter-segment insulators that are flanked by fixed, rare restriction sites. Such a scaffold enables the exchangeability of multiple origins of replication and diverse antibiotic selection markers to shape a frame for their further combination with a large variety of cargo modules that can be used for varied end-applications. The core collection of constructs that are available at the SEVA-DB has been produced as a starting point for the further expansion of the formatted vector platform. We argue that adoption of the SEVA format can become a shortcut to fill the phenomenal gap between the existing power of DNA synthesis and the actual engineering of predictable and efficacious bacteria

Crossref

PubMed Central

Edinburgh Research Explorer

eScholarship - University of California

Metagenomic compendium of 189,680 DNA viruses from the human gut microbiome.

Author: Bhatt Ami S
Call Lee
Fischbach Michael A
Hugenholtz Philip
Ivanova Natalia N
Kyrpides Nikos C
Low Soo Jen
Nayfach Stephen
Proal Amy D
Páez-Espino David
Sberro Hila
Publication venue: eScholarship, University of California
Publication date: 01/01/2021
Field of study

Bacteriophages have important roles in the ecology of the human gut microbiome but are under-represented in reference databases. To address this problem, we assembled the Metagenomic Gut Virus catalogue that comprises 189,680 viral genomes from 11,810 publicly available human stool metagenomes. Over 75% of genomes represent double-stranded DNA phages that infect members of the Bacteroidia and Clostridia classes. Based on sequence clustering we identified 54,118 candidate viral species, 92% of which were not found in existing databases. The Metagenomic Gut Virus catalogue improves detection of viruses in stool metagenomes and accounts for nearly 40% of CRISPR spacers found in human gut Bacteria and Archaea. We also produced a catalogue of 459,375 viral protein clusters to explore the functional potential of the gut virome. This revealed tens of thousands of diversity-generating retroelements, which use error-prone reverse transcription to mutate target genes and may be involved in the molecular arms race between phages and their bacterial hosts

PubMed Central

eScholarship - University of California

Recommended from our members

The Standard European Vector Architecture (SEVA): a coherent platform for the analysis and deployment of complex prokaryotic phenotypes.

Author: Arce-Rodríguez Alejandro
Calles Belén
Chavarría Max
de Las Heras Aitor
de Lorenzo Víctor
Durante-Rodríguez Gonzalo
Kim Juhyun
Martínez-García Esteban
Nikel Pablo I
Platero Raúl
Páez-Espino A David
Silva-Rocha Rafael
Publication venue: eScholarship, University of California
Publication date: 01/01/2013
Field of study

eScholarship - University of California

Recommended from our members

IMG/VR v3: an integrated ecological and evolutionary framework for interrogating genomes of uncultivated viruses.

Author: Call Lee
Chen I-Min A
Chu Ken
Eloe-Fadrosh Emiley A
Ivanova Natalia N
Kyrpides Nikos C
Nayfach Stephen
Neches Russell Y
Palaniappan Krishna
Páez-Espino David
Ratner Anna
Reddy TBK
Roux Simon
Schulz Frederik
Woyke Tanja
Publication venue: eScholarship, University of California
Publication date: 01/01/2021
Field of study

Viruses are integral components of all ecosystems and microbiomes on Earth. Through pervasive infections of their cellular hosts, viruses can reshape microbial community structure and drive global nutrient cycling. Over the past decade, viral sequences identified from genomes and metagenomes have provided an unprecedented view of viral genome diversity in nature. Since 2016, the IMG/VR database has provided access to the largest collection of viral sequences obtained from (meta)genomes. Here, we present the third version of IMG/VR, composed of 18 373 cultivated and 2 314 329 uncultivated viral genomes (UViGs), nearly tripling the total number of sequences compared to the previous version. These clustered into 935 362 viral Operational Taxonomic Units (vOTUs), including 188 930 with two or more members. UViGs in IMG/VR are now reported as single viral contigs, integrated proviruses or genome bins, and are annotated with a new standardized pipeline including genome quality estimation using CheckV, taxonomic classification reflecting the latest ICTV update, and expanded host taxonomy prediction. The new IMG/VR interface enables users to efficiently browse, search, and select UViGs based on genome features and/or sequence similarity. IMG/VR v3 is available at https://img.jgi.doe.gov/vr, and the underlying data are available to download at https://genome.jgi.doe.gov/portal/IMG_VR

eScholarship - University of California

Recommended from our members

Influence of the polar light cycle on seasonal dynamics of an Antarctic lake microbial community.

Author: Allen Michelle A
Berg Maureen
Bevington James
Brazendale Sarah
Cavicchioli Ricardo
Chen I-Min A
Eloe-Fadrosh Emiley A
Hancock Alyce M
Huntemann Marcel
Kyrpides Nikos C
Nayfach Stephen
Panwar Pratibha
Páez-Espino David
Roux Simon
Schulz Frederik
Shapiro Nicole
Williams Timothy J
Woyke Tanja
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

BackgroundCold environments dominate the Earth's biosphere and microbial activity drives ecosystem processes thereby contributing greatly to global biogeochemical cycles. Polar environments differ to all other cold environments by experiencing 24-h sunlight in summer and no sunlight in winter. The Vestfold Hills in East Antarctica contains hundreds of lakes that have evolved from a marine origin only 3000-7000 years ago. Ace Lake is a meromictic (stratified) lake from this region that has been intensively studied since the 1970s. Here, a total of 120 metagenomes representing a seasonal cycle and four summers spanning a 10-year period were analyzed to determine the effects of the polar light cycle on microbial-driven nutrient cycles.ResultsThe lake system is characterized by complex sulfur and hydrogen cycling, especially in the anoxic layers, with multiple mechanisms for the breakdown of biopolymers present throughout the water column. The two most abundant taxa are phototrophs (green sulfur bacteria and cyanobacteria) that are highly influenced by the seasonal availability of sunlight. The extent of the Chlorobium biomass thriving at the interface in summer was captured in underwater video footage. The Chlorobium abundance dropped from up to 83% in summer to 6% in winter and 1% in spring, before rebounding to high levels. Predicted Chlorobium viruses and cyanophage were also abundant, but their levels did not negatively correlate with their hosts.ConclusionOver-wintering expeditions in Antarctica are logistically challenging, meaning insight into winter processes has been inferred from limited data. Here, we found that in contrast to chemolithoautotrophic carbon fixation potential of Southern Ocean Thaumarchaeota, this marine-derived lake evolved a reliance on photosynthesis. While viruses associated with phototrophs also have high seasonal abundance, the negative impact of viral infection on host growth appeared to be limited. The microbial community as a whole appears to have developed a capacity to generate biomass and remineralize nutrients, sufficient to sustain itself between two rounds of sunlight-driven summer-activity. In addition, this unique metagenome dataset provides considerable opportunity for future interrogation of eukaryotes and their viruses, abundant uncharacterized taxa (i.e. dark matter), and for testing hypotheses about endemic species in polar aquatic ecosystems. Video Abstract

eScholarship - University of California

University of Tasmania Open Access Repository

UNSWorks

A genomic catalog of Earth's microbiomes.

Author: Abreu Helena
Arkin Adam P.
Chen I-Min
Chivian Dylan
Dehal Paramvir
Edirisinghe Janaka N.
Eloe-Fadrosh Emiley A.
et al.
Faria José P.
Henry Christopher S.
Huntemann Marcel
IMG/M Data Consortium
Ivanova Natalia N.
Jungbluth Sean P.
Kirton Edward
Kyrpides Nikos C.
Ladau Joshua
Magnabosco Cara
Mouncey Nigel
Mukherjee Supratim
Nayfach Stephen
Nielsen Torben
Palaniappan Krishna
Páez-Espino David
Reddy T.B.K.
Roux Simon
Schulz Frederik
Seshadri Rekha
Tringe Susannah G.
Udwary Daniel
Varghese Neha
Visel Axel
Wood-Charlson Elisha M.
Woyke Tanja
Wu Dongying
Publication venue: eScholarship, University of California
Publication date: 01/04/2021
Field of study

The reconstruction of bacterial and archaeal genomes from shotgun metagenomes has enabled insights into the ecology and evolution of environmental and host-associated microbiomes. Here we applied this approach to >10,000 metagenomes collected from diverse habitats covering all of Earth's continents and oceans, including metagenomes from human and animal hosts, engineered environments, and natural and agricultural soils, to capture extant microbial, metabolic and functional potential. This comprehensive catalog includes 52,515 metagenome-assembled genomes representing 12,556 novel candidate species-level operational taxonomic units spanning 135 phyla. The catalog expands the known phylogenetic diversity of bacteria and archaea by 44% and is broadly available for streamlined comparative analyses, interactive exploration, metabolic modeling and bulk download. We demonstrate the utility of this collection for understanding secondary-metabolite biosynthetic potential and for resolving thousands of new host linkages to uncultivated viruses. This resource underscores the value of genome-centric approaches for revealing genomic properties of uncultivated microorganisms that affect ecosystem processes

Repository for Publications and Research Data

eScholarship - University of California

Minimum Information about an Uncultivated Virus Genome (MIUViG)

Author: A Dayaram
A Lwoff
A Lwoff
A Reyes
A Reyes
A Reyes
A Varsani
AC Gregory
AI Culley
Alejandro Reyes
AMQ King
Andrew M Kropinski
AP Reyes
AR Coenen
Arvind Varsani
AS Lang
B Bolduc
B Luef
Bas E Dutilh
BE Dutilh
Ben Temperton
Bonnie L Hurwitz
C Galiez
Catherine Putonti
CH Andrewes
Christelle Desnues
CJ Houldcroft
Clara Amid
CM Mizuno
Curtis A Suttle
D Amgarten
D Arndt
D Baltimore
D Field
D Páez-Espino
D Páez-Espino
D Páez-Espino
David Páez-Espino
DM Needham
EG Sakowski
Emiley A Eloe-Fadrosh
Eugene V Koonin
Evelien M Adriaenssens
F Martinez-Hernandez
F Rohwer
FE Angly
Francisco Rodriguez-Valera
François Enault
Frederik Schulz
G Lima-Mendez
G Lima-Mendez
G Zhao
Grieg F Steward
Guy R Cochrane
HA Lorenzi
HF Schmidt
Hiroyuki Ogata
HS Yoon
I Garcia-Heredia
Ilene Karsch Mizrachi
J Ren
J Rodney Brister
J Shendure
J Söding
J Villarroel
Jed A Fuhrman
Jens H Kuhn
Jessica M Labonté
JL Mokili
JM Labonté
Joanne B Emerson
JP Meier-Kolthoff
JR Brister
JR Brister
JR Brum
JT Ladner
K Arkhipova
K Eric Wommack
K McNair
Karyna Rosario
Katrine L Whiteson
Kelly C Wrighton
Kyung-Bum Lee
Lisa Zeigler Allen
LS Frost
Lynn Schriml
LZ Allen
M Breitbart
M López-Pérez
M Margulies
M Moniruzzaman
M Shi
Manuel Martinez-Garcia
Marie-Agnès Petit
Mark J Young
Mart Krupovic
Matthew B Sullivan
MB Duhaime
MB Duhaime
Melissa B Duhaime
MJ Adams
Mya Breitbart
NA Ahlgren
Natalia N Ivanova
Natalya Yutin
Nicole S Webster
Nikos C Kyrpides
NL Gao
P Aiewsakun
P Hingamp
P Simmonds
P Yilmaz
Pascal Hingamp
Peer Bork
Pelin Yilmaz
Philip Hugenholtz
R Lavigne
R Stepanauskas
RA Edwards
Ramy K Aziz
Rebecca A Daly
Rebecca Vega Thurber
Rex R Malmstrom
RK Aziz
RL Marine
RM Bowers
Rob Lavigne
RV Thurber
S Casjens
S Roux
S Roux
S Roux
S Roux
S Roux
S Roux
Seth R Bordenstein
SF Altschul
Shinichi Sunagawa
Simon Roux
SJ Biller
SR Eddy
Steven W Wilhelm
Susannah G Tringe
T Mihara
Takashi Yoshida
Tanja Woyke
Thomas Rattei
TN Mavrich
TN Wylie
WH Wilson
Y Bào
Y Nishimura
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

International audienceWe present an extension of the Minimum Information about any (x) Sequence (MIxS) standard for reporting sequences of uncultivated virus genomes. Minimum Information about an Uncultivated Virus Genome (MIUViG) standards were developed within the Genomic Standards Consortium framework and include virus origin, genome quality, genome annotation, taxonomic classification, biogeographic distribution and in silico host prediction. Community-wide adoption of MIUViG standards, which complement the Minimum Information about a Single Amplified Genome (MISAG) and Metagenome-Assembled Genome (MIMAG) standards for uncultivated bacteria and archaea, will improve the reporting of uncultivated virus genomes in public databases. In turn, this should enable more robust comparative studies and a systematic exploration of the global virosphere

Repositorio Institucional de la Universidad de Alicante

University of Liverpool Repository

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

HAL AMU

HAL Clermont Université

HAL Descartes

Loyola eCommons

Repository for Publications and Research Data

Crossref

HAL-INSU

HAL-IRD

eScholarship - University of California

Utrecht University Repository

HAL-Pasteur

A genomic catalog of Earth’s microbiomes

Author: Abreu Helena
Arkin Adam P.
Chen I-Min
Chivian Dylan
Dehal Paramvir
Edirisinghe Janaka N.
Eloe-Fadrosh Emiley A.
et al.
Faria José P.
Henry Christopher S.
Huntemann Marcel
IMG/M Data Consortium
Ivanova Natalia N.
Jungbluth Sean P.
Kirton Edward
Kyrpides Nikos C.
Ladau Joshua
Magnabosco Cara
Mouncey Nigel
Mukherjee Supratim
Nayfach Stephen
Nielsen Torben
Palaniappan Krishna
Páez-Espino David
Reddy T.B.K.
Roux Simon
Schulz Frederik
Seshadri Rekha
Tringe Susannah G.
Udwary Daniel
Varghese Neha
Visel Axel
Wood-Charlson Elisha M.
Woyke Tanja
Wu Dongying
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2021
Field of study

The reconstruction of bacterial and archaeal genomes from shotgun metagenomes has enabled insights into the ecology and evolution of environmental and host-associated microbiomes. Here we applied this approach to >10,000 metagenomes collected from diverse habitats covering all of Earth’s continents and oceans, including metagenomes from human and animal hosts, engineered environments, and natural and agricultural soils, to capture extant microbial, metabolic and functional potential. This comprehensive catalog includes 52,515 metagenome-assembled genomes representing 12,556 novel candidate species-level operational taxonomic units spanning 135 phyla. The catalog expands the known phylogenetic diversity of bacteria and archaea by 44% and is broadly available for streamlined comparative analyses, interactive exploration, metabolic modeling and bulk download. We demonstrate the utility of this collection for understanding secondary-metabolite biosynthetic potential and for resolving thousands of new host linkages to uncultivated viruses. This resource underscores the value of genome-centric approaches for revealing genomic properties of uncultivated microorganisms that affect ecosystem processes.ISSN:1546-1696ISSN:1087-015

Repository for Publications and Research Data

Minimum Information about an Uncultivated Virus Genome (MIUViG)

Author: A Dayaram
A Lwoff
A Lwoff
A Reyes
A Reyes
A Reyes
A Varsani
AC Gregory
AI Culley
Alejandro Reyes
AMQ King
Andrew M Kropinski
AP Reyes
AR Coenen
Arvind Varsani
AS Lang
B Bolduc
B Luef
Bas E Dutilh
BE Dutilh
Ben Temperton
Bonnie L Hurwitz
C Galiez
Catherine Putonti
CH Andrewes
Christelle Desnues
CJ Houldcroft
Clara Amid
CM Mizuno
Curtis A Suttle
D Amgarten
D Arndt
D Baltimore
D Field
D Páez-Espino
D Páez-Espino
D Páez-Espino
David Páez-Espino
DM Needham
EG Sakowski
Emiley A Eloe-Fadrosh
Eugene V Koonin
Evelien M Adriaenssens
F Martinez-Hernandez
F Rohwer
FE Angly
Francisco Rodriguez-Valera
François Enault
Frederik Schulz
G Lima-Mendez
G Lima-Mendez
G Zhao
Grieg F Steward
Guy R Cochrane
HA Lorenzi
HF Schmidt
Hiroyuki Ogata
HS Yoon
I Garcia-Heredia
Ilene Karsch Mizrachi
J Ren
J Rodney Brister
J Shendure
J Söding
J Villarroel
Jed A Fuhrman
Jens H Kuhn
Jessica M Labonté
JL Mokili
JM Labonté
Joanne B Emerson
JP Meier-Kolthoff
JR Brister
JR Brister
JR Brum
JT Ladner
K Arkhipova
K Eric Wommack
K McNair
Karyna Rosario
Katrine L Whiteson
Kelly C Wrighton
Kyung-Bum Lee
Lisa Zeigler Allen
LS Frost
Lynn Schriml
LZ Allen
M Breitbart
M López-Pérez
M Margulies
M Moniruzzaman
M Shi
Manuel Martinez-Garcia
Marie-Agnès Petit
Mark J Young
Mart Krupovic
Matthew B Sullivan
MB Duhaime
MB Duhaime
Melissa B Duhaime
MJ Adams
Mya Breitbart
NA Ahlgren
Natalia N Ivanova
Natalya Yutin
Nicole S Webster
Nikos C Kyrpides
NL Gao
P Aiewsakun
P Hingamp
P Simmonds
P Yilmaz
Pascal Hingamp
Peer Bork
Pelin Yilmaz
Philip Hugenholtz
R Lavigne
R Stepanauskas
RA Edwards
Ramy K Aziz
Rebecca A Daly
Rebecca Vega Thurber
Rex R Malmstrom
RK Aziz
RL Marine
RM Bowers
Rob Lavigne
RV Thurber
S Casjens
S Roux
S Roux
S Roux
S Roux
S Roux
S Roux
Seth R Bordenstein
SF Altschul
Shinichi Sunagawa
Simon Roux
SJ Biller
SR Eddy
Steven W Wilhelm
Susannah G Tringe
T Mihara
Takashi Yoshida
Tanja Woyke
Thomas Rattei
TN Mavrich
TN Wylie
WH Wilson
Y Bào
Y Nishimura
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref