Search CORE

University of Queensland eSpace

UNT Digital Library

ScholarBank@NUS

Millimeter-scale genetic gradients and community-level molecular convergence in a hypersaline microbial mat

Author: Brad M Bebout
Christian von Mering
DeLong EF
Garcia‐Pichel F
J Kirk Harris
Jeffrey J Walker
Jeroen Raes
John R Spear
Jorgensen BB
Kates M
Natalia Ivanova
Norman R Pace
Peer Bork
Philip Hugenholtz
Schmitt‐Wagner D
Tringe SG
Venter JC
Victor Kunin
von Mering C
Publication venue: Nature Publishing Group
Publication date: 01/01/2008
Field of study

To investigate the extent of genetic stratification in structured microbial communities, we compared the metagenomes of 10 successive layers of a phylogenetically complex hypersaline mat from Guerrero Negro, Mexico. We found pronounced millimeter-scale genetic gradients that were consistent with the physicochemical profile of the mat. Despite these gradients, all layers displayed near-identical and acid-shifted isoelectric point profiles due to a molecular convergence of amino-acid usage, indicating that hypersalinity enforces an overriding selective pressure on the mat community

MetaPath: identifying differentially abundant metabolic pathways in metagenomic datasets

Author: B Liu
B Rodriguez-Brito
Bo Liu
CS Riesenfeld
F Borson-Chazot
F Meyer
I Sharon
JD Storey
JR White
K Kurokawa
M Kanehisa
Mihai Pop
MR Fokkema
MT Dittrich
O Beja
PJ Turnbaugh
PJ Turnbaugh
R Mojtabai
R Tungtrongchitr
RH Eckel
RL Tatusov
S Gallistl
S Hirsch
SG Tringe
T Ideker
TA Gianoulis
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Enabled by rapid advances in sequencing technology, metagenomic studies aim to characterize entire communities of microbes bypassing the need for culturing individual bacterial members. One major goal of metagenomic studies is to identify specific functional adaptations of microbial communities to their habitats. The functional profile and the abundances for a sample can be estimated by mapping metagenomic sequences to the global metabolic network consisting of thousands of molecular reactions. Here we describe a powerful analytical method (MetaPath) that can identify differentially abundant pathways in metagenomic datasets, relying on a combination of metagenomic sequence data and prior metabolic pathway knowledge. First, we introduce a scoring function for an arbitrary subnetwork and find the max-weight subnetwork in the global network by a greedy search algorithm. Then we compute two p values (p abund and p struct ) using nonparametric approaches to answer two different statistical questions: (1) is this subnetwork differentically abundant? (2) What is the probability of finding such good subnetworks by chance given the data and network structure? Finally, significant metabolic subnetworks are discovered based on these two p values. In order to validate our methods, we have designed a simulated metabolic pathways dataset and show that MetaPath outperforms other commonly used approaches. We also demonstrate the power of our methods in analyzing two publicly available metagenomic datasets, and show that the subnetworks identified by MetaPath provide valuable insights into the biological activities of the microbiome. We have introduced a statistical method for finding significant metabolic subnetworks from metagenomic datasets. Compared with previous methods, results from MetaPath are more robust against noise in the data, and have significantly higher sensitivity and specificity (when tested on simulated datasets). When applied to two publicly available metagenomic datasets, the output of MetaPath is consistent with previous observations and also provides several new insights into the metabolic activity of the gut microbiome. The software is freely available at http://metapath.cbcb.umd.edu .https://doi.org/10.1186/1753-6561-5-S2-S

Digital Repository at the University of Maryland

TagCleaner: Identification and removal of tag sequences from genomic and metagenomic datasets

Author: A Djikeng
AH Wright
AR Quinlan
C Quince
D Gusfield
DB Jaffe
EA Dinsdale
Forest Rohwer
G Myers
G Navarro
GR Reyes
J Falgueras
JC Dohm
JR Cole
M Margulies
P Froussard
P Schloss
PD Schloss
PJA Cock
RA Baeza-Yates
Robert Edwards
Robert Schmieder
RV Thurber
S Diguistini
S Huse
S Nakamura
SG Tringe
V Kunin
Y Chen
Yan Wei Lim
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Sequencing metagenomes that were pre-amplified with primer-based methods requires the removal of the additional tag sequences from the datasets. The sequenced reads can contain deletions or insertions due to sequencing limitations, and the primer sequence may contain ambiguous bases. Furthermore, the tag sequence may be unavailable or incorrectly reported. Because of the potential for downstream inaccuracies introduced by unwanted sequence contaminations, it is important to use reliable tools for pre-processing sequence data. Results TagCleaner is a web application developed to automatically identify and remove known or unknown tag sequences allowing insertions and deletions in the dataset. TagCleaner is designed to filter the trimmed reads for duplicates, short reads, and reads with high rates of ambiguous sequences. An additional screening for and splitting of fragment-to-fragment concatenations that gave rise to artificial concatenated sequences can increase the quality of the dataset. Users may modify the different filter parameters according to their own preferences. Conclusions TagCleaner is a publicly available web application that is able to automatically detect and efficiently remove tag sequences from metagenomic datasets. It is easily configurable and provides a user-friendly interface. The interactive web interface facilitates export functionality for subsequent data processing, and is available at <url>http://edwards.sdsu.edu/tagcleaner</url>.</p

Short clones or long clones? A simulation study on the use of paired reads in metagenomics

Author: C von Mering
D Benson
D Bentley
D MacLean
Daniel H Huson
DB Rusch
DC Richter
DH Huson
DR Bentley
FW J Kuever
I Korf
J Frias-Lopez
JC Venter
JE Koenig
K Mavromatis
M Ashburner
M Margulies
Max Schubach
R Overbeek
R Seshadri
S Mitra
SF Altschul
SG Tringe
Suparna Mitra
T Urich
T Woyke
V Kunin
VM Markowitz
W Miller
W Qi
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Metagenomics is the study of environmental samples using sequencing. Rapid advances in sequencing technology are fueling a vast increase in the number and scope of metagenomics projects. Most metagenome sequencing projects so far have been based on Sanger or Roche-454 sequencing, as only these technologies provide long enough reads, while Illumina sequencing has not been considered suitable for metagenomic studies due to a short read length of only 35 bp. However, now that reads of length 75 bp can be sequenced in pairs, Illumina sequencing has become a viable option for metagenome studies. Results This paper addresses the problem of taxonomical analysis of paired reads. We describe a new feature of our metagenome analysis software MEGAN that allows one to process sequencing reads in pairs and makes assignments of such reads based on the combined bit scores of their matches to reference sequences. Using this new software in a simulation study, we investigate the use of Illumina paired-sequencing in taxonomical analysis and compare the performance of single reads, short clones and long clones. In addition, we also compare against simulated Roche-454 sequencing runs. Conclusion This work shows that paired reads perform better than single reads, as expected, but also, perhaps slightly less obviously, that long clones allow more specific assignments than short ones. A new version of the program MEGAN that explicitly takes paired reads into account is available from our website.</p

ScholarBank@NUS

Analysis of 16S rRNA Amplicon Sequencing Options on the Roche/454 Next-Generation Titanium Sequencing Platform

Author: A Engelbrektson
A Lykidis
Ahmed Moustafa
Chiachi Hwang
Chris L. Wright
GD Wu
Hideyuki Tamaki
JR Cole
Jyothi Thimmapuram
M Palatinszky
MF Polz
ML Sogin
N Youssef
Q Wang
Qiaoyan Lin
SG Tringe
Shiping Wang
SM Huse
SM Huse
TJ Hamp
TM Schmidt
Wen-Tso Liu
Xiangzhen Li
Y Wang
Yoichi Kamagata
Z Liu
Z Liu
Publication venue: Public Library of Science
Publication date
Field of study

BACKGROUND: 16S rRNA gene pyrosequencing approach has revolutionized studies in microbial ecology. While primer selection and short read length can affect the resulting microbial community profile, little is known about the influence of pyrosequencing methods on the sequencing throughput and the outcome of microbial community analyses. The aim of this study is to compare differences in output, ease, and cost among three different amplicon pyrosequencing methods for the Roche/454 Titanium platform METHODOLOGY/PRINCIPAL FINDINGS: The following three pyrosequencing methods for 16S rRNA genes were selected in this study: Method-1 (standard method) is the recommended method for bi-directional sequencing using the LIB-A kit; Method-2 is a new option designed in this study for unidirectional sequencing with the LIB-A kit; and Method-3 uses the LIB-L kit for unidirectional sequencing. In our comparison among these three methods using 10 different environmental samples, Method-2 and Method-3 produced 1.5-1.6 times more useable reads than the standard method (Method-1), after quality-based trimming, and did not compromise the outcome of microbial community analyses. Specifically, Method-3 is the most cost-effective unidirectional amplicon sequencing method as it provided the most reads and required the least effort in consumables management. CONCLUSIONS: Our findings clearly demonstrated that alternative pyrosequencing methods for 16S rRNA genes could drastically affect sequencing output (e.g. number of reads before and after trimming) but have little effect on the outcomes of microbial community analysis. This finding is important for both researchers and sequencing facilities utilizing 16S rRNA gene pyrosequencing for microbial ecological studies

Analysis and comparison of very large metagenomes with fast clustering and functional annotation

Author: AC McHardy
AR Quinlan
B Rodriguez-Brito
D Sheskin
DB Rusch
DC Richter
DH Huson
E Portugaly
EA Dinsdale
EF DeLong
FE Angly
GW Tyson
H Noguchi
H Noguchi
H Teeling
H Teeling
J Shendure
JC Venter
K Mavromatis
KJ Hoff
L Krause
PD Schloss
R Seshadri
RK Aziz
S Yooseph
S Yooseph
SF Altschul
SG Tringe
SR Eddy
SR Gill
W Li
W Li
W Li
W Li
Weizhong Li
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The remarkable advance of metagenomics presents significant new challenges in data analysis. Metagenomic datasets (metagenomes) are large collections of sequencing reads from anonymous species within particular environments. Computational analyses for very large metagenomes are extremely time-consuming, and there are often many novel sequences in these metagenomes that are not fully utilized. The number of available metagenomes is rapidly increasing, so fast and efficient metagenome comparison methods are in great demand. Results The new metagenomic data analysis method Rapid Analysis of Multiple Metagenomes with a Clustering and Annotation Pipeline (RAMMCAP) was developed using an ultra-fast sequence clustering algorithm, fast protein family annotation tools, and a novel statistical metagenome comparison method that employs a unique graphic interface. RAMMCAP processes extremely large datasets with only moderate computational effort. It identifies raw read clusters and protein clusters that may include novel gene families, and compares metagenomes using clusters or functional annotations calculated by RAMMCAP. In this study, RAMMCAP was applied to the two largest available metagenomic collections, the "Global Ocean Sampling" and the "Metagenomic Profiling of Nine Biomes". Conclusion RAMMCAP is a very fast method that can cluster and annotate one million metagenomic reads in only hundreds of CPU hours. It is available from <url>http://tools.camera.calit2.net/camera/rammcap/</url>.</p

Bacterial Genomes: Habitat Specificity and Uncharted Organisms

Author: A Bernal
C Pedrós-Alió
D Wu
EA Dinsdale
FE Angly
Fernando Dini Andreote
Francisco Dini-Andreote
GR Burke
H Toh
J Raes
JA Gilbert
Jack T. Trevors
JAG Ranea
Jan Dirk van Elsas
JE Barrick
JK Harris
JT Trevors
L Oksana
L Philippot
M Touchon
M Wagner
ML Sogin
NR Pace
P Lapierre
P Yilmaz
PKH Lee
RT Jones
S Abby
SG Tringe
T Ishoey
T Woyke
TM Vogel
Welington Luiz Araújo
Publication venue: Springer-Verlag
Publication date: 01/01/2012
Field of study

The capability and speed in generating genomic data have increased profoundly since the release of the draft human genome in 2000. Additionally, sequencing costs have continued to plummet as the next generation of highly efficient sequencing technologies (next-generation sequencing) became available and commercial facilities promote market competition. However, new challenges have emerged as researchers attempt to efficiently process the massive amounts of sequence data being generated. First, the described genome sequences are unequally distributed among the branches of bacterial life and, second, bacterial pan-genomes are often not considered when setting aims for sequencing projects. Here, we propose that scientists should be concerned with attaining an improved equal representation of most of the bacterial tree of life organisms, at the genomic level. Moreover, they should take into account the natural variation that is often observed within bacterial species and the role of the often changing surrounding environment and natural selection pressures, which is central to bacterial speciation and genome evolution. Not only will such efforts contribute to our overall understanding of the microbial diversity extant in ecosystems as well as the structuring of the extant genomes, but they will also facilitate the development of better methods for (meta)genome annotation

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

The metagenomics RAST server – a public resource for the automatic phylogenetic and functional analysis of metagenomes

Author: A Rodriguez
A Wilke
AC McHardy
B Rodriguez-Brito
D Field
D Paarmann
EA Dinsdale
EM Glass
F Liang
F Meyer
F Meyer
F Rohwer
GW Tyson
J Wilkening
J Wuyts
JC Venter
JR Cole
L Krause
L Wegley
LK McNeil
M D'Souza
M Kubal
M Margulies
N Fierer
R Leplae
R Olson
R Overbeek
R Stevens
RA Edwards
RA Edwards
RK Aziz
SFl Altschul
SG Tringe
SM Huse
T Jarvie
T Paczian
TZ DeSantis
XSS Mou
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Methods for comparative metagenomics

Author: A Bernal
Alexander F Auch
B Rodriguez-Brito
C Lozupone
C von Mering
CL Wells
CR Woese
D Benson
D Bentley
Daniel C Richter
Daniel H Huson
DB Rusch
DH Huson
DH Huson
FD Ciccarelli
GW Tyson
HN Poinar
J Raes
JC Venter
L Krause
M Ashburner
M Margulies
N Lang-Unnasch
PB Eckburg
PJ Turnbaugh
R Lambert
R Overbeek
RL Tatusov
SF Altschul
SG Tringe
SR Gill
Stephan C Schuster
Suparna Mitra
T Urich
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Metagenomics is a rapidly growing field of research that aims at studying uncultured organisms to understand the true diversity of microbes, their functions, cooperation and evolution, in environments such as soil, water, ancient remains of animals, or the digestive system of animals and humans. The recent development of ultra-high throughput sequencing technologies, which do not require cloning or PCR amplification, and can produce huge numbers of DNA reads at an affordable cost, has boosted the number and scope of metagenomic sequencing projects. Increasingly, there is a need for new ways of comparing multiple metagenomics datasets, and for fast and user-friendly implementations of such approaches. Results This paper introduces a number of new methods for interactively exploring, analyzing and comparing multiple metagenomic datasets, which will be made freely available in a new, comparative version 2.0 of the stand-alone metagenome analysis tool MEGAN. Conclusion There is a great need for powerful and user-friendly tools for comparative analysis of metagenomic data and MEGAN 2.0 will help to fill this gap.</p