Search CORE

133 research outputs found

Subsystems-based servers for rapid annotation of genomes and metagenomes

Author: CS Riesenfeld
D Wu
F Meyer
F Meyer
J Handelsman
LK McNeil
R Overbeek
R Overbeek
RA Edwards
Ramy Karam Aziz
RK Aziz
RK Aziz
SC Schuster
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MetaPath: identifying differentially abundant metabolic pathways in metagenomic datasets

Author: B Liu
B Rodriguez-Brito
Bo Liu
CS Riesenfeld
F Borson-Chazot
F Meyer
I Sharon
JD Storey
JR White
K Kurokawa
M Kanehisa
Mihai Pop
MR Fokkema
MT Dittrich
O Beja
PJ Turnbaugh
PJ Turnbaugh
R Mojtabai
R Tungtrongchitr
RH Eckel
RL Tatusov
S Gallistl
S Hirsch
SG Tringe
T Ideker
TA Gianoulis
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Enabled by rapid advances in sequencing technology, metagenomic studies aim to characterize entire communities of microbes bypassing the need for culturing individual bacterial members. One major goal of metagenomic studies is to identify specific functional adaptations of microbial communities to their habitats. The functional profile and the abundances for a sample can be estimated by mapping metagenomic sequences to the global metabolic network consisting of thousands of molecular reactions. Here we describe a powerful analytical method (MetaPath) that can identify differentially abundant pathways in metagenomic datasets, relying on a combination of metagenomic sequence data and prior metabolic pathway knowledge. First, we introduce a scoring function for an arbitrary subnetwork and find the max-weight subnetwork in the global network by a greedy search algorithm. Then we compute two p values (p abund and p struct ) using nonparametric approaches to answer two different statistical questions: (1) is this subnetwork differentically abundant? (2) What is the probability of finding such good subnetworks by chance given the data and network structure? Finally, significant metabolic subnetworks are discovered based on these two p values. In order to validate our methods, we have designed a simulated metabolic pathways dataset and show that MetaPath outperforms other commonly used approaches. We also demonstrate the power of our methods in analyzing two publicly available metagenomic datasets, and show that the subnetworks identified by MetaPath provide valuable insights into the biological activities of the microbiome. We have introduced a statistical method for finding significant metabolic subnetworks from metagenomic datasets. Compared with previous methods, results from MetaPath are more robust against noise in the data, and have significantly higher sensitivity and specificity (when tested on simulated datasets). When applied to two publicly available metagenomic datasets, the output of MetaPath is consistent with previous observations and also provides several new insights into the metabolic activity of the gut microbiome. The software is freely available at http://metapath.cbcb.umd.edu .https://doi.org/10.1186/1753-6561-5-S2-S

Crossref

Springer - Publisher Connector

PubMed Central

Digital Repository at the University of Maryland

Fecal Microbiota in Premature Infants Prior to Necrotizing Enterocolitis

Author: A Jacquot
A Schwiertz
AM O'Hara
C Lozupone
C Palmer
Christopher Michael Young
CS Riesenfeld
Dipshikha Chakravortty
DN Frank
Douglas Theriaque
EC Claud
G Deshpande
George Casella
J Neu
JG Caporaso
Josef Neu
L Dethlefsen
M Hamady
M Hattori
M Mshvildadze
M Mshvildadze
Maria Ukhanova
Mark Hudak
MC Walsh
MF De La Cochetiere
MJ Bell
MJ Morowitz
Nan Li
PJ Turnbaugh
RA Caicedo
Renu Sharma
RS Munford
SR Coats
V Mai
Volker Mai
Xiaoyu Wang
Y Sun
Y Sun
Y Wang
Yijun Sun
Publication venue: Public Library of Science
Publication date: 06/06/2011
Field of study

Intestinal luminal microbiota likely contribute to the etiology of necrotizing enterocolitis (NEC), a common disease in preterm infants. Microbiota development, a cascade of initial colonization events leading to the establishment of a diverse commensal microbiota, can now be studied in preterm infants using powerful molecular tools. Starting with the first stool and continuing until discharge, weekly stool specimens were collected prospectively from infants with gestational ages ≤32 completed weeks or birth weights≤1250 g. High throughput 16S rRNA sequencing was used to compare the diversity of microbiota and the prevalence of specific bacterial signatures in nine NEC infants and in nine matched controls. After removal of short and low quality reads we retained a total of 110,021 sequences. Microbiota composition differed in the matched samples collected 1 week but not <72 hours prior to NEC diagnosis. We detected a bloom (34% increase) of Proteobacteria and a decrease (32%) in Firmicutes in NEC cases between the 1 week and <72 hour samples. No significant change was identified in the controls. At both time points, molecular signatures were identified that were increased in NEC cases. One of the bacterial signatures detected more frequently in NEC cases (p<0.01) matched closest to γ-Proteobacteria. Although this sequence grouped to the well-studied Enterobacteriaceae family, it did not match any sequence in Genbank by more than 97%. Our observations suggest that abnormal patterns of microbiota and potentially a novel pathogen contribute to the etiology of NEC

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Gene prediction in metagenomic fragments: A large scale machine learning approach

Author: A Lukashin
AL Delcher
BE Suzek
Burkhard Morgenstern
CJ van Rijsbergen
CM Bishop
CS Riesenfeld
D Frishman
DA Benson
DJC MacKay
F Sanger
F Wilcoxon
GW Tyson
H Noguchi
HY Ou
IT Nabney
J Besemer
J Handelsman
JC Venter
K Chen
Katharina J Hoff
KE Rudd
L Krause
M Ronaghi
M Tech
M Tech
Maike Tech
MS Rappe
P Hugenholtz
P Nielson
Peter Meinicke
R Amann
R Daniel
R Daniel
R Development Core Team
RA Edwards
Rolf Daniel
S Altschul
S Voget
SG Tringe
T Hastie
T Jarvie
Thomas Lingner
V Torsvik
VB Bajic
W Streit
Publication venue: BioMed Central
Publication date: 01/04/2008
Field of study

Abstract Background Metagenomics is an approach to the characterization of microbial genomes via the direct isolation of genomic sequences from the environment without prior cultivation. The amount of metagenomic sequence data is growing fast while computational methods for metagenome analysis are still in their infancy. In contrast to genomic sequences of single species, which can usually be assembled and analyzed by many available methods, a large proportion of metagenome data remains as unassembled anonymous sequencing reads. One of the aims of all metagenomic sequencing projects is the identification of novel genes. Short length, for example, Sanger sequencing yields on average 700 bp fragments, and unknown phylogenetic origin of most fragments require approaches to gene prediction that are different from the currently available methods for genomes of single species. In particular, the large size of metagenomic samples requires fast and accurate methods with small numbers of false positive predictions. Results We introduce a novel gene prediction algorithm for metagenomic fragments based on a two-stage machine learning approach. In the first stage, we use linear discriminants for monocodon usage, dicodon usage and translation initiation sites to extract features from DNA sequences. In the second stage, an artificial neural network combines these features with open reading frame length and fragment GC-content to compute the probability that this open reading frame encodes a protein. This probability is used for the classification and scoring of gene candidates. With large scale training, our method provides fast single fragment predictions with good sensitivity and specificity on artificially fragmented genomic DNA. Additionally, this method is able to predict translation initiation sites accurately and distinguishes complete from incomplete genes with high reliability. Conclusion Large scale machine learning methods are well-suited for gene prediction in metagenomic DNA fragments. In particular, the combination of linear discriminants and neural networks is promising and should be considered for integration into metagenomic analysis pipelines. The data sets can be downloaded from the URL provided (see Availability and requirements section).</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Predicting Prokaryotic Ecological Niches Using Genome Sequence Analysis

Author: A Marchler-Bauer
AB Simonson
B Snel
Barry S. Goldman
BC Patten
C Chothia
C Dale
C Elton
CA Orengo
CM Fraser
CR Woese
CR Woese
CS Riesenfeld
E Lerat
E Lerat
E Yabuuchi
F Harrison
F Tekaia
FD Ciccarelli
FM Cohan
G Apic
G Davidson
Garret Suen
Geraldine Butler
GM Garrity
H Ochman
H Ochman
H Ochman
J Felenstein
J Grinell
J Lin
JB Martiny
JB Martiny
JH Badger
JP Gogarten
JR Cole
JS Taylor
K Chen
K Riedel
KT Konstantinidis
N Goldenfeld
NA Moran
P Hugenholtz
RC Edgar
RD Finn
Roy D. Welch
RS Gupta
S Ohno
S Oliver
S Yang
SF Altschul
VM Markowitz
Y Boucher
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

Automated DNA sequencing technology is so rapid that analysis has become the rate-limiting step. Hundreds of prokaryotic genome sequences are publicly available, with new genomes uploaded at the rate of approximately 20 per month. As a result, this growing body of genome sequences will include microorganisms not previously identified, isolated, or observed. We hypothesize that evolutionary pressure exerted by an ecological niche selects for a similar genetic repertoire in those prokaryotes that occupy the same niche, and that this is due to both vertical and horizontal transmission. To test this, we have developed a novel method to classify prokaryotes, by calculating their Pfam protein domain distributions and clustering them with all other sequenced prokaryotic species. Clusters of organisms are visualized in two dimensions as ‘mountains’ on a topological map. When compared to a phylogenetic map constructed using 16S rRNA, this map more accurately clusters prokaryotes according to functional and environmental attributes. We demonstrate the ability of this map, which we term a “niche map”, to cluster according to ecological niche both quantitatively and qualitatively, and propose that this method be used to associate uncharacterized prokaryotes with their ecological niche as a means of predicting their functional role directly from their genome sequence

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Simultaneous Identification of DNA and RNA Viruses Present in Pig Faeces Using Process-Controlled Deep Sequencing

Author: A Djikeng
A Reyes
AC Palmenberg
AJ Cann
CE Shannon
CS Riesenfeld
DJ Sencer
HK Kim
Houssam Attoui
HW Smith
J Dreier
J Sambrook
Jana Sachsenröder
Jens A. Hammerl
JG Victoria
JM Day
JR Miller
KC Jere
L Li
L Li
L Li
M Arumugan
M Breitbart
M Breitbart
M Breitbart
M Margulies
M Pignatelli
MI Costafreda
MJ Roossinck
MT Maidana-Giret
N Halaihel
O Blinkova
P Tang
P Ward
Paul Wrede
Pawel Janczyk
R Johne
R Lamendella
Reimar Johne
RV Thurber
S Minot
SE Midgley
SF Altschul
SR Finkbeiner
Stefan Hertwig
Sven Twardziok
T Shan
T Zhang
TG Phan
WH van der Poel
X Ge
Y Kim
Y Lin
Publication venue: Public Library of Science
Publication date: 13/04/2012
Field of study

Background: Animal faeces comprise a community of many different microorganisms including bacteria and viruses. Only scarce information is available about the diversity of viruses present in the faeces of pigs. Here we describe a protocol, which was optimized for the purification of the total fraction of viral particles from pig faeces. The genomes of the purified DNA and RNA viruses were simultaneously amplified by PCR and subjected to deep sequencing followed by bioinformatic analyses. The efficiency of the method was monitored using a process control consisting of three bacteriophages (T4, M13 and MS2) with different morphology and genome types. Defined amounts of the bacteriophages were added to the sample and their abundance was assessed by quantitative PCR during the preparation procedure. Results: The procedure was applied to a pooled faecal sample of five pigs. From this sample, 69,613 sequence reads were generated. All of the added bacteriophages were identified by sequence analysis of the reads. In total, 7.7 % of the reads showed significant sequence identities with published viral sequences. They mainly originated from bacteriophages (73.9%) and mammalian viruses (23.9%); 0.8 % of the sequences showed identities to plant viruses. The most abundant detected porcine viruses were kobuvirus, rotavirus C, astrovirus, enterovirus B, sapovirus and picobirnavirus. In addition, sequences with identities to the chimpanzee stool-associated circular ssDNA virus were identified. Whole genome analysis indicates that this virus, tentatively designated as pig stool-associated circular ssDNA virus (PigSCV), represents a novel pi

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Achievements and new knowledge unraveled by metagenomic approaches

Author: A Henne
A Knietsch
A Knietsch
A Knietsch
A Majernik
AC McHardy
C Heath
C Jogler
C Manichanh
C Meilleur
C Simon
C Von Mering
C Wang
C Wu
Carola Simon
CB Abulencia
CC Lee
CJ Duan
CR Woese
CS Riesenfeld
CS Riesenfeld
DA Benson
DB Rusch
DC Richter
DG Lee
DH Haft
DH Huson
DH Huson
DL Cox-Foster
DT Pride
EA Dinsdale
EA Dinsdale
EF DeLong
EJ Biers
EM Gabor
F Hårdeman
F Meyer
FE Angly
G Li
G. P. Pathak
GR LeCleir
GW Tyson
H Teeling
H Teeling
H Yokouchi
HC Rees
HM Monzoorul
HN Poinar
I-C. Chen
J Bailly
J Frias-Lopez
J Handelsman
J Handelsman
J Pottkämper
J Wuyts
JA Gilbert
JA Gilbert
JC Venter
JF Biddle
JJ Banik
JK Rhee
JR Cole
JS Song
L Krause
L Wegley
LJ Jensen
LK McNeil
LL Williamson
M Ferrer
M Ferrer
M Kanehisa
M Strous
MH Lee
NN Diaz
O Béjà
P Lopez-Garcia
P Lorenz
P Lorenz
P Wei
PJ Turnbaugh
PW Van der Wielen
QC Meyer
R Daniel
R Overbeek
RA Edwards
RD Finn
RD Sleator
Rebecca Vega Thurber
RL Tatusov
Rolf Daniel
S Grant
S Karlin
S Morimoto
S Sjöling
S Voget
S Voget
S Yooseph
SF Altschul
SF Brady
SG Tringe
SJ Hallam
T Abe
T Abe
T Uchiyama
T Urich
T Waschkowitz
TC Galvao
TZ DeSantis
W Ludwig
Y Feng
Publication venue: Springer-Verlag
Publication date: 01/01/2009
Field of study

Metagenomics has paved the way for cultivation-independent assessment and exploitation of microbial communities present in complex ecosystems. In recent years, significant progress has been made in this research area. A major breakthrough was the improvement and development of high-throughput next-generation sequencing technologies. The application of these technologies resulted in the generation of large datasets derived from various environments such as soil and ocean water. The analyses of these datasets opened a window into the enormous phylogenetic and metabolic diversity of microbial communities living in a variety of ecosystems. In this way, structure, functions, and interactions of microbial communities were elucidated. Metagenomics has proven to be a powerful tool for the recovery of novel biomolecules. In most cases, functional metagenomics comprising construction and screening of complex metagenomic DNA libraries has been applied to isolate new enzymes and drugs of industrial importance. For this purpose, several novel and improved screening strategies that allow efficient screening of large collections of clones harboring metagenomes have been introduced

Crossref

Springer - Publisher Connector

PubMed Central

A statistical toolbox for metagenomics: assessing functional diversity in microbial communities

Author: A Chao
A Chao
A Chao
A Chao
A Chao
AC McHardy
AE Magurran
AP Martin
B Rodriguez-Brito
C von Mering
CS Riesenfeld
DA Rasko
DB Rusch
DH Huson
DR Singleton
E Lerat
EF DeLong
EF Delong
GW Tyson
H Garcia Martin
H Teeling
H Teeling
JC Venter
JC Yue
JL Stein
Jo Handelsman
JP Wang
K Mavromatis
KP Burnham
KU Foerstner
L Excoffier
M Breitbart
M Breitbart
M Margulies
M Strous
MJ Anderson
MR Rondon
P Legendre
Patrick D Schloss
PD Schloss
PD Schloss
PD Schloss
PD Schloss
PD Schloss
PL Johnson
S Yooseph
SG Tringe
SJ Hallam
SJ Hallam
SR Gill
T Woyke
TD Read
TM Schmidt
VM Markowitz
W Ludwig
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background The 99% of bacteria in the environment that are recalcitrant to culturing have spurred the development of metagenomics, a culture-independent approach to sample and characterize microbial genomes. Massive datasets of metagenomic sequences have been accumulated, but analysis of these sequences has focused primarily on the descriptive comparison of the relative abundance of proteins that belong to specific functional categories. More robust statistical methods are needed to make inferences from metagenomic data. In this study, we developed and applied a suite of tools to describe and compare the richness, membership, and structure of microbial communities using peptide fragment sequences extracted from metagenomic sequence data. Results Application of these tools to acid mine drainage, soil, and whale fall metagenomic sequence collections revealed groups of peptide fragments with a relatively high abundance and no known function. When combined with analysis of 16S rRNA gene fragments from the same communities these tools enabled us to demonstrate that although there was no overlap in the types of 16S rRNA gene sequence observed, there was a core collection of operational protein families that was shared among the three environments. Conclusion The results of comparisons between the three habitats were surprising considering the relatively low overlap of membership and the distinctively different characteristics of the three habitats. These tools will facilitate the use of metagenomics to pursue statistically sound genome-based ecological analyses.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Evolutionarily Conserved Substrate Substructures for Automated Annotation of Enzyme Superfamilies

Author: A Aharoni
AE Todd
AG Murzin
Andrej Sali
AS Mildvan
C Kalyanaraman
C Steinbeck
CM Seibert
CS Riesenfeld
CT Porter
D Weininger
DJ Weininger
DM Schmidt
DM Schmidt
GL Holliday
HM Holden
I Friedberg
I Nobeli
I Schomburg
I Shah
J Barthelmes
JA Gerlt
JA Gerlt
JA Gerlt
JA Gerlt
JC Hermann
JC Hermann
JJ Diaz-Mejia
K Tipton
KA Frazer
KN Allen
L Holm
L Song
M Ashburner
M Bashton
M Kotera
MA Marti-Renom
ME Glasner
ME Glasner
MJ Bessman
MJ Keiser
N Nagano
NH Horowitz
NH Horowitz
NM O'Boyle
Patricia C. Babbitt
PC Babbitt
PC Babbitt
PC Babbitt
R Alves
RA Nagatani
Ranyee A. Chiang
Robert B. Russell
S Light
S Schmidt
SC Pegg
SC Rison
SD Copley
TL O'Loughlin
WR Pearson
Publication venue: Public Library of Science
Publication date: 01/08/2008
Field of study

The evolution of enzymes affects how well a species can adapt to new environmental conditions. During enzyme evolution, certain aspects of molecular function are conserved while other aspects can vary. Aspects of function that are more difficult to change or that need to be reused in multiple contexts are often conserved, while those that vary may indicate functions that are more easily changed or that are no longer required. In analogy to the study of conservation patterns in enzyme sequences and structures, we have examined the patterns of conservation and variation in enzyme function by analyzing graph isomorphisms among enzyme substrates of a large number of enzyme superfamilies. This systematic analysis of substrate substructures establishes the conservation patterns that typify individual superfamilies. Specifically, we determined the chemical substructures that are conserved among all known substrates of a superfamily and the substructures that are reacting in these substrates and then examined the relationship between the two. Across the 42 superfamilies that were analyzed, substantial variation was found in how much of the conserved substructure is reacting, suggesting that superfamilies may not be easily grouped into discrete and separable categories. Instead, our results suggest that many superfamilies may need to be treated individually for analyses of evolution, function prediction, and guiding enzyme engineering strategies. Annotating superfamilies with these conserved and reacting substructure patterns provides information that is orthogonal to information provided by studies of conservation in superfamily sequences and structures, thereby improving the precision with which we can predict the functions of enzymes of unknown function and direct studies in enzyme engineering. Because the method is automated, it is suitable for large-scale characterization and comparison of fundamental functional capabilities of both characterized and uncharacterized enzyme superfamilies

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Pyrosequencing of Antibiotic-Contaminated River Sediments Reveals High Levels of Resistance and Gene Transfer Elements

The high and sometimes inappropriate use of antibiotics has accelerated the development of antibiotic resistance, creating a major challenge for the sustainable treatment of infections world-wide. Bacterial communities often respond to antibiotic selection pressure by acquiring resistance genes, i.e. mobile genetic elements that can be shared horizontally between species. Environmental microbial communities maintain diverse collections of resistance genes, which can be mobilized into pathogenic bacteria. Recently, exceptional environmental releases of antibiotics have been documented, but the effects on the promotion of resistance genes and the potential for horizontal gene transfer have yet received limited attention. In this study, we have used culture-independent shotgun metagenomics to investigate microbial communities in river sediments exposed to waste water from the production of antibiotics in India. Our analysis identified very high levels of several classes of resistance genes as well as elements for horizontal gene transfer, including integrons, transposons and plasmids. In addition, two abundant previously uncharacterized resistance plasmids were identified. The results suggest that antibiotic contamination plays a role in the promotion of resistance genes and their mobilization from environmental microbes to other species and eventually to human pathogens. The entire life-cycle of antibiotic substances, both before, under and after usage, should therefore be considered to fully evaluate their role in the promotion of resistance

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Chalmers Research

Chalmers Publication Library