Search CORE

Harvard University - DASH

Phage Encoded H-NS: A Potential Achilles Heel in the Bacterial Defence System

The relationship between phage and their microbial hosts is difficult to elucidate in complex natural ecosystems. Engineered systems performing enhanced biological phosphorus removal (EBPR), offer stable, lower complexity communities for studying phage-host interactions. Here, metagenomic data from an EBPR reactor dominated by Candidatus Accumulibacter phosphatis (CAP), led to the recovery of three complete and six partial phage genomes. Heat-stable nucleoid structuring (H-NS) protein, a global transcriptional repressor in bacteria, was identified in one of the complete phage genomes (EPV1), and was most similar to a homolog in CAP. We infer that EPV1 is a CAP-specific phage and has the potential to repress up to 6% of host genes based on the presence of putative H-NS binding sites in the CAP genome. These genes include CRISPR associated proteins and a Type III restriction-modification system, which are key host defense mechanisms against phage infection. Further, EPV1 was the only member of the phage community found in an EBPR microbial metagenome collected seven months prior. We propose that EPV1 laterally acquired H-NS from CAP providing it with a means to reduce bacterial defenses, a selective advantage over other phage in the EBPR system. Phage encoded H-NS could constitute a previously unrecognized weapon in the phage-host arms race

CiteSeerX

USFSP Digital Archive

Queensland University of Technology ePrints Archive

Scholar Commons - University of South Florida

University of Queensland eSpace

Assessing the Diversity and Specificity of Two Freshwater Viral Communities through Metagenomics

Transitions between saline and fresh waters have been shown to be infrequent for microorganisms. Based on host-specific interactions, the presence of specific clades among hosts suggests the existence of freshwater-specific viral clades. Yet, little is known about the composition and diversity of the temperate freshwater viral communities, and even if freshwater lakes and marine waters harbor distinct clades for particular viral sub-families, this distinction remains to be demonstrated on a community scale

Hal - Université Grenoble Alpes

HAL Clermont Université

Agritrop

HAL Université de Savoie

FigShare

The GAAS Metagenomic Tool and Its Estimations of Viral and Microbial Average Genome Size in Four Major Biomes

Author: AC Paoletti
Alejandra Prieto-Davó
B Diez
B Zybailov
Baoli Zhu
Beltran Rodriguez-Mueller
C Desnues
Christelle Desnues
D Rasko
D Willner
Dana Willner
David L. Kirchman
DH Huson
Dionysios A. Antonopoulos
DL Wheeler
EA Dinsdale
EA Dinsdale
Egbert Mundt
Elizabeth A. Dinsdale
F Angly
F Meyer
F Rohwer
FE Angly
Florent E. Angly
FM Lauro
Folker Meyer
Forest Rohwer
Gary D. Stormo
GF Steward
I Hewson
I Letunic
J Raes
J Raes
JAG Ranea
John D. McPherson
K Holmfeldt
K Rosario
Katie Barott
KE Wommack
KE Wommack
KT Konstantinidis
L Florens
LB Koski
Linda Wegley
Lixin Zhang
LM Graves
M Dyall-Smith
M Pignatelli
Matthew Haynes
Matthew R. Henn
Matthew T. Cottrell
MG Weinbauer
Mike Furlan
P DasSarma
P Hugenholtz
R Sadreyev
R Sandaa
R Sandaa
R Seshadri
R. Michael Miller
Rebecca Vega-Thurber
Rick Stevens
RL Vega Thurber
Robert A. Edwards
Robert K. Naviaux
Robert Schmieder
RV Thurber
S Karlin
SD Bentley
SF Altschul
Tracey McDole
Yongfei Hu
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Metagenomic studies characterize both the composition and diversity of uncultured viral and microbial communities. BLAST-based comparisons have typically been used for such analyses; however, sampling biases, high percentages of unknown sequences, and the use of arbitrary thresholds to find significant similarities can decrease the accuracy and validity of estimates. Here, we present Genome relative Abundance and Average Size (GAAS), a complete software package that provides improved estimates of community composition and average genome length for metagenomes in both textual and graphical formats. GAAS implements a novel methodology to control for sampling bias via length normalization, to adjust for multiple BLAST similarities by similarity weighting, and to select significant similarities using relative alignment lengths. In benchmark tests, the GAAS method was robust to both high percentages of unknown sequences and to variations in metagenomic sequence read lengths. Re-analysis of the Sargasso Sea virome using GAAS indicated that standard methodologies for metagenomic analysis may dramatically underestimate the abundance and importance of organisms with small genomes in environmental systems. Using GAAS, we conducted a meta-analysis of microbial and viral average genome lengths in over 150 metagenomes from four biomes to determine whether genome lengths vary consistently between and within biomes, and between microbial and viral communities from the same environment. Significant differences between biomes and within aquatic sub-biomes (oceans, hypersaline systems, freshwater, and microbialites) suggested that average genome length is a fundamental property of environments driven by factors at the sub-biome level. The behavior of paired viral and microbial metagenomes from the same environment indicated that microbial and viral average genome sizes are independent of each other, but indicative of community responses to stressors and environmental conditions

HAL AMU

DigitalCommons@Florida International University

HAL Descartes

Hal-Diderot

University of Queensland eSpace

eScholarship - University of California

ScholarlyCommons@Penn

Deep sequencing evidence from single grapevine plants reveals a virome dominated by mycoviruses

Author: A Djikeng
A López-Bueno
A. Rowhani
B Coetzee
CM Fauquet
F. Cordero
FE Angly
G Routh
GP Martelli
HN Pearson
J Atif
J. R. Úrbez-Torres
JR Úrbez-Torres
L Sage
LJ Crawford
M Al Rwahnih
M. Al Rwahnih
RA Edwards
S Nakamura
S. Daubert
SF Altschul
TJ White
YP Zhang
Publication venue: Springer Vienna
Publication date: 01/01/2010
Field of study

We have characterized the virome in single grapevines by 454 high-throughput sequencing of double-stranded RNA recovered from the vine stem. The analysis revealed a substantial set of sequences similar to those of fungal viruses. Twenty-six putative fungal virus groups were identified from a single plant source. These represented half of all known mycoviral families including the Chrysoviridae, Hypoviridae, Narnaviridae, Partitiviridae, and Totiviridae. Three of the mycoviruses were associated with Botrytis cinerea, a common fungal pathogen of grapes. Most of the rest appeared to be undescribed. The presence of viral sequences identified by BLAST analysis was confirmed by sequencing PCR products generated from the starting material using primers designed from the genomic sequences of putative mycoviruses. To further characterize these sequences as fungal viruses, fungi from the grapevine tissue were cultured and screened with the same PCR probes. Five of the mycoviruses identified in the total grapevine extract were identified again in extracts of the fungal cultures

Springer - Publisher Connector

Assessment of Metagenomic Assembly Using Simulated Next Generation Sequencing Data

Author: AH Singh
Aino I. Järvelin
Alison S. Waller
B Ewing
B Ewing
CB Abulencia
D Chivian
D Wu
Daniel R. Mende
DC Richter
DR Zerbino
ED Harrington
ES Lander
EW Myers
F Meyer
FE Angly
FE Angly
GW Tyson
H García Martín
H-H Chou
J Goecks
J Goll
J Handelsman
J Muller
J Peterson
J Qin
J Raes
J Raes
JC Venter
Jeroen Raes
John Parkinson
JR Miller
JR Miller
K Kurokawa
K Mavromatis
M Arumugam
M Arumugam
M Pignatelli
M Pop
Manimozhiyan Arumugam
Michelle M. Chan
MP Cox
Peer Bork
PJ Turnbaugh
PJA Cock
R Li
R Li
R Schmieder
RA Edwards
RL Warren
S Aparicio
SG Tringe
Shinichi Sunagawa
SR Gill
T Schoenfeld
TA Gianoulis
TC Glenn
VM Markowitz
W Zhu
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Due to the complexity of the protocols and a limited knowledge of the nature of microbial communities, simulating metagenomic sequences plays an important role in testing the performance of existing tools and data analysis methods with metagenomic data. We developed metagenomic read simulators with platform-specific (Sanger, pyrosequencing, Illumina) base-error models, and simulated metagenomes of differing community complexities. We first evaluated the effect of rigorous quality control on Illumina data. Although quality filtering removed a large proportion of the data, it greatly improved the accuracy and contig lengths of resulting assemblies. We then compared the quality-trimmed Illumina assemblies to those from Sanger and pyrosequencing. For the simple community (10 genomes) all sequencing technologies assembled a similar amount and accurately represented the expected functional composition. For the more complex community (100 genomes) Illumina produced the best assemblies and more correctly resembled the expected functional composition. For the most complex community (400 genomes) there was very little assembly of reads from any sequencing technology. However, due to the longer read length the Sanger reads still represented the overall functional composition reasonably well. We further examined the effect of scaffolding of contigs using paired-end Illumina reads. It dramatically increased contig lengths of the simple community and yielded minor improvements to the more complex communities. Although the increase in contig length was accompanied by increased chimericity, it resulted in more complete genes and a better characterization of the functional repertoire. The metagenomic simulators developed for this research are freely available

CiteSeerX

Copenhagen University Research Information System

MDC Repository

FigShare

Metagenomic Analysis of Respiratory Tract DNA Viral Communities in Cystic Fibrosis and Non-Cystic Fibrosis Individuals

Author: A Livraghi
AF Andersson
AJ Gentles
B Rodriguez-Brito
Bahador Nosrat
BE van Ewijk
BS Everitt
C Desnues
C Goerke
D Willner
Dana Willner
DF Rogers
DM Raskin
Douglas Conrad
EA Dinsdale
F Angly
F Harrison
F Klein
F Meyer
F Rohwer
F Wartha
FB Dean
FE Angly
Florent E. Angly
Forest Rohwer
GB Rogers
GB Rogers
GB Winnie
H Ochman
H See
J Azeredo
J Heyder
JA Fuhrman
Jeffrey A. Gold
JM Corne
Joas Silva
K Potrykus
KL Palmer
KL Palmer
L Zawadzka-Głos
LL Kulczycki
M Breitbart
M Breitbart
M Breitbart
Matthew Haynes
Mike Furlan
MJ Goldman
MR Knowles
P Green
P Lohavanichbutr
PJ Turnbaugh
PJ Turnbaugh
PM Beringer
R Overbeek
R Pinard
RK Aziz
Robert Schmieder
RV Miller
RV Thurber
S Nakamura
Sassan Tammadoni
SF Altschul
SG Tringe
SH Randell
SR Bencht
SR Gill
T Allander
T Schoenfeld
T Vadivukarasi
T Zhang
TE McManus
V Jain
WT Liu
X Xiang
Publication venue: Public Library of Science
Publication date: 09/10/2009
Field of study

The human respiratory tract is constantly exposed to a wide variety of viruses, microbes and inorganic particulates from environmental air, water and food. Physical characteristics of inhaled particles and airway mucosal immunity determine which viruses and microbes will persist in the airways. Here we present the first metagenomic study of DNA viral communities in the airways of diseased and non-diseased individuals. We obtained sequences from sputum DNA viral communities in 5 individuals with cystic fibrosis (CF) and 5 individuals without the disease. Overall, diversity of viruses in the airways was low, with an average richness of 175 distinct viral genotypes. The majority of viral diversity was uncharacterized. CF phage communities were highly similar to each other, whereas Non-CF individuals had more distinct phage communities, which may reflect organisms in inhaled air. CF eukaryotic viral communities were dominated by a few viruses, including human herpesviruses and retroviruses. Functional metagenomics showed that all Non-CF viromes were similar, and that CF viromes were enriched in aromatic amino acid metabolism. The CF metagenomes occupied two different metabolic states, probably reflecting different disease states. There was one outlying CF virome which was characterized by an over-representation of Guanosine-5′-triphosphate,3′-diphosphate pyrophosphatase, an enzyme involved in the bacterial stringent response. Unique environments like the CF airway can drive functional adaptations, leading to shifts in metabolic profiles. These results have important clinical implications for CF, indicating that therapeutic measures may be more effective if used to change the respiratory environment, as opposed to shifting the taxonomic composition of resident microbiota

Analysis and comparison of very large metagenomes with fast clustering and functional annotation

Author: AC McHardy
AR Quinlan
B Rodriguez-Brito
D Sheskin
DB Rusch
DC Richter
DH Huson
E Portugaly
EA Dinsdale
EF DeLong
FE Angly
GW Tyson
H Noguchi
H Noguchi
H Teeling
H Teeling
J Shendure
JC Venter
K Mavromatis
KJ Hoff
L Krause
PD Schloss
R Seshadri
RK Aziz
S Yooseph
S Yooseph
SF Altschul
SG Tringe
SR Eddy
SR Gill
W Li
W Li
W Li
W Li
Weizhong Li
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The remarkable advance of metagenomics presents significant new challenges in data analysis. Metagenomic datasets (metagenomes) are large collections of sequencing reads from anonymous species within particular environments. Computational analyses for very large metagenomes are extremely time-consuming, and there are often many novel sequences in these metagenomes that are not fully utilized. The number of available metagenomes is rapidly increasing, so fast and efficient metagenome comparison methods are in great demand. Results The new metagenomic data analysis method Rapid Analysis of Multiple Metagenomes with a Clustering and Annotation Pipeline (RAMMCAP) was developed using an ultra-fast sequence clustering algorithm, fast protein family annotation tools, and a novel statistical metagenome comparison method that employs a unique graphic interface. RAMMCAP processes extremely large datasets with only moderate computational effort. It identifies raw read clusters and protein clusters that may include novel gene families, and compares metagenomes using clusters or functional annotations calculated by RAMMCAP. In this study, RAMMCAP was applied to the two largest available metagenomic collections, the "Global Ocean Sampling" and the "Metagenomic Profiling of Nine Biomes". Conclusion RAMMCAP is a very fast method that can cluster and annotate one million metagenomic reads in only hundreds of CPU hours. It is available from <url>http://tools.camera.calit2.net/camera/rammcap/</url>.</p

Springer - Publisher Connector

Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads

Author: A Brady
AC McHardy
B Beszteri
B Langmead
DB Rusch
DC Richter
DH Huson
DR Kelley
EJ Biers
Emmanuel Dias-Neto
FE Angly
Fengzhu Sun
GL Rosen
GW Tyson
H Li
H Teeling
J Peterson
J Qin
Jacob A. Cram
JC Venter
Jed A. Fuhrman
JL Morgan
JS Liu
K Kurokawa
K Liolios
K Mavromatis
KE Nelson
Li C. Xia
M Monzoorul Haque
NN Diaz
PA Vaishampayan
PJ Turnbaugh
PJ Turnbaugh
PJ Turnbaugh
R Sandberg
R Stepanauskas
RJ Case
RM Engeman
S Chatterji
SF Altschul
SR Gill
T Woyke
Ting Chen
VM Markowitz
Y Chen
YW Wu
Publication venue: Public Library of Science
Publication date: 06/12/2011
Field of study

Accurate estimation of microbial community composition based on metagenomic sequencing data is fundamental for subsequent metagenomics analysis. Prevalent estimation methods are mainly based on directly summarizing alignment results or its variants; often result in biased and/or unstable estimates. We have developed a unified probabilistic framework (named GRAMMy) by explicitly modeling read assignment ambiguities, genome size biases and read distributions along the genomes. Maximum likelihood method is employed to compute Genome Relative Abundance of microbial communities using the Mixture Model theory (GRAMMy). GRAMMy has been demonstrated to give estimates that are accurate and robust across both simulated and real read benchmark datasets. We applied GRAMMy to a collection of 34 metagenomic read sets from four metagenomics projects and identified 99 frequent species (minimally 0.5% abundant in at least 50% of the data- sets) in the human gut samples. Our results show substantial improvements over previous studies, such as adjusting the over-estimated abundance for Bacteroides species for human gut samples, by providing a new reference-based strategy for metagenomic sample comparisons. GRAMMy can be used flexibly with many read assignment tools (mapping, alignment or composition-based) even with low-sensitivity mapping results from huge short-read datasets. It will be increasingly useful as an accurate and robust tool for abundance estimation with the growing size of read sets and the expanding database of reference genomes

Metagenomic Analysis of RNA Viruses in a Fresh Water Lake

Author: A Djikeng
AB Martin-Cuadrado
AF Andersson
AI Culley
AI Culley
AI Culley
Appolinaire Djikeng
B Chen
CM Short
David J. Spiro
DB Rusch
de Miranda Jr
DH Huson
DL Cox-Foster
FE Angly
GM Allan
GW Tyson
I. King Jordan
M Breitbart
M Breitbart
M Breitbart
M Margulies
MH Garcia
MM Van
Norman G. Anderson
P Biagini
R Seshadri
RA Edwards
RA Watanabe
Ryan Kuzmickas
S Nakamura
SG Tringe
SJ Williamson
SM Goldberg
SR Bench
SR Bench
SR Finkbeiner
T Nabeshima
T Schoenfeld
T Zhang
Publication venue: Public Library of Science
Publication date: 29/09/2009
Field of study

Freshwater lakes and ponds present an ecological interface between humans and a variety of host organisms. They are a habitat for the larval stage of many insects and may serve as a medium for intraspecies and interspecies transmission of viruses such as avian influenza A virus. Furthermore, freshwater bodies are already known repositories for disease-causing viruses such as Norwalk Virus, Coxsackievirus, Echovirus, and Adenovirus. While RNA virus populations have been studied in marine environments, to this date there has been very limited analysis of the viral community in freshwater. Here we present a survey of RNA viruses in Lake Needwood, a freshwater lake in Maryland, USA. Our results indicate that just as in studies of other aquatic environments, the majority of nucleic acid sequences recovered did not show any significant similarity to known sequences. The remaining sequences are mainly from viral types with significant similarity to approximately 30 viral families. We speculate that these novel viruses may infect a variety of hosts including plants, insects, fish, domestic animals and humans. Among these viruses we have discovered a previously unknown dsRNA virus closely related to Banna Virus which is responsible for a febrile illness and is endemic to Southeast Asia. Moreover we found multiple viral sequences distantly related to Israeli Acute Paralysis virus which has been implicated in honeybee colony collapse disorder. Our data suggests that due to their direct contact with humans, domestic and wild animals, freshwater ecosystems might serve as repositories of a wide range of viruses (both pathogenic and non-pathogenic) and possibly be involved in the spread of emerging and pandemic diseases