Search CORE

Scholar Commons - University of South Florida

University of Queensland eSpace

Recommended from our members

Viral Metagenomics: MetaView Software

Author: Smith Jason
Zhou Carol
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 22/10/2007
Field of study

The purpose of this report is to design and develop a tool for analysis of raw sequence read data from viral metagenomics experiments. The tool should compare read sequences of known viral nucleic acid sequence data and enable a user to attempt to determine, with some degree of confidence, what virus groups may be present in the sample. This project was conducted in two phases. In phase 1 we surveyed the literature and examined existing metagenomics tools to educate ourselves and to more precisely define the problem of analyzing raw read data from viral metagenomic experiments. In phase 2 we devised an approach and built a prototype code and database. This code takes viral metagenomic read data in fasta format as input and accesses all complete viral genomes from Kpath for sequence comparison. The system executes at the UNIX command line, producing output that is stored in an Oracle relational database. We provide here a description of the approach we came up with for handling un-assembled, short read data sets from viral metagenomics experiments. We include a discussion of the current MetaView code capabilities and additional functionality that we believe should be added, should additional funding be acquired to continue the work

UNT Digital Library

Viral Metagenomics: MetaView Software

Author
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date
Field of study

Springer - Publisher Connector

Estimation of viral richness from shotgun metagenomes using a frequency count approach

Author: Darrell O Bayles
Heather K Allen
James A Foster
John Bunge
Thaddeus B Stanton
Publication venue: Springer Nature
Publication date: 04/02/2013
Field of study

BACKGROUND: Viruses are important drivers of ecosystem functions, yet little is known about the vast majority of viruses. Viral shotgun metagenomics enables the investigation of broad ecological questions in phage communities. One ecological characteristic is species richness, which is the number of different species in a community. Viruses do not have a phylogenetic marker analogous to the bacterial 16S rRNA gene with which to estimate richness, and so contig spectra are employed to measure the number of virus taxa in a given community. A contig spectrum is generated from a viral shotgun metagenome by assembling the random sequence reads into groups of sequences that overlap (contigs) and counting the number of sequences that group within each contig. Current tools available to analyze contig spectra to estimate phage richness are limited by relying on rank-abundance data. RESULTS: We present statistical estimates of virus richness from contig spectra. The program CatchAll (http://www.northeastern.edu/catchall/) was used to analyze contig spectra in terms of frequency count data rather than rank-abundance, thus enabling formal statistical analyses. Also, the influence of potentially spurious low-frequency counts on richness estimates was minimized by two methods, empirical and statistical. The results show greater estimates of viral richness than previous calculations in nearly all environments analyzed, including swine feces and reclaimed fresh water. CONCLUSIONS: CatchAll yielded consistent estimates of richness across viral metagenomes from the same or similar environments. Additionally, analysis of pooled viral metagenomes from different environments via mixed contig spectra resulted in greater richness estimates than those of the component metagenomes. Using CatchAll to analyze contig spectra will improve estimations of richness from viral shotgun metagenomes, particularly from large datasets, by providing statistical measures of richness

The Australian National University

Accurate reconstruction of viral quasispecies spectra through improved estimation of strain richness

Author: Chang BC
Halgamuge Saman
Jayasundara Duleepa
Saeed Isaam
Tang Sen-Lin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/11/2018
Field of study

Background Estimating the number of different species (richness) in a mixed microbial population has been a main focus in metagenomic research. Existing methods of species richness estimation ride on the assumption that the reads in each assembled contig correspond to only one of the microbial genomes in the population. This assumption and the underlying probabilistic formulations of existing methods are not useful for quasispecies populations where the strains are highly genetically related. The lack of knowledge on the number of different strains in a quasispecies population is observed to hinder the precision of existing Viral Quasispecies Spectrum Reconstruction (QSR) methods due to the uncontrolled reconstruction of a large number of in silico false positives. In this work, we formulated a novel probabilistic method for strain richness estimation specifically targeting viral quasispecies. By using this approach we improved our recently proposed spectrum reconstruction pipeline ViQuaS to achieve higher levels of precision in reconstructed quasispecies spectra without compromising the recall rates. We also discuss how one other existing popular QSR method named ShoRAH can be improved using this new approach. Results On benchmark data sets, our estimation method provided accurate richness estimates (< 0.2 median estimation error) and improved the precision of ViQuaS by 2%-13% and F-score by 1%-9% without compromising the recall rates. We also demonstrate that our estimation method can be used to improve the precision and F-score of ShoRAH by 0%-7% and 0%-5% respectively. Conclusions The proposed probabilistic estimation method can be used to estimate the richness of viral populations with a quasispecies behavior and to improve the accuracy of the quasispecies spectra reconstructed by the existing methods ViQuaS and ShoRAH in the presence of a moderate level of technical sequencing errors

Phage Encoded H-NS: A Potential Achilles Heel in the Bacterial Defence System

The relationship between phage and their microbial hosts is difficult to elucidate in complex natural ecosystems. Engineered systems performing enhanced biological phosphorus removal (EBPR), offer stable, lower complexity communities for studying phage-host interactions. Here, metagenomic data from an EBPR reactor dominated by Candidatus Accumulibacter phosphatis (CAP), led to the recovery of three complete and six partial phage genomes. Heat-stable nucleoid structuring (H-NS) protein, a global transcriptional repressor in bacteria, was identified in one of the complete phage genomes (EPV1), and was most similar to a homolog in CAP. We infer that EPV1 is a CAP-specific phage and has the potential to repress up to 6% of host genes based on the presence of putative H-NS binding sites in the CAP genome. These genes include CRISPR associated proteins and a Type III restriction-modification system, which are key host defense mechanisms against phage infection. Further, EPV1 was the only member of the phage community found in an EBPR microbial metagenome collected seven months prior. We propose that EPV1 laterally acquired H-NS from CAP providing it with a means to reduce bacterial defenses, a selective advantage over other phage in the EBPR system. Phage encoded H-NS could constitute a previously unrecognized weapon in the phage-host arms race

CiteSeerX

Public Library of Science (PLOS)

USFSP Digital Archive

Queensland University of Technology ePrints Archive

Scholar Commons - University of South Florida

University of Queensland eSpace

Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities

Author: Abe
Acinas
Angly
Aparicio
Audic
Batzoglou
Beja
Beja
Bray
Breitbart
Breitbart
Breitbart
Campbell
Cann
Chen
Cole
Courtois
Curtis
Dale
Delcher
DeLong
DeLong
Deschavanne
Durbin
Edgar
Edwards
Eisen
Feller
Felsenstein
Forney
Goo
Hallam
Handelsman
Hayes
Huang
Hugenholtz
Jones
Kanehisa
Kanehisa
Karlin
Kawashima
Kearney
Kevin Chen
Liles
Lior Pachter
Ludwig
Margulies
Myers
Olsen
Pachter
Pop
Quackenbush
Rappe
Riesenfeld
Riesenfeld
Rodriguez-Valera
Rohwer
Ruepp
Sabehi
Salzberg
Salzberg
Stein
Streit
Sundararajan
Tatusov
Teeling
Teeling
Tringe
Tyson
Uchiyama
Venter
Wiens
Publication venue: Public Library of Science
Publication date: 01/01/2005
Field of study

The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems

Caltech Authors

Molecular eco-systems biology: towards an understanding of community function

Author: Bork P.
Raes J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2008
Field of study

Systems-biology approaches, which are driven by genome sequencing and high-throughput functional genomics data, are revolutionizing single-cell-organism biology. With the advent of various high-throughput techniques that aim to characterize complete microbial ecosystems (metagenomics, meta-transcriptomics and meta-metabolomics), we propose that the time is ripe to consider molecular systems biology at the ecosystem level (eco-systems biology). Here, we discuss the necessary data types that are required to unite molecular microbiology and ecology to develop an understanding of community function and discuss the potential shortcomings of these approaches

MDC Repository

Estimating DNA coverage and abundance in metagenomes using a gamma approximation

Author: Amrita Pati
Angly
Brass
Breitbart
Chao
Chao
Chao
Chevreux
Dalevi
Daniel Dalevi
Dropkin
el-Shaarawi
Heath
Izsák
Kalyuzhnaya
Konstantinos Mavromatis
Kunin
Lander
Mavromatis
Natalia N. Ivanova
Nikos C. Kyrpides
Quail
Quince
Raes
Richter
Schloss
Sean D. Hooper
Simon
Stein
Tringe
Venter
Warnecke
Wendl
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Motivation: Shotgun sequencing generates large numbers of short DNA reads from either an isolated organism or, in the case of metagenomics projects, from the aggregate genome of a microbial community. These reads are then assembled based on overlapping sequences into larger, contiguous sequences (contigs). The feasibility of assembly and the coverage achieved (reads per nucleotide or distinct sequence of nucleotides) depend on several factors: the number of reads sequenced, the read length and the relative abundances of their source genomes in the microbial community. A low coverage suggests that most of the genomic DNA in the sample has not been sequenced, but it is often difficult to estimate either the extent of the uncaptured diversity or the amount of additional sequencing that would be most efficacious. In this work, we regard a metagenome as a population of DNA fragments (bins), each of which may be covered by one or more reads. We employ a gamma distribution to model this bin population due to its flexibility and ease of use. When a gamma approximation can be found that adequately fits the data, we may estimate the number of bins that were not sequenced and that could potentially be revealed by additional sequencing. We evaluated the performance of this model using simulated metagenomes and demonstrate its applicability on three recent metagenomic datasets

eScholarship - University of California

UNT Digital Library

Metagenomic Analysis of Lysogeny in Tampa Bay: Implications for Prophage Gene Expression

Author: Amy Long
C Desnues
C Leitet
CA Suttle
CA Suttle
D Stopar
DB Rusch
DJ Carpenter
DL Kirchman
EA Dinsdale
EF DeLong
F Angly
F Chen
F Rohwer
FE Angly
Forest Rohwer
Geraldine Butler
HW Ackermann
HW Ackermann
I Sharon
J Laybourn-Parry
J Sambrook
JA Fuhrman
Jennifer Mobberley
JH Paul
John H. Paul
KE Wommack
KE Wommack
L McDaniel
L McDaniel
Lauren McDaniel
M Margulies
Matthew Haynes
MB Sullivan
MG Weinbauer
Mya Breitbart
SC Jiang
SC Jiang
SJ Williamson
SJ Williamson
SR Bench
SW Wilhem
X Mou
Ø Bergh
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Phage integrase genes often play a role in the establishment of lysogeny in temperate phage by catalyzing the integration of the phage into one of the host's replicons. To investigate temperate phage gene expression, an induced viral metagenome from Tampa Bay was sequenced by 454/Pyrosequencing. The sequencing yielded 294,068 reads with 6.6% identifiable. One hundred-three sequences had significant similarity to integrases by BLASTX analysis (e≤0.001). Four sequences with strongest amino-acid level similarity to integrases were selected and real-time PCR primers and probes were designed. Initial testing with microbial fraction DNA from Tampa Bay revealed 1.9×107, and 1300 gene copies of Vibrio-like integrase and Oceanicola-like integrase L−1 respectively. The other two integrases were not detected. The integrase assay was then tested on microbial fraction RNA extracted from 200 ml of Tampa Bay water sampled biweekly over a 12 month time series. Vibrio-like integrase gene expression was detected in three samples, with estimated copy numbers of 2.4-1280 L−1. Clostridium-like integrase gene expression was detected in 6 samples, with estimated copy numbers of 37 to 265 L−1. In all cases, detection of integrase gene expression corresponded to the occurrence of lysogeny as detected by prophage induction. Investigation of the environmental distribution of the two expressed integrases in the Global Ocean Survey Database found the Vibrio-like integrase was present in genome equivalents of 3.14% of microbial libraries and all four viral metagenomes. There were two similar genes in the library from British Columbia and one similar gene was detected in both the Gulf of Mexico and Sargasso Sea libraries. In contrast, in the Arctic library eleven similar genes were observed. The Clostridium-like integrase was less prevalent, being found in 0.58% of the microbial and none of the viral libraries. These results underscore the value of metagenomic data in discovering signature genes that play important roles in the environment through their expression, as demonstrated by integrases in lysogeny

Public Library of Science (PLOS)

USFSP Digital Archive