Search CORE

11 research outputs found

CloVR: A virtual machine for automated and portable sequence analysis from the desktop using cloud computing

Author: A Bateman
A Bateman
A Tridgell
Aaron Gussman
AC Stewart
AL Delcher
B Langmead
B Langmead
BE Suzek
C Hemmerich
C Rapier
Cesar Arze
D Field
D Hull
David R Riley
DL Wheeler
DR Zerbino
E Afgan
EE Schadt
F Meyer
J Dean
J Goecks
J Orvis
J White
J White
J White
James R White
JD Selengut
JG Caporaso
JP Mesirov
JR Cole
JR Miller
JR White
JT Dudley
K Galens
K Keahey
K Lagesen
Kevin Galens
LD Stein
M Reich
Mahesh Vangala
Malcolm Matalka
MC Schatz
MC Schatz
MC Schatz
O Trelles
Owen White
PD Schloss
RC Edgar
RK Aziz
RL Tatusov
S Angiuoli
Samuel V Angiuoli
SD Kahn
SF Altschul
SF Altschul
SR Eddy
TM Lowe
W Florian Fricke
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing.https://doi.org/10.1186/1471-2105-12-35

Crossref

Springer - Publisher Connector

PubMed Central

Digital Repository at the University of Maryland

High quality regulation: its popularity, its tools and its future

Author: James White
James White
Malcolm Matalka
Samuel Angiuoli
W. Florian Fricke
Publication venue: Routledge
Publication date: 01/05/2009
Field of study

Ideas regarding 'better regulation' and 'high-quality regulation' have become key aspects of contemporary administrative reform initiatives. What explains the popularity of this agenda? What does the comparative experience tell us about its impact? And what is its future? This article suggests that the contemporary debate is flawed by competing assumptions hiding behind a common language. A more promising approach is to embed high-quality regulation into regulatory conversations rather than imposing requirements through hierarchical means

Crossref

LSE Research Online

Resources and costs for microbial sequence analysis evaluated using virtual machines and cloud computing.

Author: James R White
Malcolm Matalka
Owen White
Samuel V Angiuoli
W Florian Fricke
Publication venue: Public Library of Science (PLoS)
Publication date: 19/10/2011
Field of study

The widespread popularity of genomic applications is threatened by the "bioinformatics bottleneck" resulting from uncertainty about the cost and infrastructure needed to meet increasing demands for next-generation sequence analysis. Cloud computing services have been discussed as potential new bioinformatics support systems but have not been evaluated thoroughly.We present benchmark costs and runtimes for common microbial genomics applications, including 16S rRNA analysis, microbial whole-genome shotgun (WGS) sequence assembly and annotation, WGS metagenomics and large-scale BLAST. Sequence dataset types and sizes were selected to correspond to outputs typically generated by small- to midsize facilities equipped with 454 and Illumina platforms, except for WGS metagenomics where sampling of Illumina data was used. Automated analysis pipelines, as implemented in the CloVR virtual machine, were used in order to guarantee transparency, reproducibility and portability across different operating systems, including the commercial Amazon Elastic Compute Cloud (EC2), which was used to attach real dollar costs to each analysis type. We found considerable differences in computational requirements, runtimes and costs associated with different microbial genomics applications. While all 16S analyses completed on a single-CPU desktop in under three hours, microbial genome and metagenome analyses utilized multi-CPU support of up to 120 CPUs on Amazon EC2, where each analysis completed in under 24 hours for less than $60. Representative datasets were used to estimate maximum data throughput on different cluster sizes and to compare costs between EC2 and comparable local grid servers.Although bioinformatics requirements for microbial genomics depend on dataset characteristics and the analysis protocols applied, our results suggests that smaller sequencing facilities (up to three Roche/454 or one Illumina GAIIx sequencer) invested in 16S rRNA amplicon sequencing, microbial single-genome and metagenomics WGS projects can achieve cost-efficient bioinformatics support using CloVR in combination with Amazon EC2 as an alternative to local computing centers

Directory of Open Access Journals

PubMed Central

CloVR-Metagenomics: Functional and taxonomic microbial community characterization from metagenomic whole-genome shotgun (WGS) sequences – standard operating procedure, version 1.0

Author: Cesar Arze
James White
Malcolm Matalka
Samuel Angiuoli
The CloVR Team
W. Florian Fricke
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

CloVR-16S: Phylogenetic microbial community composition analysis based on 16S ribosomal RNA amplicon sequencing – standard operating procedure, version 1.0

Author: Cesar Arze
James White
Malcolm Matalka
Owen White
Samuel Angiuoli
The CloVR Team
W. Florian Fricke
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Resources and Costs for Microbial Sequence Analysis Evaluated Using Virtual Machines and Cloud Computing

Author: James R. White
Malcolm Matalka
Owen White
Samuel V. Angiuoli
Sarah K. Highlander
W. Florian Fricke
Publication venue: 'Public Library of Science (PLoS)'
Publication date
Field of study

Crossref

CloVR-16S: Phylogenetic microbial community composition analysis based on 16S ribosomal RNA amplicon sequencing – standard operating procedure, version1.0

Author: Cesar Arze
James White
James White
Malcolm Matalka
Owen White
Samuel Angiuoli
The CloVR Team
W. Florian Fricke
Publication venue
Publication date: 01/01/2011
Field of study

The CloVR-16S pipeline employs several well-known phylogenetic tools and protocols for the analysis of 16S rRNA sequence datasets:

A) Mothur [1] – a C++ - based software package used for clustering 16SrRNA sequences into operational taxonomic units (OTUs). Mothur creates OTUs using a matrix that describes pairwise distances between representative aligned sequences and subsequently estimates within-sample diversity (alpha diversity);
B)The Ribosomal Database (RDP) naive Bayesian classifier [2] assigns each 16S sequence to a reference taxonomy with associated empirical probabilities based on oligonucleotide frequencies;
C) Qiime [3] – a python-based workflow package, allowing for sequence processing and phylogenetic analysis using different methods including phylogenetic distance (UniFrac [4]) for within- (alpha diversity) and between- (beta diversity) sample analysis;
D) Metastats [5] and custom R scripts used to generate additional statistical and graphical evaluations.

Though some of the different protocols used in CloVR-16S overlap in purpose (e.g. OTU clustering), the end-user benefits from their overall complementary nature as they focus on different aspects of the phylogenetic analysis. CloVR-16S accepts as input raw multiplex 454-pyrosequencer output, i.e. pooled pyrotagged sequences from multiple samples, or alternatively, pre-processed sequences from multiple samples in separate files. This protocol is available in CloVR beta versions 0.5 and 0.6

Crossref

Nature Precedings

CloVR-Microbe: Assembly, gene finding and functional annotation of raw sequence data from single microbial genome projects – standard operating procedure, version 1.0

Author: Cesar Arze
James R. White
Kevin Galens
Malcolm Matalka
Michelle Gwinn Giglio
Owen White
Samuel V. Angiuoli
The CloVR Team
W. Florian Fricke
Publication venue
Publication date: 01/01/2011
Field of study

The CloVR-Microbe pipeline performs the basic processing and analysis steps required for standard microbial single-genome sequencing projects: A) Whole-genome shotgun sequence assembly; B) Identification of protein and RNA-coding genes; and C) Functional gene annotation. B) and C) are based on the IGS Annotation Engine (http://ae.igs.umaryland.edu/), which is described elsewhere (K Galens et al. submitted). The assembly component of CloVR-Microbe can be executed independently from the gene identification and annotation components. Alternatively, pre-assembled sequence contigs can be used to perform gene identifications and annotations. The pipeline input may consist of unassembled raw sequence reads from the Sanger, Roche/454 GS FLX or Illumina GAII or HiSeq sequencing platforms or of combinations of Sanger and Roche/454 sequence data. The pipeline output consists of results and summary files generated during the different pipeline steps. Annotated sequence files are generated that are compatible with common genome browser tools and can be submitted to the GenBank repository at NCBI. This protocol is available in CloVR beta versions 0.5 and 0.6

Crossref

Nature Precedings

CloVR: A virtual machine for automated and portable sequence analysis from the desktop using cloud computing

Author: Angiuoli Samuel V
Arze Cesar
Fricke W Florian
Galens Kevin
Gussman Aaron
Matalka Malcolm
Riley David R
Vangala Mahesh
White James R
White Owen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2011
Field of study

Abstract Background Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. Results We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. Conclusion The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing.</p

Directory of Open Access Journals