Search CORE

18,739 research outputs found

CloVR: A virtual machine for automated and portable sequence analysis from the desktop using cloud computing

Author: A Bateman
A Bateman
A Tridgell
Aaron Gussman
AC Stewart
AL Delcher
B Langmead
B Langmead
BE Suzek
C Hemmerich
C Rapier
Cesar Arze
D Field
D Hull
David R Riley
DL Wheeler
DR Zerbino
E Afgan
EE Schadt
F Meyer
J Dean
J Goecks
J Orvis
J White
J White
J White
James R White
JD Selengut
JG Caporaso
JP Mesirov
JR Cole
JR Miller
JR White
JT Dudley
K Galens
K Keahey
K Lagesen
Kevin Galens
LD Stein
M Reich
Mahesh Vangala
Malcolm Matalka
MC Schatz
MC Schatz
MC Schatz
O Trelles
Owen White
PD Schloss
RC Edgar
RK Aziz
RL Tatusov
S Angiuoli
Samuel V Angiuoli
SD Kahn
SF Altschul
SF Altschul
SR Eddy
TM Lowe
W Florian Fricke
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing.https://doi.org/10.1186/1471-2105-12-35

Crossref

Springer - Publisher Connector

PubMed Central

Digital Repository at the University of Maryland

Cloud Bioinformatics in a private cloud deployment

Author: Chang Victor
Publication venue: 'IGI Global'
Publication date: 15/11/2013
Field of study

Southampton (e-Prints Soton)

Cloud Storage and Bioinformatics in a private cloud deployment: Lessons for Data Intensive research

Author: Chang Victor
Walters Robert John
Wills Gary
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

This paper describes service portability for a private cloud deployment, including a detailed case study about Cloud Storage and bioinformatics services developed as part of the Cloud Computing Adoption Framework (CCAF). Our Cloud Storage design and deployment is based on Storage Area Network (SAN) technologies, details of which include functionalities, technical implementation, architecture and user support. Experiments for data services (backup automation, data recovery and data migration) are performed and results confirm backup automation is completed swiftly and is reliable for data-intensive research. The data recovery result confirms that execution time is in proportion to quantity of recovered data, but the failure rate increases in an exponential manner. The data migration result confirms execution time is in proportion to disk volume of migrated data, but again the failure rate increases in an exponential manner. In addition, benefits of CCAF are illustrated using several bioinformatics examples such as tumour modelling, brain imaging, insulin molecules and simulations for medical training. Our Cloud Storage solution described here offers cost reduction, time-saving and user friendliness

Southampton (e-Prints Soton)

Crossref

Teeside University's Research Repository

Towards a Swiss National Research Infrastructure

Author: Bohnert Thomas
Edmonds Andrew
Eurich Markus
Flanders Dean
Flury Placi
Haug Sigve
Jamakovic-Kapic Almerina
Kunszt Peter
Leinen Simon
Maffioletti Sergio
Schiller Eryk
Stockinger Heinz
Publication venue
Publication date: 01/01/2013
Field of study

In this position paper we describe the current status and plans for a Swiss National Research Infrastructure. Swiss academic and research institutions are very autonomous. While being loosely coupled, they do not rely on any centralized management entities. Therefore, a coordinated national research infrastructure can only be established by federating the various resources available locally at the individual institutions. The Swiss Multi-Science Computing Grid and the Swiss Academic Compute Cloud projects serve already a large number of diverse user communities. These projects also allow us to test the operational setup of such a heterogeneous federated infrastructure

arXiv.org e-Print Archive

Crossref

ZHAW digitalcollection

ZORA

Bern Open Repository and Information System (BORIS)

Bringing Hadoop into Bioinformatics with Cloudgene and CloudMan

Author: Afgan Enis
Davidović Davor
Forer Lukas
Kronenberg Florian
Schönherr Sebastian
Weissensteiner Hansi
Publication venue
Publication date: 10/07/2015
Field of study

Despite the evident potential of the MapReduce model and existence of bioinformatic algorithms and applications, those are still to become widely adopted in the bioinformatics data analysis. The Hadoop MapReduce model offers a simple framework for data parallelism by providing automated runtime recovery (for both task runtime and hardware failures), implicit scalability (tasks automatically run in parallel batch mode), as well as data replication and locality (reduce data movement, hence increase processing capacity). We identify two prerequisites for wider adoption and higher utilization of MapReduce tools: (1) abstract the technical details of how multiple existing MapReduce tools are composed, and (2) provide easy access to the necessary compute infrastructure and the appropriate environment. Satisfying these requirements would allow bioinformatics domain experts to focus on the analysis while the required technical details are hidden. At BOSC 2012, two platforms were presented: Cloudgene a MapReduce tool execution platform leveraging Hadoop, and CloudMan a cloud resource manager. Since then, we have combined and extended these two platforms to provide a readily available and an accessible Hadoopbased bioinformatics environment for the Cloud. Cloudgene, other than allowing arbitrary MapReduce tools to be integrated and used to craft an analysis, has been extended as a job execution engine for currently two dedicated services: an imputation service developed in cooperation with the Center for Statistical Genetics, University of Michigan (available at imputationserver.sph.umich.edu ) and a mtDNA analysis service (available at mtdnaserver.uibk.ac.at ). Thus far, the “Michigan Imputation Server” has shown remarkable popularity and scalability with over 690,000 human genomes being imputed within one year. These services have been deployed on dedicated hardware and offer a simple interface for the specific tasks while the jobs are being executed in the MapReduce fashion. This demonstrates a positive disposition towards wider adoption of MapReduce paradigm in the bioinformatics data analysis space given accessible and effective solutions. To facilitate easy access to such MapReduce solutions for bioinformatics and broaden the availability of these services, we have extended CloudMan to provide a Hadoopbased environment with preconfigured Cloudgene. CloudMan handles the tasks of procuring required cloud resources and configuring the appropriate environment, thus insulating the user from the lowlevel technical details otherwise required. Because CloudMan is compatible with multiple cloud technologies, it is now feasible to deploy this environment on a range of private and public clouds. This makes it possible for anyone to obtain a scalable Hadoopbased cluster with Cloudgene preinstalled and readily execute MapReduce tools. This talk will present the motivation for supporting greater adoption of MapReducebased applications in the bioinformatics data analysis space followed by the details of the described services and their functionality

Full-text Institutional Repository of the Ruđer Bošković Institute

FigShare