24,333 research outputs found

    Cloud Bioinformatics in a private cloud deployment

    No full text
    This chapter describes service portability for a private cloud deployment, including a detailed case study about Cloud Bioinformatics services developed as part of the Cloud Computing Adoption Framework (CCAF). The Cloud Bioinformatics design and deployment is based on Storage Area Network (SAN) technologies, details of which include functionalities, technical implementation, architecture, and user support. Bioinformatics applications are written on the SAN-based private cloud, which can simulate complex biological sciences and present them in a way that anyone without prior knowledge can understand. Several bioinformatics results are discussed, particularly brain segmentation, which demonstrates different parts of the brain simulated by the private cloud. In addition, benefits of CCAF are illustrated using several bioinformatics examples such as tumour modelling, brain imaging, insulin molecules, and simulations for medical training. The Cloud Bioinformatics solution offers cost reduction, time-saving, and user friendliness. </jats:p

    Cloud Storage and Bioinformatics in a private cloud deployment: Lessons for Data Intensive research

    No full text
    This paper describes service portability for a private cloud deployment, including a detailed case study about Cloud Storage and bioinformatics services developed as part of the Cloud Computing Adoption Framework (CCAF). Our Cloud Storage design and deployment is based on Storage Area Network (SAN) technologies, details of which include functionalities, technical implementation, architecture and user support. Experiments for data services (backup automation, data recovery and data migration) are performed and results confirm backup automation is completed swiftly and is reliable for data-intensive research. The data recovery result confirms that execution time is in proportion to quantity of recovered data, but the failure rate increases in an exponential manner. The data migration result confirms execution time is in proportion to disk volume of migrated data, but again the failure rate increases in an exponential manner. In addition, benefits of CCAF are illustrated using several bioinformatics examples such as tumour modelling, brain imaging, insulin molecules and simulations for medical training. Our Cloud Storage solution described here offers cost reduction, time-saving and user friendliness

    CloVR: A virtual machine for automated and portable sequence analysis from the desktop using cloud computing

    Get PDF
    Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing.https://doi.org/10.1186/1471-2105-12-35

    Cloud Computing in Bioinformatics

    Get PDF
    Cloud Computing presents a new approach to allow the development of dynamic, distributed and highly scalable software. For this purpose, Cloud Computing offers services, software and computing infrastructure independently through the network. To achieve a system that supports these characteristics, Service-Oriented Architectures (SOA) and agent frameworks exist which provide tools for developing distributed and multi-agent systems that can be used for the establishment of Cloud Computing environments. This paper presents a CISM@ (Cloud computing Integrated into Service-oriented Multi-Agent) architecture set on top of the platforms and frameworks by adding new layers for integrating a SOA and Cloud Computing approach and facilitating the distribution and management of functionalities. CISM@ has been applied to the real case study consisting of the analysis of microarray data and has allowed the efficient management of the allocation of resources to the different system agents

    Multilevel parallelism in sequence alignment using a streaming approach

    Get PDF
    Proceedings of: Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015). Krakow (Poland), September 10-11, 2015.Ultrascale computing and bioinformatics are two rapidly growing fields with a big impact right now and even more so in the future. The introduction of next generation sequencing pushes current bioinformatics tools and workflows to their limits in terms of performance. This forces the tools to become increasingly performant to keep up with the growing speed at which sequencing data is created. Ultrascale computing can greatly benefit bioinformatics in the challenges it faces today, especially in terms of scalability, data management and reliability. But before this is possible, the algorithms and software used in the field of bioinformatics need to be prepared to be used in a heterogeneous distributed environment. For this paper we choose to look at sequence alignment, which has been an active topic of research to speed up next generation sequence analysis, as it is ideally suited for parallel processing. We present a multilevel stream based parallel architecture to transparently distribute sequence alignment over multiple cores of the same machine, multiple machines and cloud resources. The same concepts are used to achieve multithreaded and distributed parallelism, making the architecture simple to extend and adapt to new situations. A prototype of the architecture has been implemented using an existing commercial sequence aligner. We demonstrate the flexibility of the implementation by running it on different configurations, combining local and cloud computing resources

    Brain Segmentation ? A Case study of Biomedical Cloud Computing for Education and Research

    Get PDF
    Medical imaging is widely adopted in Hospitals and medical institutes, and new ways to improve existing medical imaging services are regularly exploited. This paper describes the adoption of Cloud Computing is useful for medical education and research, and describes the methodology, results and lesson learned. A working Bioinformatics Cloud platform can demonstrate computation and visualisation of brain imaging. The aim is to study segmentation of brains, which divides the brain into ten major regions. The Cloud platform has these two functions: (i) it can highlight each region for ten different segments; and (ii) it can adjust intensity of segmentation to allow basic study of brain medicine. Two types of benefits are reported as follows. Firstly, all the medical student participants are reported to have 20% improvement in their learning satisfaction. Secondly, 100% of volunteer participants are reported to have positive learning experience

    MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud

    Get PDF
    This is a pre-copyedited, author-produced version of an article accepted for publication in Bioinformatics following peer review. The version of record Roberto R. Expósito, Jorge Veiga, Jorge González-Domínguez, Juan Touriño; MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud, Bioinformatics, Volume 33, Issue 17, 1 September 2017, Pages 2762–2764 is available online at: https://doi.org/10.1093/bioinformatics/btx307[Abstract] This article presents MarDRe, a de novo cloud-ready duplicate and near-duplicate removal tool that can process single- and paired-end reads from FASTQ/FASTA datasets. MarDRe takes advantage of the widely adopted MapReduce programming model to fully exploit Big Data technologies on cloud-based infrastructures. Written in Java to maximize cross-platform compatibility, MarDRe is built upon the open-source Apache Hadoop project, the most popular distributed computing framework for scalable Big Data processing. On a 16-node cluster deployed on the Amazon EC2 cloud platform, MarDRe is up to 8.52 times faster than a representative state-of-the-art tool.Ministerio de Economia y Competitividad; TIN2016-75845-PMinisterio de Educación; FPU014/0280

    PRIVATE CLOUD INITIATIVES USING BIOINFORMATICS RESOURCES AND APPLICATIONS FACILITY (BRAF)

    Get PDF
    ABSTRACT The bioinformatics research community has a demand of enormous compute resources to run bioinformatics tools. Next generation sequencing technologies have further increased the overall demand for computational analysis. The traditional Cluster and Grid computing are having their own complexities to program and use while there is a silver-line in cloud for on-demand high-performance infrastructures with the advent of cloud computing era and its advantages. We have adopted the technology so that it can prove its mandate with more benefits to the community. We are able to bring out a private cloud using BRAF which is a high end cluster facility dedicated for bioinformatics. Open source equivalents to prominent commercially available solutions are used as the cloud middle-ware stack. In this paper we will summarize our implementation of a virtualized private cloud environment using Eucalyptus
    corecore