4,335 research outputs found

    Design and evaluation of a genomics variant analysis pipeline using GATK Spark tools

    Full text link
    Scalable and efficient processing of genome sequence data, i.e. for variant discovery, is key to the mainstream adoption of High Throughput technology for disease prevention and for clinical use. Achieving scalability, however, requires a significant effort to enable the parallel execution of the analysis tools that make up the pipelines. This is facilitated by the new Spark versions of the well-known GATK toolkit, which offer a black-box approach by transparently exploiting the underlying Map Reduce architecture. In this paper we report on our experience implementing a standard variant discovery pipeline using GATK 4.0 with Docker-based deployment over a cluster. We provide a preliminary performance analysis, comparing the processing times and cost to those of the new Microsoft Genomics Services

    Performance-oriented Cloud Provisioning: Taxonomy and Survey

    Full text link
    Cloud computing is being viewed as the technology of today and the future. Through this paradigm, the customers gain access to shared computing resources located in remote data centers that are hosted by cloud providers (CP). This technology allows for provisioning of various resources such as virtual machines (VM), physical machines, processors, memory, network, storage and software as per the needs of customers. Application providers (AP), who are customers of the CP, deploy applications on the cloud infrastructure and then these applications are used by the end-users. To meet the fluctuating application workload demands, dynamic provisioning is essential and this article provides a detailed literature survey of dynamic provisioning within cloud systems with focus on application performance. The well-known types of provisioning and the associated problems are clearly and pictorially explained and the provisioning terminology is clarified. A very detailed and general cloud provisioning classification is presented, which views provisioning from different perspectives, aiding in understanding the process inside-out. Cloud dynamic provisioning is explained by considering resources, stakeholders, techniques, technologies, algorithms, problems, goals and more.Comment: 14 pages, 3 figures, 3 table

    Value-Based Allocation of Docker Containers

    Get PDF
    Recently, an increasing number of public cloud vendors added Containers as a Service (CaaS) to their service portfolio. This is an adequate answer to the growing popularity of Docker, a software technology allowing Linux containers to run independently on a host in an isolated environment. As any software can be deployed in a container, the nature of containers differs and thus assorted allocation and orchestration approaches are needed for their effective execution. In this paper, we focus on containers whose execution value for end users varies over time. A baseline and two dynamic allocation algorithms are proposed and compared with the default Docker scheduling algorithm. Experiments show that the proposed approach can increase the total value obtained from a workload up to three times depending on the workload heaviness. It is also demonstrated that the algorithms scale well with the growing number of nodes in a cloud
    • …
    corecore