138 research outputs found

    A Taxonomy of Workflow Management Systems for Grid Computing

    Full text link
    With the advent of Grid and application technologies, scientists and engineers are building more and more complex applications to manage and process large data sets, and execute scientific experiments on distributed resources. Such application scenarios require means for composing and executing complex workflows. Therefore, many efforts have been made towards the development of workflow management systems for Grid computing. In this paper, we propose a taxonomy that characterizes and classifies various approaches for building and executing workflows on Grids. We also survey several representative Grid workflow systems developed by various projects world-wide to demonstrate the comprehensiveness of the taxonomy. The taxonomy not only highlights the design and engineering similarities and differences of state-of-the-art in Grid workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure

    Multi-layered simulations at the heart of workflow enactment on clouds

    Get PDF
    Scientific workflow systems face new challenges when supporting Cloud computing, as the information on the state of the used infrastructures is much less detailed than before. Thus, organising virtual infrastructures in a way that not only supports the workflow execution but also optimises it for several service level objectives (e.g. maximum energy consumption limit, cost, reliability, availability) become reliant on good Cloud modelling and prediction information. While simulators were successfully aiding research on such workflow management systems, the currently available Cloud related simulation toolkits suffer from several issues (e.g. scalability and narrow scope) that hinder their applicability. To address these issues, this article introduces techniques for unifying two existing simulation toolkits by first analysing the problems with the current simulators, and then by illustrating the problems faced by workflow systems. We use for this purpose the example of the ASKALON environment, a scientific workflow composition and execution tool for cloud and grid environments. We illustrate the advantages of a workflow system with directly integrated simulation back-end and how the unification of the selected simulators does not affect the overall workflow execution simulation performance. Copyright © 2015 John Wiley & Sons, Ltd

    Fostering energy-awareness in simulations behind scientific workflow management systems

    Get PDF
    © 2014 IEEE.Scientific workflow management systems face a new challenge in the era of cloud computing. The past availability of rich information regarding the state of the used infrastructures is gone. Thus, organising virtual infrastructures so that they not only support the workflow being executed, but also optimise for several service level objectives (e.g., Maximum energy consumption limit, cost, reliability, availability) become dependent on good infrastructure modelling and prediction techniques. While simulators have been successfully used in the past to aid research on such workflow management systems, the currently available cloud related simulation toolkits suffer form several issues (e.g., Scalability, narrow scope) that hinder their applicability. To address this need, this paper introduces techniques for unifying two existing simulation toolkits by first analysing the problems with the current simulators, and then by illustrating the problems faced by workflow systems through the example of the ASKALON environment. Finally, we show how the unification of the selected simulators improve on the the discussed problems

    BioWorkbench: A High-Performance Framework for Managing and Analyzing Bioinformatics Experiments

    Get PDF
    Advances in sequencing techniques have led to exponential growth in biological data, demanding the development of large-scale bioinformatics experiments. Because these experiments are computation- and data-intensive, they require high-performance computing (HPC) techniques and can benefit from specialized technologies such as Scientific Workflow Management Systems (SWfMS) and databases. In this work, we present BioWorkbench, a framework for managing and analyzing bioinformatics experiments. This framework automatically collects provenance data, including both performance data from workflow execution and data from the scientific domain of the workflow application. Provenance data can be analyzed through a web application that abstracts a set of queries to the provenance database, simplifying access to provenance information. We evaluate BioWorkbench using three case studies: SwiftPhylo, a phylogenetic tree assembly workflow; SwiftGECKO, a comparative genomics workflow; and RASflow, a RASopathy analysis workflow. We analyze each workflow from both computational and scientific domain perspectives, by using queries to a provenance and annotation database. Some of these queries are available as a pre-built feature of the BioWorkbench web application. Through the provenance data, we show that the framework is scalable and achieves high-performance, reducing up to 98% of the case studies execution time. We also show how the application of machine learning techniques can enrich the analysis process

    Fine-Grain Interoperability of Scientific Workflows in Distributed Computing Infrastructures

    Get PDF
    Today there exist a wide variety of scientific workflow management systems, each designed to fulfill the needs of a certain scientific community. Unfortunately, once a workflow application has been designed in one particular system it becomes very hard to share it with users working with different systems. Portability of workflows and interoperability between current systems barely exists. In this work, we present the fine-grained interoperability solution proposed in the SHIWA European project that brings together four representative European workflow systems: ASKALON, MOTEUR, WS-PGRADE, and Triana. The proposed interoperability is realised at two levels of abstraction: abstract and concrete. At the abstract level, we propose a generic Interoperable Workflow Intermediate Representation (IWIR) that can be used as a common bridge for translating workflows between different languages independent of the underlying distributed computing infrastructure. At the concrete level, we propose a bundling technique that aggregates the abstract IWIR representation and concrete task representations to enable workflow instantiation, execution and scheduling. We illustrate case studies using two real-workflow applications designed in a native environment and then translated and executed by a foreign workflow system in a foreign distributed computing infrastructure. © 2013 Springer Science+Business Media Dordrecht
    corecore