2,719 research outputs found

    Elastic Business Process Management: State of the Art and Open Challenges for BPM in the Cloud

    Full text link
    With the advent of cloud computing, organizations are nowadays able to react rapidly to changing demands for computational resources. Not only individual applications can be hosted on virtual cloud infrastructures, but also complete business processes. This allows the realization of so-called elastic processes, i.e., processes which are carried out using elastic cloud resources. Despite the manifold benefits of elastic processes, there is still a lack of solutions supporting them. In this paper, we identify the state of the art of elastic Business Process Management with a focus on infrastructural challenges. We conceptualize an architecture for an elastic Business Process Management System and discuss existing work on scheduling, resource allocation, monitoring, decentralized coordination, and state management for elastic processes. Furthermore, we present two representative elastic Business Process Management Systems which are intended to counter these challenges. Based on our findings, we identify open issues and outline possible research directions for the realization of elastic processes and elastic Business Process Management.Comment: Please cite as: S. Schulte, C. Janiesch, S. Venugopal, I. Weber, and P. Hoenisch (2015). Elastic Business Process Management: State of the Art and Open Challenges for BPM in the Cloud. Future Generation Computer Systems, Volume NN, Number N, NN-NN., http://dx.doi.org/10.1016/j.future.2014.09.00

    A Taxonomy of Workflow Management Systems for Grid Computing

    Full text link
    With the advent of Grid and application technologies, scientists and engineers are building more and more complex applications to manage and process large data sets, and execute scientific experiments on distributed resources. Such application scenarios require means for composing and executing complex workflows. Therefore, many efforts have been made towards the development of workflow management systems for Grid computing. In this paper, we propose a taxonomy that characterizes and classifies various approaches for building and executing workflows on Grids. We also survey several representative Grid workflow systems developed by various projects world-wide to demonstrate the comprehensiveness of the taxonomy. The taxonomy not only highlights the design and engineering similarities and differences of state-of-the-art in Grid workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure

    DFlow: Efficient Dataflow-based Invocation Workflow Execution for Function-as-a-Service

    Full text link
    The Serverless Computing is becoming increasingly popular due to its ease of use and fine-grained billing. These features make it appealing for stateful application or serverless workflow. However, current serverless workflow systems utilize a controlflow-based invocation pattern to invoke functions. In this execution pattern, the function invocation depends on the state of the function. A function can only begin executing once all its precursor functions have completed. As a result, this pattern may potentially lead to longer end-to-end execution time. We design and implement the DFlow, a novel dataflow-based serverless workflow system that achieves high performance for serverless workflow. DFlow introduces a distributed scheduler (DScheduler) by using the dataflow-based invocation pattern to invoke functions. In this pattern, the function invocation depends on the data dependency between functions. The function can start to execute even its precursor functions are still running. DFlow further features a distributed store (DStore) that utilizes effective fine-grained optimization techniques to eliminate function interaction, thereby enabling efficient data exchange. With the support of DScheduler and DStore, DFlow can achieving an average improvement of 60% over CFlow, 40% over FaaSFlow, 25% over FaasFlowRedis, and 40% over KNIX on 99%-ile latency respectively. Further, it can improve network bandwidth utilization by 2x-4x over CFlow and 1.5x-3x over FaaSFlow, FaaSFlowRedis and KNIX, respectively. DFlow effectively reduces the cold startup latency, achieving an average improvement of 5.6x over CFlow and 1.1x over FaaSFlowComment: 22 pages, 13 figure

    A Model for Scientific Workflows with Parallel and Distributed Computing

    Get PDF
    In the last decade we witnessed an immense evolution of the computing infrastructures in terms of processing, storage and communication. On one hand, developments in hardware architectures have made it possible to run multiple virtual machines on a single physical machine. On the other hand, the increase of the available network communication bandwidth has enabled the widespread use of distributed computing infrastructures, for example based on clusters, grids and clouds. The above factors enabled different scientific communities to aim for the development and implementation of complex scientific applications possibly involving large amounts of data. However, due to their structural complexity, these applications require decomposition models to allow multiple tasks running in parallel and distributed environments. The scientific workflow concept arises naturally as a way to model applications composed of multiple activities. In fact, in the past decades many initiatives have been undertaken to model application development using the workflow paradigm, both in the business and in scientific domains. However, despite such intensive efforts, current scientific workflow systems and tools still have limitations, which pose difficulties to the development of emerging large-scale, distributed and dynamic applications. This dissertation proposes the AWARD model for scientific workflows with parallel and distributed computing. AWARD is an acronym for Autonomic Workflow Activities Reconfigurable and Dynamic. The AWARD model has the following main characteristics. It is based on a decentralized execution control model where multiple autonomic workflow activities interact by exchanging tokens through input and output ports. The activities can be executed separately in diverse computing environments, such as in a single computer or on multiple virtual machines running on distributed infrastructures, such as clusters and clouds. It provides basic workflow patterns for parallel and distributed application decomposition and other useful patterns supporting feedback loops and load balancing. The model is suitable to express applications based on a finite or infinite number of iterations, thus allowing to model long-running workflows, which are typical in scientific experimention. A distintive contribution of the AWARD model is the support for dynamic reconfiguration of long-running workflows. A dynamic reconfiguration allows to modify the structure of the workflow, for example, to introduce new activities, modify the connections between activity input and output ports. The activity behavior can also be modified, for example, by dynamically replacing the activity algorithm. In addition to the proposal of a new workflow model, this dissertation presents the implementation of a fully functional software architecture that supports the AWARD model. The implemented prototype was used to validate and refine the model across multiple workflow scenarios whose usefulness has been demonstrated in practice clearly, through experimental results, demonstrating the advantages of the major characteristics and contributions of the AWARD model. The implemented prototype was also used to develop application cases, such as a workflow to support the implementation of the MapReduce model and a workflow to support a text mining application developed by an external user. The extensive experimental work confirmed the adequacy of the AWARD model and its implementation for developing applications that exploit parallelism and distribution using the scientific workflows paradigm

    Evaluating the benefits of key-value databases for scientific applications

    Get PDF
    The convergence of Big Data applications with High-Performance Computing requires new methodologies to store, manage and process large amounts of information. Traditional storage solutions are unable to scale and that results in complex coding strategies. For example, the brain atlas of the Human Brain Project has the challenge to process large amounts of high-resolution brain images. Given the computing needs, we study the effects of replacing a traditional storage system with a distributed Key-Value database on a cell segmentation application. The original code uses HDF5 files on GPFS through an intricate interface, imposing synchronizations. On the other hand, by using Apache Cassandra or ScyllaDB through Hecuba, the application code is greatly simplified. Thanks to the Key-Value data model, the number of synchronizations is reduced and the time dedicated to I/O scales when increasing the number of nodes.This project/research has received funding from the European Unions Horizon 2020 Framework Programme for Research and Innovation under the Speci c Grant Agreement No. 720270 (Human Brain Project SGA1) and the Speci c Grant Agreement No. 785907 (Human Brain Project SGA2). This work has also been supported by the Spanish Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), and by Generalitat de Catalunya (contract 2017-SGR-1414).Postprint (author's final draft
    • …
    corecore