40 research outputs found

    Supporting Quality of Service in Scientific Workflows

    Get PDF
    While workflow management systems have been utilized in enterprises to support businesses for almost two decades, the use of workflows in scientific environments was fairly uncommon until recently. Nowadays, scientists use workflow systems to conduct scientific experiments, simulations, and distributed computations. However, most scientific workflow management systems have not been built using existing workflow technology; rather they have been designed and developed from scratch. Due to the lack of generality of early scientific workflow systems, many domain-specific workflow systems have been developed. Generally speaking, those domain-specific approaches lack common acceptance and tool support and offer lower robustness compared to business workflow systems. In this thesis, the use of the industry standard BPEL, a workflow language for modeling business processes, is proposed for the modeling and the execution of scientific workflows. Due to the widespread use of BPEL in enterprises, a number of stable and mature software products exist. The language is expressive (Turingcomplete) and not restricted to specific applications. BPEL is well suited for the modeling of scientific workflows, but existing implementations of the standard lack important features that are necessary for the execution of scientific workflows. This work presents components that extend an existing implementation of the BPEL standard and eliminate the identified weaknesses. The components thus provide the technical basis for use of BPEL in academia. The particular focus is on so-called non-functional (Quality of Service) requirements. These requirements include scalability, reliability (fault tolerance), data security, and cost (of executing a workflow). From a technical perspective, the workflow system must be able to interface with the middleware systems that are commonly used by the scientific workflow community to allow access to heterogeneous, distributed resources (especially Grid and Cloud resources). The major components cover exactly these requirements: Cloud Resource Provisioner Scalability of the workflow system is achieved by automatically adding additional (Cloud) resources to the workflow system’s resource pool when the workflow system is heavily loaded. Fault Tolerance Module High reliability is achieved via continuous monitoring of workflow execution and corrective interventions, such as re-execution of a failed workflow step or replacement of the faulty resource. Cost Aware Data Flow Aware Scheduler The majority of scientific workflow systems only take the performance and utilization of resources for the execution of workflow steps into account when making scheduling decisions. The presented workflow system goes beyond that. By defining preference values for the weighting of costs and the anticipated workflow execution time, workflow users may influence the resource selection process. The developed multiobjective scheduling algorithm respects the defined weighting and makes both efficient and advantageous decisions using a heuristic approach. Security Extensions Because it supports various encryption, signature and authentication mechanisms (e.g., Grid Security Infrastructure), the workflow system guarantees data security in the transfer of workflow data. Furthermore, this work identifies the need to equip workflow developers with workflow modeling tools that can be used intuitively. This dissertation presents two modeling tools that support users with different needs. The first tool, DAVO (domain-adaptable, Visual BPEL Orchestrator), operates at a low level of abstraction and allows users with knowledge of BPEL to use the full extent of the language. DAVO is a software that offers extensibility and customizability for different application domains. These features are used in the implementation of the second tool, SimpleBPEL Composer. SimpleBPEL is aimed at users with little or no background in computer science and allows for quick and intuitive development of BPEL workflows based on predefined components

    Runtime Adaptation of Scientific Service Workflows

    Get PDF
    Software landscapes are rather subject to change than being complete after having been built. Changes may be caused by a modified customer behavior, the shift to new hardware resources, or otherwise changed requirements. In such situations, several challenges arise. New architectural models have to be designed and implemented, existing software has to be integrated, and, finally, the new software has to be deployed, monitored, and, where appropriate, optimized during runtime under realistic usage scenarios. All of these situations often demand manual intervention, which causes them to be error-prone. This thesis addresses these types of runtime adaptation. Based on service-oriented architectures, an environment is developed that enables the integration of existing software (i.e., the wrapping of legacy software as web services). A workflow modeling tool that aims at an easy-to-use approach by separating the role of the workflow expert and the role of the domain expert. After the development of workflows, tools that observe the executing infrastructure and perform automatic scale-in and scale-out operations are presented. Infrastructure-as-a-Service providers are used to scale the infrastructure in a transparent and cost-efficient way. The deployment of necessary middleware tools is automatically done. The use of a distributed infrastructure can lead to communication problems. In order to keep workflows robust, these exceptional cases need to treated. But, in this way, the process logic of a workflow gets mixed up and bloated with infrastructural details, which yields an increase in its complexity. In this work, a module is presented that can deal automatically with infrastructural faults and that thereby allows to keep the separation of these two layers. When services or their components are hosted in a distributed environment, some requirements need to be addressed at each service separately. Although techniques as object-oriented programming or the usage of design patterns like the interceptor pattern ease the adaptation of service behavior or structures. Still, these methods require to modify the configuration or the implementation of each individual service. On the other side, aspect-oriented programming allows to weave functionality into existing code even without having its source. Since the functionality needs to be woven into the code, it depends on the specific implementation. In a service-oriented architecture, where the implementation of a service is unknown, this approach clearly has its limitations. The request/response aspects presented in this thesis overcome this obstacle and provide a SOA-compliant and new methods to weave functionality into the communication layer of web services. The main contributions of this thesis are the following: Shifting towards a service-oriented architecture: The generic and extensible Legacy Code Description Language and the corresponding framework allow to wrap existing software, e.g., as web services, which afterwards can be composed into a workflow by SimpleBPEL without overburdening the domain expert with technical details that are indeed handled by a workflow expert. Runtime adaption: Based on the standardized Business Process Execution Language an automatic scheduling approach is presented that monitors all used resources and is able to automatically provision new machines in case a scale-out becomes necessary. If the resource's load drops, e.g., because of less workflow executions, a scale-in is also automatically performed. The scheduling algorithm takes the data transfer between the services into account in order to prevent scheduling allocations that eventually increase the workflow's makespan due to unnecessary or disadvantageous data transfers. Furthermore, a multi-objective scheduling algorithm that is based on a genetic algorithm is able to additionally consider cost, in a way that a user can define her own preferences rising from optimized execution times of a workflow and minimized costs. Possible communication errors are automatically detected and, according to certain constraints, corrected. Adaptation of communication: The presented request/response aspects allow to weave functionality into the communication of web services. By defining a pointcut language that only relies on the exchanged documents, the implementation of services must neither be known nor be available. The weaving process itself is modeled using web services. In this way, the concept of request/response aspects is naturally embedded into a service-oriented architecture

    Task scheduling for application integration: A strategy for large volumes of data

    Get PDF
    Enterprise Application Integration is the research field, which provides methodologies, techniques and tools for modelling and implementing integration processes. An integration process performs the orchestration of a set of applications to keep them synchronised or to allow the creation of new features. It can be represented by a workflow composed of tasks and communication channels. Integration platforms are tools for the design and execution of integration processes in which, the runtime system is the component responsible for execution time of the tasks and the allocation of computational resources that perform them. The processing of a large volume of data, corresponding to execution of millions of tasks, can cause situations of overload, characterised by the accumulation of tasks in internal queues awaiting computational resources in the runtime systems, resulting in unacceptable response time for the external applications and users. Our research hypothesis is that the runtime systems of the integration platforms use simplistic heuristics for scheduling tasks, which does not allow them to maintain acceptable levels of performance when there are overload situations. In this research work, we developed (i) a representation for integration processes, (ii) a characterisation for your task schedules, (iii) a heuristic to deal with situations of overload, (iv) a mathematical model for a performance metric of the execution of integration processes and (v) a simulation tool for task scheduling heuristics. Our research results indicate that, in situations of overload, our heuristic promotes a balanced workload distribution and an increase in the performance of the execution of the integration processes.Integração de Aplicações Empresariais é o campo de pesquisa, que fornece metodologias, técnicas e ferramentas para modelar e implementar processos de integração. Um processo de integração executa a orquestração de um conjunto de aplicações para mantê-las sincronizadas ou para permitir a criação de novas funcionalidades. Ele pode ser representado por um fluxo de trabalho composto por tarefas e canais de comunicação. Plataformas de integração são ferramentas para projetar e executar processos de integração, nas quais o motor de execução é o componente responsável pelo tempo de execução das tarefas e pela alocação de recursos computacionais que as executam. O processamento de um grande volume de dados, correspondendo a execução de milhões de tarefas, pode causar situações de sobrecarga, caracterizadas pelo acúmulo de tarefas em filas internas que aguardam recursos computacionais nos motores de execução, resultando em tempos de resposta inaceitáveis para aplicações e usuários externos. Nossa hipótese de pesquisa é que os motores de execução das plataformas de integração usam heurísticas simplistas para agendar tarefas, o que não lhes permitem manter níveis aceitáveis de desempenho em situações de sobrecarga. Neste trabalho de pesquisa, desenvolvemos (i) uma representação para processos de integração, (ii) uma caracterização para seus agendamentos de tarefas, (iii) uma heurística para lidar com situações de sobrecarga, (iv) um modelo matemático para uma métrica de desempenho da execução de processos de integração e (v) uma ferramenta de simulação para heurísticas de agendamento de tarefas. Nossos resultados de pesquisa indicam que, em situações de sobrecarga, nossa heurística promove uma distribuição equilibrada da carga de trabalho e um aumento no desempenho da execução dos processos de integração

    Knowledge-base and techniques for effective service-oriented programming & management of hybrid processes

    Full text link
    Recent advances in Web 2.0, SOA, crowd-sourcing, social and collaboration technologies, as well as cloud-computing, have truly transformed the Internet into a global development and deployment platform. As a result, developers have been presented with ubiquitous access to countless Web-services, resources and tools. However, while enabling tremendous automation and reuse opportunities, new productivity challenges have also emerged: The exploitation of services and resources nonetheless requires skilled programmers and a development-centric approach; it is thus inevitably susceptible to the same repetitive, error-prone and time consuming integration work each time a developer integrates a new API. Business Process Management on the other hand were proposed to support service-based integration. It provided the benefit of automation and modelling, which appealed to non-technical domain-experts. The problem however: it proves too rigid for unstructured processes. Thus, without this level of support, building new application either requires extensive manual programming or resorting to homebrew solutions. Alternatively, with the proliferation of SaaS, various such tools could be used for independent portions of the overall process - although this either presupposes conforming to the in-built process, or results in "shadow processes" via use of e-mail or the like, in order to exchange information and share decisions. There has therefore been an inevitable gap in technological support between structured and unstructured processes. To address these challenges, this thesis deals with transitioning process-support from structured to unstructured. We have been motivated to harness the foundational capabilities of BPM for its application to unstructured processes. We propose to achieve this by: First, addressing the productivity challenges of Web-services integration - simplifying this process - whilst encouraging an incremental curation and collective reuse approach. We then extend this to propose an innovative Hybrid-Process Management Platform that holistically combines structured, semi-structured and unstructured activities, based on a unified task-model that encapsulates a spectrum of process specificity. We have thus aimed to bridge the current lacking technology gap. The approach presented has been exposed as service-based libraries and tools. Whereby, we have devised several use-case scenarios and conducted user-studies in order to evaluate the overall effectiveness of our proposed work

    Replicated execution of workflows

    Get PDF
    Workflows are the de facto standard for managing and optimizing business processes. Workflows allow businesses to automate interactions between business locations and partners residing anywhere on the planet. This, however, requires the workflows to be executed in a distributed and dynamic environment, where device and communication failures occur quite frequently. In case that a workflow execution becomes unavailable through such failures, the business operations that rely on the workflow might be hindered or even stopped, implying the loss of money. Consequently, availability is a key concern when using workflows in dynamic environments. In this thesis, we propose replication schemes for workflow engines to ensure the availability of the workflows that are executed by these engines. Of course, a workflow that is executed by a replicated workflow engine has to yield the same result as a non-replicated execution of that workflow. To this end, we formally define the equivalence of a replicated and a non-replicated execution called Single-Execution-Equivalence. Subsequently, we present replication schemes for both imperative and declarative workflow languages. Imperative workflow languages, such as the Web Service Business Process Execution Language (WS-BPEL), specify the execution order of activities through an ordering relation and are the predominant way of specifying workflow models. We implement a proof-of-concept for demonstrating the compatibility of our replication schemes with current (imperative) workflow technology. Declarative workflow languages provide greater flexibility by allowing the reordering of the activities within a workflow at run-time. We exploit this by executing differently ordered replicas on several nodes in the network for improving availability further

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    Architectures and Algorithms for Cloud-Based Multimedia Conferencing

    Get PDF
    Multimedia conferencing is the real-time exchange of multimedia content between multiple parties. It is the basis of several applications, such as distance learning, online meetings, and massively multiplayer online games. Cloud-based provisioning of multimedia conferencing has several benefits, like resource efficiency, elasticity, and scalability. However, it remains very challenging. A challenge, for instance, is the lack of holistic architectures which cover both the infrastructure and the platform layers of cloud-based multimedia conferencing applications. Another challenge is the lack of appropriate algorithms for resource allocation in the conferencing cloud to accommodate the fluctuating number of participants, while meeting the required quality of services (QoS). Yet another example is the lack of suitable algorithms for scaling the multimedia conferencing applications in the cloud while meeting both QoS requirements and cost efficiency objective. Unfortunately, the solutions proposed so far do not address these challenges. This thesis focuses on the architectural and algorithmic challenges of cloud-based multimedia conferencing. It proposes architectural components and interfaces for multimedia conferencing application provisioning, covering both the Platform-as-a-Service (PaaS) and the Infrastructure-as-a-Service (IaaS) layers. The proposed interfaces simplify multimedia conference service provisioning for a wide range of application providers. On the algorithmic side, it proposes resource allocation mechanisms that support scalability in terms of the number of participants while meeting the QoS. These mechanisms allocate the actual resources (e.g., CPU, RAM, and storage) in an optimal manner. Besides these mechanisms, it proposes the scalability approaches for cloud-based multimedia conferencing applications. To ensure cost efficiency, these proposed solutions enable fine-grained scalability of the applications with respect to the number of participants while considering the QoS requirements. All algorithmic problems in this thesis are formulated using the Integer Linear Programming (ILP) and heuristics have been designed and validated to solve them
    corecore