35 research outputs found

    Grid Workflow Modelling Using Grid-Specific BPEL Extensions

    Get PDF
    This paper discusses problems of Grid service composition using BPEL4WS. In particular, difficulties concerning the invocation of WSRF-based services are elucidated. A solution to this problem is presented by extending the BPEL specification, and an implementation based on the ActiveBPEL workflow enactment engine is described

    A Framework for Model-Driven Scientific Workflow Engineering

    Get PDF
    So-called scientific workflows are one important means in the context of data-intensive science for reliable and efficient scientific data processing in distributed computing infrastructures such as Grids. Scientific Workflow Management Systems (SWfMS) help scientists model and run scientific workflows, whereas a domain-specific layer for workflow modeling by a scientist and a technical layer for automated workflow execution can be distinguished. Initially, many SWfMS were developed from scratch using custom workflow technologies languages without application of already existing and established business workflow technologies. Among the reasons were different life cycles for scientific and business workflows as well as incompatible interfaces and communication protocols of the respective execution infrastructures. Meanwhile, several business IT infrastructures have evolved to serviceoriented architectures (SOAs), for which many Web service standards and technologies have been developed. The Web Services Business Process Execution Language (BPEL), for example, is a well-accepted standard for the implementation and execution of business workflows in SOAs. The SOA architecture pattern has been adopted in scientific IT infrastructures by so-called Service Grids based on existing standards and technologies. Due to this development, BPEL is also suitable for the execution of scientific workflows at the technical layer, which has been elaborated on in many publications and projects. However, BPEL is a workflow language for IT experts and is originally not suited for scientific workflow modeling by a scientist at the domain-specific layer. A domain-specific abstraction of BPEL is therefore required that can be specifically tailored for scientific workflow modeling as well as a corresponding mapping to the technical layer. These challenges of the domain-specific abstraction and the mapping are addressed in this thesis with the help of the Business Process Model and Notation (BPMN) standard and technologies from Model-Driven Software Development (MDSD). Therefore, the MoDFlow approach for Model-Driven Scientific WorkFlow Engineering is presented to map domain-specific scientific workflow models via a BPMN-based intermediate layer to an executable workflow model. The intermediate layer is specified by MoDFlow.BPMN, which is a BPMN metamodel subset with custom extensions for the scientific domain. MoDFlow.BPMN2BPEL defines three consecutive transformation steps to map MoDFlow.BPMN to BPEL for workflow execution. Furthermore, different methods to utilize and extend MoDFlow.BPMN and MoDFlow.BPMN2BPEL are described in the MoDFlow approach, in which the definition of so-called domain-specific languages (DSLs) for the modeling of scientific workflows at the domain-specific layer is focused. The MoDFlow framework is an implementation of the MoDFlow approach, which is based on the Eclipse Modeling Framework (EMF). The MoDFlow framework is evaluated in three application scenarios, in which different utilization and extension mechanisms are examined. The first two application scenarios investigate the technical feasibility of the approach and support scientific workflows with parameter sweeps that are executed on a Grid infrastructure. The third application scenario has been conducted in collaboration with the PubFlow project, which aims to create an infrastructure to model and execute data publication workflows. Based on the Xtext framework, a textual DSL and a corresponding language infrastructure is defined for this purpose that supports developers in creating data publication workflows. This scenario aims to illustrate the practicability of the MoDFlow framework. PubFlow currently plans to implement an additional graphical DSL based on the BPMN notation and a corresponding workflow editor for scientists

    Generic access to symbolic computing services

    Get PDF
    Symbolic computation is one of the computational domains that requires large computational resources. Computer Algebra Systems (CAS), the main tools used for symbolic computations, are mainly designed to be used as software tools installed on standalone machines that do not provide the required resources for solving large symbolic computation problems. In order to support symbolic computations an infrastructure built upon massively distributed computational environments must be developed. Building an infrastructure for symbolic computations requires a thorough analysis of the most important requirements raised by the symbolic computation world and must be built based on the most suitable architectural styles and technologies. The architecture that we propose is composed of several main components: the Computer Algebra System (CAS) Server that exposes the functionality implemented by one or more supporting CASs through generic interfaces of Grid Services; the Architecture for Grid Symbolic Services Orchestration (AGSSO) Server that allows seamless composition of CAS Server capabilities; and client side libraries to assist the users in describing workflows for symbolic computations directly within the CAS environment. We have also designed and developed a framework for automatic data management of mathematical content that relies on OpenMath encoding. To support the validation and fine tuning of the system we have developed a simulation platform that mimics the environment on which the architecture is deployed

    Runtime Adaptation of Scientific Service Workflows

    Get PDF
    Software landscapes are rather subject to change than being complete after having been built. Changes may be caused by a modified customer behavior, the shift to new hardware resources, or otherwise changed requirements. In such situations, several challenges arise. New architectural models have to be designed and implemented, existing software has to be integrated, and, finally, the new software has to be deployed, monitored, and, where appropriate, optimized during runtime under realistic usage scenarios. All of these situations often demand manual intervention, which causes them to be error-prone. This thesis addresses these types of runtime adaptation. Based on service-oriented architectures, an environment is developed that enables the integration of existing software (i.e., the wrapping of legacy software as web services). A workflow modeling tool that aims at an easy-to-use approach by separating the role of the workflow expert and the role of the domain expert. After the development of workflows, tools that observe the executing infrastructure and perform automatic scale-in and scale-out operations are presented. Infrastructure-as-a-Service providers are used to scale the infrastructure in a transparent and cost-efficient way. The deployment of necessary middleware tools is automatically done. The use of a distributed infrastructure can lead to communication problems. In order to keep workflows robust, these exceptional cases need to treated. But, in this way, the process logic of a workflow gets mixed up and bloated with infrastructural details, which yields an increase in its complexity. In this work, a module is presented that can deal automatically with infrastructural faults and that thereby allows to keep the separation of these two layers. When services or their components are hosted in a distributed environment, some requirements need to be addressed at each service separately. Although techniques as object-oriented programming or the usage of design patterns like the interceptor pattern ease the adaptation of service behavior or structures. Still, these methods require to modify the configuration or the implementation of each individual service. On the other side, aspect-oriented programming allows to weave functionality into existing code even without having its source. Since the functionality needs to be woven into the code, it depends on the specific implementation. In a service-oriented architecture, where the implementation of a service is unknown, this approach clearly has its limitations. The request/response aspects presented in this thesis overcome this obstacle and provide a SOA-compliant and new methods to weave functionality into the communication layer of web services. The main contributions of this thesis are the following: Shifting towards a service-oriented architecture: The generic and extensible Legacy Code Description Language and the corresponding framework allow to wrap existing software, e.g., as web services, which afterwards can be composed into a workflow by SimpleBPEL without overburdening the domain expert with technical details that are indeed handled by a workflow expert. Runtime adaption: Based on the standardized Business Process Execution Language an automatic scheduling approach is presented that monitors all used resources and is able to automatically provision new machines in case a scale-out becomes necessary. If the resource's load drops, e.g., because of less workflow executions, a scale-in is also automatically performed. The scheduling algorithm takes the data transfer between the services into account in order to prevent scheduling allocations that eventually increase the workflow's makespan due to unnecessary or disadvantageous data transfers. Furthermore, a multi-objective scheduling algorithm that is based on a genetic algorithm is able to additionally consider cost, in a way that a user can define her own preferences rising from optimized execution times of a workflow and minimized costs. Possible communication errors are automatically detected and, according to certain constraints, corrected. Adaptation of communication: The presented request/response aspects allow to weave functionality into the communication of web services. By defining a pointcut language that only relies on the exchanged documents, the implementation of services must neither be known nor be available. The weaving process itself is modeled using web services. In this way, the concept of request/response aspects is naturally embedded into a service-oriented architecture

    Geospatial Web Services, Open Standards, and Advances in Interoperability: A Selected, Annotated Bibliography

    Get PDF
    This paper is designed to help GIS librarians and information specialists follow developments in the emerging field of geospatial Web services (GWS). When built using open standards, GWS permits users to dynamically access, exchange, deliver, and process geospatial data and products on the World Wide Web, no matter what platform or protocol is used. Standards/specifications pertaining to geospatial ontologies, geospatial Web services and interoperability are discussed in this bibliography. Finally, a selected, annotated list of bibliographic references by experts in the field is presented

    Supporting Quality of Service in Scientific Workflows

    Get PDF
    While workflow management systems have been utilized in enterprises to support businesses for almost two decades, the use of workflows in scientific environments was fairly uncommon until recently. Nowadays, scientists use workflow systems to conduct scientific experiments, simulations, and distributed computations. However, most scientific workflow management systems have not been built using existing workflow technology; rather they have been designed and developed from scratch. Due to the lack of generality of early scientific workflow systems, many domain-specific workflow systems have been developed. Generally speaking, those domain-specific approaches lack common acceptance and tool support and offer lower robustness compared to business workflow systems. In this thesis, the use of the industry standard BPEL, a workflow language for modeling business processes, is proposed for the modeling and the execution of scientific workflows. Due to the widespread use of BPEL in enterprises, a number of stable and mature software products exist. The language is expressive (Turingcomplete) and not restricted to specific applications. BPEL is well suited for the modeling of scientific workflows, but existing implementations of the standard lack important features that are necessary for the execution of scientific workflows. This work presents components that extend an existing implementation of the BPEL standard and eliminate the identified weaknesses. The components thus provide the technical basis for use of BPEL in academia. The particular focus is on so-called non-functional (Quality of Service) requirements. These requirements include scalability, reliability (fault tolerance), data security, and cost (of executing a workflow). From a technical perspective, the workflow system must be able to interface with the middleware systems that are commonly used by the scientific workflow community to allow access to heterogeneous, distributed resources (especially Grid and Cloud resources). The major components cover exactly these requirements: Cloud Resource Provisioner Scalability of the workflow system is achieved by automatically adding additional (Cloud) resources to the workflow system’s resource pool when the workflow system is heavily loaded. Fault Tolerance Module High reliability is achieved via continuous monitoring of workflow execution and corrective interventions, such as re-execution of a failed workflow step or replacement of the faulty resource. Cost Aware Data Flow Aware Scheduler The majority of scientific workflow systems only take the performance and utilization of resources for the execution of workflow steps into account when making scheduling decisions. The presented workflow system goes beyond that. By defining preference values for the weighting of costs and the anticipated workflow execution time, workflow users may influence the resource selection process. The developed multiobjective scheduling algorithm respects the defined weighting and makes both efficient and advantageous decisions using a heuristic approach. Security Extensions Because it supports various encryption, signature and authentication mechanisms (e.g., Grid Security Infrastructure), the workflow system guarantees data security in the transfer of workflow data. Furthermore, this work identifies the need to equip workflow developers with workflow modeling tools that can be used intuitively. This dissertation presents two modeling tools that support users with different needs. The first tool, DAVO (domain-adaptable, Visual BPEL Orchestrator), operates at a low level of abstraction and allows users with knowledge of BPEL to use the full extent of the language. DAVO is a software that offers extensibility and customizability for different application domains. These features are used in the implementation of the second tool, SimpleBPEL Composer. SimpleBPEL is aimed at users with little or no background in computer science and allows for quick and intuitive development of BPEL workflows based on predefined components

    Inside the NIGM Grid Service: Implementation, Evaluation and Extension

    Full text link
    Chinese and Western medicine s have a different understanding and approach to life, health, and illness -joining their complementary work and support them by an advanced information technology could result in an improved health system. The Non-Invasive Blood Glucose Measurement (NIGM) Service is a grid based implementation of a novel non-invasive method for measuring human blood glucose values exploiting Chinese meridian theory. In this paper, we describe the implementation of the NIGM service in detail, present an initial performance evaluation and discuss an extension towards other non-invasive long term diabetic relevant measurement. Additionally, the adaption of the ontology-based Medical records Annotation Tool (MedAT) framework towards usage in NIGM trails is elaborated. ? 2008 IEEE.EI

    Enhancing Client Honeypots with Grid Services and Workflows

    No full text
    Client honeypots are devices for detecting malicious servers on a network. They interact with potentially malicious servers and analyse the Web pages returned to assess whether these pages contain an attack. This type of attack is termed a 'drive-by-download'. Low-interaction client honeypots operate a signature-based approach to detecting known malicious code. High- interaction client honeypots run client applications in full operating systems that are usually hosted by a virtual machine. The operating systems are either internally or externally monitored for anomalous behaviour. In recent years there have been a growing number of client honeypot systems being developed, but there is little interoperability between systems because each has its own custom operational scripts and data formats. By creating interoperability through standard interfaces we could more easily share usage of client honeypots and the data collected. Another problem is providing a simple means of managing an installation of client honeypots. Work ows are a popular technology for allowing end-users to co-ordinate e-science experiments, so these work ow systems can potentially be utilised for client honeypot management. To formulate requirements for management we ran moderate-scale scans of the .nz domain over several months using a manual script-based approach. The main requirements were a system that is user-oriented, loosely-coupled, and integrated with Grid computing|allowing for resource sharing across organisations. Our system design uses Grid services (extensions to Web services) to wrap client honeypots, a manager component acts as a broker for user access, and workflows orchestrate the Grid services. Our prototype wraps our case study - Capture-HPC -with these services, using the Taverna workflow system, and a Web portal for user access. When evaluating our experiences we found that while our system design met our requirements, currently a Java-based application operating on our Web services provides some advantages over our Taverna approach - particularly for modifying workflows, maintainability, and dealing with failure. The Taverna workflows, however, are better suited for the data analysis phase and have some usability advantages. Workflow languages such as Taverna are still relatively immature, so improvements are likely to be made. Both of these approaches are significantly easier to manage and deploy than the previous manual script-based method

    An SOA-based model for the integrated provisioning of cloud and grid resources

    Get PDF
    In the last years, the availability and models of use of networked computing resources within reach of e-Science are rapidly changing and see the coexistence of many disparate paradigms: high-performance computing, grid, and recently cloud. Unfortunately, none of these paradigms is recognized as the ultimate solution, and a convergence of them all should be pursued. At the same time, recent works have proposed a number of models and tools to address the growing needs and expectations in the field of e-Science. In particular, they have shown the advantages and the feasibility of modeling e-Science environments and infrastructures according to the service-oriented architecture. In this paper, we suggest a model to promote the convergence and the integration of the different computing paradigms and infrastructures for the dynamic on-demand provisioning of resources from multiple providers as a cohesive aggregate, leveraging the service-oriented architecture. In addition, we propose a design aimed at endorsing a flexible, modular, workflow-based computing model for e-Science. The model is supplemented by a working prototype implementation together with a case study in the applicative domain of bioinformatics, which is used to validate the presented approach and to carry out some performance and scalability measurements
    corecore