47 research outputs found

    A Service-Oriented Architecture enabling dynamic services grouping for optimizing distributed workflows execution

    Get PDF
    International audienceIn this paper, we describe a Service-Oriented Architecture allowing the optimization of the execution of service workflows. We discuss the advantages of the service-oriented approach with regard to the enactment of scientific applications on a grid infrastructure. Based on the development of a generic Web-Services wrapper, we show how the flexibility of our architecture enables dynamic service grouping for optimizing the application execution time. We demonstrate performance results on a real medical imaging application. On a production grid infrastructure, the optimization proposed introduces a significant speed-up (from 1.2 to 2.9) when compared to a traditional execution

    Generic web service wrapper for efficient embedding of legacy codes in service-based workflows

    Get PDF
    International audienceIn this paper, we present a generic wrapper that enables the optimization of legacy codes assembled in application workflows on grid infrastructures. We first describe advantages of a service-based approach for job management. We then introduce our wrapper, that works at execution time, thus allowing service grouping strategies to optimize the execution. We demonstrate performance results on a real medical imaging application. We finally propose a new ser- vice oriented architecture of the whole system, from appli- cation composition to job submission on the grid

    GRIDKIT: Pluggable overlay networks for Grid computing

    Get PDF
    A `second generation' approach to the provision of Grid middleware is now emerging which is built on service-oriented architecture and web services standards and technologies. However, advanced Grid applications have significant demands that are not addressed by present-day web services platforms. As one prime example, current platforms do not support the rich diversity of communication `interaction types' that are demanded by advanced applications (e.g. publish-subscribe, media streaming, peer-to-peer interaction). In the paper we describe the Gridkit middleware which augments the basic service-oriented architecture to address this particular deficiency. We particularly focus on the communications infrastructure support required to support multiple interaction types in a unified, principled and extensible manner-which we present in terms of the novel concept of pluggable overlay networks

    On the construction of decentralised service-oriented orchestration systems

    Get PDF
    Modern science relies on workflow technology to capture, process, and analyse data obtained from scientific instruments. Scientific workflows are precise descriptions of experiments in which multiple computational tasks are coordinated based on the dataflows between them. Orchestrating scientific workflows presents a significant research challenge: they are typically executed in a manner such that all data pass through a centralised computer server known as the engine, which causes unnecessary network traffic that leads to a performance bottleneck. These workflows are commonly composed of services that perform computation over geographically distributed resources, and involve the management of dataflows between them. Centralised orchestration is clearly not a scalable approach for coordinating services dispersed across distant geographical locations. This thesis presents a scalable decentralised service-oriented orchestration system that relies on a high-level data coordination language for the specification and execution of workflows. This system’s architecture consists of distributed engines, each of which is responsible for executing part of the overall workflow. It exploits parallelism in the workflow by decomposing it into smaller sub-workflows, and determines the most appropriate engines to execute them using computation placement analysis. This permits the workflow logic to be distributed closer to the services providing the data for execution, which reduces the overall data transfer in the workflow and improves its execution time. This thesis provides an evaluation of the presented system which concludes that decentralised orchestration provides scalability benefits over centralised orchestration, and improves the overall performance of executing a service-oriented workflow

    Generating eScience Workflows from Statistical Analysis of Prior Data

    Get PDF
    A number of workflow design tools have been developed specifically to enable easy graphical specification of workflows that ensure systematic scientific data capture and analysis and precise provenance information. We believe that an important component that is missing from these existing workflow specification and enactment systems is integration with tools that enable prior detailed analysis of the existing data - and in particular statistical analysis. By thoroughly analyzing the existing relevant datasets first, it is possible to determine precisely where the existing data is sparse or insufficient and what further experimentation is required. Introducing statistical analysis to experimental design will reduce duplication and costs associated with fruitless experimentation and maximize opportunities for scientific breakthroughs. In this paper we describe a workflow specification system that we have developed for a particular eScience application (fuel cell optimization). Experimental workflow instances are generated as a result of detailed statistical analysis and interactive exploration of the existing datasets. This is carried out through a graphical data exploration interface that integrates the widely-used open source statistical analysis software package, R, as a web service

    Service-Oriented Ad Hoc Grid Computing

    Get PDF
    Subject of this thesis are the design and implementation of an ad hoc Grid infrastructure. The vision of an ad hoc Grid further evolves conventional service-oriented Grid systems into a more robust, more flexible and more usable environment that is still standards compliant and interoperable with other Grid systems. A lot of work in current Grid middleware systems is focused on providing transparent access to high performance computing (HPC) resources (e.g. clusters) in virtual organizations spanning multiple institutions. The ad hoc Grid vision presented in this thesis exceeds this view in combining classical Grid components with more flexible components and usage models, allowing to form an environment combining dedicated HPC-resources with a large number of personal computers forming a "Desktop Grid". Three examples from medical research, media research and mechanical engineering are presented as application scenarios for a service-oriented ad hoc Grid infrastructure. These sample applications are also used to derive requirements for the runtime environment as well as development tools for such an ad hoc Grid environment. These requirements form the basis for the design and implementation of the Marburg ad hoc Grid Environment (MAGE) and the Grid Development Tools for Eclipse (GDT). MAGE is an implementation of a WSRF-compliant Grid middleware, that satisfies the criteria for an ad hoc Grid middleware presented in the introduction to this thesis. GDT extends the popular Eclipse integrated development environment by components that support application development both for traditional service-oriented Grid middleware systems as well as ad hoc Grid infrastructures such as MAGE. These development tools represent the first fully model driven approach to Grid service development integrated with infrastructure management components in service-oriented Grid computing. This thesis is concluded by a quantitative discussion of the performance overhead imposed by the presented extensions to a service-oriented Grid middleware as well as a discussion of the qualitative improvements gained by the overall solution. The conclusion of this thesis also gives an outlook on future developments and areas for further research. One of these qualitative improvements is "hot deployment" the ability to install and remove Grid services in a running node without interrupt to other active services on the same node. Hot deployment has been introduced as a novelty in service-oriented Grid systems as a result of the research conducted for this thesis. It extends service-oriented Grid computing with a new paradigm, making installation of individual application components a functional aspect of the application. This thesis further explores the idea of using peer-to-peer (P2P networking for Grid computing by combining a general purpose P2P framework with a standard compliant Grid middleware. In previous work the application of P2P systems has been limited to replica location and use of P2P index structures for discovery purposes. The work presented in this thesis also uses P2P networking to realize seamless communication accross network barriers. Even though the web service standards have been designed for the internet, the two-way communication requirement introduced by the WSRF-standards and particularly the notification pattern is not well supported by the web service standards. This defficiency can be answered by mechanisms that are part of such general purpose P2P communication frameworks. Existing security infrastructures for Grid systems focus on protection of data during transmission and access control to individual resources or the overall Grid environment. This thesis focuses on security issues within a single node of a dynamically changing service-oriented Grid environment. To counter the security threads arising from the new capabilities of an ad hoc Grid, a number of novel isolation solutions are presented. These solutions address security issues and isolation on a fine-grained level providing a range of applicable basic mechanisms for isolation, ranging from lightweight system call interposition to complete para-virtualization of the operating systems

    Developing Use Cases and State Transition Models for Effective Protection of Electronic Health Records (EHRs) in Cloud

    Get PDF
    ABSTRACT: This paper proposes new object oriented design of use cases and state transition models to effectively guard Electronic Health Records (EHRs). Privacy-An important factor need to be considered while we publishing the microdata. Usually government agencies and other organization used to publish the microdata. On releasing the microdata, the sensitive information of the individuals are being disclosed. This constitutes a major problem in the government and organizational sector for releasing the microdata. In order to sector or to prevent the sensitive information, we are going to implement certain algorithms and methods. Normally there two types of information disclosures they are: Identity disclosure and Attribute disclosure. Identity disclosure occurs when an individual's linked to a particular record in the released Attribute disclosure occurs when new information about some individuals are revealed. This paper aims to discuss the existing techniques present in literature for preserving, incremental development, use cases and state transition models of the system proposed

    Interacting with scientific workflows

    Get PDF
    corecore