162 research outputs found

    WS-Pro: a Petri net based performance-driven service composition framework

    Get PDF
    As an emerging area gaining prevalence in the industry, Web Services was established to satisfy the needs for better flexibility and higher reliability in web applications. However, due to the lack of reliable frameworks and difficulties in constructing versatile service composition platform, web developers encountered major obstacles in large-scale deployment of web services. Meanwhile, performance has been one of the major concerns and a largely unexplored area in Web Services research. There is high demand for researchers to conceive and develop feasible solutions to design, monitor, and deploy web service systems that can adapt to failures, especially performance failures. Though many techniques have been proposed to solve this problem, none of them offers a comprehensive solution to overcome the difficulties that challenge practitioners. Central to the performance-engineering studies, performance analysis and performance adaptation are of paramount importance to the success of a software project. The industry learned through many hard lessons the significance of well-founded and well-executed performance engineering plans. An important fact is that it is too expensive to tackle performance evaluation, mostly through performance testing, after the software is developed. This is especially true in recent decades when software complexity has risen sharply. After the system is deployed, performance adaptation is essential to maintaining and improving software system reliability. Performance adaptation provides techniques to mitigate the consequence of performance failures and therefore is an important research issue. Performance adaptation is particularly meaningful for mission-critical software systems and software systems with inevitable frequent performance failures, such as Web Services. This dissertation focuses on Web Services framework and proposes a performance-driven service composition scheme, called WS-Pro, to support both performance analysis and performance adaptation. A formalism of transformation from WS-BPEL to Petri net is first defined to enable the analysis of system properties and facilitate quality prediction. A state-transition based proof is presented to show that the transformed Petri net model correctly simulates the behavior of the WS-BPEL process. The generated Petri net model was augmented using performance data supplied by both historical data and runtime data. Results of executing the Petri nets suggest that optimal composition plans can be achieved based on the proposed method. The performance of service composition procedure is an important research issue which has not been sufficiently treated by researchers. However, such an issue is critical for dynamic service composition, where re-planning must be done in a timely manner. In order to improve the performance of service composition procedure and enhance performance adaptation, this dissertation presents an algorithm to remove loops in the reachability graphs so that a large portion of the computation time of service composition can be moved to a pre-processing unit; hence the response time is shortened during runtime. We also extended the WS-Pro to the ubiquitous computing area to improve fault-tolerance

    Applying web services technology to implement distributed simulation for supply chain modeling and analysis

    Full text link

    Programming and parallelising applications for distributed infrastructures

    Get PDF
    The last decade has witnessed unprecedented changes in parallel and distributed infrastructures. Due to the diminished gains in processor performance from increasing clock frequency, manufacturers have moved from uniprocessor architectures to multicores; as a result, clusters of computers have incorporated such new CPU designs. Furthermore, the ever-growing need of scienti c applications for computing and storage capabilities has motivated the appearance of grids: geographically-distributed, multi-domain infrastructures based on sharing of resources to accomplish large and complex tasks. More recently, clouds have emerged by combining virtualisation technologies, service-orientation and business models to deliver IT resources on demand over the Internet. The size and complexity of these new infrastructures poses a challenge for programmers to exploit them. On the one hand, some of the di culties are inherent to concurrent and distributed programming themselves, e.g. dealing with thread creation and synchronisation, messaging, data partitioning and transfer, etc. On the other hand, other issues are related to the singularities of each scenario, like the heterogeneity of Grid middleware and resources or the risk of vendor lock-in when writing an application for a particular Cloud provider. In the face of such a challenge, programming productivity - understood as a tradeo between programmability and performance - has become crucial for software developers. There is a strong need for high-productivity programming models and languages, which should provide simple means for writing parallel and distributed applications that can run on current infrastructures without sacri cing performance. In that sense, this thesis contributes with Java StarSs, a programming model and runtime system for developing and parallelising Java applications on distributed infrastructures. The model has two key features: first, the user programs in a fully-sequential standard-Java fashion - no parallel construct, API call or pragma must be included in the application code; second, it is completely infrastructure-unaware, i.e. programs do not contain any details about deployment or resource management, so that the same application can run in di erent infrastructures with no changes. The only requirement for the user is to select the application tasks, which are the model's unit of parallelism. Tasks can be either regular Java methods or web service operations, and they can handle any data type supported by the Java language, namely les, objects, arrays and primitives. For the sake of simplicity of the model, Java StarSs shifts the burden of parallelisation from the programmer to the runtime system. The runtime is responsible from modifying the original application to make it create asynchronous tasks and synchronise data accesses from the main program. Moreover, the implicit inter-task concurrency is automatically found as the application executes, thanks to a data dependency detection mechanism that integrates all the Java data types. This thesis provides a fairly comprehensive evaluation of Java StarSs on three di erent distributed scenarios: Grid, Cluster and Cloud. For each of them, a runtime system was designed and implemented to exploit their particular characteristics as well as to address their issues, while keeping the infrastructure unawareness of the programming model. The evaluation compares Java StarSs against state-of-the-art solutions, both in terms of programmability and performance, and demonstrates how the model can bring remarkable productivity to programmers of parallel distributed applications

    Runtime Adaptation of Scientific Service Workflows

    Get PDF
    Software landscapes are rather subject to change than being complete after having been built. Changes may be caused by a modified customer behavior, the shift to new hardware resources, or otherwise changed requirements. In such situations, several challenges arise. New architectural models have to be designed and implemented, existing software has to be integrated, and, finally, the new software has to be deployed, monitored, and, where appropriate, optimized during runtime under realistic usage scenarios. All of these situations often demand manual intervention, which causes them to be error-prone. This thesis addresses these types of runtime adaptation. Based on service-oriented architectures, an environment is developed that enables the integration of existing software (i.e., the wrapping of legacy software as web services). A workflow modeling tool that aims at an easy-to-use approach by separating the role of the workflow expert and the role of the domain expert. After the development of workflows, tools that observe the executing infrastructure and perform automatic scale-in and scale-out operations are presented. Infrastructure-as-a-Service providers are used to scale the infrastructure in a transparent and cost-efficient way. The deployment of necessary middleware tools is automatically done. The use of a distributed infrastructure can lead to communication problems. In order to keep workflows robust, these exceptional cases need to treated. But, in this way, the process logic of a workflow gets mixed up and bloated with infrastructural details, which yields an increase in its complexity. In this work, a module is presented that can deal automatically with infrastructural faults and that thereby allows to keep the separation of these two layers. When services or their components are hosted in a distributed environment, some requirements need to be addressed at each service separately. Although techniques as object-oriented programming or the usage of design patterns like the interceptor pattern ease the adaptation of service behavior or structures. Still, these methods require to modify the configuration or the implementation of each individual service. On the other side, aspect-oriented programming allows to weave functionality into existing code even without having its source. Since the functionality needs to be woven into the code, it depends on the specific implementation. In a service-oriented architecture, where the implementation of a service is unknown, this approach clearly has its limitations. The request/response aspects presented in this thesis overcome this obstacle and provide a SOA-compliant and new methods to weave functionality into the communication layer of web services. The main contributions of this thesis are the following: Shifting towards a service-oriented architecture: The generic and extensible Legacy Code Description Language and the corresponding framework allow to wrap existing software, e.g., as web services, which afterwards can be composed into a workflow by SimpleBPEL without overburdening the domain expert with technical details that are indeed handled by a workflow expert. Runtime adaption: Based on the standardized Business Process Execution Language an automatic scheduling approach is presented that monitors all used resources and is able to automatically provision new machines in case a scale-out becomes necessary. If the resource's load drops, e.g., because of less workflow executions, a scale-in is also automatically performed. The scheduling algorithm takes the data transfer between the services into account in order to prevent scheduling allocations that eventually increase the workflow's makespan due to unnecessary or disadvantageous data transfers. Furthermore, a multi-objective scheduling algorithm that is based on a genetic algorithm is able to additionally consider cost, in a way that a user can define her own preferences rising from optimized execution times of a workflow and minimized costs. Possible communication errors are automatically detected and, according to certain constraints, corrected. Adaptation of communication: The presented request/response aspects allow to weave functionality into the communication of web services. By defining a pointcut language that only relies on the exchanged documents, the implementation of services must neither be known nor be available. The weaving process itself is modeled using web services. In this way, the concept of request/response aspects is naturally embedded into a service-oriented architecture

    Enhancing Client Honeypots with Grid Services and Workflows

    No full text
    Client honeypots are devices for detecting malicious servers on a network. They interact with potentially malicious servers and analyse the Web pages returned to assess whether these pages contain an attack. This type of attack is termed a 'drive-by-download'. Low-interaction client honeypots operate a signature-based approach to detecting known malicious code. High- interaction client honeypots run client applications in full operating systems that are usually hosted by a virtual machine. The operating systems are either internally or externally monitored for anomalous behaviour. In recent years there have been a growing number of client honeypot systems being developed, but there is little interoperability between systems because each has its own custom operational scripts and data formats. By creating interoperability through standard interfaces we could more easily share usage of client honeypots and the data collected. Another problem is providing a simple means of managing an installation of client honeypots. Work ows are a popular technology for allowing end-users to co-ordinate e-science experiments, so these work ow systems can potentially be utilised for client honeypot management. To formulate requirements for management we ran moderate-scale scans of the .nz domain over several months using a manual script-based approach. The main requirements were a system that is user-oriented, loosely-coupled, and integrated with Grid computing|allowing for resource sharing across organisations. Our system design uses Grid services (extensions to Web services) to wrap client honeypots, a manager component acts as a broker for user access, and workflows orchestrate the Grid services. Our prototype wraps our case study - Capture-HPC -with these services, using the Taverna workflow system, and a Web portal for user access. When evaluating our experiences we found that while our system design met our requirements, currently a Java-based application operating on our Web services provides some advantages over our Taverna approach - particularly for modifying workflows, maintainability, and dealing with failure. The Taverna workflows, however, are better suited for the data analysis phase and have some usability advantages. Workflow languages such as Taverna are still relatively immature, so improvements are likely to be made. Both of these approaches are significantly easier to manage and deploy than the previous manual script-based method

    Enhancing Client Honeypots with Grid Services and Workflows

    Get PDF
    Client honeypots are devices for detecting malicious servers on a network. They interact with potentially malicious servers and analyse the Web pages returned to assess whether these pages contain an attack. This type of attack is termed a 'drive-by-download'. Low-interaction client honeypots operate a signature-based approach to detecting known malicious code. High- interaction client honeypots run client applications in full operating systems that are usually hosted by a virtual machine. The operating systems are either internally or externally monitored for anomalous behaviour. In recent years there have been a growing number of client honeypot systems being developed, but there is little interoperability between systems because each has its own custom operational scripts and data formats. By creating interoperability through standard interfaces we could more easily share usage of client honeypots and the data collected. Another problem is providing a simple means of managing an installation of client honeypots. Work ows are a popular technology for allowing end-users to co-ordinate e-science experiments, so these work ow systems can potentially be utilised for client honeypot management. To formulate requirements for management we ran moderate-scale scans of the .nz domain over several months using a manual script-based approach. The main requirements were a system that is user-oriented, loosely-coupled, and integrated with Grid computing|allowing for resource sharing across organisations. Our system design uses Grid services (extensions to Web services) to wrap client honeypots, a manager component acts as a broker for user access, and workflows orchestrate the Grid services. Our prototype wraps our case study - Capture-HPC -with these services, using the Taverna workflow system, and a Web portal for user access. When evaluating our experiences we found that while our system design met our requirements, currently a Java-based application operating on our Web services provides some advantages over our Taverna approach - particularly for modifying workflows, maintainability, and dealing with failure. The Taverna workflows, however, are better suited for the data analysis phase and have some usability advantages. Workflow languages such as Taverna are still relatively immature, so improvements are likely to be made. Both of these approaches are significantly easier to manage and deploy than the previous manual script-based method

    Scientific Workflows for Metabolic Flux Analysis

    Get PDF
    Metabolic engineering is a highly interdisciplinary research domain that interfaces biology, mathematics, computer science, and engineering. Metabolic flux analysis with carbon tracer experiments (13 C-MFA) is a particularly challenging metabolic engineering application that consists of several tightly interwoven building blocks such as modeling, simulation, and experimental design. While several general-purpose workflow solutions have emerged in recent years to support the realization of complex scientific applications, the transferability of these approaches are only partially applicable to 13C-MFA workflows. While problems in other research fields (e.g., bioinformatics) are primarily centered around scientific data processing, 13C-MFA workflows have more in common with business workflows. For instance, many bioinformatics workflows are designed to identify, compare, and annotate genomic sequences by "pipelining" them through standard tools like BLAST. Typically, the next workflow task in the pipeline can be automatically determined by the outcome of the previous step. Five computational challenges have been identified in the endeavor of conducting 13 C-MFA studies: organization of heterogeneous data, standardization of processes and the unification of tools and data, interactive workflow steering, distributed computing, and service orientation. The outcome of this thesis is a scientific workflow framework (SWF) that is custom-tailored for the specific requirements of 13 C-MFA applications. The proposed approach – namely, designing the SWF as a collection of loosely-coupled modules that are glued together with web services – alleviates the realization of 13C-MFA workflows by offering several features. By design, existing tools are integrated into the SWF using web service interfaces and foreign programming language bindings (e.g., Java or Python). Although the attributes "easy-to-use" and "general-purpose" are rarely associated with distributed computing software, the presented use cases show that the proposed Hadoop MapReduce framework eases the deployment of computationally demanding simulations on cloud and cluster computing resources. An important building block for allowing interactive researcher-driven workflows is the ability to track all data that is needed to understand and reproduce a workflow. The standardization of 13 C-MFA studies using a folder structure template and the corresponding services and web interfaces improves the exchange of information for a group of researchers. Finally, several auxiliary tools are developed in the course of this work to complement the SWF modules, i.e., ranging from simple helper scripts to visualization or data conversion programs. This solution distinguishes itself from other scientific workflow approaches by offering a system of loosely-coupled components that are flexibly arranged to match the typical requirements in the metabolic engineering domain. Being a modern and service-oriented software framework, new applications are easily composed by reusing existing components

    On the construction of decentralised service-oriented orchestration systems

    Get PDF
    Modern science relies on workflow technology to capture, process, and analyse data obtained from scientific instruments. Scientific workflows are precise descriptions of experiments in which multiple computational tasks are coordinated based on the dataflows between them. Orchestrating scientific workflows presents a significant research challenge: they are typically executed in a manner such that all data pass through a centralised computer server known as the engine, which causes unnecessary network traffic that leads to a performance bottleneck. These workflows are commonly composed of services that perform computation over geographically distributed resources, and involve the management of dataflows between them. Centralised orchestration is clearly not a scalable approach for coordinating services dispersed across distant geographical locations. This thesis presents a scalable decentralised service-oriented orchestration system that relies on a high-level data coordination language for the specification and execution of workflows. This system’s architecture consists of distributed engines, each of which is responsible for executing part of the overall workflow. It exploits parallelism in the workflow by decomposing it into smaller sub-workflows, and determines the most appropriate engines to execute them using computation placement analysis. This permits the workflow logic to be distributed closer to the services providing the data for execution, which reduces the overall data transfer in the workflow and improves its execution time. This thesis provides an evaluation of the presented system which concludes that decentralised orchestration provides scalability benefits over centralised orchestration, and improves the overall performance of executing a service-oriented workflow
    corecore