497 research outputs found

    Tools and models for high level parallel and Grid programming

    Get PDF
    When algorithmic skeletons were first introduced by Cole in late 1980 (50) the idea had an almost immediate success. The skeletal approach has been proved to be effective when application algorithms can be expressed in terms of skeletons composition. However, despite both their effectiveness and the progress made in skeletal systems design and implementation, algorithmic skeletons remain absent from mainstream practice. Cole and other researchers, respectively in (51) and (19), focused the problem. They recognized the issues affecting skeletal systems and stated a set of principles that have to be tackled in order to make them more effective and to take skeletal programming into the parallel mainstream. In this thesis we propose tools and models for addressing some among the skeletal programming environments issues. We describe three novel approaches aimed at enhancing skeletons based systems from different angles. First, we present a model we conceived that allows algorithmic skeletons customization exploiting the macro data-flow abstraction. Then we present two results about the exploitation of metaprogramming techniques for the run-time generation and optimization of macro data-flow graphs. In particular, we show how to generate and how to optimize macro data-flow graphs accordingly both to programmers provided non-functional requirements and to execution platform features. The last result we present are the Behavioural Skeletons, an approach aimed at addressing the limitations of skeletal programming environments when used for the development of component-based Grid applications. We validated all the approaches conducting several test, performed exploiting a set of tools we developed

    Higher-Order Process Modeling: Product-Lining, Variability Modeling and Beyond

    Full text link
    We present a graphical and dynamic framework for binding and execution of business) process models. It is tailored to integrate 1) ad hoc processes modeled graphically, 2) third party services discovered in the (Inter)net, and 3) (dynamically) synthesized process chains that solve situation-specific tasks, with the synthesis taking place not only at design time, but also at runtime. Key to our approach is the introduction of type-safe stacked second-order execution contexts that allow for higher-order process modeling. Tamed by our underlying strict service-oriented notion of abstraction, this approach is tailored also to be used by application experts with little technical knowledge: users can select, modify, construct and then pass (component) processes during process execution as if they were data. We illustrate the impact and essence of our framework along a concrete, realistic (business) process modeling scenario: the development of Springer's browser-based Online Conference Service (OCS). The most advanced feature of our new framework allows one to combine online synthesis with the integration of the synthesized process into the running application. This ability leads to a particularly flexible way of implementing self-adaption, and to a particularly concise and powerful way of achieving variability not only at design time, but also at runtime.Comment: In Proceedings Festschrift for Dave Schmidt, arXiv:1309.455

    InfraPhenoGrid: A scientific workflow infrastructure for Plant Phenomics on the Grid

    Get PDF
    International audiencePlant phenotyping consists in the observation of physical and biochemical traits of plant genotypes in response to environmental conditions. Challenges , in particular in context of climate change and food security, are numerous. High-throughput platforms have been introduced to observe the dynamic growth of a large number of plants in different environmental conditions. Instead of considering a few genotypes at a time (as it is the case when phenomic traits are measured manually), such platforms make it possible to use completely new kinds of approaches. However, the data sets produced by such widely instrumented platforms are huge, constantly augmenting and produced by increasingly complex experiments, reaching a point where distributed computation is mandatory to extract knowledge from data. In this paper, we introduce InfraPhenoGrid, the infrastructure we designed and deploy to efficiently manage data sets produced by the PhenoArch plant phenomics platform in the context of the French Phenome Project. Our solution consists in deploying scientific workflows on a Grid using a middle-ware to pilot workflow executions. Our approach is user-friendly in the sense that despite the intrinsic complexity of the infrastructure, running scientific workflows and understanding results obtained (using provenance information) is kept as simple as possible for end-users

    Work flows in life science

    Get PDF
    The introduction of computer science technology in the life science domain has resulted in a new life science discipline called bioinformatics. Bioinformaticians are biologists who know how to apply computer science technology to perform computer based experiments, also known as in-silico or dry lab experiments. Various tools, such as databases, web applications and scripting languages, are used to design and run in-silico experiments. As the size and complexity of these experiments grow, new types of tools are required to design and execute the experiments and to analyse the results. Workflow systems promise to fulfill this role. The bioinformatician composes an experiment by using tools and web services as building blocks, and connecting them, often through a graphical user interface. Workflow systems, such as Taverna, provide access to up to a few thousand resources in a uniform way. Although workflow systems are intended to make the bioinformaticians' work easier, bioinformaticians experience difficulties in using them. This thesis is devoted to find out which problems bioinformaticians experience using workflow systems and to provide solutions for these problems.\u

    A generic framework for process execution and secure multi-party transaction authorization

    Get PDF
    Process execution engines are not only an integral part of workflow and business process management systems but are increasingly used to build process-driven applications. In other words, they are potentially used in all kinds of software across all application domains. However, contemporary process engines and workflow systems are unsuitable for use in such diverse application scenarios for several reasons. The main shortcomings can be observed in the areas of interoperability, versatility, and programmability. Therefore, this thesis makes a step away from domain specific, monolithic workflow engines towards generic and versatile process runtime frameworks, which enable integration of process technology into all kinds of software. To achieve this, the idea and corresponding architecture of a generic and embeddable process virtual machine (ePVM), which supports defining process flows along the theoretical foundation of communicating extended finite state machines, are presented. The architecture focuses on the core process functionality such as control flow and state management, monitoring, persistence, and communication, while using JavaScript as a process definition language. This approach leads to a very generic yet easily programmable process framework. A fully functional prototype implementation of the proposed framework is provided along with multiple example applications. Despite the fact that business processes are increasingly automated and controlled by information systems, humans are still involved, directly or indirectly, in many of them. Thus, for process flows involving sensitive transactions, a highly secure authorization scheme supporting asynchronous multi-party transaction authorization must be available within process management systems. Therefore, along with the ePVM framework, this thesis presents a novel approach for secure remote multi-party transaction authentication - the zone trusted information channel (ZTIC). The ZTIC approach uniquely combines multiple desirable properties such as the highest level of security, ease-of-use, mobility, remote administration, and smooth integration with existing infrastructures into one device and method. Extensively evaluating both, the ePVM framework and the ZTIC, this thesis shows that ePVM in combination with the ZTIC approach represents a unique and very powerful framework for building workflow systems and process-driven applications including support for secure multi-party transaction authorization

    Scientific Workflows for Metabolic Flux Analysis

    Get PDF
    Metabolic engineering is a highly interdisciplinary research domain that interfaces biology, mathematics, computer science, and engineering. Metabolic flux analysis with carbon tracer experiments (13 C-MFA) is a particularly challenging metabolic engineering application that consists of several tightly interwoven building blocks such as modeling, simulation, and experimental design. While several general-purpose workflow solutions have emerged in recent years to support the realization of complex scientific applications, the transferability of these approaches are only partially applicable to 13C-MFA workflows. While problems in other research fields (e.g., bioinformatics) are primarily centered around scientific data processing, 13C-MFA workflows have more in common with business workflows. For instance, many bioinformatics workflows are designed to identify, compare, and annotate genomic sequences by "pipelining" them through standard tools like BLAST. Typically, the next workflow task in the pipeline can be automatically determined by the outcome of the previous step. Five computational challenges have been identified in the endeavor of conducting 13 C-MFA studies: organization of heterogeneous data, standardization of processes and the unification of tools and data, interactive workflow steering, distributed computing, and service orientation. The outcome of this thesis is a scientific workflow framework (SWF) that is custom-tailored for the specific requirements of 13 C-MFA applications. The proposed approach – namely, designing the SWF as a collection of loosely-coupled modules that are glued together with web services – alleviates the realization of 13C-MFA workflows by offering several features. By design, existing tools are integrated into the SWF using web service interfaces and foreign programming language bindings (e.g., Java or Python). Although the attributes "easy-to-use" and "general-purpose" are rarely associated with distributed computing software, the presented use cases show that the proposed Hadoop MapReduce framework eases the deployment of computationally demanding simulations on cloud and cluster computing resources. An important building block for allowing interactive researcher-driven workflows is the ability to track all data that is needed to understand and reproduce a workflow. The standardization of 13 C-MFA studies using a folder structure template and the corresponding services and web interfaces improves the exchange of information for a group of researchers. Finally, several auxiliary tools are developed in the course of this work to complement the SWF modules, i.e., ranging from simple helper scripts to visualization or data conversion programs. This solution distinguishes itself from other scientific workflow approaches by offering a system of loosely-coupled components that are flexibly arranged to match the typical requirements in the metabolic engineering domain. Being a modern and service-oriented software framework, new applications are easily composed by reusing existing components

    Sharing Semantic Resources

    Get PDF
    The Semantic Web is an extension of the current Web in which information, so far created for human consumption, becomes machine readable, “enabling computers and people to work in cooperation”. To turn into reality this vision several challenges are still open among which the most important is to share meaning formally represented with ontologies or more generally with semantic resources. This Semantic Web long-term goal has many convergences with the activities in the field of Human Language Technology and in particular in the development of Natural Language Processing applications where there is a great need of multilingual lexical resources. For instance, one of the most important lexical resources, WordNet, is also commonly regarded and used as an ontology. Nowadays, another important phenomenon is represented by the explosion of social collaboration, and Wikipedia, the largest encyclopedia in the world, is object of research as an up to date omni comprehensive semantic resource. The main topic of this thesis is the management and exploitation of semantic resources in a collaborative way, trying to use the already available resources as Wikipedia and Wordnet. This work presents a general environment able to turn into reality the vision of shared and distributed semantic resources and describes a distributed three-layer architecture to enable a rapid prototyping of cooperative applications for developing semantic resources

    A language and toolkit for the specification, execution and monitoring of dependable distributed applications

    Get PDF
    PhD ThesisThis thesis addresses the problem of specifying the composition of distributed applications out of existing applications, possibly legacy ones. With the automation of business processes on the increase, more and more applications of this kind are being constructed. The resulting applications can be quite complex, usually long-lived and are executed in a heterogeneous environment. In a distributed environment, long-lived activities need support for fault tolerance and dynamic reconfiguration. Indeed, it is likely that the environment where they are run will change (nodes may fail, services may be moved elsewhere or withdrawn) during their execution and the specification will have to be modified. There is also a need for modularity, scalability and openness. However, most of the existing systems only consider part of these requirements. A new area of research, called workflow management has been trying to address these issues. This work first looks at what needs to be addressed to support the specification and execution of these new applications in a heterogeneous, distributed environment. A co- ordination language (scripting language) is developed that fulfils the requirements of specifying the composition and inter-dependencies of distributed applications with the properties of dynamic reconfiguration, fault tolerance, modularity, scalability and openness. The architecture of the overall workflow system and its implementation are then presented. The system has been implemented as a set of CORBA services and the execution environment is built using a transactional workflow management system. Next, the thesis describes the design of a toolkit to specify, execute and monitor distributed applications. The design of the co-ordination language and the toolkit represents the main contribution of the thesis.UK Engineering and Physical Sciences Research Council, CaberNet, Northern Telecom (Nortel)
    corecore