3,378 research outputs found
A Taxonomy of Workflow Management Systems for Grid Computing
With the advent of Grid and application technologies, scientists and
engineers are building more and more complex applications to manage and process
large data sets, and execute scientific experiments on distributed resources.
Such application scenarios require means for composing and executing complex
workflows. Therefore, many efforts have been made towards the development of
workflow management systems for Grid computing. In this paper, we propose a
taxonomy that characterizes and classifies various approaches for building and
executing workflows on Grids. We also survey several representative Grid
workflow systems developed by various projects world-wide to demonstrate the
comprehensiveness of the taxonomy. The taxonomy not only highlights the design
and engineering similarities and differences of state-of-the-art in Grid
workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure
Action semantics of unified modeling language
The Uni ed Modeling Language or UML, as a visual and general purpose modeling
language, has been around for more than a decade, gaining increasingly wide application
and becoming the de-facto industrial standard for modeling software systems. However,
the dynamic semantics of UML behaviours are only described in natural languages.
Speci cation in natural languages inevitably involves vagueness, lacks reasonability and
discourages mechanical language implementation. Such semi-formality of UML causes
wide concern for researchers, including us.
The formal semantics of UML demands more readability and extensibility due to its
fast evolution and a wider range of users. Therefore we adopt Action Semantics (AS),
mainly created by Peter Mosses, to formalize the dynamic semantics of UML, because
AS can satisfy these needs advantageously compared to other frameworks.
Instead of de ning UML directly, we design an action language, called ALx, and
use it as the intermediary between a typical executable UML and its action semantics.
ALx is highly heterogeneous, combining the features of Object Oriented Programming
Languages, Object Query Languages, Model Description Languages and more complex
behaviours like state machines. Adopting AS to formalize such a heterogeneous language
is in turn of signi cance in exploring the adequacy and applicability of AS.
In order to give assurance of the validity of the action semantics of ALx, a prototype
ALx-to-Java translator is implemented, underpinned by our formal semantic description
of the action language and using the Model Driven Approach (MDA). We argue that
MDA is a feasible way of implementing this source-to-source language translator because
the cornerstone of MDA, UML, is adequate to specify the static aspect of programming
languages, and MDA provides executable transformation languages to model mapping
rules between languages.
We also construct a translator using a commonly-used conventional approach, in
i
which a tool is employed to generate the lexical scanner and the parser, and then
other components including the type checker, symbol table constructor, intermediate
representation producer and code generator, are coded manually. Then we compare the
conventional approach with the MDA. The result shows that MDA has advantages over
the conventional method in the aspect of code quality but is inferior to the latter in
terms of system performance
Scientific Workflows for Metabolic Flux Analysis
Metabolic engineering is a highly interdisciplinary research domain that interfaces biology, mathematics, computer science, and engineering. Metabolic flux analysis with carbon tracer experiments (13 C-MFA) is a particularly challenging metabolic engineering application that consists of several tightly interwoven building blocks such as modeling, simulation, and experimental design. While several general-purpose workflow solutions have emerged in recent years to support the realization of complex scientific applications, the transferability of these approaches are only partially applicable to 13C-MFA workflows. While problems in other research fields (e.g., bioinformatics) are primarily centered around scientific data processing, 13C-MFA workflows have more in common with business workflows. For instance, many bioinformatics workflows are designed to identify, compare, and annotate genomic sequences by "pipelining" them through standard tools like BLAST. Typically, the next workflow task in the pipeline can be automatically determined by the outcome of the previous step. Five computational challenges have been identified in the endeavor of conducting 13 C-MFA studies: organization of heterogeneous data, standardization of processes and the unification of tools and data, interactive workflow steering, distributed computing, and service orientation. The outcome of this thesis is a scientific workflow framework (SWF) that is custom-tailored for the specific requirements of 13 C-MFA applications. The proposed approach – namely, designing the SWF as a collection of loosely-coupled modules that are glued together with web services – alleviates the realization of 13C-MFA workflows by offering several features. By design, existing tools are integrated into the SWF using web service interfaces and foreign programming language bindings (e.g., Java or Python). Although the attributes "easy-to-use" and "general-purpose" are rarely associated with distributed computing software, the presented use cases show that the proposed Hadoop MapReduce framework eases the deployment of computationally demanding simulations on cloud and cluster computing resources. An important building block for allowing interactive researcher-driven workflows is the ability to track all data that is needed to understand and reproduce a workflow. The standardization of 13 C-MFA studies using a folder structure template and the corresponding services and web interfaces improves the exchange of information for a group of researchers. Finally, several auxiliary tools are developed in the course of this work to complement the SWF modules, i.e., ranging from simple helper scripts to visualization or data conversion programs. This solution distinguishes itself from other scientific workflow approaches by offering a system of loosely-coupled components that are flexibly arranged to match the typical requirements in the metabolic engineering domain. Being a modern and service-oriented software framework, new applications are easily composed by reusing existing components
- …