170,640 research outputs found
Semantically Resolving Type Mismatches in Scientific Workflows
Scientists are increasingly utilizing Grids to manage large data sets and execute scientific experiments on distributed resources. Scientific workflows are used as means for modeling and enacting scientific experiments. Windows Workflow Foundation (WF) is a major component of Microsoft’s .NET technology which offers lightweight support for long-running workflows. It provides a comfortable graphical and programmatic environment for the development of extended BPEL-style workflows. WF’s visual features ease the syntactic composition of Web services into scientific workflows but do nothing to assure that information passed between services has consistent semantic types or representations or that deviant flows, errors and compensations are handled meaningfully. In this paper we introduce SAWSDL-compliant annotations for WF and use them with a semantic reasoner to guarantee semantic type correctness in scientific workflows. Examples from bioinformatics are presented
Workflow Partitioning and Deployment on the Cloud using Orchestra
Orchestrating service-oriented workflows is typically based on a design model
that routes both data and control through a single point - the centralised
workflow engine. This causes scalability problems that include the unnecessary
consumption of the network bandwidth, high latency in transmitting data between
the services, and performance bottlenecks. These problems are highly prominent
when orchestrating workflows that are composed from services dispersed across
distant geographical locations. This paper presents a novel workflow
partitioning approach, which attempts to improve the scalability of
orchestrating large-scale workflows. It permits the workflow computation to be
moved towards the services providing the data in order to garner optimal
performance results. This is achieved by decomposing the workflow into smaller
sub workflows for parallel execution, and determining the most appropriate
network locations to which these sub workflows are transmitted and subsequently
executed. This paper demonstrates the efficiency of our approach using a set of
experimental workflows that are orchestrated over Amazon EC2 and across several
geographic network regions.Comment: To appear in Proceedings of the IEEE/ACM 7th International Conference
on Utility and Cloud Computing (UCC 2014
Cost-aware scheduling of deadline-constrained task workflows in public cloud environments
Public cloud computing infrastructure offers resources on-demand, and makes it possible to develop applications that elastically scale when demand changes. This capacity can be used to schedule highly parallellizable task workflows, where individual tasks consist of many small steps. By dynamically scaling the number of virtual machines used, based on varying resource requirements of different steps, lower costs can be achieved, and workflows that would previously have been infeasible can be executed. In this paper, we describe how task workflows consisting of large numbers of distributable steps can be provisioned on public cloud infrastructure in a cost-efficient way, taking into account workflow deadlines. We formally define the problem, and describe an ILP-based algorithm and two heuristic algorithms to solve it. We simulate how the three algorithms perform when scheduling these task workflows on public cloud infrastructure, using the various instance types of the Amazon EC2 cloud, and we evaluate the achieved cost and execution speed of the three algorithms using two different task workflows based on a document processing application
A safety analysis approach to clinical workflows : application and evaluation
Clinical workflows are safety critical workflows as they have the potential to cause harm or death to patients. Their safety needs to be considered as early as possible in the development process. Effective safety analysis methods are required to ensure the safety of these high-risk workflows, because errors that may happen through routine workflow could propagate within the workflow to result in harmful failures of the system’s output. This paper shows how to apply an approach for safety analysis of clinic al workflows to analyse the safety of the workflow within a radiology department and evaluates the approach in terms of usability and benefits. The outcomes of using this approach include identification of the root causes of hazardous workflow failures that may put patients’ lives at risk. We show that the approach is applicable to this area of healthcare and is able to present added value through the detailed information on possible failures, of both their causes and effects; therefore, it has the potential to improve the safety of radiology and other clinical workflows
GRIDCC: Real-time workflow system
The Grid is a concept which allows the sharing of resources between distributed communities, allowing each to progress towards potentially different goals. As adoption of the Grid increases so are the activities that people wish to conduct through it. The GRIDCC project is a European Union funded project addressing the issues of integrating instruments into the Grid. This increases the requirement of workflows and Quality of Service upon these workflows as many of these instruments have real-time requirements. In this paper we present the workflow management service within the GRIDCC project which is tasked with optimising the workflows and ensuring that they meet the pre-defined QoS requirements specified upon them
Kronos: a workflow assembler for genome analytics and informatics.
BackgroundThe field of next-generation sequencing informatics has matured to a point where algorithmic advances in sequence alignment and individual feature detection methods have stabilized. Practical and robust implementation of complex analytical workflows (where such tools are structured into "best practices" for automated analysis of next-generation sequencing datasets) still requires significant programming investment and expertise.ResultsWe present Kronos, a software platform for facilitating the development and execution of modular, auditable, and distributable bioinformatics workflows. Kronos obviates the need for explicit coding of workflows by compiling a text configuration file into executable Python applications. Making analysis modules would still require programming. The framework of each workflow includes a run manager to execute the encoded workflows locally (or on a cluster or cloud), parallelize tasks, and log all runtime events. The resulting workflows are highly modular and configurable by construction, facilitating flexible and extensible meta-applications that can be modified easily through configuration file editing. The workflows are fully encoded for ease of distribution and can be instantiated on external systems, a step toward reproducible research and comparative analyses. We introduce a framework for building Kronos components that function as shareable, modular nodes in Kronos workflows.ConclusionsThe Kronos platform provides a standard framework for developers to implement custom tools, reuse existing tools, and contribute to the community at large. Kronos is shipped with both Docker and Amazon Web Services Machine Images. It is free, open source, and available through the Python Package Index and at https://github.com/jtaghiyar/kronos
Enabling adaptive scientific workflows via trigger detection
Next generation architectures necessitate a shift away from traditional
workflows in which the simulation state is saved at prescribed frequencies for
post-processing analysis. While the need to shift to in~situ workflows has been
acknowledged for some time, much of the current research is focused on static
workflows, where the analysis that would have been done as a post-process is
performed concurrently with the simulation at user-prescribed frequencies.
Recently, research efforts are striving to enable adaptive workflows, in which
the frequency, composition, and execution of computational and data
manipulation steps dynamically depend on the state of the simulation. Adapting
the workflow to the state of simulation in such a data-driven fashion puts
extremely strict efficiency requirements on the analysis capabilities that are
used to identify the transitions in the workflow. In this paper we build upon
earlier work on trigger detection using sublinear techniques to drive adaptive
workflows. Here we propose a methodology to detect the time when sudden heat
release occurs in simulations of turbulent combustion. Our proposed method
provides an alternative metric that can be used along with our former metric to
increase the robustness of trigger detection. We show the effectiveness of our
metric empirically for predicting heat release for two use cases.Comment: arXiv admin note: substantial text overlap with arXiv:1506.0825
Preferred Workflows for Syndromic Surveillance Systems
Workflows are a sequence of information processing operations that people carry out to meet certain in-formational goals [1]. Using various user-centered design (UCD) techniques we uncovered the workflows that epidemiologists wished to follow when using syndromic surveillance (SS) systems
- …
