Search CORE

158 research outputs found

Static Analysis of Taverna Workflows To Predict Provenance Patterns

Author: Alper Pinar
Belhajjame Khalid
Goble Carole
Publication venue: 'Elsevier BV'
Publication date: 01/10/2017
Field of study

The University of Manchester - Institutional Repository

Static analysis of Taverna workflows to predict provenance patterns

Author: Abiteboul
Aho
Belhajjame
Bowers
Callahan
Carole A. Goble
Curcin
Curcin
Dey
Dey
Garijo
Ghoshal
Giardine
Gil
Khalid Belhajjame
Ludäscher
Miles
Missier
Moreau
Najjar
Pinar Alper
Tenopir
Van der aalst
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

TOWARDS HARNESSING COMPUTATIONAL WORKFLOW PROVENANCE FOR EXPERIMENT REPORTING

Author: Alper Pinar
Publication venue
Publication date: 01/08/2016
Field of study

The University of Manchester - Institutional Repository

Understanding Legacy Workflows through Runtime Trace Analysis

Author
Publication venue
Publication date: 01/01/2015
Field of study

abstract: When scientific software is written to specify processes, it takes the form of a workflow, and is often written in an ad-hoc manner in a dynamic programming language. There is a proliferation of legacy workflows implemented by non-expert programmers due to the accessibility of dynamic languages. Unfortunately, ad-hoc workflows lack a structured description as provided by specialized management systems, making ad-hoc workflow maintenance and reuse difficult, and motivating the need for analysis methods. The analysis of ad-hoc workflows using compiler techniques does not address dynamic languages - a program has so few constrains that its behavior cannot be predicted. In contrast, workflow provenance tracking has had success using run-time techniques to record data. The aim of this work is to develop a new analysis method for extracting workflow structure at run-time, thus avoiding issues with dynamics. The method captures the dataflow of an ad-hoc workflow through its execution and abstracts it with a process for simplifying repetition. An instrumentation system first processes the workflow to produce an instrumented version, capable of logging events, which is then executed on an input to produce a trace. The trace undergoes dataflow construction to produce a provenance graph. The dataflow is examined for equivalent regions, which are collected into a single unit. The workflow is thus characterized in terms of its treatment of an input. Unlike other methods, a run-time approach characterizes the workflow's actual behavior; including elements which static analysis cannot predict (for example, code dynamically evaluated based on input parameters). This also enables the characterization of dataflow through external tools. The contributions of this work are: a run-time method for recording a provenance graph from an ad-hoc Python workflow, and a method to analyze the structure of a workflow from provenance. Methods are implemented in Python and are demonstrated on real world Python workflows. These contributions enable users to derive graph structure from workflows. Empowered by a graphical view, users can better understand a legacy workflow. This makes the wealth of legacy ad-hoc workflows accessible, enabling workflow reuse instead of investing time and resources into creating a workflow.Dissertation/ThesisMasters Thesis Computer Science 201

ASU Digital Repository

Perspectives on automated composition of workflows in the life sciences [version 1; peer review: 2 approved]

Author: Al Manir Mohammad Sadnan
Capella Gutiérrez Salvador
Ison Jon
Lamprecht Anna-Lena
Palmblad Magnus
Schwämmle Veit
Publication venue: 'F1000 Research Ltd'
Publication date: 01/01/2021
Field of study

Scientific data analyses often combine several computational tools in automated pipelines, or workflows. Thousands of such workflows have been used in the life sciences, though their composition has remained a cumbersome manual process due to a lack of standards for annotation, assembly, and implementation. Recent technological advances have returned the long-standing vision of automated workflow composition into focus. This article summarizes a recent Lorentz Center workshop dedicated to automated composition of workflows in the life sciences. We survey previous initiatives to automate the composition process, and discuss the current state of the art and future perspectives. We start by drawing the “big picture” of the scientific workflow development life cycle, before surveying and discussing current methods, technologies and practices for semantic domain modelling, automation in workflow development, and workflow assessment. Finally, we derive a roadmap of individual and community-based actions to work toward the vision of automated workflow development in the forthcoming years. A central outcome of the workshop is a general description of the workflow life cycle in six stages: 1) scientific question or hypothesis, 2) conceptual workflow, 3) abstract workflow, 4) concrete workflow, 5) production workflow, and 6) scientific results. The transitions between stages are facilitated by diverse tools and methods, usually incorporating domain knowledge in some form. Formal semantic domain modelling is hard and often a bottleneck for the application of semantic technologies. However, life science communities have made considerable progress here in recent years and are continuously improving, renewing interest in the application of semantic technologies for workflow exploration, composition and instantiation. Combined with systematic benchmarking with reference data and large-scale deployment of production-stage workflows, such technologies enable a more systematic process of workflow development than we know today. We believe that this can lead to more robust, reusable, and sustainable workflows in the future.Stian Soiland-Reyes was supported by BioExcel-2 Centre of Excellence, funded by European Commission Horizon 2020 programme under European Commission contract H2020-INFRAEDI-02-2018 823830. Carole Goble was supported by EOSC-Life, funded by European Commission Horizon 2020 programme under grant agreement H2020-INFRAEOSC-2018-2 824087. We gratefully acknowledge the financial support from the Lorentz Center, ELIXIR, and the Leiden University Medical Center (LUMC) that made the workshop possible. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscriptPeer Reviewed"Article signat per 33 autors/es: Anna-Lena Lamprecht , Magnus Palmblad, Jon Ison, Veit Schwämmle , Mohammad Sadnan Al Manir, Ilkay Altintas, Christopher J. O. Baker, Ammar Ben Hadj Amor, Salvador Capella-Gutierrez, Paulos Charonyktakis, Michael R. Crusoe, Yolanda Gil, Carole Goble, Timothy J. Griffin , Paul Groth , Hans Ienasescu, Pratik Jagtap, Matúš Kalaš , Vedran Kasalica, Alireza Khanteymoori , Tobias Kuhn12, Hailiang Mei, Hervé Ménager, Steffen Möller, Robin A. Richardson, Vincent Robert9, Stian Soiland-Reyes, Robert Stevens, Szoke Szaniszlo, Suzan Verberne, Aswin Verhoeven, Katherine Wolstencroft "Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Workflow Provenance: from Modeling to Reporting

Author: Ferdous Rayhan 1992-
Publication venue: 'University of Saskatchewan Library'
Publication date: 12/03/2019
Field of study

Workflow provenance is a crucial part of a workflow system as it enables data lineage analysis, error tracking, workflow monitoring, usage pattern discovery, and so on. Integrating provenance into a workflow system or modifying a workflow system to capture or analyze different provenance information is burdensome, requiring extensive development because provenance mechanisms rely heavily on the modelling, architecture, and design of the workflow system. Various tools and technologies exist for logging events in a software system. Unfortunately, logging tools and technologies are not designed for capturing and analyzing provenance information. Workflow provenance is not only about logging, but also about retrieving workflow related information from logs. In this work, we propose a taxonomy of provenance questions and guided by these questions, we created a workflow programming model 'ProvMod' with a supporting run-time library to provide automated provenance and log analysis for any workflow system. The design and provenance mechanism of ProvMod is based on recommendations from prominent research and is easy to integrate into any workflow system. ProvMod offers Neo4j graph database support to manage semi-structured heterogeneous JSON logs. The log structure is adaptable to any NoSQL technology. For each provenance question in our taxonomy, ProvMod provides the answer with data visualization using Neo4j and the ELK Stack. Besides analyzing performance from various angles, we demonstrate the ease of integration by integrating ProvMod with Apache Taverna and evaluate ProvMod usability by engaging users. Finally, we present two Software Engineering research cases (clone detection and architecture extraction) where our proposed model ProvMod and provenance questions taxonomy can be applied to discover meaningful insights

eCommons@USASK

University of Saskatchewan Research Archive

Perspectives on automated composition of workflows in the life sciences

Author: Al Manir M.S.
Altintas I.
Baker C.J.O.
Ben Hadj Amor A.
Capella-Gutierrez S.
Charonyktakis P.
Crusoe M.R.
Gil Y.
Goble C.
Griffin T.J.
Groth P.
Ienasescu H.
Ison J.
Jagtap P.
Kalaš M.
Kasalica V.
Khanteymoori A.
Kuhn T.
Lamprecht A.-L.
Mei H.
Ménager H.
Möller S.
Palmblad M.
Richardson R.A.
Robert V.
Schwämmle V.
Soiland-Reyes S.
Stevens R.
Szaniszlo S.
Verberne S.
Verhoeven A.
Wolstencroft K.
Publication venue: 'F1000 Research Ltd'
Publication date: 01/01/2021
Field of study

International Migration, Integration and Social Cohesion online publications

Perspectives on automated composition of workflows in the life sciences

Author: Al Manir M.S.
Altintas I.
Baker C.J.O.
Ben Hadj Amor A.
Capella-Gutierrez S.
Charonyktakis P.
Crusoe M.R.
Gil Y.
Goble C.
Griffin T.J.
Groth P.
Ienasescu H.
Ison J.
Jagtap P.
Kalaš M.
Kasalica V.
Khanteymoori A.
Kuhn T.
Lamprecht A.-L.
Mei H.
Ménager H.
Möller S.
Palmblad M.
Richardson R.A.
Robert V.
Schwämmle V.
Soiland-Reyes S.
Stevens R.
Szaniszlo S.
Verberne S.
Verhoeven A.
Wolstencroft K.
Publication venue: 'F1000 Research Ltd'
Publication date: 01/01/2021
Field of study

International Migration, Integration and Social Cohesion online publications

The architecture of discovery net : towards grid-based discovery services

Author: Wendel Patrick
Wendel Patrick
Publication venue
Publication date: 01/01/2008
Field of study

Imperial Users onl

Spiral - Imperial College Digital Repository

A MULTI-FUNCTIONAL PROVENANCE ARCHITECTURE: CHALLENGES AND SOLUTIONS

Author: Naseri Mahsa
Publication venue: 'University of Saskatchewan Library'
Publication date
Field of study

In service-oriented environments, services are put together in the form of a workflow with the aim of distributed problem solving. Capturing the execution details of the services' transformations is a significant advantage of using workflows. These execution details, referred to as provenance information, are usually traced automatically and stored in provenance stores. Provenance data contains the data recorded by a workflow engine during a workflow execution. It identifies what data is passed between services, which services are involved, and how results are eventually generated for particular sets of input values. Provenance information is of great importance and has found its way through areas in computer science such as: Bioinformatics, database, social, sensor networks, etc. Current exploitation and application of provenance data is very limited as provenance systems started being developed for specific applications. Thus, applying learning and knowledge discovery methods to provenance data can provide rich and useful information on workflows and services. Therefore, in this work, the challenges with workflows and services are studied to discover the possibilities and benefits of providing solutions by using provenance data. A multifunctional architecture is presented which addresses the workflow and service issues by exploiting provenance data. These challenges include workflow composition, abstract workflow selection, refinement, evaluation, and graph model extraction. The specific contribution of the proposed architecture is its novelty in providing a basis for taking advantage of the previous execution details of services and workflows along with artificial intelligence and knowledge management techniques to resolve the major challenges regarding workflows. The presented architecture is application-independent and could be deployed in any area. The requirements for such an architecture along with its building components are discussed. Furthermore, the responsibility of the components, related works and the implementation details of the architecture along with each component are presented

eCommons@USASK

University of Saskatchewan Research Archive