Search CORE

126 research outputs found

Perspectives on automated composition of workflows in the life sciences [version 1; peer review: 2 approved]

Author: Al Manir Mohammad Sadnan
Capella Gutiérrez Salvador
Ison Jon
Lamprecht Anna-Lena
Palmblad Magnus
Schwämmle Veit
Publication venue: 'F1000 Research Ltd'
Publication date: 01/01/2021
Field of study

Scientific data analyses often combine several computational tools in automated pipelines, or workflows. Thousands of such workflows have been used in the life sciences, though their composition has remained a cumbersome manual process due to a lack of standards for annotation, assembly, and implementation. Recent technological advances have returned the long-standing vision of automated workflow composition into focus. This article summarizes a recent Lorentz Center workshop dedicated to automated composition of workflows in the life sciences. We survey previous initiatives to automate the composition process, and discuss the current state of the art and future perspectives. We start by drawing the “big picture” of the scientific workflow development life cycle, before surveying and discussing current methods, technologies and practices for semantic domain modelling, automation in workflow development, and workflow assessment. Finally, we derive a roadmap of individual and community-based actions to work toward the vision of automated workflow development in the forthcoming years. A central outcome of the workshop is a general description of the workflow life cycle in six stages: 1) scientific question or hypothesis, 2) conceptual workflow, 3) abstract workflow, 4) concrete workflow, 5) production workflow, and 6) scientific results. The transitions between stages are facilitated by diverse tools and methods, usually incorporating domain knowledge in some form. Formal semantic domain modelling is hard and often a bottleneck for the application of semantic technologies. However, life science communities have made considerable progress here in recent years and are continuously improving, renewing interest in the application of semantic technologies for workflow exploration, composition and instantiation. Combined with systematic benchmarking with reference data and large-scale deployment of production-stage workflows, such technologies enable a more systematic process of workflow development than we know today. We believe that this can lead to more robust, reusable, and sustainable workflows in the future.Stian Soiland-Reyes was supported by BioExcel-2 Centre of Excellence, funded by European Commission Horizon 2020 programme under European Commission contract H2020-INFRAEDI-02-2018 823830. Carole Goble was supported by EOSC-Life, funded by European Commission Horizon 2020 programme under grant agreement H2020-INFRAEOSC-2018-2 824087. We gratefully acknowledge the financial support from the Lorentz Center, ELIXIR, and the Leiden University Medical Center (LUMC) that made the workshop possible. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscriptPeer Reviewed"Article signat per 33 autors/es: Anna-Lena Lamprecht , Magnus Palmblad, Jon Ison, Veit Schwämmle , Mohammad Sadnan Al Manir, Ilkay Altintas, Christopher J. O. Baker, Ammar Ben Hadj Amor, Salvador Capella-Gutierrez, Paulos Charonyktakis, Michael R. Crusoe, Yolanda Gil, Carole Goble, Timothy J. Griffin , Paul Groth , Hans Ienasescu, Pratik Jagtap, Matúš Kalaš , Vedran Kasalica, Alireza Khanteymoori , Tobias Kuhn12, Hailiang Mei, Hervé Ménager, Steffen Möller, Robin A. Richardson, Vincent Robert9, Stian Soiland-Reyes, Robert Stevens, Szoke Szaniszlo, Suzan Verberne, Aswin Verhoeven, Katherine Wolstencroft "Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Perspectives on automated composition of workflows in the life sciences

Author: Al Manir M.S.
Altintas I.
Baker C.J.O.
Ben Hadj Amor A.
Capella-Gutierrez S.
Charonyktakis P.
Crusoe M.R.
Gil Y.
Goble C.
Griffin T.J.
Groth P.
Ienasescu H.
Ison J.
Jagtap P.
Kalaš M.
Kasalica V.
Khanteymoori A.
Kuhn T.
Lamprecht A.-L.
Mei H.
Ménager H.
Möller S.
Palmblad M.
Richardson R.A.
Robert V.
Schwämmle V.
Soiland-Reyes S.
Stevens R.
Szaniszlo S.
Verberne S.
Verhoeven A.
Wolstencroft K.
Publication venue: 'F1000 Research Ltd'
Publication date: 01/01/2021
Field of study

International Migration, Integration and Social Cohesion online publications

Scientific Workflow Integration For Services Computing

Author: Lin Cui
Publication venue: DigitalCommons@WayneState
Publication date: 01/01/2010
Field of study

In recent years, significant scientific advances are increasingly achieved through complex scientific processes. As the exponential growth in computing technologies and scientific data, a scientific workflow may comprise a large number of heterogeneous scientific services and applications, provided by different organizations. These services, applications, and their associated data are usually distributed across heterogeneous computing environments. The integration and management of such scientific workflows are pushing the limits of current workflow technology. This dissertation presents an integrated solution to composing, scheduling, executing and developing scientific workflows and scientific workflow management systems. To provide a foundation for workflow composition, scheduling, execution and management, we propose the first reference architecture for scientific workflow management systems. The reference architecture not only provides a high-level organization of subsystems and their interactions in a workflow system, but also provides a basis for comparison between different systems and a guidance for the architectural design of an SWFMS in a specific scientific domain. To integrate heterogeneous services and applications and enable them composed to workflows, we propose a task template model which not only provides an appropriate abstraction of heterogeneous services and applications, but also encapsulates the composition and mapping of shims and functional task components within a task interface. Our proposed task specification language (TSL) not only integrates heterogeneous services and applications into uniform workflow tasks, but also provides a solution to address both TYPE-I and TYPE-II shimming problems in composing scientific workflows. To schedule scientific workflows in emerging services computing environments, we propose two workflow scheduling algorithms, the SHEFT algorithm and the SCPOR algorithm, to prioritize tasks in a workflow, map tasks onto suitable resources and order the execution of tasks on the assigned resources, so that the workflow makespan can be minimized. Our extensive experiments have shown that our proposed algorithms not only outperform other algorithms for large-scale, data-intensive and compute intensive workflows, but also allow the assigned resources elastically change on demand of the size of workflows. To execute workflows on distributed computing environments, we propose a task run model to model the run-time behaviors of tasks. The proposed task run description language (TRDL) enables the execution of task instances constructed from heterogeneous services and applications. We also develop an SOA based task management subsystem to manage all task templates, task instances and task runs for the invocation and execution of various heterogeneous task components. Finally, our developed SOA based workflow management system, the VIEW system, and a VIEW based workflow application system, the FiberFlow system, validate our architectures, models, languages, and algorithms

Digital Commons@Wayne State University

Perspectives on automated composition of workflows in the life sciences

Author: Al Manir M.S.
Altintas I.
Baker C.J.O.
Ben Hadj Amor A.
Capella-Gutierrez S.
Charonyktakis P.
Crusoe M.R.
Gil Y.
Goble C.
Griffin T.J.
Groth P.
Ienasescu H.
Ison J.
Jagtap P.
Kalaš M.
Kasalica V.
Khanteymoori A.
Kuhn T.
Lamprecht A.-L.
Mei H.
Ménager H.
Möller S.
Palmblad M.
Richardson R.A.
Robert V.
Schwämmle V.
Soiland-Reyes S.
Stevens R.
Szaniszlo S.
Verberne S.
Verhoeven A.
Wolstencroft K.
Publication venue: 'F1000 Research Ltd'
Publication date: 01/01/2021
Field of study

International Migration, Integration and Social Cohesion online publications

APE in the wild: automated exploration of proteomics workflows in the bio.tools registry

Author: Ison J.
Kasalica V.
Lamprecht A.L.
Palmblad M.
Schwammle V.
Publication venue: 'American Chemical Society (ACS)'
Publication date: 02/04/2021
Field of study

The bio.tools registry is a main catalogue of computational tools in the life sciences. More than 17 000 tools have been registered by the international bioinformatics community. The bio.tools metadata schema includes semantic annotations of tool functions, that is, formal descriptions of tools' data types, formats, and operations with terms from the EDAM bioinformatics ontology. Such annotations enable the automated composition of tools into multistep pipelines or workflows. In this Technical Note, we revisit a previous case study on the automated composition of proteomics workflows. We use the same four workflow scenarios but instead of using a small set of tools with carefully handcrafted annotations, we explore workflows directly on bio.tools. We use the Automated Pipeline Explorer (APE), a reimplementation and extension of the workflow composition method previously used. Moving "into the wild" opens up an unprecedented wealth of tools and a huge number of alternative workflows. Automated composition tools can be used to explore this space of possibilities systematically. Inevitably, the mixed quality of semantic annotations in bio.tools leads to unintended or erroneous tool combinations. However, our results also show that additional control mechanisms (tool filters, configuration options, and workflow constraints) can effectively guide the exploration toward smaller sets of more meaningful workflows.Proteomic

Leiden University Scholary Publications

VPH-HF: A software framework for the execution of complex subject-specific physiology modelling workflows

Author: Afgan
Alowayyed
Beisken
Ben Belgacem
Benzekry
Borgdorff
Borgdorff
Borgdorff
Ciepiela
Conger
Deelman
Falcone
Fenner
Fielding
Forrester
Groen
Groen
Hahnfeldt
Hardisty
Hoenen
Khuri
Kleijnen
Koziel
Le Blanc
Li
Li
May
Meineke
Mirams
Mizeranschi
Morris
Myers
Oinn
Osborne
Ovaska
Qasim
Rowe
Sakkalis
Stamatakos
Tahir
Tourdot
Walton
Wolstencroft
Publication venue: 'Elsevier BV'
Publication date: 01/03/2018
Field of study

Computational medicine more and more requires complex orchestrations of multiple modelling & simulation codes, written in different programming languages and with different computational requirements, which when validated need to be run many times on large cohorts of patients. The aim of this paper is to present a new open source software, the VPH Hypermodelling Framework (VPH-HF). The VPH-HF overcomes the limitations of most workflow execution environments by supporting both Taverna and Muscle2; the addition of Muscle2 support makes possible the execution of very complex orchestrations that include strongly-coupled models. The overhead that the VPH-HF imposes in exchange for this is small, and tends to be flat regardless of the complexity and the computational cost of the hypermodel being executed. We recommend the use of the VPH-HF to orchestrate any hypermodel with an execution time of 200 s or higher, which would confine the VPH-HF overhead to less than 10%. The VPH-HF also provide an automatic caching system over the execution of every hypomodel, which may provide considerable speed-up when the orchestration is run repeatedly over large numbers of patients or within stochastic frameworks, and the input sets are properly binned. The caching system also makes it easy to form large input set/output set databases required to develop reduced-order models, and the framework offers the possibility to dynamically replace single models in the orchestration with reduced-order versions built from cached results, an essential feature when the orchestration of multiple models produces a combinatory explosion of the computational cost

Crossref

White Rose Research Online

TOWARDS HARNESSING COMPUTATIONAL WORKFLOW PROVENANCE FOR EXPERIMENT REPORTING

Author: Alper Pinar
Publication venue
Publication date: 01/08/2016
Field of study

The University of Manchester - Institutional Repository

Automatic Generation of Optimized Process Models from Declarative Specifications

Author: Böhm Klemens
Mrasek Richard
Mülle Jutta
Publication venue: Karlsruher Institut für Technologie
Publication date: 01/01/2014
Field of study

Process models often are generic, i. e., describe similar cases or contexts. For instance, a process model for commissioning can cover both vehicles with an automatic and with a manual transmission, by executing alternative tasks. A generic process model is not optimal compared to one tailored to a specific context. Given a declarative specification of the constraints and a specific context, we study how to automatically generate a good process model and propose a novel approach. We focus on the restricted case that there are not any repetitions of a task, as is the case in commissioning and elsewhere, e. g., manufacturing. Our approach uses a probabilistic search to find a good process model according to quality criteria. It can handle complex real-world specifications containing several hundred constraints and more than one hundred tasks. The process models generated with our scheme are superior (nearly twice as fast) to ones designed by professional modelers by hand

KITopen

Recommended from our members

A Framework for Grid-Enabling Scientific Workflow Systems. Architecture and application case studies on interoperability and heterogeneity in support for Grid workflow automation.

Author: Azam Nabeel A.
Publication venue: Department of Computing, School of Informatics
Publication date: 01/01/2010
Field of study

Since the early 2000s, Service Oriented Architectures (SOAs) have played a key role in the development of complex applications within a virtual organization (VO) context. Grids and workflows have emerged as vital technologies for addressing the (SOA) paradigm. Given the variety of Grid middleware, scientific workflow systems and Grid workflows available, bringing the two technologies together in a flexible, reusable and generalized way has been largely overlooked, particularly from a scientific end user perspective. The lack of domain focus in this area has led to a slow uptake of Grid technologies. This thesis aims to design a framework for Grid-enabling workflows, which identifies the essential technological components, how these components fit together in layered architecture and the interactions between them. To produce such a framework, this thesis first investigates the definition of a Grid-workflow architecture and mapping Grid functionality to workflow nodes, focusing on striking a balance between performance, usability and the Grid functionality supported. Next, it presents an examination of framework extensions for supporting various forms of Grid heterogeneity, essential for ii VO based collaboration. Given the complex nature of Grid technologies, the work presented here investigates abstracting Grid based workflows through high-level definitions and resolution using semantic technologies. Finally, this thesis presents a way to resolves abstract Grid workflows using semantic technologies and intelligent, autonomous agents. The frameworks presented in this thesis are tested and evaluated within the context of domain-based case studies defined in the SIMDAT, BRIDGE and ARGUGRID EU funded research projects

Bradford Scholars