17,205 research outputs found
BPMN4sML: A BPMN Extension for Serverless Machine Learning. Technology Independent and Interoperable Modeling of Machine Learning Workflows and their Serverless Deployment Orchestration
Machine learning (ML) continues to permeate all layers of academia, industry
and society. Despite its successes, mental frameworks to capture and represent
machine learning workflows in a consistent and coherent manner are lacking. For
instance, the de facto process modeling standard, Business Process Model and
Notation (BPMN), managed by the Object Management Group, is widely accepted and
applied. However, it is short of specific support to represent machine learning
workflows. Further, the number of heterogeneous tools for deployment of machine
learning solutions can easily overwhelm practitioners. Research is needed to
align the process from modeling to deploying ML workflows.
We analyze requirements for standard based conceptual modeling for machine
learning workflows and their serverless deployment. Confronting the
shortcomings with respect to consistent and coherent modeling of ML workflows
in a technology independent and interoperable manner, we extend BPMN's
Meta-Object Facility (MOF) metamodel and the corresponding notation and
introduce BPMN4sML (BPMN for serverless machine learning). Our extension
BPMN4sML follows the same outline referenced by the Object Management Group
(OMG) for BPMN. We further address the heterogeneity in deployment by proposing
a conceptual mapping to convert BPMN4sML models to corresponding deployment
models using TOSCA.
BPMN4sML allows technology-independent and interoperable modeling of machine
learning workflows of various granularity and complexity across the entire
machine learning lifecycle. It aids in arriving at a shared and standardized
language to communicate ML solutions. Moreover, it takes the first steps toward
enabling conversion of ML workflow model diagrams to corresponding deployment
models for serverless deployment via TOSCA.Comment: 105 pages 3 tables 33 figure
Recommended from our members
Computerization of workflows, guidelines and care pathways: a review of implementation challenges for process-oriented health information systems
There is a need to integrate the various theoretical frameworks and formalisms for modeling clinical guidelines, workflows, and pathways, in order to move beyond providing support for individual clinical decisions and toward the provision of process-oriented, patient-centered, health information systems (HIS). In this review, we analyze the challenges in developing process-oriented HIS that formally model guidelines, workflows, and care pathways. A qualitative meta-synthesis was performed on studies published in English between 1995 and 2010 that addressed the modeling process and reported the exposition of a new methodology, model, system implementation, or system architecture. Thematic analysis, principal component analysis (PCA) and data visualisation techniques were used to identify and cluster the underlying implementation ‘challenge’ themes. One hundred and eight relevant studies were selected for review. Twenty-five underlying ‘challenge’ themes were identified. These were clustered into 10 distinct groups, from which a conceptual model of the implementation process was developed. We found that the development of systems supporting individual clinical decisions is evolving toward the implementation of adaptable care pathways on the semantic web, incorporating formal, clinical, and organizational ontologies, and the use of workflow management systems. These architectures now need to be implemented and evaluated on a wider scale within clinical settings
Conceptual-level workflow modeling of scientific experiments using NMR as a case study
BACKGROUND: Scientific workflows improve the process of scientific experiments by making computations explicit, underscoring data flow, and emphasizing the participation of humans in the process when intuition and human reasoning are required. Workflows for experiments also highlight transitions among experimental phases, allowing intermediate results to be verified and supporting the proper handling of semantic mismatches and different file formats among the various tools used in the scientific process. Thus, scientific workflows are important for the modeling and subsequent capture of bioinformatics-related data. While much research has been conducted on the implementation of scientific workflows, the initial process of actually designing and generating the workflow at the conceptual level has received little consideration. RESULTS: We propose a structured process to capture scientific workflows at the conceptual level that allows workflows to be documented efficiently, results in concise models of the workflow and more-correct workflow implementations, and provides insight into the scientific process itself. The approach uses three modeling techniques to model the structural, data flow, and control flow aspects of the workflow. The domain of biomolecular structure determination using Nuclear Magnetic Resonance spectroscopy is used to demonstrate the process. Specifically, we show the application of the approach to capture the workflow for the process of conducting biomolecular analysis using Nuclear Magnetic Resonance (NMR) spectroscopy. CONCLUSION: Using the approach, we were able to accurately document, in a short amount of time, numerous steps in the process of conducting an experiment using NMR spectroscopy. The resulting models are correct and precise, as outside validation of the models identified only minor omissions in the models. In addition, the models provide an accurate visual description of the control flow for conducting biomolecular analysis using NMR spectroscopy experiment
Identifying and Modelling Complex Workflow Requirements in Web Applications
Workflow plays a major role in nowadays business and therefore its
requirement elicitation must be accurate and clear for achieving the solution
closest to business’s needs. Due to Web applications popularity, the Web is becoming
the standard platform for implementing business workflows. In this
context, Web applications and their workflows must be adapted to market demands
in such a way that time and effort are minimize. As they get more popular,
they must give support to different functional requirements but also they
contain tangled and scattered behaviour. In this work we present a model-driven
approach for modelling workflows using a Domain Specific Language for Web
application requirement called WebSpec. We present an extension to WebSpec
based on Pattern Specifications for modelling crosscutting workflow requirements
identifying tangled and scattered behaviour and reducing inconsistencies
early in the cycle
When Are Two Workflows the Same?
In the area of workflow management, one is confronted with a large number of competing languages and the relations between them (e.g. relative expressiveness) are usually not clear. Moreover, even within the same language it is generally possible to express the same workflow in different ways, a feature known as variability. This paper aims at providing some of the formal groundwork for studying relative expressiveness and variability by defining notions of equivalence capturing different views on how workflow systems operate. Firstly, a notion of observational equivalence in the absence of silent steps is defined and related to classical bisimulation. Secondly, a number of equivalence notions in the presence of silent steps are defined. A distinction is made between the case where silent steps are visible (but not controllable) by the environment and the case where silent steps are not visible, i.e., there is an alternation between system events and environment interactions. It is shown that these notions of equivalence are different and do not coincide with classical notions of bisimulation with silent steps (e.g. weak and branching)
BioWorkbench: A High-Performance Framework for Managing and Analyzing Bioinformatics Experiments
Advances in sequencing techniques have led to exponential growth in
biological data, demanding the development of large-scale bioinformatics
experiments. Because these experiments are computation- and data-intensive,
they require high-performance computing (HPC) techniques and can benefit from
specialized technologies such as Scientific Workflow Management Systems (SWfMS)
and databases. In this work, we present BioWorkbench, a framework for managing
and analyzing bioinformatics experiments. This framework automatically collects
provenance data, including both performance data from workflow execution and
data from the scientific domain of the workflow application. Provenance data
can be analyzed through a web application that abstracts a set of queries to
the provenance database, simplifying access to provenance information. We
evaluate BioWorkbench using three case studies: SwiftPhylo, a phylogenetic tree
assembly workflow; SwiftGECKO, a comparative genomics workflow; and RASflow, a
RASopathy analysis workflow. We analyze each workflow from both computational
and scientific domain perspectives, by using queries to a provenance and
annotation database. Some of these queries are available as a pre-built feature
of the BioWorkbench web application. Through the provenance data, we show that
the framework is scalable and achieves high-performance, reducing up to 98% of
the case studies execution time. We also show how the application of machine
learning techniques can enrich the analysis process
- …