123,354 research outputs found

    Towards Collaborative Scientific Workflow Management System

    Get PDF
    The big data explosion phenomenon has impacted several domains, starting from research areas to divergent of business models in recent years. As this intensive amount of data opens up the possibilities of several interesting knowledge discoveries, over the past few years divergent of research domains have undergone the shift of trend towards analyzing those massive amount data. Scientific Workflow Management System (SWfMS) has gained much popularity in recent years in accelerating those data-intensive analyses, visualization, and discoveries of important information. Data-intensive tasks are often significantly time-consuming and complex in nature and hence SWfMSs are designed to efficiently support the specification, modification, execution, failure handling, and monitoring of the tasks in a scientific workflow. As far as the complexity, dimension, and volume of data are concerned, their effective analysis or management often become challenging for an individual and requires collaboration of multiple scientists instead. Hence, the notion of 'Collaborative SWfMS' was coined - which gained significant interest among researchers in recent years as none of the existing SWfMSs directly support real-time collaboration among scientists. In terms of collaborative SWfMSs, consistency management in the face of conflicting concurrent operations of the collaborators is a major challenge for its highly interconnected document structure among the computational modules - where any minor change in a part of the workflow can highly impact the other part of the collaborative workflow for the datalink relation among them. In addition to the consistency management, studies show several other challenges that need to be addressed towards a successful design of collaborative SWfMSs, such as sub-workflow composition and execution by different sub-groups, relationship between scientific workflows and collaboration models, sub-workflow monitoring, seamless integration and access control of the workflow components among collaborators and so on. In this thesis, we propose a locking scheme to facilitate consistency management in collaborative SWfMSs. The proposed method works by locking workflow components at a granular attribute level in addition to supporting locks on a targeted part of the collaborative workflow. We conducted several experiments to analyze the performance of the proposed method in comparison to related existing methods. Our studies show that the proposed method can reduce the average waiting time of a collaborator by up to 36% while increasing the average workflow update rate by up to 15% in comparison to existing descendent modular level locking techniques for collaborative SWfMSs. We also propose a role-based access control technique for the management of collaborative SWfMSs. We leverage the Collaborative Interactive Application Methodology (CIAM) for the investigation of role-based access control in the context of collaborative SWfMSs. We present our proposed method with a use-case of Plant Phenotyping and Genotyping research domain. Recent study shows that the collaborative SWfMSs often different sets of opportunities and challenges. From our investigations on existing research works towards collaborative SWfMSs and findings of our prior two studies, we propose an architecture of collaborative SWfMSs. We propose - SciWorCS - a Collaborative Scientific Workflow Management System as a proof of concept of the proposed architecture; which is the first of its kind to the best of our knowledge. We present several real-world use-cases of scientific workflows using SciWorCS. Finally, we conduct several user studies using SciWorCS comprising different real-world scientific workflows (i.e., from myExperiment) to understand the user behavior and styles of work in the context of collaborative SWfMSs. In addition to evaluating SciWorCS, the user studies reveal several interesting facts which can significantly contribute in the research domain, as none of the existing methods considered such empirical studies, and rather relied only on computer generated simulated studies for evaluation

    Understanding Legacy Workflows through Runtime Trace Analysis

    Get PDF
    abstract: When scientific software is written to specify processes, it takes the form of a workflow, and is often written in an ad-hoc manner in a dynamic programming language. There is a proliferation of legacy workflows implemented by non-expert programmers due to the accessibility of dynamic languages. Unfortunately, ad-hoc workflows lack a structured description as provided by specialized management systems, making ad-hoc workflow maintenance and reuse difficult, and motivating the need for analysis methods. The analysis of ad-hoc workflows using compiler techniques does not address dynamic languages - a program has so few constrains that its behavior cannot be predicted. In contrast, workflow provenance tracking has had success using run-time techniques to record data. The aim of this work is to develop a new analysis method for extracting workflow structure at run-time, thus avoiding issues with dynamics. The method captures the dataflow of an ad-hoc workflow through its execution and abstracts it with a process for simplifying repetition. An instrumentation system first processes the workflow to produce an instrumented version, capable of logging events, which is then executed on an input to produce a trace. The trace undergoes dataflow construction to produce a provenance graph. The dataflow is examined for equivalent regions, which are collected into a single unit. The workflow is thus characterized in terms of its treatment of an input. Unlike other methods, a run-time approach characterizes the workflow's actual behavior; including elements which static analysis cannot predict (for example, code dynamically evaluated based on input parameters). This also enables the characterization of dataflow through external tools. The contributions of this work are: a run-time method for recording a provenance graph from an ad-hoc Python workflow, and a method to analyze the structure of a workflow from provenance. Methods are implemented in Python and are demonstrated on real world Python workflows. These contributions enable users to derive graph structure from workflows. Empowered by a graphical view, users can better understand a legacy workflow. This makes the wealth of legacy ad-hoc workflows accessible, enabling workflow reuse instead of investing time and resources into creating a workflow.Dissertation/ThesisMasters Thesis Computer Science 201

    A Study of the color management implementation on the RGB-based digital imaging workflow: digital camera to RGB printers

    Get PDF
    An RGB (red, green, and blue color information) workflow is used in digital photography today because a lot of the devices digital cameras, scanners, monitors, image recorders (LVT or Light Value Technology), and some types of printers are based on RGB color information. In addition, rapidly growing new media such as the Internet and CD-ROM (Compact Disc-Read-Only Memory) publishing use an RGB -based monitor as the output device. Because color is device-dependent, each device has a different method of representing color information. Each has a different range of color they can reproduce. Most of the time, the range of color, color gamut, that devices can produce is smaller than that of the original capturing device. As a result, a color image reproduction does not match accurately with its original. Therefore, in typical color image reproduction, the task of matching a color image reproduction with its original is a significant problem that operators must overcome to achieve good quality color image reproduction. Generally, there are two approaches to conquer these problems. The first method is trial-and-error in the legacy-based system. This method is effective in a pair-wise working environment and highly depended on a skill operator. The second method is the ICC-based (ICC or International Color Consortium) color management system (CMS) which is more practical in the multiple devices working environment. Using the right method leads to the higher efficiency of a digital photography produc tion. Therefore, the purpose of this thesis project is to verify that ICC-based CMS with an RGB workflow has higher efficiency (better utilized of resource and capacity) than a legacy-based traditional color reproduction workflow. In this study, the RGB workflows from digital cameras to RGB digital printers were used because of the increasing num ber of digital camera users and the advantages of using an RGB workflow in digital pho tography. There were two experimental image reproduction workflows the legacy-based system and the ICC-based color management system. Both of them used the same raw RGB images that were captured from digital cameras as their input files. The color images were modified with two different color matching methods according to each workflow. Then, they were printed out to two RGB digital printers. Twenty observers were asked to evaluate the picture quality as well as the reproduction quality. The results demonstrated that the two workflows had the ability to produce an accept able picture quality reproduction. For reproduction quality aspect, the reproductions of the ICC-based CMS workflow had higher reproduction quality than the legacy-based workflow. In addition, when the time usage of the workflow was taken into account, it showed that the ICC-based CMS had higher efficiency than the legacy-based system. However, many times, image production jobs do not start with optimum quality raw images as in this study; for example, they are under/over exposure or have some defects. These images need some retouching work or fine adjustment to improve their quality. In these cases, the ICC-based CMS with skilled operators can be implemented to these types of production in order to achieve the high efficiency workflow

    Identifying non-value added waste that delay emergency CT brain workflow using lean management principles

    Get PDF
    Introduction: The Department of Radiology at Groote Schuur Hospital receives numerous emergency CT brain requests especially from the Emergency and Trauma departments. Improvement in emergency CT brain workflow should reduce waiting times for CT scans resulting in earlier diagnosis and treatment of these patients. Identification of the nonvalue-added waste (NVAW) (steps regarded as wasteful to the customer) in the CT brain workflow can be determined by use of a lean management tool namely a value stream map (VSM - a flow analysis of information required to provide service to the customer). AIM: The study aims to identify non-value-added waste in the CT brain workflow value stream map which may result in delay in emergency CT brain reporting. Method: This study investigated NVAW in emergency CT brain workflow for 5 working days between 08h00 to 22h00 from Monday to Friday. Nineteen patients booked for an emergency CT brain scan by the Emergency Department (ED) only between 08h00 and 22h00 over the specific 5 day working period were randomly selected using convenience sampling. The indications for emergency CT brain scans in the sample were similar to the wider group of patients undergoing emergency CT brain scans. A VSM identifying all the relevant steps in the emergency CT brain workflow was constructed. The investigator accompanied each of the nineteen patients from the ED to the CT scanner and back and manually recorded the time elapsed in minutes for each separate step on the data collection sheet. The outstanding information required was obtained from the Xiris system on the Phillips PACS (Picture Archiving and Communicating System). The average time interval for each of the steps as indicated on the VSM was calculated, and the rate limiting step(s) which resulted in a delay in emergency CT brain reporting was identified. Results: Overall, the longest step was the time interval from the time of completion of the scan to the generation of the report (turnaround time (TAT)) with an average time of 72.21 minutes (p value of < 0,01). Conversely, the time interval from placing the request by the clinician on the PACS to the time of annotation by the radiologist was the shortest with an average time of 5.84 minutes. Discussion: The lean management system was used to identify the rate limiting step(s) which resulted in delay in emergency CT brain reporting. Possible reasons identified for the delay caused by the rate limiting step include the backlog in reporting of the large number of already scanned cases which may be due to staff constraints as only one radiologist was on duty during most of the study period. Additional contributory factors include clinician telephonic query interruptions to radiology registrars during reporting sessions and delay in the emergency doctor authorising and facilitating transport of the patient from the emergency unit to the CT scanner. Conclusion: The value stream map tool in lean management can be utilised to identify non value added waste in emergency CT brain workflow

    Performance analysis and optimization for workflow authorization

    Get PDF
    Many workflow management systems have been developed to enhance the performance of workflow executions. The authorization policies deployed in the system may restrict the task executions. The common authorization constraints include role constraints, Separation of Duty (SoD), Binding of Duty (BoD) and temporal constraints. This paper presents the methods to check the feasibility of these constraints, and also determines the time durations when the temporal constraints will not impose negative impact on performance. Further, this paper presents an optimal authorization method, which is optimal in the sense that it can minimize a workflow’s delay caused by the temporal constraints. The authorization analysis methods are also extended to analyze the stochastic workflows, in which the tasks’ execution times are not known exactly, but follow certain probability distributions. Simulation experiments have been conducted to verify the effectiveness of the proposed authorization methods. The experimental results show that comparing with the intuitive authorization method, the optimal authorization method can reduce the delay caused by the authorization constraints and consequently reduce the workflows’ response time

    A framework for selecting workflow tools in the context of composite information systems

    Get PDF
    When an organization faces the need of integrating some workflow-related activities in its information system, it becomes necessary to have at hand some well-defined informational model to be used as a framework for determining the selection criteria onto which the requirements of the organization can be mapped. Some proposals exist that provide such a framework, remarkably the WfMC reference model, but they are designed to be appl icable when workflow tools are selected independently from other software, and departing from a set of well-known requirements. Often this is not the case: workflow facilities are needed as a part of the procurement of a larger, composite information syste m and therefore the general goals of the system have to be analyzed, assigned to its individual components and further detailed. We propose in this paper the MULTSEC method in charge of analyzing the initial goals of the system, determining the types of components that form the system architecture, building quality models for each type and then mapping the goals into detailed requirements which can be measured using quality criteria. We develop in some detail the quality model (compliant with the ISO/IEC 9126-1 quality standard) for the workflow type of tools; we show how the quality model can be used to refine and clarify the requirements in order to guarantee a highly reliable selection result; and we use it to evaluate two particular workflow solutions a- ailable in the market (kept anonymous in the paper). We develop our proposal using a particular selection experience we have recently been involved in, namely the procurement of a document management subsystem to be integrated in an academic data management information system for our university.Peer ReviewedPostprint (author's final draft

    Mining Event Logs to Support Workflow Resource Allocation

    Full text link
    Workflow technology is widely used to facilitate the business process in enterprise information systems (EIS), and it has the potential to reduce design time, enhance product quality and decrease product cost. However, significant limitations still exist: as an important task in the context of workflow, many present resource allocation operations are still performed manually, which are time-consuming. This paper presents a data mining approach to address the resource allocation problem (RAP) and improve the productivity of workflow resource management. Specifically, an Apriori-like algorithm is used to find the frequent patterns from the event log, and association rules are generated according to predefined resource allocation constraints. Subsequently, a correlation measure named lift is utilized to annotate the negatively correlated resource allocation rules for resource reservation. Finally, the rules are ranked using the confidence measures as resource allocation rules. Comparative experiments are performed using C4.5, SVM, ID3, Na\"ive Bayes and the presented approach, and the results show that the presented approach is effective in both accuracy and candidate resource recommendations.Comment: T. Liu et al., Mining event logs to support workflow resource allocation, Knowl. Based Syst. (2012), http://dx.doi.org/ 10.1016/j.knosys.2012.05.01

    An Approach for Supporting Ad-hoc Modifications in Distributed Workflow Management Systems

    Get PDF
    Supporting enterprise-wide or even cross-organizational business processes is a characteristic challenge for any workflow management system (WfMS). Scalability at the presence of high loads as well as the capability to dynamically modify running workflow (WF) instances (e.g., to cope with exceptional situations) are essential requirements in this context. Should the latter one, in particular, not be met, the WfMS will not have the necessary flexibility to cover the wide range of process-oriented applications deployed in many organizations. Scalability and flexibility have, for the most part, been treated separately in the relevant literature thus far. Even though they are basic needs for a WfMS, the requirements related with them are totally different. To achieve satisfactory scalability, on the one hand, the system needs to be designed such that a workflow instance can be controlled by several WF servers that are as independent from each other as possible. Yet dynamic WF modifications, on the other hand, necessitate a (logical) central control instance which knows the current and global state of a WF instance. For the first time, this paper presents methods which allow ad-hoc modifications (e.g., to insert, delete, or shift steps) to be performed in a distributed WfMS; i.e., in a WfMS with partitioned WF execution graphs and distributed WF control. It is especially noteworthy that the system succeeds in realizing the full functionality as given in the central case while, at the same time, achieving extremely favorable behavior with respect to communication costs
    corecore