35 research outputs found

    A Framework for Online Conformance Checking

    Get PDF
    Conformance checking – a branch of process mining – focuses on establishing to what extent actual executions of a process are in line with the expected behavior of a reference model. Current conformance checking techniques only allow for a-posteriori analysis: the amount of (non-)conformant behavior is quantified after the completion of the process instance. In this paper we propose a framework for online conformance checking: not only do we quantify (non-)conformant behavior as the execution is running, we also restrict the computation to constant time complexity per event analyzed, thus enabling the online analysis of a stream of events. The framework is instantiated with ideas coming from the theory of regions, and state similarity. An implementation is available in ProM and promising results have been obtained.Peer ReviewedPostprint (author's final draft

    Deletion of Genes Implicated in Protecting the Integrity of Male Germ Cells Has Differential Effects on the Incidence of DNA Breaks and Germ Cell Loss

    Get PDF
    Infertility affects approximately 20% of couples in Europe and in 50% of cases the problem lies with the male partner. The impact of damaged DNA originating in the male germ line on infertility is poorly understood but may increase miscarriage. Mouse models allow us to investigate how deficiencies in DNA repair/damage response pathways impact on formation and function of male germ cells. We have investigated mice with deletions of ERCC1 (excision repair cross-complementing gene 1), MSH2 (MutS homolog 2, involved in mismatch repair pathway), and p53 (tumour suppressor gene implicated in elimination of germ cells with DNA damage).We demonstrate for the first time that depletion of ERCC1 or p53 from germ cells results in an increased incidence of unrepaired DNA breaks in pachytene spermatocytes and increased numbers of caspase-3 positive (apoptotic) germ cells. Sertoli cell-only tubules were detected in testes from mice lacking expression of ERCC1 or MSH2 but not p53. The number of sperm recovered from epididymes was significantly reduced in mice lacking testicular ERCC1 and 40% of sperm contained DNA breaks whereas the numbers of sperm were not different to controls in adult Msh2 -/- or p53 -/- mice nor did they have significantly compromised DNA.These data have demonstrated that deletion of Ercc1, Msh2 and p53 can have differential but overlapping affects on germ cell function and sperm production. These findings increase our understanding of the ways in which gene mutations can have an impact on male fertility

    Process mining with streaming data

    No full text

    On the application of sequential pattern mining primitives to process discovery:overview, outlook and opportunity identification

    No full text
    \u3cp\u3eSequential pattern mining (SPM) is a well-studied theme in data mining, in which one aims to discover common sequences of item sets in a large corpus of temporal itemset data. Due to the sequential nature of data streams, supporting SPM in streaming environments is commonly studied in the area of data stream mining as well. On the other hand, stream-based process discovery (PD), originating from the field of process mining, focusses on learning process models on the basis of online event data. In particular, the main goal of the models discovered is to describe the underlying generating process in an end-to-end fashion. As both SPM and PD use data that are comparable in nature, that is, both involve time-stamped instances, one expects that techniques from the SPM domain are (partly) transferable to the PD domain. However, thus far, little work has been done in the intersection of the two fields. In this focus article, we therefore study the possible application of SPM techniques in the context of PD. We provide an overview of the two fields, covering their commonalities and differences, highlight the challenges of applying them, and, present an outlook and several avenues for future work. This article is categorized under: Algorithmic Development > Spatial and Temporal Data Mining Fundamental Concepts of Data and Knowledge > Key Design Issues in Data Mining Fundamental Concepts of Data and Knowledge > Big Data Mining.\u3c/p\u3

    Event stream-based process discovery using abstract representations

    No full text
    \u3cp\u3eThe aim of process discovery, originating from the area of process mining, is to discover a process model based on business process execution data. A majority of process discovery techniques relies on an event log as an input. An event log is a static source of historical data capturing the execution of a business process. In this paper, we focus on process discovery relying on online streams of business process execution events. Learning process models from event streams poses both challenges and opportunities, i.e. we need to handle unlimited amounts of data using finite memory and, preferably, constant time. We propose a generic architecture that allows for adopting several classes of existing process discovery techniques in context of event streams. Moreover, we provide several instantiations of the architecture, accompanied by implementations in the process mining toolkit ProM (http://promtools.org). Using these instantiations, we evaluate several dimensions of stream-based process discovery. The evaluation shows that the proposed architecture allows us to lift process discovery to the streaming domain.\u3c/p\u3

    Online discovery of cooperative structures in business processes

    No full text
    \u3cp\u3eProcess mining is a data-driven technique aiming to provide novel insights and help organizations to improve their business processes. In this paper, we focus on the cooperative aspect of process mining, i.e., discovering networks of cooperating resources that together perform processes. We use online streams of events as an input rather than event logs, which are typically used in an off-line setting. We present the Online Cooperative Network (OCN) framework, which defines online cooperative resource network discovery in a generic way. A prototypical implementation of the framework is available in the open source process mining toolkit ProM. By means of an empirical evaluation we show the applicability of the framework in the streaming domain. The techniques presented operate in a real time fashion and are able to handle unlimited amounts of data. Moreover, the implementation allows to visualize network dynamics, which helps in gaining insights in changes in the execution of the underlying business process.\u3c/p\u3

    Avoiding over-fitting in ILP-based process discovery

    No full text
    The aim of process discovery is to discover a process model based on business process execution data, recorded in an event log. One of several existing process discovery techniques is the ILP-based process discovery algorithm. The algorithm is able to unravel complex process structures and provides formal guarantees w.r.t. the model discovered, e.g., the algorithm guarantees that a discovered model describes all behavior present in the event log. Unfortunately the algorithm is unable to cope with exceptional behavior present in event logs. As a result, the application of ILP-based process discovery techniques in everyday process discovery practice is limited. This paper addresses this problem by proposing a filtering technique tailored towards ILP-based process discovery. The technique helps to produce process models that are less over-fitting w.r.t. the event log, more understandable, and more adequate in capturing the dominant behavior present in the event log. The technique is implemented in the ProM framework. Keywords: Process mining Process discovery Integer linear programming Filterin

    Repairing outlier behaviour in event logs

    No full text
    \u3cp\u3eOne of the main challenges in applying process mining on real event data, is the presence of noise and rare behaviour. Applying process mining algorithms directly on raw event data typically results in complex, incomprehensible, and, in some cases, even inaccurate analyses. As a result, correct and/or important behaviour may be concealed. In this paper, we propose an event data repair method, that tries to detect and repair outlier behaviour within the given event data. We propose a probabilistic method that is based on the occurrence frequency of activities in specific contexts. Our approach allows for removal of infrequent behaviour, which enables us to obtain a more global view of the process. The proposed method has been implemented in both the ProM- and the RapidProM framework. Using these implementations, we conduct a collection of experiments that show that we are able to detect and modify most types of outlier behaviour in the event data. Our evaluation clearly demonstrates that we are able to help to improve process mining discovery results by repairing event logs upfront.\u3c/p\u3

    Improving process discovery results by filtering outliers using conditional behavioural probabilities

    No full text
    \u3cp\u3eProcess discovery, one of the key challenges in process mining, aims at discovering process models from process execution data stored in event logs. Most discovery algorithms assume that all data in an event log conform to correct execution of the process, and hence, incorporate all behaviour in their resulting process model. However, in real event logs, noise and irrelevant infrequent behaviour are often present. Incorporating such behaviour results in complex, incomprehensible process models concealing the correct and/or relevant behaviour of the underlying process. In this paper, we propose a novel general purpose filtering method that exploits observed conditional probabilities between sequences of activities. The method has been implemented in both the ProM toolkit and the RapidProM framework. We evaluate our approach using real and synthetic event data. The results show that the proposed method accurately removes irrelevant behaviour and, indeed, improves process discovery results.\u3c/p\u3
    corecore