13,818 research outputs found

    Efficient edge filtering of directly-follows graphs for process mining

    Get PDF
    Automated process discovery is a process mining operation that takes as input an event log of a business process and generates a diagrammatic representation of the process. In this setting, a common diagrammatic representation generated by commercial tools is the directly-follows graph (DFG). In some real-life scenarios, the DFG of an event log contains hundreds of edges, hindering its understandability. To overcome this shortcoming, process mining tools generally offer the possibility of filtering the edges in the DFG. We study the problem of efficiently filtering the DFG extracted from an event log while retaining the most frequent relations. We formalize this problem as an optimization problem, specifically, the problem of finding a sound spanning subgraph of a DFG with a minimal number of edges and a maximal sum of edge frequencies. We show that this problem is an instance of an NP-hard problem and outline several polynomial-time heuristics to compute approximate solutions. Finally, we report on an evaluation of the efficiency and optimality of the proposed heuristics using 13 real-life event logsWe thank Luciano García-Baíuelos for proposing the idea of combining the results of Chu-Liu-Edmonds’ algorithm to filter a DFG. We also thank Adriano Augusto for providing us with the implementation of the Split Miner filtering technique. This research was funded by the Spanish Ministry of Economy and Competitiveness (TIN2017-84796-C2-1-R) and the Galician Ministry of Education, Culture and Universities (ED431G/08). These grants are co-funded by the European Regional Development Fund (ERDF/FEDER program). D. Chapela-Campa is supported by the Spanish Ministry of Education, under the FPU national plan (FPU16/04428 and EST19/00135). This research is also funded by the Estonian Research Council (grant PRG1226)S

    Visualizing Business Process Deviance With Timeline Diagrams

    Get PDF
    LĂ”putöös pĂŒstitatakse kaks peamist kĂŒsimust mĂ”iste "hĂ€lvete kaevandamine" kohta ja tehakse ettepanek uue meetodi jaoks, mis nĂ€itab kahe sĂŒndmuse logide erinevusi ajalise dĂŒnaamika mĂ”ttes, et tĂ€ita hĂ€lvete kaevandamist. HĂ€lvete kaevandamise eesmĂ€rk on tĂ€psustada probleemide pĂ€ritolu ja kĂ”rvalekaldeid. Kogu sellega seotud töö uurimisel on tĂ€heldatud, et enamus olemasolevatest meetoditest keskenduvad protsessi pĂ”histruktuurile, mis on ĂŒlesannete tĂ€itmise jĂ€rjekord. Uus tehnika nĂ€itab tavapĂ€raste ja hĂ€lbivate jĂ€lgede tegevuste suhtelisi kestusi, st ajalist dĂŒnaamikat, joonistades variantide ajakava. Lisaks pakutakse vĂ€lja tehnika, mis nĂ€itab diagrammide kohandamiseks erinevaid seadeid, nagu tulemuslikkuse mÔÔdik ja protsessi ĂŒksikasjalikkus. LĂ”puks onvĂ€lja arendatud kontseptsioonivahend, mis jĂ€rgib vĂ€lja pakutud lĂ€henemisviisi ja on veebis saadaval.The thesis poses two main questions regarding to the notion of “deviance mining” andproposes a new technique to visualise the differences of two event logs in terms oftemporal dynamics in order to perform deviance mining. The objective of deviancemining is to pinpoint the origin of the problems and the deviance. Throughout theresearch of the related work it’s observed that most of the existing methods focus on themain structure of the process which is the order of the tasks being executed. The newtechnique brings out the relative durations i.e temporal dynamics of the activities in thenormal and deviant traces by drawing a timeline diagram of the variants. Additionally theproposed technique puts forward set of different settings such as the performancemeasure and the granularity level of the process to customize the diagram. Lastly, aproof-of-concept tool abiding by the proposed approach is implemented which is servedon the web

    Supporting ethnographic studies of ubiquitous computing in the wild

    Get PDF
    Ethnography has become a staple feature of IT research over the last twenty years, shaping our understanding of the social character of computing systems and informing their design in a wide variety of settings. The emergence of ubiquitous computing raises new challenges for ethnography however, distributing interaction across a burgeoning array of small, mobile devices and online environments which exploit invisible sensing systems. Understanding interaction requires ethnographers to reconcile interactions that are, for example, distributed across devices on the street with online interactions in order to assemble coherent understandings of the social character and purchase of ubiquitous computing systems. We draw upon four recent studies to show how ethnographers are replaying system recordings of interaction alongside existing resources such as video recordings to do this and identify key challenges that need to be met to support ethnographic study of ubiquitous computing in the wild

    Can recurrent neural networks learn process model structure?

    Full text link
    Various methods using machine and deep learning have been proposed to tackle different tasks in predictive process monitoring, forecasting for an ongoing case e.g. the most likely next event or suffix, its remaining time, or an outcome-related variable. Recurrent neural networks (RNNs), and more specifically long short-term memory nets (LSTMs), stand out in terms of popularity. In this work, we investigate the capabilities of such an LSTM to actually learn the underlying process model structure of an event log. We introduce an evaluation framework that combines variant-based resampling and custom metrics for fitness, precision and generalization. We evaluate 4 hypotheses concerning the learning capabilities of LSTMs, the effect of overfitting countermeasures, the level of incompleteness in the training set and the level of parallelism in the underlying process model. We confirm that LSTMs can struggle to learn process model structure, even with simplistic process data and in a very lenient setup. Taking the correct anti-overfitting measures can alleviate the problem. However, these measures did not present themselves to be optimal when selecting hyperparameters purely on predicting accuracy. We also found that decreasing the amount of information seen by the LSTM during training, causes a sharp drop in generalization and precision scores. In our experiments, we could not identify a relationship between the extent of parallelism in the model and the generalization capability, but they do indicate that the process' complexity might have impact
    • 

    corecore