    Automated Process Discovery: A Literature Review and a Comparative Evaluation with Domain Experts

    Äriprotsesside kaeve meetodi võimaldavad analüütikul kasutada logisid saamaks teadmisi protsessi tegeliku toimise kohta. Neist meetodist üks enim uuritud on automaatne äriprotsesside avastamine. Sündmuste logi võetakse kui sisend automaatse äriprotsesside avastamise meetodi poolt ning väljundina toodetakse äriprotsessi mudel, mis kujutab logis talletatud sündmuste kontrollvoogu. Viimase kahe kümnendi jooksul on väljapakutud mitmeidki automaatseid äriprotsessi avastamise meetodeid balansseerides erinevalt toodetavate mudelite skaleeruvuse, täpsuse ning keerukuse vahel. Siiani on automaatsed äriprotsesside avastamise meetodid testitud ad-hoc kombel, kus erinevad autorid kasutavad erinevaid andmestike, seadistusi, hindamismeetrikuid ning alustõdesid, mis viib tihti võrdlematute tulemusteni ning mõnikord ka mittetaastoodetavate tulemusteni suletud andmestike kasutamise tõttu. Eelpool toodu mõistes sooritatakse antud magistritöö raames süstemaatiline kirjanduse ülevaade automaatsete äriprotsesside avastamise meetoditest ja ka süstemaatiline hindav võrdlus üle nelja kvaliteedimeetriku olemasolevate automaatsete äriprotsesside avastamise meetodite kohta koostöös domeeniekspertidega ning kasutades reaalset logi rahvusvahelisest tarkvara firmast. Kirjanduse ülevaate ning hindamise tulemused tõstavad esile puudujääke ning seni uurimata kompromisse mudelite loomiseks nelja kvaliteedimeetriku kontekstis. Antud magistritöö tulemused võimaldavad teaduritel parandada puudujäägid meetodites. Samuti vastatakse küsimusele automaatsete äriprotsesside avastamise meetodite kasutamise kohta väljaspool akadeemilist maailma.Process mining methods allow analysts to use logs of historical executions of business processes in order to gain knowledge about the actual performance of these processes.One of the most widely studied process mining operations is automated process discovery.An event log is taken as input by an automated process discovery method and produces a business process model as output that captures the control-flow relations between tasks that are described by the event log.Several automated process discovery methods have been proposed in the past two decades, striking different tradeoffs between scalability, accuracy and complexity of the resulting models.So far, automated process discovery methods have been evaluated in an ad hoc manner, with different authors employing different datasets, experimental setups, evaluation measures and baselines, often leading to incomparable conclusions and sometimes unreproducible results due to the use of non-publicly available datasets.In this setting, this thesis provides a systematic review of automated process discovery methods and a systematic comparative evaluation of existing implementations of these methods with domain experts by using a real-life event log extracted from a international software engineering company and four quality metrics.The review and evaluation results highlight gaps and unexplored tradeoffs in the field in the context of four business process model quality metrics.The results of this master thesis allows researchers to improve the lacks in the automated process discovery methods and also answers question about the usability of process discovery techniques in industry

    On the discovery of declarative control flows for artful processes

    Artful processes are those processes in which the experience, intuition, and knowledge of the actors are the key factors in determining the decision making. They are typically carried out by the "knowledge workers," such as professors, managers, and researchers. They are often scarcely formalized or completely unknown a priori. Throughout this article, we discuss how we addressed the challenge of discovering declarative control flows in the context of artful processes. To this extent, we devised and implemented a two-phase algorithm, named MINERful. The first phase builds a knowledge base, where statistical information extracted from logs is represented. During the second phase, queries are evaluated on that knowledge base, in order to infer the constraints that constitute the discovered process. After outlining the overall approach and offering insight on the adopted process modeling language, we describe in detail our discovery technique. Thereupon, we analyze its performances, both from a theoretical and an experimental perspective. A user-driven evaluation of the quality of results is also reported on the basis of a real case study. Finally, a study on the fitness of discovered models with respect to synthetic and real logs is presented

    Cooperation between expert knowledge and data mining discovered knowledge: Lessons learned

    Expert systems are built from knowledge traditionally elicited from the human expert. It is precisely knowledge elicitation from the expert that is the bottleneck in expert system construction. On the other hand, a data mining system, which automatically extracts knowledge, needs expert guidance on the successive decisions to be made in each of the system phases. In this context, expert knowledge and data mining discovered knowledge can cooperate, maximizing their individual capabilities: data mining discovered knowledge can be used as a complementary source of knowledge for the expert system, whereas expert knowledge can be used to guide the data mining process. This article summarizes different examples of systems where there is cooperation between expert knowledge and data mining discovered knowledge and reports our experience of such cooperation gathered from a medical diagnosis project called Intelligent Interpretation of Isokinetics Data, which we developed. From that experience, a series of lessons were learned throughout project development. Some of these lessons are generally applicable and others pertain exclusively to certain project types

    Conformance Checking of Mixed-paradigm Process Models

    Mixed-paradigm process models integrate strengths of procedural and declarative representations like Petri nets and Declare. They are specifically interesting for process mining because they allow capturing complex behaviour in a compact way. A key research challenge for the proliferation of mixed-paradigm models for process mining is the lack of corresponding conformance checking techniques. In this paper, we address this problem by devising the first approach that works with intertwined state spaces of mixed-paradigm models. More specifically, our approach uses an alignment-based replay to explore the state space and compute trace fitness in a procedural way. In every state, the declarative constraints are separately updated, such that violations disable the corresponding activities. Our technique provides for an efficient replay towards an optimal alignment by respecting all orthogonal Declare constraints. We have implemented our technique in ProM and demonstrate its performance in an evaluation with real-world event logs.Comment: Accepted for publication in Information System

    Towards an Entropy-based Analysis of Log Variability

    Rules, decisions, and workflows are intertwined components depicting the overall process. So far imperative workflow modelling languages have played the major role for the description and analysis of business processes. Despite their undoubted efficacy in representing sequential executions, they hide circumstantial information leading to the enactment of activities, and obscure the rationale behind the verification of requirements, dependencies, and goals. This workshop aimed at providing a platform for the discussion and introduction of new ideas related to the development of a holistic approach that encompasses all those aspects. The objective was to extend the reach of the business process management audience towards the decisions and rules community and increase the integration between different imperative, declarative and hybrid modelling perspectives. Out of the high-quality submitted manuscripts, three papers were accepted for publication, with an acceptance rate of 50%. They contributed to foster a fruitful discussion among the participants about the respective impact and the interplay of decision perspective and the process perspective

    Leveraging Multi-Perspective A priori Knowledge in Predictive Business Process Monitoring

    Äriprotsesside ennestusseire on valdkond, mis on pühendunud käimasolevate äriprotsesside tuleviku ennustamisele kasutades selleks minevikus sooritatud äriprotsesside kohta käivaid andmeid. Valdav osa uurimustööst selles valdkonnas keskendub ainult seda tüüpi andmetele, jättes tähelepanuta täiendavad teadmised (a priori teadmised) protsessi teostumise kohta tulevikus. Hiljuti pakuti välja lähenemine, mis võimaldab a priori teadmisi kasutada LTL-reeglite näol. Kuid tõsiasjana on antud tehnika limiteeritud äriprotsessi kontroll-voole, jättes välja võimaluse väljendada a priori teadmisi, mis puudutavad lisaks kontrollvoole ka informatsiooni protsessis leiduvate atribuutide kohta (multiperspektiivsed a priori teadmised). Me pakume välja lahenduse, mis võimaldab seda tüüpi teadmiste kasutuse, tehes multiperspektiivseid ennustusi käimasoleva äriprotsessi kohta. Tulemused, milleni jõuti rakendades väljapakutud tehnikat 20-le tehisärilogile ning ühele elulisele ärilogile, näitavad, et meie lähenemine suudab pakkuda konkurentsivõimelisi ennustusi.Predictive business process monitoring is an area dedicated to exploiting past process execution data in order to predict the future unfolding of a currently executed business process instance. Most of the research done in this domain focuses on exploiting the past process execution data only, leaving neglected additional a priori knowledge that might become available at runtime. Recently, an approach was proposed, which allows to leverage a priori knowledge on the control flow in the form of LTL-rules. However, cases exist in which more granular a priori knowledge becomes available about perspectives that go be-yond the pure control flow like data, time and resources (multiperspective a priori knowledge). In this thesis, we propose a technique that enables to leverage multi-perspective a priori knowledge when making predictions of complex sequences, i.e., sequences of events with a subset of the data attributes attached to them. The results, obtained by applying the proposed technique to 20 synthetic logs and 1 real life log, show that the proposed technique is able to overcome state-of-the-art approaches by successfully leveraging multiperspective a priori knowledge