204 research outputs found

    Advanced Methods in Business Process Deviance Mining

    Get PDF
    Äriprotsessi hĂ€lve on nĂ€htus, kus alamhulk Ă€riprotsessi tĂ€itmistest erinevad soovitud vĂ”i ettenĂ€htud tulemusest, kas positiivses vĂ”i negatiivses mĂ”ttes. Äriprotsesside hĂ€lbega tĂ€itmised sisaldavad endas tĂ€itmisi, mis ei vasta ettekirjutatud reeglitele vĂ”i tĂ€itmised, mis on jÀÀvad alla vĂ”i ĂŒletavad tulemuslikkuse eesmĂ€rke. HĂ€lbekaevandus tegeleb hĂ€lbe pĂ”hjuste otsimisega, analĂŒĂŒsides selleks Ă€riprotsesside sĂŒndmuste logisid.Antud töös lĂ€henetakse protsessihĂ€lvete pĂ”hjuste otsimise ĂŒlesandele, esmalt kasutades jĂ€rjestikkudel pĂ”hinevaid vĂ”i deklaratiivseid mustreid ning nende kombinatsiooni. HĂ€lbekaevandusest saadud pĂ”hjendusi saab parendada, kasutades sĂŒndmustes ja sĂŒndmusjĂ€lgede atribuutides sisalduvaid andmelaste. Andmelastidest konstrueeritakse uued tunnused nii otsekoheselt atribuute ekstraheerides ja agregeerides kui ka andmeteadlike deklaratiivseid piiranguid kasutades. HĂ€lbeid iseloomustavad pĂ”hjendused ekstraheeritakse kasutades kaudset ja otsest meetodit reeglite induktsiooniks. Kasutades sĂŒnteetilisi ja reaalseid logisid, hinnatakse erinevaid tunnuseid ja tulemuseks saadud otsustusreegleid nii nende vĂ”imekuses tĂ€pselt eristada hĂ€lbega ja hĂ€lbeta protsesside tĂ€itmiseid kui ka kasutajatele antud lĂ”pptulemustes.Business process deviance refers to the phenomenon whereby a subset of the executions of a business process deviate, in a negative or positive way, with respect to its expected or desirable outcomes. Deviant executions of a business process include those that violate compliance rules, or executions that undershoot or exceed performance targets. Deviance mining is concerned with uncovering the reasons for deviant executions by analyzing business process event logs. In this thesis, the problem of explaining deviations in business processes is first investigated by using features based on sequential and declarative patterns, and a combination of them. The explanations are further improved by leveraging the data payload of events and traces in event logs through features based on pure data attribute values and data-aware declare constraints. The explanations characterizing the deviances are then extracted by direct and indirect methods for rule induction. Using synthetic and real-life logs from multiple domains, a range of feature types and different forms of decision rules are evaluated in terms of their ability to accurately discriminate between non-deviant and deviant executions of a process as well as in terms of the final outcome returned to the users

    Degree Spectra, and Relative Acceptability of Notations

    Get PDF

    On Two Different Kinds of Computational Indeterminacy

    Get PDF
    It is often indeterminate what function a given computational system computes. This phenomenon has been referred to as “computational indeterminacy” or “multiplicity of computations”. In this paper, we argue that what has typically been considered and referred to as the (unique) challenge of computational indeterminacy in fact subsumes two distinct phenomena, which are typically bundled together and should be teased apart. One kind of indeterminacy concerns a functional (or formal) characterization of the system’s relevant behavior (briefly: how its physical states are grouped together and corresponded to abstract states). Another kind concerns the manner in which the abstract (or computational) states are interpreted (briefly: what function the system computes). We discuss the similarities and differences between the two kinds of computational indeterminacy, their implications for certain accounts of “computational individuation” in the literature, and their relevance to different levels of description within the computational system. We also examine the interrelationships between our proposed accounts of the two kinds of indeterminacy and the main accounts of “computational implementation”

    Visual encoding quality and scalability in information visualization

    Get PDF
    Information visualization seeks to amplify cognition through interactive visual representations of data. It comprises human processes, such as perception and cognition, and computer processes, such as visual encoding. Visual encoding consists in mapping data variables to visual variables, and its quality is critical to the effectiveness of information visualizations. The scalability of a visual encoding is the extent to which its quality is preserved as the parameters of the data grow. Scalable encodings offer good support for basic analytical tasks at scale by carrying design decisions that consider the limits of human perception and cognition. In this thesis, I present three case studies that explore different aspects of visual encoding quality and scalability: information loss, perceptual scalability, and discriminability. In the first study, I leverage information theory to model encoding quality in terms of information content and complexity. I examine how information loss and clutter affect the scalability of hierarchical visualizations and contribute an information-theoretic algorithm for adjusting these factors in visualizations of large datasets. The second study centers on the question of whether a data property (outlierness) can be lost in the visual encoding process due to saliency interference with other visual variables. I designed a controlled experiment to measure the effectiveness of motion outlier detection in complex multivariate scatterplots. The results suggest a saliency deficit effect whereby global saliency undermines support to tasks that rely on local saliency. Finally, I investigate how discriminability, a classic visualization criterion, can explain recent empirical results on encoding effectiveness and provide the foundation for automated evaluation of visual encodings. I propose an approach for discriminability evaluation based on a perceptually motivated image similarity measure

    Äriprotsessi tulemuste ennustav ja korralduslik seire

    Get PDF
    Viimastel aastatel on erinevates valdkondades tegutsevad ettevĂ”tted ĂŒles nĂ€idanud kasvavat huvi masinĂ”ppel pĂ”hinevate rakenduste kasutusele vĂ”tmiseks. Muuhulgas otsitakse vĂ”imalusi oma Ă€riprotsesside efektiivsuse tĂ”stmiseks, kasutades ennustusmudeleid protsesside jooksvaks seireks. Sellised ennustava protsessiseire meetodid vĂ”tavad sisendiks sĂŒndmuslogi, mis koosneb hulgast lĂ”petatud Ă€riprotsessi juhtumite sĂŒndmusjadadest, ning kasutavad masinĂ”ppe algoritme ennustusmudelite treenimiseks. Saadud mudelid teevad ennustusi lĂ”petamata (antud ajahetkel aktiivsete) protsessijuhtumite jaoks, vĂ”ttes sisendiks sĂŒndmuste jada, mis selle hetkeni on toimunud ning ennustades kas jĂ€rgmist sĂŒndmust antud juhtumis, juhtumi lĂ”ppemiseni jÀÀnud aega vĂ”i instantsi lĂ”pptulemust. LĂ”pptulemusele orienteeritud ennustava protsessiseire meetodid keskenduvad ennustamisele, kas protsessijuhtum lĂ”ppeb soovitud vĂ”i ebasoovitava lĂ”pptulemusega. SĂŒsteemi kasutaja saab ennustuste alusel otsustada, kas sekkuda antud protsessijuhtumisse vĂ”i mitte, eesmĂ€rgiga Ă€ra hoida ebasoovitavat lĂ”pptulemust vĂ”i leevendada selle negatiivseid tagajĂ€rgi. Erinevalt puhtalt ennustavatest sĂŒsteemidest annavad korralduslikud protsessiseire meetodid kasutajale ka soovitusi, kas ja kuidas antud juhtumisse sekkuda, eesmĂ€rgiga optimeerida mingit kindlat kasulikkusfunktsiooni. KĂ€esolev doktoritöö uurib, kuidas treenida, hinnata ja kasutada ennustusmudeleid Ă€riprotsesside lĂ”pptulemuste ennustava ja korraldusliku seire raames. Doktoritöö pakub vĂ€lja taksonoomia olemasolevate meetodite klassifitseerimiseks ja vĂ”rdleb neid katseliselt. Lisaks pakub töö vĂ€lja raamistiku tekstiliste andmete kasutamiseks antud ennustusmudelites. Samuti pakume vĂ€lja ennustuste ajalise stabiilsuse mĂ”iste ning koostame raamistiku korralduslikuks protsessiseireks, mis annab kasutajatele soovitusi, kas protsessi sekkuda vĂ”i mitte. Katsed nĂ€itavad, et vĂ€ljapakutud lahendused tĂ€iendavad olemasolevaid meetodeid ning aitavad kaasa ennustava protsessiseire sĂŒsteemide rakendamisele reaalsetes sĂŒsteemides.Recent years have witnessed a growing adoption of machine learning techniques for business improvement across various fields. Among other emerging applications, organizations are exploiting opportunities to improve the performance of their business processes by using predictive models for runtime monitoring. Such predictive process monitoring techniques take an event log (a set of completed business process execution traces) as input and use machine learning techniques to train predictive models. At runtime, these techniques predict either the next event, the remaining time, or the final outcome of an ongoing case, given its incomplete execution trace consisting of the events performed up to the present moment in the given case. In particular, a family of techniques called outcome-oriented predictive process monitoring focuses on predicting whether a case will end with a desired or an undesired outcome. The user of the system can use the predictions to decide whether or not to intervene, with the purpose of preventing an undesired outcome or mitigating its negative effects. Prescriptive process monitoring systems go beyond purely predictive ones, by not only generating predictions but also advising the user if and how to intervene in a running case in order to optimize a given utility function. This thesis addresses the question of how to train, evaluate, and use predictive models for predictive and prescriptive monitoring of business process outcomes. The thesis proposes a taxonomy and performs a comparative experimental evaluation of existing techniques in the field. Moreover, we propose a framework for incorporating textual data to predictive monitoring systems. We introduce the notion of temporal stability to evaluate these systems and propose a prescriptive process monitoring framework for advising users if and how to act upon the predictions. The results suggest that the proposed solutions complement the existing techniques and can be useful for practitioners in implementing predictive process monitoring systems in real life

    Research directions in data wrangling: Visualizations and transformations for usable and credible data

    Get PDF
    In spite of advances in technologies for working with data, analysts still spend an inordinate amount of time diagnosing data quality issues and manipulating data into a usable form. This process of ‘data wrangling’ often constitutes the most tedious and time-consuming aspect of analysis. Though data cleaning and integration arelongstanding issues in the database community, relatively little research has explored how interactive visualization can advance the state of the art. In this article, we review the challenges and opportunities associated with addressing data quality issues. We argue that analysts might more effectively wrangle data through new interactive systems that integrate data verification, transformation, and visualization. We identify a number of outstanding research questions, including how appropriate visual encodings can facilitate apprehension of missing data, discrepant values, and uncertainty; how interactive visualizations might facilitate data transform specification; and how recorded provenance and social interaction might enable wider reuse, verification, and modification of data transformations

    Äriprotsesside ajaliste nĂ€itajate selgitatav ennustav jĂ€lgimine

    Get PDF
    Kaasaegsed ettevĂ”tte infosĂŒsteemid vĂ”imaldavad ettevĂ”tetel koguda detailset informatsiooni Ă€riprotsesside tĂ€itmiste kohta. Eelnev koos masinĂ”ppe meetoditega vĂ”imaldab kasutada andmejuhitavaid ja ennustatavaid lĂ€henemisi Ă€riprotsesside jĂ”udluse jĂ€lgimiseks. Kasutades ennustuslike Ă€riprotsesside jĂ€lgimise tehnikaid on vĂ”imalik jĂ”udluse probleeme ennustada ning soovimatu tegurite mĂ”ju ennetavalt leevendada. TĂŒĂŒpilised kĂŒsimused, millega tegeleb ennustuslik protsesside jĂ€lgimine on “millal antud Ă€riprotsess lĂ”ppeb?” vĂ”i “mis on kĂ”ige tĂ”enĂ€olisem jĂ€rgmine sĂŒndmus antud Ă€riprotsessi jaoks?”. Suurim osa olemasolevatest lahendustest eelistavad tĂ€psust selgitatavusele. Praktikas, selgitatavus on ennustatavate tehnikate tĂ€htis tunnus. Ennustused, kas protsessi tĂ€itmine ebaĂ”nnestub vĂ”i selle tĂ€itmisel vĂ”ivad tekkida raskused, pole piisavad. On oluline kasutajatele seletada, kuidas on selline ennustuse tulemus saavutatud ning mida saab teha soovimatu tulemuse ennetamiseks. Töö pakub vĂ€lja kaks meetodit ennustatavate mudelite konstrueerimiseks, mis vĂ”imaldavad jĂ€lgida Ă€riprotsesse ning keskenduvad selgitatavusel. Seda saavutatakse ennustuse lahtivĂ”tmisega elementaarosadeks. NĂ€iteks, kui ennustatakse, et Ă€riprotsessi lĂ”puni on jÀÀnud aega 20 tundi, siis saame anda seletust, et see aeg on moodustatud kĂ”ikide seni kĂ€sitlemata tegevuste lĂ”petamiseks vajalikust ajast. Töös vĂ”rreldakse omavahel eelmainitud meetodeid, kĂ€sitledes Ă€riprotsesse erinevatest valdkondadest. Hindamine toob esile erinevusi selgitatava ja tĂ€psusele pĂ”hinevale lĂ€henemiste vahel. Töö teaduslik panus on ennustuslikuks protsesside jĂ€lgimiseks vabavaralise tööriista arendamine. SĂŒsteemi nimeks on Nirdizati ning see sĂŒsteem vĂ”imaldab treenida ennustuslike masinĂ”ppe mudeleid, kasutades nii töös kirjeldatud meetodeid kui ka kolmanda osapoole meetodeid. Hiljem saab treenitud mudeleid kasutada hetkel kĂ€ivate Ă€riprotsesside tulemuste ennustamiseks, mis saab aidata kasutajaid reaalajas.Modern enterprise systems collect detailed data about the execution of the business processes they support. The widespread availability of such data in companies, coupled with advances in machine learning, have led to the emergence of data-driven and predictive approaches to monitor the performance of business processes. By using such predictive process monitoring approaches, potential performance issues can be anticipated and proactively mitigated. Various approaches have been proposed to address typical predictive process monitoring questions, such as what is the most likely continuation of an ongoing process instance, or when it will finish. However, most existing approaches prioritize accuracy over explainability. Yet in practice, explainability is a critical property of predictive methods. It is not enough to accurately predict that a running process instance will end up in an undesired outcome. It is also important for users to understand why this prediction is made and what can be done to prevent this undesired outcome. This thesis proposes two methods to build predictive models to monitor business processes in an explainable manner. This is achieved by decomposing a prediction into its elementary components. For example, to explain that the remaining execution time of a process execution is predicted to be 20 hours, we decompose this prediction into the predicted execution time of each activity that has not yet been executed. We evaluate the proposed methods against each other and various state-of-the-art baselines using a range of business processes from multiple domains. The evaluation reaffirms a fundamental trade-off between explainability and accuracy of predictions. The research contributions of the thesis have been consolidated into an open-source tool for predictive business process monitoring, namely Nirdizati. It can be used to train predictive models using the methods described in this thesis, as well as third-party methods. These models are then used to make predictions for ongoing process instances; thus, the tool can also support users at runtime

    The analysis of fault diagnosis tasks : do verbal reports speak for themselves?

    Get PDF

    Data-Aware Declarative Process Mining with SAT

    Get PDF
    Process Mining is a family of techniques for analyzing business process execution data recorded in event logs. Process models can be obtained as output of automated process discovery techniques or can be used as input of techniques for conformance checking or model enhancement. In Declarative Process Mining, process models are represented as sets of temporal constraints (instead of procedural descriptions where all control-flow details are explicitly modeled). An open research direction in Declarative Process Mining is whether multi-perspective specifications can be supported, i.e., specifications that not only describe the process behavior from the control-flow point of view, but also from other perspectives like data or time. In this paper, we address this question by considering SAT (Propositional Satisfiability Problem) as a solving technology for a number of classical problems in Declarative Process Mining, namely log generation, conformance checking and temporal query checking. To do so, we first express each problem as a suitable FO (First-Order) theory whose bounded models represent solutions to the problem, and then find a bounded model of such theory by compilation into SAT
    • 

    corecore