11 research outputs found

    Data Sovereignty and Data Space Ecosystems

    Get PDF

    TraVaS: Differentially Private Trace Variant Selection for Process Mining

    Full text link
    In the area of industrial process mining, privacy-preserving event data publication is becoming increasingly relevant. Consequently, the trade-off between high data utility and quantifiable privacy poses new challenges. State-of-the-art research mainly focuses on differentially private trace variant construction based on prefix expansion methods. However, these algorithms face several practical limitations such as high computational complexity, introducing fake variants, removing frequent variants, and a bounded variant length. In this paper, we introduce a new approach for direct differentially private trace variant release which uses anonymized \textit{partition selection} strategies to overcome the aforementioned restraints. Experimental results on real-life event data show that our algorithm outperforms state-of-the-art methods in terms of both plain data utility and result utility preservation

    TraVaG: Differentially Private Trace Variant Generation Using GANs

    Full text link
    Process mining is rapidly growing in the industry. Consequently, privacy concerns regarding sensitive and private information included in event data, used by process mining algorithms, are becoming increasingly relevant. State-of-the-art research mainly focuses on providing privacy guarantees, e.g., differential privacy, for trace variants that are used by the main process mining techniques, e.g., process discovery. However, privacy preservation techniques for releasing trace variants still do not fulfill all the requirements of industry-scale usage. Moreover, providing privacy guarantees when there exists a high rate of infrequent trace variants is still a challenge. In this paper, we introduce TraVaG as a new approach for releasing differentially private trace variants based on \text{Generative Adversarial Networks} (GANs) that provides industry-scale benefits and enhances the level of privacy guarantees when there exists a high ratio of infrequent variants. Moreover, TraVaG overcomes shortcomings of conventional privacy preservation techniques such as bounding the length of variants and introducing fake variants. Experimental results on real-life event data show that our approach outperforms state-of-the-art techniques in terms of privacy guarantees, plain data utility preservation, and result utility preservation

    Privacy-Preserving Process Mining: Differential Privacy for Event Logs

    Get PDF
    The final publication is available at Springer via http://dx.doi.org/10.1007/s12599-019-00613-3Privacy regulations for data can be regarded as a major driver for data sovereignty measures. A specific example for this is the case of event data that is recorded by information systems during the processing of entities in domains such as e-commerce or health care. Since such data, typically available in the form of event log files, contains personalized information on the specific processed entities, it can expose sensitive information that may be traced back to individuals. In recent years, a plethora of methods have been developed to analyse event logs under the umbrella of process mining. However, the impact of privacy regulations on the technical design as well as the organizational application of process mining has been largely neglected. This paper set out to develop a protection model for event data privacy which applies the well-established notion of differential privacy. Starting from common assumptions about the event logs used in process mining, this paper presents potential privacy leakages and means to protect against them. The paper also shows at which stages of privacy leakages a protection model for event logs should be used. Relying on this understanding, the notion of differential privacy for process discovery methods is instantiated, i.e., algorithms that aim at the construction of a process model from an event log. The general feasibility of our approach is demonstrated by its application to two publicly available real-life events logs.acceptedVersio

    Privaatsuskaitse tehnoloogiaid äriprotsesside kaeveks

    Get PDF
    Protsessikaeve tehnikad võimaldavad organisatsioonidel analüüsida protsesside täitmise käigus tekkivaid logijälgi eesmärgiga leida parendusvõimalusi. Nende tehnikate eelduseks on, et nimetatud logijälgi koondavad sündmuslogid on andmeanalüütikutele analüüside läbi viimiseks kättesaadavad. Sellised sündmuslogid võivad sisaldada privaatset informatsiooni isikute kohta kelle jaoks protsessi täidetakse. Sellistel juhtudel peavad organisatsioonid rakendama privaatsuskaitse tehnoloogiaid (PET), et võimaldada analüütikul sündmuslogi põhjal järeldusi teha, samas säilitades isikute privaatsust. Kuigi PET tehnikad säilitavad isikute privaatsust organisatsiooni siseselt, muudavad nad ühtlasi sündmuslogisid sellisel viisil, mis võib viia analüüsi käigus valede järeldusteni. PET tehnikad võivad lisada sündmuslogidesse sellist uut käitumist, mille esinemine ei ole reaalses sündmuslogis võimalik. Näiteks võivad mõned PET tehnikad haigla sündmuslogi anonüümimisel lisada logijälje, mille kohaselt patsient külastas arsti enne haiglasse saabumist. Käesolev lõputöö esitab privaatsust säilitavate lähenemiste komplekti nimetusega privaatsust säilitav protsessikaeve (PPPM). PPPM põhiline eesmärk on leida tasakaal võimaliku sündmuslogi analüüsist saadava kasu ja analüüsile kohaldatavate privaatsusega seonduvate regulatsioonide (näiteks GDPR) vahel. Lisaks pakub käesolev lõputöö lahenduse, mis võimaldab erinevatel organisatsioonidel protsessikaevet üle ühise andmete terviku rakendada, ilma oma privaatseid andmeid üksteisega jagamata. Käesolevas lõputöös esitatud tehnikad on avatud lähtekoodiga tööriistadena kättesaadavad. Nendest tööriistadest esimene on Amun, mis võimaldab sündmuslogi omanikul sündmuslogi anonüümida enne selle analüütikule jagamist. Teine tööriist on Libra, mis pakub täiendatud võimalusi kasutatavuse ja privaatsuse tasakaalu leidmiseks. Kolmas tööriist on Shareprom, mis võimaldab organisatsioonidele ühiste protsessikaartide loomist sellisel viisil, et ükski osapool ei näe teiste osapoolte andmeid.Process Mining Techniques enable organizations to analyze process execution traces to identify improvement opportunities. Such techniques need the event logs (which record process execution) to be available for data analysts to perform the analysis. These logs contain private information about the individuals for whom a process is being executed. In such cases, organizations need to deploy Privacy-Enhancing Technologies (PETs) to enable the analyst to drive conclusions from the event logs while preserving the privacy of individuals. While PETs techniques preserve the privacy of individuals inside the organization, they work by perturbing the event logs in such a way that may lead to misleading conclusions of the analysis. They may inject new behaviors into the event logs that are impossible to exist in real-life event logs. For example, some PETs techniques anonymize a hospital event log by injecting a trace that a patient may visit a doctor before checking in inside the hospital. In this thesis, we propose a set of privacy-preserving approaches that we call Privacy-Preserving Process Mining (PPPM) approaches to strike a balance between the benefits an analyst can get from analyzing these event logs and the requirements imposed on them by privacy regulations (e.g., GDPR). Also, in this thesis, we propose an approach that enables organizations to jointly perform process mining over their data without sharing their private information. The techniques proposed in this thesis have been proposed as open-source tools. The first tool is Amun, enabling an event log publisher to anonymize their event log before sharing it with an analyst. The second tool is called Libra, which provides an enhanced utility-privacy tradeoff. The third tool is Shareprom, which enables organizations to construct process maps jointly in such a manner that no party learns the data of the other parties.https://www.ester.ee/record=b552434

    Process Mining Handbook

    Get PDF
    This is an open access book. This book comprises all the single courses given as part of the First Summer School on Process Mining, PMSS 2022, which was held in Aachen, Germany, during July 4-8, 2022. This volume contains 17 chapters organized into the following topical sections: Introduction; process discovery; conformance checking; data preprocessing; process enhancement and monitoring; assorted process mining topics; industrial perspective and applications; and closing

    Measuring the impact of COVID-19 on hospital care pathways

    Get PDF
    Care pathways in hospitals around the world reported significant disruption during the recent COVID-19 pandemic but measuring the actual impact is more problematic. Process mining can be useful for hospital management to measure the conformance of real-life care to what might be considered normal operations. In this study, we aim to demonstrate that process mining can be used to investigate process changes associated with complex disruptive events. We studied perturbations to accident and emergency (A &E) and maternity pathways in a UK public hospital during the COVID-19 pandemic. Co-incidentally the hospital had implemented a Command Centre approach for patient-flow management affording an opportunity to study both the planned improvement and the disruption due to the pandemic. Our study proposes and demonstrates a method for measuring and investigating the impact of such planned and unplanned disruptions affecting hospital care pathways. We found that during the pandemic, both A &E and maternity pathways had measurable reductions in the mean length of stay and a measurable drop in the percentage of pathways conforming to normative models. There were no distinctive patterns of monthly mean values of length of stay nor conformance throughout the phases of the installation of the hospital’s new Command Centre approach. Due to a deficit in the available A &E data, the findings for A &E pathways could not be interpreted
    corecore