2,929 research outputs found

    Anonymization Techniques for Privacy-preserving Process Mining

    Get PDF
    Process Mining ermöglicht die Analyse von Event Logs. Jede Aktivität ist durch ein Event in einem Trace recorded, welcher jeweils einer Prozessinstanz entspricht. Traces können sensible Daten, z.B. über Patienten enthalten. Diese Dissertation adressiert Datenschutzrisiken für Trace Daten und Process Mining. Durch eine empirische Studie zum Re-Identifikations Risiko in öffentlichen Event Logs wird die hohe Gefahr aufgezeigt, aber auch weitere Risiken sind von Bedeutung. Anonymisierung ist entscheidend um Risiken zu adressieren, aber schwierig weil gleichzeitig die Verhaltensaspekte des Event Logs erhalten werden sollen. Dies führt zu einem Privacy-Utility-Trade-Off. Dieser wird durch neue Algorithmen wie SaCoFa und SaPa angegangen, die Differential Privacy garantieren und gleichzeitig Utility erhalten. PRIPEL ergänzt die anonymiserten Control-flows um Kontextinformationen und ermöglich so die Veröffentlichung von vollständigen, geschützten Logs. Mit PRETSA wird eine Algorithmenfamilie vorgestellt, die k-anonymity garantiert. Dafür werden privacy-verletztende Traces miteinander vereint, mit dem Ziel ein möglichst syntaktisch ähnliches Log zu erzeugen. Durch Experimente kann eine bessere Utility-Erhaltung gegenüber existierenden Lösungen aufgezeigt werden.Process mining analyzes business processes using event logs. Each activity execution is recorded as an event in a trace, representing a process instance's behavior. Traces often hold sensitive info like patient data. This thesis addresses privacy concerns arising from trace data and process mining. A re-identification risk study on public event logs reveals high risk, but other threats exist. Anonymization is vital to address these issues, yet challenging due to preserving behavioral aspects for analysis, leading to a privacy-utility trade-off. New algorithms, SaCoFa and SaPa, are introduced for trace anonymization using noise for differential privacy while maintaining utility. PRIPEL supplements anonymized control flows with trace contextual info for complete protected logs. For k-anonymity, the PRETSA algorithm family merges privacy-violating traces based on a prefix representation of the event log, maintaining syntactic similarity. Empirical evaluations demonstrate utility improvements over existing techniques

    Privaatsuskaitse tehnoloogiaid äriprotsesside kaeveks

    Get PDF
    Protsessikaeve tehnikad võimaldavad organisatsioonidel analüüsida protsesside täitmise käigus tekkivaid logijälgi eesmärgiga leida parendusvõimalusi. Nende tehnikate eelduseks on, et nimetatud logijälgi koondavad sündmuslogid on andmeanalüütikutele analüüside läbi viimiseks kättesaadavad. Sellised sündmuslogid võivad sisaldada privaatset informatsiooni isikute kohta kelle jaoks protsessi täidetakse. Sellistel juhtudel peavad organisatsioonid rakendama privaatsuskaitse tehnoloogiaid (PET), et võimaldada analüütikul sündmuslogi põhjal järeldusi teha, samas säilitades isikute privaatsust. Kuigi PET tehnikad säilitavad isikute privaatsust organisatsiooni siseselt, muudavad nad ühtlasi sündmuslogisid sellisel viisil, mis võib viia analüüsi käigus valede järeldusteni. PET tehnikad võivad lisada sündmuslogidesse sellist uut käitumist, mille esinemine ei ole reaalses sündmuslogis võimalik. Näiteks võivad mõned PET tehnikad haigla sündmuslogi anonüümimisel lisada logijälje, mille kohaselt patsient külastas arsti enne haiglasse saabumist. Käesolev lõputöö esitab privaatsust säilitavate lähenemiste komplekti nimetusega privaatsust säilitav protsessikaeve (PPPM). PPPM põhiline eesmärk on leida tasakaal võimaliku sündmuslogi analüüsist saadava kasu ja analüüsile kohaldatavate privaatsusega seonduvate regulatsioonide (näiteks GDPR) vahel. Lisaks pakub käesolev lõputöö lahenduse, mis võimaldab erinevatel organisatsioonidel protsessikaevet üle ühise andmete terviku rakendada, ilma oma privaatseid andmeid üksteisega jagamata. Käesolevas lõputöös esitatud tehnikad on avatud lähtekoodiga tööriistadena kättesaadavad. Nendest tööriistadest esimene on Amun, mis võimaldab sündmuslogi omanikul sündmuslogi anonüümida enne selle analüütikule jagamist. Teine tööriist on Libra, mis pakub täiendatud võimalusi kasutatavuse ja privaatsuse tasakaalu leidmiseks. Kolmas tööriist on Shareprom, mis võimaldab organisatsioonidele ühiste protsessikaartide loomist sellisel viisil, et ükski osapool ei näe teiste osapoolte andmeid.Process Mining Techniques enable organizations to analyze process execution traces to identify improvement opportunities. Such techniques need the event logs (which record process execution) to be available for data analysts to perform the analysis. These logs contain private information about the individuals for whom a process is being executed. In such cases, organizations need to deploy Privacy-Enhancing Technologies (PETs) to enable the analyst to drive conclusions from the event logs while preserving the privacy of individuals. While PETs techniques preserve the privacy of individuals inside the organization, they work by perturbing the event logs in such a way that may lead to misleading conclusions of the analysis. They may inject new behaviors into the event logs that are impossible to exist in real-life event logs. For example, some PETs techniques anonymize a hospital event log by injecting a trace that a patient may visit a doctor before checking in inside the hospital. In this thesis, we propose a set of privacy-preserving approaches that we call Privacy-Preserving Process Mining (PPPM) approaches to strike a balance between the benefits an analyst can get from analyzing these event logs and the requirements imposed on them by privacy regulations (e.g., GDPR). Also, in this thesis, we propose an approach that enables organizations to jointly perform process mining over their data without sharing their private information. The techniques proposed in this thesis have been proposed as open-source tools. The first tool is Amun, enabling an event log publisher to anonymize their event log before sharing it with an analyst. The second tool is called Libra, which provides an enhanced utility-privacy tradeoff. The third tool is Shareprom, which enables organizations to construct process maps jointly in such a manner that no party learns the data of the other parties.https://www.ester.ee/record=b552434

    TraVaS: Differentially Private Trace Variant Selection for Process Mining

    Full text link
    In the area of industrial process mining, privacy-preserving event data publication is becoming increasingly relevant. Consequently, the trade-off between high data utility and quantifiable privacy poses new challenges. State-of-the-art research mainly focuses on differentially private trace variant construction based on prefix expansion methods. However, these algorithms face several practical limitations such as high computational complexity, introducing fake variants, removing frequent variants, and a bounded variant length. In this paper, we introduce a new approach for direct differentially private trace variant release which uses anonymized \textit{partition selection} strategies to overcome the aforementioned restraints. Experimental results on real-life event data show that our algorithm outperforms state-of-the-art methods in terms of both plain data utility and result utility preservation

    TraVaG: Differentially Private Trace Variant Generation Using GANs

    Full text link
    Process mining is rapidly growing in the industry. Consequently, privacy concerns regarding sensitive and private information included in event data, used by process mining algorithms, are becoming increasingly relevant. State-of-the-art research mainly focuses on providing privacy guarantees, e.g., differential privacy, for trace variants that are used by the main process mining techniques, e.g., process discovery. However, privacy preservation techniques for releasing trace variants still do not fulfill all the requirements of industry-scale usage. Moreover, providing privacy guarantees when there exists a high rate of infrequent trace variants is still a challenge. In this paper, we introduce TraVaG as a new approach for releasing differentially private trace variants based on \text{Generative Adversarial Networks} (GANs) that provides industry-scale benefits and enhances the level of privacy guarantees when there exists a high ratio of infrequent variants. Moreover, TraVaG overcomes shortcomings of conventional privacy preservation techniques such as bounding the length of variants and introducing fake variants. Experimental results on real-life event data show that our approach outperforms state-of-the-art techniques in terms of privacy guarantees, plain data utility preservation, and result utility preservation
    corecore