3 research outputs found
Practical Aspect of Privacy-Preserving Data Publishing in Process Mining
Process mining techniques such as process discovery and conformance checking
provide insights into actual processes by analyzing event data that are widely
available in information systems. These data are very valuable, but often
contain sensitive information, and process analysts need to balance
confidentiality and utility. Privacy issues in process mining are recently
receiving more attention from researchers which should be complemented by a
tool to integrate the solutions and make them available in the real world. In
this paper, we introduce a Python-based infrastructure implementing
state-of-the-art privacy preservation techniques in process mining. The
infrastructure provides a hierarchy of usages from single techniques to the
collection of techniques, integrated as web-based tools. Our infrastructure
manages both standard and non-standard event data resulting from privacy
preservation techniques. It also stores explicit privacy metadata to track the
modifications applied to protect sensitive data
Secure Multi-Party Computation for Inter-Organizational Process Mining
Process mining is a family of techniques for analysing business processes
based on event logs extracted from information systems. Mainstream process
mining tools are designed for intra-organizational settings, insofar as they
assume that an event log is available for processing as a whole. The use of
such tools for inter-organizational process analysis is hampered by the fact
that such processes involve independent parties who are unwilling to, or
sometimes legally prevented from, sharing detailed event logs with each other.
In this setting, this paper proposes an approach for constructing and querying
a common type of artifact used for process mining, namely the frequency and
time-annotated Directly-Follows Graph (DFG), over multiple event logs belonging
to different parties, in such a way that the parties do not share the event
logs with each other. The proposal leverages an existing platform for secure
multi-party computation, namely Sharemind. Since a direct implementation of DFG
construction in Sharemind suffers from scalability issues, the paper proposes
to rely on vectorization of event logs and to employ a divide-and-conquer
scheme for parallel processing of sub-logs. The paper reports on an
experimental evaluation that tests the scalability of the approach on real-life
logs.Comment: 15 pages ,5 figure
PRIPEL: Privacy-Preserving Event Log Publishing Including Contextual Information
Event logs capture the execution of business processes in terms of executed
activities and their execution context. Since logs contain potentially
sensitive information about the individuals involved in the process, they
should be pre-processed before being published to preserve the individuals'
privacy. However, existing techniques for such pre-processing are limited to a
process' control-flow and neglect contextual information, such as attribute
values and durations. This thus precludes any form of process analysis that
involves contextual factors. To bridge this gap, we introduce PRIPEL, a
framework for privacy-aware event log publishing. Compared to existing work,
PRIPEL takes a fundamentally different angle and ensures privacy on the level
of individual cases instead of the complete log. This way, contextual
information as well as the long tail process behaviour are preserved, which
enables the application of a rich set of process analysis techniques. We
demonstrate the feasibility of our framework in a case study with a real-world
event log