9,233 research outputs found
Explain, Adapt and Retrain: How to improve the accuracy of a PPM classifier through different explanation styles
Recent papers have introduced a novel approach to explain why a Predictive
Process Monitoring (PPM) model for outcome-oriented predictions provides wrong
predictions. Moreover, they have shown how to exploit the explanations,
obtained using state-of-the art post-hoc explainers, to identify the most
common features that induce a predictor to make mistakes in a semi-automated
way, and, in turn, to reduce the impact of those features and increase the
accuracy of the predictive model. This work starts from the assumption that
frequent control flow patterns in event logs may represent important features
that characterize, and therefore explain, a certain prediction. Therefore, in
this paper, we (i) employ a novel encoding able to leverage DECLARE constraints
in Predictive Process Monitoring and compare the effectiveness of this encoding
with Predictive Process Monitoring state-of-the art encodings, in particular
for the task of outcome-oriented predictions; (ii) introduce a completely
automated pipeline for the identification of the most common features inducing
a predictor to make mistakes; and (iii) show the effectiveness of the proposed
pipeline in increasing the accuracy of the predictive model by validating it on
different real-life datasets
Identifying the Key Attributes in an Unlabeled Event Log for Automated Process Discovery
Process mining discovers and analyzes a process model from historical event
logs. The prior art methods use the key attributes of case-id, activity, and
timestamp hidden in an event log as clues to discover a process model. However,
a user needs to specify them manually, and this can be an exhaustive task. In
this paper, we propose a two-stage key attribute identification method to avoid
such a manual investigation, and thus this is a step toward fully automated
process discovery. One of the challenging tasks is how to avoid exhaustive
computation due to combinatorial explosion. For this, we narrow down candidates
for each key attribute by using supervised machine learning in the first stage
and identify the best combination of the key attributes by discovering process
models and evaluating them in the second stage. Our computational complexity
can be reduced from to where and
are the numbers of columns and candidates we keep in the first stage,
respectively, and usually is much smaller than . We evaluated our method
with 14 open datasets and showed that our method could identify the key
attributes even with for about 20 seconds for many datasets.Comment: IEEE Transactions on Services Computing (Early Access version
- …