249 research outputs found
Monotone Precision and Recall Measures for Comparing Executions and Specifications of Dynamic Systems
The behavioural comparison of systems is an important concern of software
engineering research. For example, the areas of specification discovery and
specification mining are concerned with measuring the consistency between a
collection of execution traces and a program specification. This problem is
also tackled in process mining with the help of measures that describe the
quality of a process specification automatically discovered from execution
logs. Though various measures have been proposed, it was recently demonstrated
that they neither fulfil essential properties, such as monotonicity, nor can
they handle infinite behaviour. In this paper, we address this research problem
by introducing a new framework for the definition of behavioural quotients. We
proof that corresponding quotients guarantee desired properties that existing
measures have failed to support. We demonstrate the application of the
quotients for capturing precision and recall measures between a collection of
recorded executions and a system specification. We use a prototypical
implementation of these measures to contrast their monotonic assessment with
measures that have been defined in prior research
Active Learning of Points-To Specifications
When analyzing programs, large libraries pose significant challenges to
static points-to analysis. A popular solution is to have a human analyst
provide points-to specifications that summarize relevant behaviors of library
code, which can substantially improve precision and handle missing code such as
native code. We propose ATLAS, a tool that automatically infers points-to
specifications. ATLAS synthesizes unit tests that exercise the library code,
and then infers points-to specifications based on observations from these
executions. ATLAS automatically infers specifications for the Java standard
library, and produces better results for a client static information flow
analysis on a benchmark of 46 Android apps compared to using existing
handwritten specifications
An A*-algorithm for computing discounted anti-alignments in process mining
Process mining techniques aim at analyzing and monitoring processes through event data. Formal models like Petri nets serve as an effective representation of the processes. A central question in the field is to assess the conformance of a process model with respect to the real process executions. The notion of anti-alignment, which represents a model run that is as distant as possible to the process executions, has been demonstrated to be crucial to measure precision of models. However, the only known algorithm for computing anti-alignments has a high complexity, which prevents it from being applied on real-life problem instances. In this paper we propose a novel algorithm for computing anti-alignments, based on the well-known graph-based A* scheme. By introducing a discount factor in the edit distance used for the search of anti-alignments, we obtain the first efficient algorithm to approximate them. We show how this approximation is quite accurate in practice, by comparing it with the optimal results for small instances where the optimal algorithm can also compute anti-alignments. Finally, we compare the obtained precision metric with respect to the state-of-the-art metrics in the literature for real-life examples.Peer ReviewedPostprint (author's final draft
All That Glitters Is Not Gold: Towards Process Discovery Techniques with Guarantees
The aim of a process discovery algorithm is to construct from event data a
process model that describes the underlying, real-world process well.
Intuitively, the better the quality of the event data, the better the quality
of the model that is discovered. However, existing process discovery algorithms
do not guarantee this relationship. We demonstrate this by using a range of
quality measures for both event data and discovered process models. This paper
is a call to the community of IS engineers to complement their process
discovery algorithms with properties that relate qualities of their inputs to
those of their outputs. To this end, we distinguish four incremental stages for
the development of such algorithms, along with concrete guidelines for the
formulation of relevant properties and experimental validation. We will also
use these stages to reflect on the state of the art, which shows the need to
move forward in our thinking about algorithmic process discovery.Comment: 13 pages, 4 figures. Submitted to the International Conference on
Advanced Information Systems Engineering, 202
The Connection between Process Complexity of Event Sequences and Models discovered by Process Mining
Process mining is a research area focusing on the design of algorithms that
can automatically provide insights into business processes by analysing
historic process execution data, known as event logs. Among the most popular
algorithms are those for automated process discovery, whose ultimate goal is to
generate the best process model that summarizes the behaviour recorded in the
input event log. Over the past decade, several process discovery algorithms
have been proposed but, until now, this research was driven by the implicit
assumption that a better algorithm would discover better process models, no
matter the characteristics of the input event log. In this paper, we take a
step back and question that assumption. Specifically, we investigate what are
the relations between measures capturing characteristics of the input event log
and the quality of the discovered process models. To this end, we review the
state-of-the-art process complexity measures, propose a new process complexity
measure based on graph entropy, and analyze this set of complexity measures on
an extensive collection of event logs and corresponding automatically
discovered process models. Our analysis shows that many process complexity
measures correlate with the quality of the discovered process models,
demonstrating the potential of using complexity measures as predictors for the
quality of process models discovered with state-of-the-art process discovery
algorithms. This finding is important for process mining research, as it
highlights that not only algorithms, but also connections between input data
and output quality should be studied
A Temporal Logic-Based Measurement Framework for Process Mining
The assessment of behavioral rules with respect to a given dataset is key in several research areas, including declarative process mining, association rule mining, and specification mining. The assessment is required to check how well a set of discovered rules describes the input data, as well as to determine to what extent data complies with predefined rules. In declarative process mining, in particular, some measures have been taken from association rule mining and adapted to support the assessment of temporal rules on event logs. Among them, support and confidence are used more often, yet they are reportedly unable to provide a sufficiently rich feedback to users and often cause spurious rules to be discovered from logs. In addition, these measures are designed to work on a predefined set of rules, thus lacking generality and extensibility. In this paper, we address this research gap by developing a general measurement framework for temporal rules based on Linear-time Temporal Logic with Past on Finite Traces (LTLpf). The framework is independent from the rule-specification language of choice and allows users to define new measures. We show that our framework can seamlessly adapt well-known measures of the association rule mining field to declarative process mining. Also, we test our software prototype implementing the framework on synthetic and real-world data, and investigate the properties characterizing those measures in the context of process analysis
Measuring Rule-based LTLf Process Specifications: A Probabilistic Data-driven Approach
Declarative process specifications define the behavior of processes by means
of rules based on Linear Temporal Logic on Finite Traces (LTLf). In a mining
context, these specifications are inferred from, and checked on, multi-sets of
runs recorded by information systems (namely, event logs). To this end, being
able to gauge the degree to which process data comply with a specification is
key. However, existing mining and verification techniques analyze the rules in
isolation, thereby disregarding their interplay. In this paper, we introduce a
framework to devise probabilistic measures for declarative process
specifications. Thereupon, we propose a technique that measures the degree of
satisfaction of specifications over event logs. To assess our approach, we
conduct an evaluation with real-world data, evidencing its applicability in
discovery, checking, and drift detection contexts
Hyper Static Analysis of Programs - An Abstract Interpretation-Based Framework for Hyperproperties Verification
In the context of systems security, information flows play a central role. Unhandled information flows potentially leave the door open to very dangerous types of security attacks, such as code injection or sensitive information leakage. Information flows verification is based on a notion of dependency between a system\u2019s objects, which requires specifications expressing relations between different executions of a system. Specifications of this kind, called hyperproperties, go beyond classic trace properties, defined in terms of predicate over single executions. The problem of trace properties verification is well studied, both from a theoretical as well as a practical point of view. Unfortunately, very few works deal with the verification of hyperproperties. Note that hyperproperties are not limited to information flows. Indeed, a lot of other important problems can be modeled through hyperproperties only: processes synchronization, availability requirements, integrity issues, error resistant codes check, just to name a few. The sound verification of hyperproperties is not trivial: it is not easy to adapt classic verification methods, used for trace properties, in order to deal with hyperproperties. The added complexity derives from the fact that hyperproperties are defined over sets of sets of executions, rather than sets of executions, as happens for trace properties. In general, passing to powersets involves many problems, from a computability point of view, and this is the case also for systems verification. In this thesis, it is explored the problem of hyperproperties verification in its theoretical and practical aspects. In particular, the aim is to extend verification methods used for trace properties to the more general case of hyperproperties. The verification is performed exploiting the framework of abstract interpretation, a very general theory for approximating the behavior of discrete dynamic systems. Apart from the general setting, the thesis focuses on sound verification methods, based on static analysis, for computer programs. As a case study \u2013 which is also a leading motivation \u2013 the verification of information flows specifications has been taken into account, in the form of Non-Interference and Abstract Non-Interference. The second is a weakening of the first, useful in the context where Non-Interference is a too restrictive specification. The results of the thesis have been implemented in a prototype analyzer for (Abstract) Non-Interference which is, to the best of the author\u2019s knowledge, the first attempt to implement a sound verifier for that specification(s), based on abstract interpretation and taking into account the expressive power of hyperproperties
- …