




zur Erlangung des Doktorgrades
der Naturwissenschaften im Fachbereich
Mathematik und Informatik
der Mathematisch-Naturwissenschaftlichen Fakultät
der Westfälischen Wilhelms-Universität Münster
vorgelegt von
Peter Lammich
aus Freiburg im Breisgau
– 2011 –
Dekan: Prof. Dr. Matthias Löwe
Erster Gutachter: Prof. Dr. Markus Müller-Olm
Zweiter Gutachter: Prof. Dr. Helmut Seidl
Tag der mündlichen Prüfung: 28.6.2011
Tag der Promotion: 28.6.2011
Abstract
We present a model-checking algorithm for dynamic pushdown networks with
monitors (Monitor-DPNs). Monitor-DPNs are a model for parallel programs
with recursive procedures, thread creation, and mutual exclusion via locks that
are bound to syntactic blocks in the program (monitors). We consider predeces-
sor set computation, with which many interesting properties can be expressed,
among them race-conditions, bitvector-analysis, and the (EF,EX)-fragment of the
branching time logic CTL.
Our algorithm is based on acquisition structures, which are a finite-state ab-
straction of executions. An acquisition structure contains enough information
to decide whether an execution is feasible w.r.t. the semantics of locks. By en-
coding the acquisition structures into the control-states of a Monitor-DPN, we
reduce lock-sensitive predecessor set computation to lock-insensitive predecessor
set computation, for which efficient algorithms are known.
For fixed-size, negation-free (EF,EX)-formulas, which are sufficient to describe
most interesting properties, our algorithm requires exponential time in the num-
ber of locks, but only polynomial time in the program size. To justify the ex-
ponential complexity, we show that checking most interesting properties is NP-
hard. We also show that the model-checking problem for fixed-size, negation-





1.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Preliminaries 11
2.1 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1 Word Automata . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.2 Tree Automata . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.1 Big-O Notation . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.2 Size of Input Data . . . . . . . . . . . . . . . . . . . . . . 16
2.3.3 Computational Complexity . . . . . . . . . . . . . . . . . . 18
3 Models 21
3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Dynamic Pushdown Networks . . . . . . . . . . . . . . . . . . . . 22
3.2.1 Interleaving Semantics . . . . . . . . . . . . . . . . . . . . 23
3.2.2 Tree-Semantics . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3 Locks and Monitors . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3.1 Lock-Sensitive Interleaving Semantics . . . . . . . . . . . . 30
3.3.2 Well-Nestedness . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3.3 Lock-Sensitive Scheduler . . . . . . . . . . . . . . . . . . . 35
3.4 Summary and Related Work . . . . . . . . . . . . . . . . . . . . . 38
4 Lock-Sensitive Schedulability 45
4.1 Acquire/Release-Hedges . . . . . . . . . . . . . . . . . . . . . . . 45
4.1.1 Mapping Execution-Hedges to A/R-Hedges . . . . . . . . . 48
4.2 Schedules of Acquire/Release-Hedges . . . . . . . . . . . . . . . . 50
4.2.1 A Theory of Movers . . . . . . . . . . . . . . . . . . . . . 51
4.2.2 Disciplined Schedules of Execution-Hedges . . . . . . . . . 54
4.3 Acquisition Structures . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3.1 Dependence-Graph . . . . . . . . . . . . . . . . . . . . . . 58
4.3.2 Acquisition- and Release-Graphs . . . . . . . . . . . . . . . 63
i
Contents
4.3.3 A Tree Automaton for Schedulable A/R-Hedges . . . . . . 65
4.4 Summary and Related Work . . . . . . . . . . . . . . . . . . . . . 71
5 Automata Constructions 75
5.1 A DPN-Acceptor for Schedulable Execution-Hedges . . . . . . . . 75
5.1.1 DPN-Acceptors . . . . . . . . . . . . . . . . . . . . . . . . 75
5.1.2 A DPN-Acceptor for Regular Sets of A/R-Hedges . . . . . 76
5.2 Cross-Product Construction . . . . . . . . . . . . . . . . . . . . . 89
5.3 Summary and Related Work . . . . . . . . . . . . . . . . . . . . . 96
6 Lock-Sensitive Predecessor Sets 99
6.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.2 Lock-Insensitive Predecessor Set Computation . . . . . . . . . . . 100
6.3 Reduction to Lock-Insensitive Predecessor Set Computation . . . 102
6.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.4.1 Atomic-Set Serializability Violation . . . . . . . . . . . . . 105
6.4.2 EF-Formula . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.4.3 Bounded Model-Checking . . . . . . . . . . . . . . . . . . 109
6.5 Summary and Related Work . . . . . . . . . . . . . . . . . . . . . 109
7 Optimizations 113
7.1 Pseudocode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.2 Queries from the Start Configuration . . . . . . . . . . . . . . . . 114
7.2.1 Consistency Check . . . . . . . . . . . . . . . . . . . . . . 114
7.2.2 Immediate Predecessor Sets . . . . . . . . . . . . . . . . . 115
7.2.3 Initial Releases . . . . . . . . . . . . . . . . . . . . . . . . 116
7.3 Deadlocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.3.1 Inescapable Locks . . . . . . . . . . . . . . . . . . . . . . . 118
7.3.2 No Spawn inside Monitors . . . . . . . . . . . . . . . . . . 122
7.3.3 Deadlock Detection . . . . . . . . . . . . . . . . . . . . . . 124
7.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
7.4.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.5 Summary and Related Work . . . . . . . . . . . . . . . . . . . . . 135
8 Non-Monitor Locking Disciplines 137
8.1 Well-Nested, Reentrant Locks . . . . . . . . . . . . . . . . . . . . 137
8.2 Well-Nested, Non-Reentrant Locks . . . . . . . . . . . . . . . . . 138
8.3 Summary and Related Work . . . . . . . . . . . . . . . . . . . . . 141
9 Complexity 143
9.1 Models and Properties . . . . . . . . . . . . . . . . . . . . . . . . 143
9.2 Monitors and Well-Nested, Non-Reentrant Locks . . . . . . . . . . 145
9.2.1 Lower Complexity Bounds . . . . . . . . . . . . . . . . . . 146
ii
Contents
9.2.2 Upper Complexity Bounds . . . . . . . . . . . . . . . . . . 152
9.3 Stronger Synchronization Mechanisms . . . . . . . . . . . . . . . . 165







Computer programming is prone to errors. Due to Rice’s theorem [107], there
exists no algorithm that automatically proves that a program meets its specifica-
tion. In practice, several methods are applied to verify program correctness. The
most widely used method is testing, i.e., running the program on some test input,
and verify that it yields the expected output. Testing is inherently incomplete,
as it only covers a finite subset of the possible executions. On the other hand,
testing is simple and requires no knowledge of formal methods, which explains
its widespread use. Another method is to rigorously prove that a program meets
its specification. This requires the use of sophisticated formal methods. More-
over, writing specifications and proofs is tedious and error-prone, in particular for
large programs. While interactive theorem provers and automated proof checkers
help to avoid errors in proofs, they require even more rigorous, machine readable
formalizations, which tend to be hardly understandable by humans.
Automatic verification methods lie in between testing and complete correctness
proofs. The goal is to automatically verify absence of certain errors, like buffer-
overflows or null-pointer accesses. Due to Rice’s theorem, such methods must be
incomplete, i.e., they may fail to verify correct programs, or unsound, i.e., they
may fail to detect errors. However, one tries to design methods that succeed for
typical programs.
Due to the widespread availability of concurrent hardware, like multicore pro-
cessors, concurrent programs are getting more and more important, and almost
all modern programming languages support concurrent programming. However,
concurrent programming is prone to subtle errors like race-conditions and dead-
locks that do not occur in sequential programs. Even worse, execution of concur-
rent programs is nondeterministic, as it depends on the order in which the steps
of the parallel processes are executed. Thus, there are concurrency related errors
that manifest themselves only at a very low probability, depending on external
parameters like the scheduler and the processor speed. Such errors are likely to
be missed by testing. On the other hand, formal methods for proving correctness
of concurrent programs are even more complex than those for sequential pro-




Example 1.1. As an example, consider the following Java [47] program:
public class Example implements Runnable {
private s ta t ic void wr i t e_termina l ( S t r i n g s ) {
for ( int i =0; i<s . l e n g t h ();++ i ) {
System . out . p r i n t ( s . charAt ( i ) ) ;
}
System . out . f l u s h ( ) ;
}
private synchronized void wr i t e ( S t r i n g s ) {
wr i t e_termina l ( s ) ;
}
public void run ( ) {
wr i t e ( " He l l o " ) ;
}
public s ta t ic void main ( S t r i n g arg s [ ] ) {
new Thread (new Example ( ) ) . s t a r t ( ) ;
new Thread (new Example ( ) ) . s t a r t ( ) ;
}
}
The program creates two threads, and each thread invokes the write() method to
print the string "Hello". To synchronize the concurrent access to the printer,
the write() method is marked as synchronized. Probably, the intention of the
programmer was that the program outputs the string "HelloHello". And indeed,
running the program on the author’s machine 50 times yields the correct output
each time. However, the program contains a subtle error: The semantics of the
synchronized-keyword is to synchronize on the monitor of the object on that
the method is invoked. In our case, this is the thread object. Thus, the write()
methods of the two threads are synchronized on two different objects, and there is
actually no mutual exclusion! The reason why this error does not manifest itself
depends on the scheduler that uses timeslicing to switch between the threads. As
the time required to complete the call of write() is small in relation to the duration
of a timeslice, a preemption inside the write method is very unlikely. However,
this behavior may be different on other architectures or operating systems, such
that the error may manifest itself with a higher probability there. We will resume
this example in Section 7.4, where we show how the analysis developed in this
thesis detects the error.
Automatic verification methods often work in two phases: In the first phase, the
program to be verified is abstracted to an abstract model, on which the verification
2
1.1 Related Work
problem is decidable. The abstraction is usually sound, i.e., behaviors of the
program correspond to behaviors of the abstract model. However, the abstraction
is not precise in general, i.e., the abstract model may have spurious behaviors that
do not correspond to actual behaviors of the program.
The second phase then decides the verification problem on the abstract model.
If the second phase succeeds, the program is proved correct (i.e., free of the
errors the analysis checks for), because all behaviors of the abstract model—that
are a superset of the behaviors of the concrete model—are proved correct. If,
however, the second phase fails, the abstract model has an erroneous behavior.
This behavior may either correspond to a behavior of the program, or it may be
spurious. In the former case, the program really has an error. In the latter case,
the program may be correct or not. If the decision procedure used in the second
phase is able to produce a description of the erroneous behavior of the abstract
model, this behavior can be projected on the program and tested for feasibility
by simulating the program. If the behavior is indeed feasible, a definite error
has been detected. Otherwise, information about the failed simulation may be
used to refine the abstraction, and the whole process is iterated with the refined
abstraction. This scheme is commonly referred to as counter example guided
abstraction refinement (CEGAR) [25].
This thesis is focused on the second phase of automatic verification, i.e., de-
ciding verification problems on the abstract model. Our model, Monitor-DPN,
is an infinite-state model with unboundedly deep stacks and unboundedly many
threads that may be created dynamically. Additionally, synchronization between
threads via reentrant monitors is supported. What is not supported, and thus
has to be abstracted away in the first phase, are infinite-state data structures on
the heap, shared memory, and other synchronization mechanisms like wait/notify
and join. With the exception of join, support of any of these concepts leads to
undecidability of almost all interesting properties.
Our method is based on predecessor sets. For a set of configurations C, the pre-
decessor set of C consists of all configurations c′ such that there exists a sequence
of transitions from c′ to some configuration c ∈ C. Many interesting properties
can be expressed with predecessor sets, among them absence of race-conditions
and the (EF,EX)-fragment of the branching time logic CTL [27]. Moreover, our
technique can be applied to increase the precision of bounded model-checking,
where the objective is finding errors rather than proving their absence.
In the remainder of this chapter, we discuss related work and the contributions
of this thesis. Finally, we sketch the outline of this thesis.
1.1 Related Work
Testing of programs has been done since the early days of computers (cf. [45] for
an overview). Methods for rigorous correctness proofs of programs are usually
3
1 Introduction
based on the work of Floyd [42] and Hoare [50] (Hoare-logic), with extensions
to handle concurrent programs (cf. [53, 95]) and recursive programs (cf. [5] for
an overview). However, those methods are quite complex and tedious for large
programs. Up to now, complete formal verification of programs with interactive
theorem provers and Floyd-Hoare style logic (e.g. [64, 118]) is mostly restricted
to academia.
Model-Checking Model-checking was developed and pushed forward by Clarke
and Emerson [26] and Queille and Sifakis [103] out of the need for automatic
methods to verify concurrent programs [24]. The basic idea is to automatically
verify a system specified by an abstract model against a specification in temporal
logic. An overview of the history of model-checking can be found in [24].
The biggest obstacle for model-checking is state-space explosion: The state-
space of a concurrent program has exponential size in the number of concurrently
executed processes. This limits the explicit exploration of the state-space to
rather small programs. The state-space explosion problem has been tackled with
different techniques, among them symbolic model-checking [20], where sets of
states are represented by binary decision diagrams, and partial order reduction
techniques [46, 97, 114], which reduce the number of states that need to be
explored by exploiting that steps of concurrent processes are often independent.
Model-checking can also be applied to infinite-state systems. For example,
Bouajjani et al. [13] model-check branching and linear time logics on pushdown
processes. However, regarding concurrent infinite-state systems, model-checking
tends to be undecidable. Mayr [86] provides a classification of infinite-state sys-
tems and obtains decidability and undecidability results for various temporal
logics. Among other results, Mayr [85, 86] shows that the (EF,EX)-fragment of
CTL (referred to as EF-logic) is decidable for models with parallel procedure
calls.1
Another technique to cope with state-space explosion or infinite-state systems is
bounded model-checking, where the state-space is only explored up to a certain
search depth [9, 10]. While this allows the application of alternative symbolic
techniques like SAT-solvers, bounding the search depth no longer allows to prove
the system correct. One can only find errors, hoping that all errors manifest
themselves within a tractable search depth. The idea of bounded model-checking
can also be combined with precise techniques developed for pushdown systems.
Qadeer and Rehof [101] apply bounded model-checking to recursive, concurrent
Boolean programs, where the bound is not the number of steps, but the number
of context switches. Their technique is improved and generalized to DPNs by
Bouajjani et al. [15].
1Presented in a process-algebraic framework, called PA- and PAD-processes.
4
1.1 Related Work
Predecessor Sets A well-known result by Büchi [19] is that the set of reachable
configurations of a pushdown system is regular. This has been extended by Caucal
[22, 23], who characterizes pushdown systems as rational transducers. This result
implies that predecessor and successor sets of regular sets of configurations are
regular again and can be computed. Bouajjani and Maler [11], Bouajjani et al.
[13], and Finkel et al. [41] apply predecessor and successor set computation to
decide temporal logics for pushdown systems. Successor and predecessor set
computation has been generalized to PA-processes by Lugiez and Schnoebelen
[82, 83]. While most branching time temporal logics are undecidable for PA-
processes, EF-logic can be naturally decided via predecessor set computation.
Thus, the algorithm of Lugiez and Schnoebelen [82, 83] is a succinct alternative
to the more complicated tableaux based method of Mayr [85, 86].
Dynamic Pushdown Networks Intuitively, PA-processes support recursion
and concurrency by allowing procedures to be called in parallel. After a parallel
call, the procedures run in parallel, and the call returns if all called procedures
have returned. However, concurrency in programming languages like Java [47]
works differently: Once a thread is created, it runs until termination and does
not synchronize with its creator (unless the creator invokes a join-statement).
In particular, a thread may survive the procedure that created it. This type of
concurrency is called dynamic thread creation and cannot be modeled by PA-
processes [16]. Bouajjani et al. [16, 17] propose dynamic pushdown networks
(DPNs), which support dynamic thread creation. They show that predecessor
set computation for DPNs can be done in polynomial time and preserves regular-
ity. A DPN is a pushdown system where rules may create (spawn) new threads
as a side effect. The spawned threads have their own stacks and run in parallel
with all other threads in the system. A disadvantage is that DPNs do not sup-
port joining of threads as PA-processes do: Once a thread is spawned, it runs
completely independent from its creator. Thus, the expressive power of DPNs
and PA-processes are incomparable. Hence, Bouajjani et al. [16, 17] propose
Constrained DPNs (CDPNs) that are strictly more general than DPNs and PA-
processes. They show that predecessor sets for CDPNs still preserve regularity
and are effectively computable.
More recently, we have explored an alternative technique that computes suc-
cessor sets of DPNs [44, 74]. It is based on characterizing the set of executions
of a DPN as a tree automaton, and may also be applicable to CDPNs.
Communication between Threads In DPNs and CDPNs, the communica-
tion between threads is rather limited: In DPNs, the only communication be-
tween threads is the causal dependence of a thread on the step that created
it. In CDPNs, threads may additionally observe their children by stable con-
straints, e.g. observe termination or progress of spawned threads. However, real
5
1 Introduction
programming languages support more powerful communication between threads,
like shared memory or message-passing. Unfortunately, most interesting proper-
ties are undecidable for concurrent pushdown systems with shared global state or
other synchronization mechanisms like rendezvous-communication and message-
passing [104].
Besides shared memory, real programming languages also use locks and moni-
tors to synchronize the access to shared resources. Intuitively, a lock is an object
that can be acquired by at most one thread at the same time. If a thread wants
to acquire a lock that is already acquired by another thread, it has to wait until
the other thread releases the lock. Monitors (cf. [51]) are a special discipline of
using locks: A monitor is a syntactic block in the program text that is associ-
ated with a lock. The lock is acquired upon entering the monitor and released
upon leaving the monitor. For example, the synchronized methods of Java [47]
implement monitors.
Monitors enforce locks to be used in a well-nested fashion, i.e., a thread can
only release the last lock that it has acquired and not yet released. For well-
nested locking, each thread can be thought of having a stack of locks (lockstack).
An acquisition pushes the acquired lock onto the lockstack, and a release pops
the topmost lock from the stack. For monitors, locks are acquired and released
inside the same procedure. Thus, the lockstack is synchronized with the callstack,
which stores the return addresses.
In real programming languages, locks are often reentrant, i.e., a thread is al-
lowed to re-acquire a lock that it already possesses. Each acquisition-operation
increments a counter for the lock and each release-operation decrements the
counter. The thread releases the lock if the counter falls to zero.
When proving absence of errors like race-conditions, it is essential not to ab-
stract away from locks, as they are typically used to avoid such errors. Kahlon
et al. [57] have shown that reachability analysis for parallel pushdown systems
with well-nested, non-reentrant locks remains decidable. They extended their re-
sults to checking fragments of LTL and CTL [55, 56], and to systems that satisfy
the bounded lock-chain criterion [54], a generalization of well-nested locks. Note
that reachability properties w.r.t. arbitrarily non-well-nested locks are undecid-
able [57].
Their model, called parallel pushdown systems (PPDS), consists of a fixed num-
ber of pushdown processes that run completely independent, i.e., there are no par-
allel procedure calls like in PA-processes, nor thread creations like in DPNs. They
decide pairwise reachability properties (viz. logics with double-indexed atomic
propositions in [55, 56]).
Preceding Results In this paragraph, we describe our own results that are re-
lated to this thesis. The main objective of our research is to extend DPNs with
more powerful communication mechanisms, such that interesting properties, like
6
1.1 Related Work
absence of race-conditions, remain decidable. Considering the tight undecidabil-
ity boundary [104], mutual exclusion via locks is a natural candidate for such an
extension. Leveraging the acquisition history methods of [57], we first developed
an analysis for pairwise reachability of interprocedural flowgraphs with dynamic
thread creation and reentrant monitors [78]. This model differs from DPNs in
that a procedure may not pass a return value to its caller.
A main observation of [78] is that, as far as reachability is concerned, an ex-
ecution can always be reordered such that steps inside a monitor are scheduled
atomically. This is used to prune irrelevant steps from the execution. For pair-
wise reachability, the pruned execution contains no more than two threads that
are running simultaneously.
While [57] deals with non-reentrant, well-nested locks, our methods cover reen-
trant monitors. These two models are incomparable. Unfortunately, we have no
results for the natural generalization of both models, i.e., well-nested, reentrant
locks (cf. Section 8.1). Another difference between [57] and our method in [78] is
that [57] is based on successor set computation for pushdown automata, while we
use fixed point computation and an abstract interpretation framework [31, 32].
Our technique of reordering schedules implies a cut-off: For a specific execution,
we call the acquisition of a lock final if this lock is not released during the remain-
der of the execution. Obviously, the number of non-reentrant final acquisitions in
any execution is bounded by the number of locks. Symmetrically, a release of a
lock that is not acquired during the prefix of the execution is called initial. Also
the number of non-reentrant initial releases in any execution is bounded by the
number of locks.
Using this cut-off, a lock-sensitive predecessor set computation for DPNs can
be implemented by guessing a sequence of initial releases and final acquisitions,
and using lock-insensitive predecessor set computation to check whether there is
a feasible execution with that sequence of initial releases and final acquisitions.
However, guessing has to be implemented by iterating over all possible sequences,
resulting in a super-exponential algorithm. We have not published these results,
as they were superseded by subsequent research [79].
In [79], we extend the acquisition history method to DPNs with well-nested,
non-reentrant locks (Lock-DPNs). We show how to compute predecessor sets in
time polynomial in the program size and exponential in the number of locks.
The executions of the threads in a parallel pushdown system are completely
independent. Thus, the execution of each thread can be seen as a linear sequence
of steps. The acquisition history method computes a summary of the executions
of each thread, and then combines these summaries to check whether there are
compatible executions. However, when threads are dynamically created, they are
not independent any more, as a thread does not start before it has been created.
A natural generalization is to describe executions as trees (execution-trees), such
that a step that creates a thread has two successors: The execution of the created
thread, and the remainder of the execution of the creating thread. By generaliz-
7
1 Introduction
ing the acquisition history method from linear executions to execution-trees, we
obtain a tree automaton that characterizes all feasible execution-trees. Then, we
construct the cross-product of the Lock-DPN and this tree automaton, resulting
in a DPN without locks, whose executions correspond to feasible executions of
the original Lock-DPN. The cross-product DPN is then analyzed using the known
predecessor set algorithm of Bouajjani et al. [16].
Another generalization that we applied increased the number of threads that
can be handled by acquisition histories. For the pairwise reachability properties
of [57], it was sufficient to restrict to systems with two threads. A similar restric-
tion was achieved in [78] by pruning the executions. However, when computing
predecessor sets of arbitrary configurations, we have to consider more than two
threads. Fortunately, the generalization of acquisition histories to more than two
threads is quite straightforward.
Recall that DPNs and PA-processes are incomparable [16], and CDPNs are a
generalization of both models. A slightly less expressive generalization of both
models are DPNs with joins, where a thread may execute a join-statement to wait
until all threads that it has spawned have terminated. In parallel to the prepara-
tion of this thesis, we have explored analysis of DPNs with joins and well-nested,
non-reentrant locks (Join-Lock-DPNs). The acquisition history techniques can be
generalized to this scenario, and as a first result, we have shown that reachability
of regular sets of configurations is decidable in exponential time [44].
1.2 Contributions
In this thesis, we develop an algorithm for predecessor set computation for DPNs
with reentrant monitors (Monitor-DPNs). Our methods are largely based on
the ideas that we developed in [79] to compute predecessor sets of Lock-DPNs:
We also use execution-trees, characterize execution-trees of feasible executions by
DPN-Acceptors (a generalization of the tree automata used in [79]), and perform
a cross-product construction of a Monitor-DPN and a DPN-Acceptor, thus reduc-
ing lock-sensitive predecessor set computation to lock-insensitive predecessor set
computation, which is then performed by the algorithm of Bouajjani et al. [16].
The main technical differences are, (1) we analyze Monitor-DPNs rather than
Lock-DPNs, (2) we use DPN-Acceptors rather than tree automata, and, (3) we
use a more modular approach. Monitor-DPNs and Lock-DPNs are incomparable
models, as Monitor-DPNs have reentrance and Lock-DPNs need not adhere to a
monitor-discipline. DPN-Acceptors generalize tree automata by limited use of a
stack, thus allowing to model reentrant monitors. Our approach to analysis of
Monitor-DPNs consists of two phases. In the first phase, we reorder the steps
of an execution—like in [78]—such that sequences between matching acquisition-
and release-operations are scheduled atomically. Additionally, we eliminate reen-
trance in this phase. In the second phase, the acquisition history method is
8
1.2 Contributions
applied to the reordered executions. Compared to the direct approach taken in
[79], our approach is more modular and results in simpler proofs.
Moreover, we show that many analysis problems for Lock-DPNs and Monitor-
DPNs are NP-complete, where the high complexity is induced by the number
of locks, not by the program size. This result also matches the runtime of our
predecessor set algorithm, which is exponential only in the number of locks.
When regarding slightly more expressive models, e.g. Join-Lock-DPNs, the anal-
ysis problems become PSPACE-hard.
Summarizing, the contributions of this thesis are the following:
• We characterize executions of DPNs by execution-trees, and show how to
constrain predecessor set computation w.r.t. sets of execution-trees via a
cross-product construction. While computing predecessor sets that are
constrained with a regular set of interleaved executions is not effective,
our results imply a polynomial algorithm for computing predecessor sets
constrained with a regular set of execution-trees, or even with a set of
execution-trees accepted by a DPN-Acceptor.
The idea of execution-trees has been around in our research group for a
while (cf. [111, 120]). The idea of the cross-product construction with a
tree automaton has been developed in [79], and is generalized to DPN-
Acceptors in this thesis.
• We generalize the acquisition histories of Kahlon et al. [55, 57] to execution-
trees, thus supporting lock-sensitive analysis of DPNs. We mainly worked
out this generalization in [79]. The main difference is that we analyze
DPNs with reentrant monitors, while [79] analyzes DPNs with well-nested,
non-reentrant locks. Also the technical approach in this thesis is different
from [79]. While the formulation in [79] defines acquisition structures over
execution-trees, making the correctness proof quite involved, this thesis
presents a modular approach that results in simpler proofs.
• We use generalized acquisition histories and the cross-product construction
to reduce lock-sensitive predecessor set computation to lock-insensitive pre-
decessor set computation. This yields an algorithm for lock-sensitive pre-
decessor set computation on Monitor-DPNs that can be applied to check
various interesting properties, like absence of race-conditions and EF-logic.
• Finally, we show that various analysis problems for Monitor-DPNs are NP-
complete, among them detection of race-conditions and model-checking
negation-free EF-logic with a fixed number of operators. Slightly more gen-
eral problems, like reachability analysis on Join-Lock-DPNs, are PSPACE-
hard. Our NP-hardness results also apply to related problems on systems
with locks. For example, the pairwise reachability problem on parallel push-
9
1 Introduction
down systems (cf. [57]), as well as the model-checking problems studied by
Kahlon and Gupta [55, 56], are NP-hard.
1.3 Outline
The remainder of this thesis is organized as follows: In Chapter 2, we present
basic concepts that are frequently used in this thesis. In Chapter 3, we for-
mally define Monitor-DPNs and their semantics. We define an interleaving se-
mantics as a reference point, and a tree-semantics on execution-trees, and show
that the two are equivalent. In Chapter 4, we generalize the concept of ac-
quisition histories to execution-trees. As an intermediate model, we introduce
acquire/release-trees, where matching acquisition- and release-nodes are summa-
rized into single use-nodes. By reordering the steps of an execution such that
steps between matching acquisitions and releases are scheduled atomically, we
show that schedulability of a lock-execution-tree is equivalent to schedulability of
the corresponding acquire/release-tree. Then, we use acquisition histories to show
that the set of schedulable acquire/release-trees is regular. In Chapter 5, we show
how to combine a Monitor-DPN with a regular constraint on its acquire/release-
trees. This is done in two phases: In the first phase, we translate automata
on acquire/release-trees to DPN-Acceptors on execution-trees. In the second
phase, we do a cross-product construction between the Monitor-DPN and the
DPN-Acceptor. In the first part of Chapter 6, we use the results of the previous
chapters to reduce lock-sensitive predecessor set computation to lock-insensitive
predecessor set computation. As lock-insensitive predecessor set computation is
effective [16], we obtain an algorithm for lock-sensitive predecessor set compu-
tation, which is the main result of this thesis. In the second part of Chapter 6,
we sketch some applications of this algorithm. In Chapter 7, we describe opti-
mizations that simplify our algorithm for typical analysis scenarios, and present
a simple example that illustrates how our analysis finds the data-race in the
Java program from Example 1.1. Finally, we use the example to reveal problems
that an implementation of our algorithm has to solve, and propose solutions. In
Chapter 8, we briefly discuss how the methods of this thesis can be transferred to
the analysis of DPNs with well-nested, non-reentrant locks, and discuss why our
techniques do not apply to well-nested, reentrant locks. Moreover, we present a
polynomial time algorithm that checks whether a given DPN uses locks only in
a well-nested, non-reentrant fashion. In Chapter 9, we show that our analysis is
NP-complete, and establish lower complexity bounds for various related analysis
problems. Finally, Chapter 10 contains the conclusion of this thesis and indicates
topics of current and future research.
10
2 Preliminaries
In this chapter, we briefly introduce some basic concepts and notations that we
use in this thesis.
This chapter is organized as follows: In Section 2.1, we agree on conventions for
basic notations of sets, sequences, etc. In Section 2.2, we describe the concepts
of word and tree automata. In Section 2.3, we introduce concepts required to
estimate the complexity of our algorithms.
2.1 Notations
The set B := {>,⊥} is the set of truth values. The set of natural numbers
including zero is denoted by N. We set N+ := N\{0}. The symbol R denotes the
real numbers. We set R+ := {r ∈ R | r > 0}. We use the standard operations
∪,∩, \,∈ for sets. For a predicate P and a set S, the set {x ∈ S | P (x)} denotes
the set of all elements of S that satisfy P . We write A ∪˙ B for disjoint union of
the sets A and B, i.e., A ∪˙B = A ∪B, and, additionally, we assume A ∩B = ∅.
We use sets and their characteristic functions synonymously. For example, we
write valid(c) or c ∈ valid to denote that the element c is contained in the set
valid.
The set of finite length lists of elements from a set X is denoted by X∗. The
empty list is denoted by ε. A list with distinct elements is denoted by [x1, . . . , xn].
List concatenation is written by the empty operator, i.e., l1l2 is the concatena-
tion of the lists l1 and l2. When clear from the context, we also write x1 . . . xn
instead of [x1, . . . , xn], and mix lists and single elements to form a new list, i.e.,
lx1x2l
′ instead of l[x1, x2]l′. With |l|, we denote the length of the list l. When
unambiguous, we use lists in place of the set of their elements. For example, we
write x ∈ l to denote that l contains an element x, and l1∩ l2 to denote the set of
elements contained in both l1 and l2. To explicitly denote the set of elements of
a list l, we write set(l). Variables of a list type are sometimes written with a bar,
i.e., x¯ ∈ X∗, to distinguish them from variables x ∈ X of the element type. For
any set A, the predicate disjoint ⊆ A∗ holds for exactly the lists whose elements
are pairwise disjoint:
disjoint(w) ⇐⇒ ∀w1, w2, w3 ∈ A∗, x, y ∈ A. w = w1xw2yw3 =⇒ x 6= y.
Let L1 ⊆ X∗1 , and L2 ⊆ X∗2 be sets of lists. Then L1L2 is the set of lists l1l2
with l1 ∈ L1 and l2 ∈ L2. The set of n-element lists of X is denoted by Xn. If
11
2 Preliminaries
types are clear from the context, X1 may be written as just X. This allows a
regular expression like notation for sets of lists For example, let A = {a, . . . , z}
be the Latin alphabet, then {a}A∗({y} ∪ {z}) ⊆ A∗ is the set of all words that
start with the letter a and end with the letter y or z.
Given a relation R ⊆ A×B, we write aRb := (a, b) ∈ R. We define the inverse
R−1 ⊆ B × A of the relation R by
R−1 ⊆ B × A := {(b, a) | (a, b) ∈ R}.
The image of a set S under R is denoted by R(S), or, alternatively, by SR, and
defined as
R(S) := SR := {b | ∃a ∈ S. (a, b) ∈ R}.
Given two relations R ⊆ A×B and S ⊆ B × C, we define the composition of R
and S by
(a, c) ∈ R ◦ S iff ∃b. (a, b) ∈ R ∧ (b, c) ∈ S.
This definition is also used if relations are given in arrow-notation, i.e.,→1 ◦ →2,
and also for ternary relations that represent labeled transition systems:
c
a→1 ◦ b→2 c′ iff ∃c˜. c a→1 c˜ b→2 c′.
For relations of the form R ⊆ A × A, or R ⊆ A × Σ × A, R∗ denotes the
reflexive, transitive closure, and R+ denotes the transitive closure.
2.2 Automata
In this section, we define the well-known concepts of word and tree automata.
2.2.1 Word Automata
Word automata are a well-known concept to characterize regular sets of sequences
of symbols. A good introduction to word automata is [52]. We briefly sketch the
definition of word automata and some basic results here. A word automaton
A = (Σ, Q, I, F, δ) consists of a finite alphabet Σ, a finite set of states Q, a set
of initial states I ⊆ Q, a set of final states F ⊆ Q, and a set of transitions
δ ⊆ Q × Σ × Q. If the alphabet is clear from the context, we omit it and write
A = (Q, I, F, δ). We write q a−→δ q′ instead of (q, a, q′) ∈ δ, and write −→∗δ for the
reflexive, transitive closure of −→δ.
A word w ∈ Σ∗ is recognized in state q by A, iff there is a final state q′ ∈ F ,
such that (q, w, q′) ∈ δ∗. We write A(q) for the set of words recognized in state q.
A word w is recognized by A, iff it is recognized in an initial state. The language
L(A) of an automaton A is the set of all recognized words:
L(A) = {w ∈ Σ∗ | ∃q ∈ I, q′ ∈ F. (q, w, q′) ∈ δ∗}.
12
2.2 Automata
A set that is the language of some automaton is called regular. In order to
store the automaton A, we need polynomial space in the size of the alphabet and
the number of states:
|A| = poly(|Σ||Q|).
See Section 2.3 for details.
The class of regular sets is closed under union, intersection, and complement.
Given automata A and B over the alphabet Σ, automata A ∪ B, A ∩ B, and
Σ∗ \ A can be constructed such that
L(A ∪B) = L(A) ∪ L(B) L(A ∩B) = L(A) ∩ L(B) L(Σ∗ \ A) = Σ∗ \ L(A).
Union and intersection can be computed in polynomial time, the complement
can be computed in exponential time. Moreover, emptiness of the language of a
given automaton can be decided in polynomial time. The sizes of the automata
can be estimated by
|A ∪B| = O(|A|+ |B|) |A ∩B| = O(|A||B|) |Σ∗ \ A| = 2O(A).
A function h : Σ→ Γ∗ is called a homomorphism. We extend h to words and
set of words by
h([a1, . . . , an]) = h(a1) . . . h(an) h(W ) = {h(w) | w ∈ W}.
The inverse h−1 : 2Γ∗ → 2Σ∗ of h is defined by
h−1(W ) = {w ∈ Σ∗ | h(w) ∈ W}.
The class of regular sets is closed under application of homomorphism and inverse
homomorphism, and given automata A over Σ and B over Γ, and a homomor-
phism h : Σ→ Γ∗, automata h(A) and h−1(B) can be constructed in polynomial
time such that
L(h(A)) = h(L(A)) L(h−1(B)) = h−1(L(B)).
2.2.2 Tree Automata
Tree automata generalize the concept of word automata to trees. A good intro-
duction to tree automata is [29]. We briefly sketch the definition of tree automata
and some basic results here.
Given a finite, ranked alphabet C = C0 ∪˙ . . . ∪˙Cn, such that C0 6= ∅, the set TC
of ranked trees over C is defined as the least solution of the following constraints:
c(t1, . . . , tm) ∈ TC for c ∈ Cm, 0 ≤ m ≤ n, and t1, . . . , tm ∈ TC
In order to simplify the definitions below, we also define Cn+1 = Cn+2 = . . . := ∅.
13
2 Preliminaries
A tree automaton A = (C,Q, F, δ) consists of a ranked alphabet C = C0 ∪˙
. . . ∪˙ Cn, a finite set of states Q, a set F ⊆ Q of final states, and a finite set
δ ⊆ C ×Q∗ ×Q of rules that are consistent w.r.t. the ranks of the constructors,
i.e., (c, r, q) ∈ δ =⇒ c ∈ C|r|. If the alphabet is clear from the context, we omit it
and write A = (Q,F, δ). We write a rule (c, q1 . . . qm, q) ∈ δ as c(q1, . . . , qm)→δ q.
We define the relation →∗δ⊆ TC × Q as the least solution of the following
constraints:
c(t1, . . . , tm)→∗δ q if t1 →∗δ q1 ∧ . . . ∧ tm →∗δ qm ∧ c(q1, . . . , qm)→δ q
If we have t→∗δ q, we say that the tree t is accepted in state q. A tree t is accepted
by the tree automaton, iff it is accepted in a final state. The language L(A) of a
tree automaton A is the set of all recognized trees:
L(A) = {t ∈ TC | ∃q ∈ F. t→∗δ q}
A set of trees is called regular, iff it is the language of some tree automaton.
An equivalent way of defining acceptance by a tree automaton is to include
the states of the tree automaton into the alphabet with rank 0, and interpret the
rules of the tree automaton as subtree-rewrite-rules, i.e., a rule c(q1, . . . , qm)→δ q
means that a subtree c(q1, . . . , qm) may be rewritten to q. A tree is accepted in
state q iff it can be rewritten to q. Using this characterization, we write t →∗δ q
also for trees t that contain states of the tree automaton. For example, we may
write f(g(q1), q2)→∗δ q3.
In order to store a tree automaton A = (C,Q, F, δ), we need space polynomial
in the size of the alphabet and the number of states:
|A| = poly(|C||Q|).
See Section 2.3 for details.
Like for regular sets of words, the class of regular sets of trees is closed under
union, intersection, and complement. Given tree automata A and B over an
alphabet C, tree automata A ∪ B, A ∩ B, and TC \ A can be constructed such
that
L(A ∪B) = L(A) ∪ L(B) L(A ∩B) = L(A) ∩ L(B) L(TC \ A) = TC \ L(A).
Union and intersection can be computed in polynomial time, and the complement
can be computed in exponential time. Moreover, it can be decided in polynomial
time whether the language of a tree automaton is empty.
2.3 Complexity
In this section we define the preliminaries required for the analysis of the com-
plexity of the algorithms that will be developed in this thesis. In Subsection 2.3.1,
14
2.3 Complexity
we describe the big-O notation, in Subsection 2.3.2 we define how we measure
the size of the input data of our algorithms, and in Subsection 2.3.3 we briefly
introduce the basic concepts of computational complexity.
2.3.1 Big-O Notation
The big-O notation is a well-known tool to specify the space and time complexity
of algorithms. An introduction can be found in, for example, Wegener [119]. We
introduce the basic definitions here:
Given a monotonous1 function g : Nn+ → R+ for some n ∈ N+, we define the
class of functions O(g) by




We extend this notion to classes of monotonous functions G by
f ∈ O(G) :⇐⇒ ∃g ∈ G. f ∈ O(g).
As it is common in literature, we abuse the notion of equals in a non-symmetric
way, and write f = O(g) instead of f ∈ O(g), and O(f) = O(g) instead of
O(f) ⊆ O(g). If we mean real equality we write O(f) ≡ O(g).
We have the following properties [119]:
cf ∈ O(f) for c ∈ R+
cO(f) ≡ O(f) for c ∈ R+
O(f1) + . . .+O(fk) ≡ O(f1 + . . .+ fk) ≡ O(max{f1, . . . , fk})
O(f)O(g) ≡ O(fg)
Other useful laws are
O(O(f)) ≡ O(f)
fO(g) ≡ O(fg)
For example, when regarding a polynomial with positive coefficients cm, . . . , c0 ∈
R+, its order is only determined by the largest exponent:
O(cnx
n + . . .+ c1x+ c0) ≡ O(xn).
If we are not interested in the largest exponent of a polynomial, we also write
poly(g) for the class of polynomials in g:
f ∈ poly(g) :⇐⇒ ∃k. f ∈ O(gk)
1We use a pointwise ordering on Nn+, i.e., (x1, . . . xn) ≤ (y1, . . . , yn) iff x1 ≤ y1∧ . . .∧xn ≤ yn.
15
2 Preliminaries
and extend this notion to classes of functions:
f ∈ poly(G) :⇐⇒ ∃g ∈ G. f ∈ poly(g).
For poly, we have similar laws as for O:
cfk ∈ poly(f) for c ∈ R+ and k ∈ N
cpoly(f) ≡ poly(f) for c ∈ R+





Between the classes of functions described by O and poly, we have the following
hierarchy:
O(f 0) ⊂ O(f 1) ⊂ . . . ⊂ poly(f) ⊂ 2O(f)
Typical complexities encountered in this thesis are poly(n), which we call poly-
nomial in n; 2poly(n), which we call exponential in n; and poly(n)2poly(m), which
we call polynomial in n and exponential in m.
2.3.2 Size of Input Data
We typically specify the complexity of an algorithm in terms of the size of its
input. The size of the input is the length of bits required to encode the input of
the algorithm.
The input of our algorithms are usually automata-like structures, consisting of
a finite alphabet, finite sets of symbols for states and stack, and a finite set of
rules. The rules are a relation on boundedly many elements from the alphabet
and symbol sets. For example, the rules of an automaton relate two states with
one letter from the alphabet. A rule of a tree automaton may relate more than
two states, depending on the arity of the function symbol.
Regard an automaton A = (Σ, Q, I, F, δ). We may drop letters and states
that do not occur in rules, without changing the language of the automaton2.
Hence, we may safely assume that Σ and Q do not contain such letters or states.
Thus, to store A, it is sufficient to store the sets I, F , and δ, as Σ and Q can be
reconstructed from these sets. With |A|, we denote the size of A, i.e., the number
of bits required to store A.
In order to store finite sets, there are two common options: Explicit enumera-
tion of all the elements in the set (e.g. by some tree or hashset data structure), or
2Actually, the precise alphabet is only important when complementing the automaton.
16
2.3 Complexity
storing the set as a vector of bits representing its characteristic function. While
the former method is usually better for sparse sets, containing few elements com-
pared with the maximum possible elements, the latter method is better for dense
sets, containing a number of elements in the order of the maximum possible
elements. We choose the second possibility here, as it has a better worst-case
size.
Hence, in order to store the sets of initial and final states, we need 2|Q| bits,
and in order to store the set of rules, we need |Q|2|Σ| bits. Together, we need
2|Q|+ |Q|2|Σ| bits. Using big-O notation, we have
|A| = 2|Q|+ |Q|2|Σ| = O(|Q|2|Σ|) = poly(|Q||Σ|),
i.e., to store an automaton we need space quadratic in the number of states and
linear in the size of the alphabet, or, estimated more roughly, space polynomial
in the number of states and in the size of the alphabet.
A similar estimation works for tree automata A = (C,Q, F, δ). Here, the






|Ci||Q|i+1 = |C|poly(|Q|) = poly(|C||Q|),
where we assume that the maximum rank is fixed. Thus, we have
|A| = O(|Q|) + |C|poly(Q) = O(|C|poly(Q)) = poly(|C||Q|),
i.e., we need space polynomial in the number of states and linear in the size of
the alphabet, or, more roughly, polynomial in the number of states and the size
of the alphabet.
Example 2.1. Given a finite set X we construct an automaton
A = ({xb | x ∈ X ∧ b ∈ B}, {q} × 2X , {q} × 2X , {(q, ∅)}, δ)
with the following rules:
(q,X ∪ {x}) x⊥−→δ (q,X) (q,X) x
>−→δ (q,X)
for all X ⊆ X and x ∈ X .
Intuitively, this automaton computes the set of elements annotated with ⊥, i.e.,
a sequence w is accepted in state (q,X), where X is the set of elements annotated
with bottom occurring in w. Formally:
w ∈ A(q,X) ⇐⇒ X = {x | x⊥ ∈ set(w)}.




The size of the alphabet is 2|X |, the number of states is 2|X |. Hence, the size
of the automaton can be estimated by
|A| = O((2|X |)2 · 2|X |) = O(|X |)2O(|X |) = 2O(|X |).
In order to explicitly construct this automaton, one needs to instantiate the rule-
templates for all possible sets X ⊆ X and elements x ∈ X. Hence, construction
of the automaton requires time 2O(|X |). However, when implementing our meth-
ods, explicit construction of such automata should be avoided, and more compact
symbolic representations should be used instead.
2.3.3 Computational Complexity
In this subsection, we briefly introduce the concepts of computational complexity
that are used in this thesis. For a complete introduction, we refer to a textbook
on computational complexity, e.g. [96, 119].
2.3.3.1 Polynomial Reduction and Completeness
The notion of NP-completeness and (polynomial) reduction was introduced in
the landmark paper of Cook [30]. We briefly recall the basics here.
A (decision) problem consists of a set of inputs L. A (decision) algorithm
decides, given an input i, whether i ∈ L. Note that we use the term decision
problem here, while it is also common to use the term language. Moreover, we
assume that algorithms are given by one-tape Turing machines.
A decision problem L is in P, if there is a deterministic algorithm that de-
cides i ∈ L in time poly(|i|). A decision problem L is in NP, if there is a
nondeterministic algorithm that decides i ∈ L in time poly(|i|). The problem
is in PSPACE, if there is a deterministic algorithm that decides i ∈ L in space
poly(|i|). The problem is in NPSPACE, if there is a nondeterministic algorithm
that decides i ∈ L in space poly(|i|). A well-known result of Savitch [108] implies
that PSPACE=NPSPACE. Finally, the problem L is in EXPTIME, if there is a
deterministic algorithm that decides i ∈ L in time 2poly(|i|).
A polynomial reduction from a problem L to another problem L′ is a polynomial
time deterministic algorithm that computes a function f , such that i ∈ L if and
only if f(i) ∈ L′. We write L ≤P L′, if there is a polynomial reduction from L
to L′. Intuitively, L ≤P L′ means that the problem L is easier to solve than the
problem L′.
A problem L is called NP-hard (PSPACE-hard), if it is harder than any problem
L′ in NP (PSPACE), i.e., if for any such problem L′, we have L′ ≤P L. A problem
is called NP-easy (PSPACE-easy), if it is in NP (PSPACE). A problem is called




Obviously, we have P ⊆ NP ⊆ PSPACE ⊆ EXPTIME. It is conjectured that
these inclusions are strict and, in particular, that there exist no polynomial time
deterministic algorithms for NP-hard problems. Moreover, although the best
known deterministic algorithms to solve NP-complete and PSPACE-complete
problems require exponential time, PSPACE-hard problems are generally consid-
ered more difficult than problems in NP.
2.3.3.2 Standard Problems
The standard method to show that a problem is NP-easy (PSPACE-easy) is
to specify a nondeterministic polynomial time (space) algorithm that solves the
problem. Alternatively, one can reduce the problem to a known problem in NP
(PSPACE). The standard method to show that a problem is NP-hard (PSPACE-
hard) is to reduce a known NP-hard (PSPACE-hard) problem to this problem.
Thus, it is useful to have a collection of NP-hard and PSPACE-hard problems,
from which a suitable one can be selected. The first problem that was shown to
be NP-hard is the Boolean satisfiability problem (SAT), i.e., checking whether
there is a valuation that satisfies a given Boolean formula [30]. Based on this
problem, many other problems can be shown to be NP-hard. A small collection
of such problems is presented in [30], and was extended by Karp [58]. We only
need the 3SAT-problem in this thesis.
3-Satisfiability The Boolean 3-satisfiability problem (3SAT for short) is one
of the first problems shown to be NP-complete [30, 58]. The 3SAT problem
is defined as follows: Given a set of n variables V = {v1, . . . , vn} ranging over
Boolean values, and a set of m clauses C = {(l11, l12, l13), . . . , (lm1, lm2, lm3)} over
literals lij ∈ V ∪˙ {¬v | v ∈ V }, the problem is to decide whether there is a







If there is such a valuation, the 3SAT-instance (V,C) is said to be satisfiable,
otherwise it is said to be unsatisfiable.
Quantified Boolean Formula A quantified Boolean formula (QBF) over n vari-
ables V = {v1 . . . vn} and m clauses C = {(l11, l12, l13), . . . , (lm1, lm2, lm3)}, where
lij ∈ V ∪˙ {¬v | v ∈ V }, is a formula of the form






such that Qi = ∀, if i is even, and Qi = ∃, if i is odd. We assume w.l.o.g. that






In this chapter, we introduce the program model used in this thesis. Dynamic
pushdown networks with monitors (Monitor-DPNs) are an extension of pushdown
processes by dynamic thread creation and mutual exclusion via monitors. In
Section 3.1, we motivate Monitor-DPNs from both, a practical and a theoretical
point of view. In Section 3.2, we formally introduce dynamic pushdown networks
(DPNs), a program model that supports recursive procedures and dynamic thread
creation. In Section 3.3, we introduce the concepts of locks and the common
monitor lock-usage discipline, and extend DPNs by monitors. For both, DPNs
and Monitor-DPNs, we define an interleaving semantics and a tree-semantics,
and show that both semantics are equivalent. Finally, in Section 3.4, we briefly
summarize the results of this chapter and give an overview of related work.
3.1 Motivation
Wemotivate Monitor-DPNs from both, a practical and a theoretical point of view.
From the practical point of view, one seeks for abstract models of parallel pro-
grams that have decidable properties and can express as much concepts of paral-
lel programming as possible. Modern parallel programming languages—like Java
[47] or C/C++ [112] with multi-threading libraries like pthreads [21]—support
many concepts that pose challenges to automatic analysis. Among others, these
are
• procedures,
• dynamically allocated memory and pointers to data and code (e.g. virtual
methods),
• dynamic thread creation,
• shared memory between threads, and
• synchronization between threads (e.g. locks, join, wait/notify, message-
passing).
Out of these concepts, Monitor-DPNs fully support procedures, dynamic thread
creation, and synchronization between threads via reentrant monitors. However,
they have no direct support for dynamically allocated memory and pointers, nor
21
3 Models
for shared memory or other synchronization primitives like join, wait/notify, or
message-passing.
The choice of supported concepts can be justified as follows: Support of dy-
namically allocated memory and pointers immediately leads to Turing-powerful
models, even without procedures and concurrency. For example, one can use
linked lists as counters. A model with decidable properties that supports proce-
dures and concurrency cannot support too powerful synchronization mechanisms
like shared memory, wait/notify, or message-passing, as shown by Ramalingam
[104]. Also non-well-nested locking leads to undecidability of simultaneous reach-
ability properties, as was shown by Kahlon et al. [57]. (Recently, decidability for
some weaker criterion than well-nestedness, called bounded lock-chains, was shown
[54].) For locks, we have the choice of supporting well-nested, non-reentrant locks,
or reentrant monitors1. We choose reentrant monitors, as they are used by the
widespread Java programming language [47]. However, our methods also apply
for well-nested, non-reentrant locks with minor changes (cf. Chapter 8 and [79]).
Moreover, locks are typically used to avoid race-conditions. As verifying absence
of race-conditions is an important application of our methods, it is essential not
to abstract from locks.
From the theoretical point of view, one seeks for generalizations of existing
models that still have decidable properties, such that further generalization re-
sults in undecidability or increased complexity. Monitor-DPNs are a general-
ization of dynamic pushdown networks (DPNs) by synchronization via monitors.
DPNs have been introduced by Bouajjani et al. [16], and are, themselves, a gener-
alization of the well-known pushdown systems (PDS) by dynamic thread creation.
Predecessor set computation for PDSs and DPNs can be done in polynomial time.
In this thesis, we provide a predecessor set computation for Monitor-DPNs that
runs in polynomial time in the size of the DPN and in exponential time in the
number of locks. To justify the exponential runtime, we show that the correspond-
ing decision problem, i.e., lock-sensitive reachability between regular sets of con-
figurations, is NP-complete (cf. Chapter 9). We also show that further extending
Monitor-DPNs with join-synchronization makes this problem PSPACE-hard. As
already mentioned, generalization to even more powerful synchronization mecha-
nisms, like shared memory or rendezvous-communication, makes most problems
undecidable [104].
3.2 Dynamic Pushdown Networks
Before we define Monitor-DPNs, we first introduce DPNs without locks. Dy-
namic pushdown networks (DPNs) are a model of programs with procedures and
dynamic thread creation, which has been developed by Bouajjani et al. [16]. They
1Unfortunately, we have no results for well-nested, reentrant locks, cf. Section 8.1.
22
3.2 Dynamic Pushdown Networks
generalize pushdown systems by the ability to spawn new threads as side-effects
of transitions. We define DPNs and their semantics along the lines of [16]:
Definition 3.1 (Dynamic Pushdown Network). A dynamic pushdown network
(DPN) is a tuple M = (P,Γ,Act,∆), where P is a finite set of control-symbols,
Γ is a finite set of stack-symbols, Act is a finite set of actions, and ∆ is a finite
set of rules of the following types:
pγ
a
↪→ p′w for p, p′ ∈ P , a ∈ Act, γ ∈ Γ, and w ∈ Γ∗ (local)
pγ
a
↪→ psws]p′w for p, ps, p′ ∈ P , a ∈ Act, γ ∈ Γ, and ws, w ∈ Γ∗ (spawn)
Intuitively, a (local)-rule describes a classic pushdown transition. A (spawn)-
rule additionally creates a new thread as a side effect. The created thread starts
with the control-state ps and the stack ws
Usually, we use variable p for control-symbols, variable γ for stack-symbols,
variable a for actions, and variables w and r for stacks.
In order to store a DPN M = (P,Γ,Act,∆), we need space
|M | = poly(|P ||Γ||Act|).
The argumentation is the same as for automata and tree automata (cf. Sec-
tion 2.3).
3.2.1 Interleaving Semantics
The interleaving semantics describes an execution as a sequence of steps, where
each step is made by one thread. Formally, it is defined as a labeled transition
system over configurations.
Definition 3.2 (Configuration). A configuration of a DPN M = (P,Γ,Act,∆)
is a list of thread-configurations, where each thread-configuration contains the
thread’s control-state and stack. The set of configurations over P and Γ is
ConfP,Γ := (PΓ
∗)∗
If P and Γ are clear from the context, we also write Conf instead of ConfP,Γ.
Usually, we use ϕ as a variable for thread-configurations, and c for configurations.
Definition 3.3 (Interleaving Semantics of DPNs). Let M = (P,Γ,Act,∆) be a
DPN. Then, the relation −→M⊆ Conf×Act×Conf is called interleaving semantics
of M . It is defined as the least relation that satisfies the following constraints:
c1(pγr)c2
a−→M c1(p′wr)c2 if pγ a↪→ p′w ∈ ∆ (local)
c1(pγr)c2
a−→M c1(psws)(p′wr)c2 if pγ a↪→ psws]p′w ∈ ∆ (spawn)
23
3 Models
With −→∗M , we denote the reflexive, transitive closure of −→M , i.e., for a¯ = a1 . . . an,
we have c a¯−→∗M c′ iff there are configurations c0, . . . , cn such that
c = c0
a1−→M c1 a2−→M · · · an−→M cn = c′.
If the DPN M is clear from the context, we write −→ instead of −→M , and −→∗
instead of −→∗M .
The (local)-constraint matches the usual definition of the semantics of a push-
down system. A step induced by the (spawn)-constraint additionally spawns a
new thread and inserts its control-state and stack to the left of the spawning
thread.
It is sometimes convenient to view configurations as plain sequences of elements
from P ∪Γ that start with an element from P . For this, we assume P ∩Γ = ∅. A
thread-configuration starts with a control-symbol, followed by the stack-elements
from top to bottom, and a configuration is the concatenation of its thread-
configurations. For example, the configuration [(p1, [γ1, γ2]), (p2, [γ3, γ4])] ∈ Conf
becomes the sequence [p1, γ1, γ2, p2, γ3, γ4]. Note that we used unambiguous list
notation for this example, in order to emphasize the difference between a con-
figuration and a sequence of elements from P ∪ Γ. From now on, we identify
configurations and sequences of elements from P ∪ Γ that start with an element
from P . A set of configurations is called regular, iff it is the language of some
automaton.
3.2.2 Tree-Semantics
Up to now, we have regarded an interleaving semantics of DPNs, i.e., an execution
is a totally ordered sequence of actions. In an interleaving semantics, we have
two types of nondeterministic choice: First, many rules may apply to a specific
thread to derive its next step, and, second, there may be many threads ready to
make a next step. The interleaving semantics makes no distinction between these
two types of nondeterminism. The idea of the tree-semantics is to separate these
two types of nondeterminism. In a first step, the rules that are applied to each
thread are chosen, and, in a second step, the order in which concurrent steps are
executed is chosen. The result of the first step is a partially ordered multiset
(pomset) of actions, the result of the second step is a topological ordering of
this set. As the only rules that affect more than one thread are spawn-rules, the
partial ordering on the actions always has a tree shape.






↪→ p2γ p1γ′ a4↪→ p2γ′
p2γ
a3
↪→ p3γ p2γ′ a5↪→ p3γ′
24
3.2 Dynamic Pushdown Networks
It has the (maximal) executions pγ a¯−→ p3γp3γ′ for a¯ ∈ S ⊆ Act∗ with
S = {a1a2a3a4a5, a1a2a4a3a5, a1a2a4a5a3, a1a4a2a5a3, a1a4a2a3a5, a1a4a5a2a3}.
Note that all 6 possible choices of a¯ result from exactly the same steps executed
on the same threads in the same per-thread order. They only differ in the order in
which steps of parallel threads have been executed. The dependency of the chosen
steps yields the following partial ordering of the steps’ actions:
{a1 < a2, a1 < a4, a2 < a3, a4 < a5}






Its set of topological orderings is exactly the set S.
Execution-trees represent such partial orderings of actions as ranked trees.
Definition 3.5 (Execution-Trees). The set TAct of execution-trees over actions
Act is defined by the following grammar:
TAct ::= τ | a(TAct) | a(TAct,TAct), for a ∈ Act
Executions that start at a configuration with more than one thread are described
by a list of execution-trees, containing one tree per thread. Those lists are called
execution-hedges. We define the set HAct of execution-hedges by
HAct = (TAct)
∗.
When clear from the context, we omit the index and write T and H instead of
TAct and HAct. Usually, we use t as a variable for execution-trees, and h for
execution-hedges.
Next, we define a semantics of DPNs that defines executions between configu-
rations as execution-trees (viz. execution-hedges).
Definition 3.6 (Tree-Semantics). Let M = (P,Γ,Act,∆) be a DPN. The tree-







==⇒M c′ if pγ a↪→ p′w ∈ ∆ and p′wr t=⇒M c′ (local)
pγr
a(ts,t)
===⇒M csc′ if pγ a↪→ psws]p′w ∈ ∆, psws ts=⇒M cs and p′wr t=⇒M c′ (spawn)
25
3 Models
We lift =⇒M to execution-hedges in the natural way, i.e.
ϕ1 . . . ϕn
t1...tn===⇒M c1 . . . cn if ϕ1 t1=⇒M c1 ∧ . . . ∧ ϕn tn=⇒M cn
If the DPN M is clear from the context, we write =⇒ instead of =⇒M .
The topological orderings of the labels of an execution-hedge are called sched-
ules of the hedge.
Definition 3.7 (Schedules). Let Act be a set of actions. We define the schedul-







; h1[t1, t2]h2 (spawn)
The reflexive, transitive closure of ; is denoted by ;∗.
The set sched(h) of schedules of an execution-hedge h is the set of maximal
runs of ; from this hedge:
sched(h) = {a¯ | ∃h′ ∈ {τ}∗. h a¯;∗ h′}.
We now prove the coincidence between the interleaving semantics and the tree-
semantics that was already suggested by Example 3.4.
Theorem 3.8 (Equality of Interleaving Semantics and Tree-Semantics). LetM =
(P,Γ,Act,∆) be a DPN. The interleaving semantics has a run c a¯−→∗ c′ if and only
if there exists an execution-hedge h ∈ H with c h=⇒ c′ and a¯ ∈ sched(h).
Proof. For the =⇒-direction, we show that the scheduler can follow every step
of the interleaving semantics, i.e.
c
a−→ cˆ hˆ=⇒ c′ =⇒ ∃h. c h=⇒ c′ ∧ h a; hˆ (∗)
This is done by a case distinction on the constraint that was used to derive the
c
a−→ cˆ step. If the (local)-constraint with pγ a↪→ p′w ∈ ∆ was used, we have
c = c1[pγr]c2 and cˆ = c1[p′wr]c2. Following the partitioning of cˆ, we can write hˆ






h1=⇒ c′1, p′wr tˆ=⇒ c′t, and c2 h2=⇒ c′2. By the (local)-constraint of
=⇒, we get pγr a(tˆ)==⇒ c′t, and thus c
h1[a(tˆ)]h2
=====⇒ c′. Moreover, by the (local)-constraint
of the scheduler, we have h1[a(tˆ)]h2
a
; hˆ. If the c a−→ c′ step was derived by the
(spawn)-constraint, the argumentation is analogous.
The =⇒-direction is now proved by induction over the reflexive, transitive
closure in c a¯−→∗ c′. If we have c = c′ and a¯ = ε, we set h = τ |c|, i.e., the
26
3.3 Locks and Monitors
execution-hedge that contains |c| leafs. We observe that c h=⇒ c and h ε;∗ h. Due
to h ∈ {τ}∗, we get ε ∈ sched(h).
Now, assume we have c a−→ cˆ a¯−→∗ c′. By induction hypothesis, we obtain hˆ with
cˆ
hˆ
=⇒ c′, and a¯ ∈ sched(hˆ). By (∗) we obtain h with c h=⇒ c′ and h a; hˆ. Thus we
also get aa¯ ∈ sched(h).
For the⇐=-direction, we show that the interleaving semantics can follow every
step of the scheduler, i.e.
c
h
=⇒ c′ ∧ h a; hˆ =⇒ ∃cˆ. c a−→ cˆ ∧ cˆ hˆ=⇒ c′. (†)
This is, similar to the above proof of (∗), shown by case distinction over the
constraint used to derive h a; h′.
The ⇐=-direction now follows from induction over a¯. In the case a¯ = ε, we
have h ∈ {τ}∗, and hence c = c′. Thus we have c ε−→∗ c′. In the case a¯ = aa¯′,
by unfolding the definition of sched, we obtain hˆ ∈ H and hf ∈ {τ}∗ such that
h
a
; hˆ and a¯′ ∈ sched(hˆ). By (†), we obtain cˆ such that c a−→ cˆ and cˆ hˆ=⇒ c′. By
induction hypothesis, we get cˆ a¯
′−→∗ c′, and thus c aa¯′−→∗ c′.
Obviously, every execution-hedge has at least one schedule. Thus we get:
Corollary 3.9. Let M = (P,Γ,Act,∆) be a DPN and c, c′ ∈ Conf be configura-
tions. There is a run between c and c′, if and only if there is an execution-hedge
from c to c′. Formally:
∃a¯ ∈ Act∗. c a¯−→∗ c′ ⇐⇒ ∃h ∈ H. c h=⇒ c′.
3.3 Locks and Monitors
A disadvantage of DPNs is that there is no communication between threads.
Once a thread has been spawned, it runs completely independent of the other
threads of the system. One possibility for adding communication between threads
are locks. Locks allow to synchronize the access on resources between parallel
threads. A thread can acquire and release locks, and each lock may be acquired
by at most one thread at the same time. If a thread wants to acquire a lock that
is currently acquired by another thread, it has to wait until the lock is released.
Locks may be reentrant, i.e., the same thread may re-acquire a lock that it
already holds. Intuitively, acquiring a lock increments a counter on that lock




A common lock-usage discipline is to acquire locks only on procedure calls, and
release them on the matching return. This corresponds to, e.g. synchronized-
methods in the Java programming language [47]. Locks used in this way are called
monitors. We model DPNs with monitors by binding locks to stack-symbols.
Moreover, we label rules such that their effect on the locks becomes visible:
Definition 3.10 (Lock-Actions). Let Act be a finite set of actions, and X be a
finite set of locks. Then,
ActX := {2a | a ∈ Act} ∪ {〈x, 〉x | x ∈ X}
denotes the set of lock-actions over actions Act and locks X . Actions of the form
2a are called base-actions, actions of the form 〉x are called release-actions, and
actions of the form 〈x are called acquisition-actions. Usually, we use x as a
variable for locks, X for sets of locks, and o for lock-actions.
In the context of lock-actions, also the actions from Act are called base-actions.
Definition 3.11 (Monitor-DPN). A Monitor-DPN M is a tuple
M = (P,Γ,Γ⊥,Act,X ,∆, locks),
where Act is a finite set of base-actions, X is a finite set of locks, (P,Γ,ActX ,∆)
is a DPN, locks : Γ → 2X , with ∀γ ∈ Γ. |locks(γ)| ≤ 1 is a mapping from stack-
symbols to sets of locks, where no stack-symbol is assigned more than one lock,
and Γ⊥ with ∅ ⊂ Γ⊥ ⊆ Γ, such that γ ∈ Γ⊥ =⇒ locks(γ) = ∅ is the set of
bottom stack-symbols. Moreover, the rules in ∆ are of the following types:
Base-rules: pγ
2a
↪→ p′γ′ with γ ∼ γ′
Push-rules: pγ
2a
↪→ p′γ1γ2 with γ ∼ γ2, locks(γ1) = ∅, and γ1 /∈ Γ⊥
Pop-rules: pγ
2a
↪→ p′ with locks(γ) = ∅ and γ /∈ Γ⊥
Spawn-rules: pγ
2a
↪→ psγs]p′γ′ with γ ∼ γ′ and γs ∈ Γ⊥
Acquire-rules: pγ
〈x
↪→ p′γ1γ2 with γ ∼ γ2 and locks(γ1) = {x}
Release-rules: pγ
〉x
↪→ p′ with locks(γ) = {x}
where p, p′, ps ∈ P , γ, γ′, γs, γ1, γ2 ∈ Γ, a ∈ Act, x ∈ X , and γ ∼ γ′ means that γ
and γ′ hold the same locks and are both in Γ⊥ or both not in Γ⊥, i.e.
γ ∼ γ′ :⇐⇒ locks(γ) = locks(γ′) ∧ (γ ∈ Γ⊥ ⇐⇒ γ′ ∈ Γ⊥).
We lift the locks function to stacks, and configurations in the following way:
locks(γ1 . . . γn) = locks(γ1) ∪ . . . ∪ locks(γn) for γ1, . . . , γn ∈ Γ
locks(p1w1 . . . pnwn) = locks(w1) ∪ . . . ∪ locks(wn) for p1w1 . . . pnwn ∈ Conf
28
3.3 Locks and Monitors
A stack of a Monitor-DPN is called valid if it is not empty, its bottommost
symbol is from Γ⊥, and all other symbols are from Γ\Γ⊥. We define the predicate
valid ⊆ Γ∗ by:
valid(w) ⇐⇒ w ∈ (Γ \ Γ⊥)∗Γ⊥.
We lift the valid-predicate to configurations in the natural way:
valid(p1w1 . . . pnwn) ⇐⇒ valid(w1) ∧ . . . ∧ valid(wn).
Intuitively, base-rules make a transition without pushing symbols on the stack,
changing the locks, or creating a thread. Threads are created by spawn-rules.
Push- and pop-rules push and pop symbols from the stack, without changing the
lockstack, while acquisition- and release-rules push and pop symbols from the
stack that are bound to locks, thus changing the lockstack.
Moreover, there is a special class of stack-symbols Γ⊥ that occur as the bot-
tommost symbol of each stack and cannot be pushed or popped. Hence, all
valid stacks are non-empty, which simplifies the constructions in Chapter 6. As
Monitor-DPNs are a special case of DPNs, we can apply the lock-insensitive in-
terleaving and tree-semantics (cf. Definitions 3.3 and 3.6) to them. Validity of
configurations is preserved by transitions of the lock-insensitive semantics, i.e.
valid(c) ∧ c o¯−→∗ c′ =⇒ valid(c′)
valid(c) ∧ c h=⇒ c′ =⇒ valid(c′)
Proof. By inspecting the rules of a Monitor-DPN, we observe that symbols from
Γ⊥ cannot be pushed or popped, nor can a rule change the membership in Γ⊥
of the topmost stack-symbol. Hence, stacks of existing threads remain valid.
The bottommost stack-symbols of spawned threads are from Γ⊥, hence stacks of
spawned threads are initially valid. Thus, a single step preserves validity, i.e.
valid(c) ∧ c o−→ c′ =⇒ valid(c′).
From this, the first proposition is shown by straightforward induction. The second
proposition follows from the first one using Corollary 3.9.
Moreover, restricting the rules of a DPN to base-, push-, pop-, and spawn-rules
does not limit the modeling power of DPNs w.r.t. reachability properties, as rules
pushing arbitrary many stack-symbols in one step can be simulated by a sequence
of push-rules, using intermediate states. I.e., for each rule pγ
a
↪→ p′γ1 . . . γn with













Obviously, each execution of the original DPN corresponds to an execution of
the new DPN, with some additional steps performing a-actions. Vice versa, each
execution of the new DPN that reaches a configuration with none of the new
control-states corresponds to an execution of the original DPN, when removing
the additional a-actions.
For the remainder of this section, we fix a Monitor-DPN
M = (P,Γ,Γ⊥,Act,X ,∆, locks).
In order to store a Monitor-DPN, we may assume that every lock actually
occurs in a rule. Thus, we have |X | = O(|Γ|), and hence
|M | = poly(|P ||Γ||Act||X |) = poly(|P ||Γ||Act|).
Each stack of a Monitor-DPN encodes a stack of locks.
Definition 3.12 (Lockstacks). The function lsM : Γ→ X ∗ maps a stack-symbol
to the list of locks that corresponds to this stack-symbol. This is either a singleton
list or the empty list. We define:
lsM(γ) =
{
x if locks(γ) = {x}
ε else
When clear from the context, we omit the DPN M and write ls instead of lsM .
We lift ls to stacks, thread-configurations, and configurations:
ls : Γ∗ → X ∗ ls : PΓ∗ → X ∗ ls : Conf → (X ∗)∗
where
ls(γ1 . . . γn) = ls(γ1) . . . ls(γn) for (γ1 . . . γn) ∈ Γ∗
ls(pw) = ls(w) for pw ∈ PΓ∗
ls(ϕ1 . . . ϕn) = [ls(ϕ1), . . . , ls(ϕn)] for (ϕ1 . . . ϕn) ∈ Conf
We use the variable name µ for lockstacks.
3.3.1 Lock-Sensitive Interleaving Semantics
A lock-sensitive semantics has to ensure that no lock is owned by two threads at
the same time. For this, we define the set of consistent configurations, which are
those configurations where no lock is on the lockstack of more than one thread:
Definition 3.13 (Consistent Configurations). A configuration c ∈ Conf is called
consistent, if and only if no lock is on the lockstack of more than one thread. We
define the set of consistent configurations by
Conf ls := {p1w1 . . . pnwn | n ∈ N ∧ ∀1 ≤ i < j ≤ n. locks(wi) ∩ locks(wj) = ∅}.
30
3.3 Locks and Monitors
The lock-sensitive interleaving semantics is then defined as the restriction of
the lock-insensitive semantics to consistent configurations:
Definition 3.14 (Lock-Sensitive Interleaving Semantics). The lock-sensitive in-
terleaving semantics
−→ls,M⊆ Conf ls × ActX × Conf ls
is defined as the restriction of the lock-insensitive interleaving semantics to con-
sistent configurations:
c
o−→ls,M c′ :⇐⇒ c, c′ ∈ Conf ls ∧ c o−→M c′.
If the Monitor-DPN M is clear from the context, we write −→ls instead of −→ls,M .
Lemma 3.15. The lock-sensitive interleaving semantics can be equivalently char-
acterized as the least solution of the following constraints:
c1(pγr)c2
2a−→ls,M c1(p′wr)c2 if c1(pγr)c2 ∈ Conf ls ∧ pγ 2a↪→ p′w ∈ ∆
(local)
c1(pγr)c2
2a−→ls,M c1(psγs)(p′γ′r)c2 if c1(pγr)c2 ∈ Conf ls ∧ pγ 2a↪→ psγs]p′γ′ ∈ ∆
(spawn)
c1(pγr)c2
〈x−→ls,M c1(p′γ1γ2r)c2 if c1(pγr)c2 ∈ Conf ls ∧ pγ
〈x
↪→ p′γ1γ2 ∈ ∆
∧ x /∈ locks(c1c2) (acquire)
c1(pγr)c2
〉x−→ls,M c1(p′r)c2 if c1(pγr)c2 ∈ Conf ls ∧ pγ
〉x
↪→ p′ ∈ ∆
(release)
Proof. For the =⇒-direction, we assume that we have a step c o−→ls c′. Unfolding
the definition of −→ls, we get c, c′ ∈ Conf ls and c o−→ c′. This step is either a local
or a spawn-step. In case of a local step, we get c = c1pγrc2, c′ = c1p′wrc2, and
pγ
o
↪→ p′w ∈ ∆. We make a case distinction over o. In the case o = 2a, we
get the proposition due to the (local)-constraint. In the case o = 〈x, we have
w = γ1γ2 and locks(γ1) = {x}, due to the constraints for rules of Monitor-DPNs.
As c′ is consistent, we have x /∈ locks(c1c2), and we get the proposition due to the
(acquire)-constraint. In the case o = 〉x, we have w = ε, and get the proposition
due to the (release)-constraint. In case of a spawn-step, we have c = c1pγrc2,
c′ = c1pswsp′wrc2, and pγ
o
↪→ psws]p′w ∈ ∆. Due to the constraints for rules
of Monitor-DPNs, we have ws = γs and w = γ′ for some γs, γ′ ∈ Γ, and the
proposition follows by the (spawn)-constraint.
For the ⇐=-direction, assume that c o−→ls c′ was derived due to the above
constraints. We have to show that c, c′ ∈ Conf ls and c o−→ c′.
We observe that the above constraints are a specialization of the constraints
defining −→. Thus, we have c o−→ c′. Moreover, all constraints require c ∈ Conf ls.
It remains to show c′ ∈ Conf ls.
31
3 Models
We distinguish due to which constraint the step c o−→ls c′ was derived. In case
of the (local)- and (spawn)-constraints, we observe that, due to the constraints
on the rules of a Monitor-DPN, the lockstack of the local thread is not changed,
and the spawned thread’s lockstack is initially empty. Hence, with c ∈ Conf ls, we
also get c′ ∈ Conf ls.
In case of the (acquire)-constraint, we have locks(γ1) = {x} and locks(γ2) =
locks(γ). As we have c ∈ Conf ls, we have c′ ∈ Conf ls if the newly acquired
lock x does not break consistency. This is guaranteed by the side condition
x /∈ locks(c1c2).
In case of the (release)-constraint, we pop a lock from the thread’s lockstack.
Thus, with c ∈ Conf ls, we also get c′ ∈ Conf ls.
3.3.2 Well-Nestedness
The binding of locks to the stack in Monitor-DPNs ensures that locks are always
used in a well-nested fashion, i.e., if a lock is released, it is the last lock that was
acquired and not yet released.
We want to define a lock-sensitive scheduler for execution-hedges. As an
execution-hedge does not contain information about the configuration from that
it was generated, there is also no information about the lockstacks. In order to
define the notion of lock-sensitive scheduling, we explicitly add the lockstacks to
the execution-hedge. Here, it makes sense to assume that the execution-hedge
is well-nested w.r.t. those explicit lockstacks. Hence, we have to formalize the
notion of a well-nested execution-hedge.
In order to make the presentation of execution-trees more readable, we in-
troduce the convention to write a spawn-node annotated with a base-action as
a(t1, t2), instead of 2a(t1, t2).
We define the relation⇀ that describes how an execution-tree modifies a given
lockstack. Later in the thesis, we only need the resulting lockstack of the root
thread. Hence, we omit the resulting lockstacks of spawned threads. However,
we still assume that spawned threads are well-nested w.r.t. the empty lockstack.
Definition 3.16 (Lock-Transition Relation). Let Act be a set of base-actions and
X be a set of locks. The lock-transition relation ⇀Act,X⊆ X ∗ × TActX × X ∗, is
defined as the least relation that satisfies the following constraints:
µ
τ
⇀Act,X µ if µ ∈ X ∗ (leaf)
µ
2a(t)





⇀ Act,X µ′ if ∃µs. ε ts⇀Act,X µs and µ t⇀Act,X µ′ (spawn)
µ
〈x(t)









3.3 Locks and Monitors
If Act and X are clear from the context, we write ⇀ instead of ⇀Act,X .
Based on the lock-transition relation, we define well-nestedness and some re-
lated notions on execution-trees.
Definition 3.17. Let t ∈ TActX be an execution-tree.
1. t is called well-nested w.r.t. a lockstack µ ∈ X ∗, iff there is a lockstack µ′
such that µ t⇀ µ′.
2. t is called well-nested, iff it is well-nested w.r.t. some lockstack. We define
Twn to be the set of well-nested execution-trees:
t ∈ Twn :⇐⇒ ∃µ, µ′. µ t⇀ µ′.
3. t is called non-releasing, iff it is well-nested w.r.t. the empty lockstack. We
define Tnr to be the set of non-releasing execution-trees:
t ∈ Tnr :⇐⇒ ∃µ. ε t⇀ µ.
4. t is called a same-level tree, iff ε t⇀ ε. We define Tsl to be the set of
same-level trees:
t ∈ Tsl :⇐⇒ ε t⇀ ε.
The ⇀-relation is a transition relation on the local branch of a tree, i.e., the
nodes of the tree corresponding to steps of the thread at the root of the tree. In
order to describe its properties, it is useful to define concatenation of execution-
trees at the root thread.
Definition 3.18 (Tree Concatenation). The operator ·; · : T→ T→ T is induc-
tively defined as follows:
τ ; t′ = t′
a(t); t′ = a(t; t′)
a(ts, t); t
′ = a(ts, t; t′)
Example 3.19. Intuitively, tree concatenation can be seen as a generalization
of list concatenation: If an execution-tree is understood as a list of operations
of the root thread, where spawn-operations have an additional execution-tree as
argument, tree concatenation is concatenation of such lists. For example, con-
sider the execution-trees t1 = a1(a2(ta, a3(tb, τ))) and t2 = a4(a5(tc, τ)). Their
concatenation t1; t2 can be visualized as follows:
a1 a2 a3 τ ; a4 a5 τ = a1 a2 a3 a4 a5 τ
ta tb tc ta tb tc
33
3 Models
Tree concatenation has similar properties as list concatenation. Some impor-
tant ones are formalized in the next lemma.
Lemma 3.20 (Properties of Tree Concatenation). The following properties hold
for all execution-trees t, t1, t2, t3 ∈ T:
1. Tree concatenation is associative, i.e., (t1; t2); t3 = t1; (t2; t3)
2. The empty tree is a left- and right-neutral, i.e., t; τ = τ ; t = t
3. The empty tree can only be produced as the concatenation of empty trees,
i.e., t1; t2 = τ if and only if t1 = τ and t2 = τ .
Proof. By straightforward induction.
Associativity of tree concatenation justifies to omit parenthesis, i.e., we write
t1; t2; t3 instead of t1; (t2; t3) or (t1; t2); t3. Moreover, we omit the ;-operator if
there is no ambiguity with list concatenation. In chains of ;-operations, we write
just the label o for a tree o(τ). For example, we write 〈xt1a(ts)〉xt2 instead
of 〈x(τ); t1;a(ts, τ); 〉x(τ); t2. This notation represents an execution-tree as a
sequence of steps of the root thread.
We now describe some basic properties of the lock-transition relation.
Lemma 3.21 (Properties of the Lock-Transition Relation). Let Act be a set of
actions and X be a set of locks. Moreover, let t, t′ ∈ TActX be execution-trees and
µ, µ′, µ′′ ∈ X ∗ be lockstacks, and x ∈ X be a lock. Then, the following holds:
1. Chaining ⇀-transitions commutes with tree concatenation, i.e.,
∃µ′. µ t⇀ µ′ t⇀′ µ′′ iff µ t;t′⇀ µ′′.
2. Transitions of ⇀ remain valid when elements are added at the bottom of
the lockstack, i.e., µ t⇀ µ′ implies µµ′′ t⇀ µ′µ′′.
3. The resulting lockstack is determined by the initial lockstack and the tree,
i.e., µ t⇀ µ′ ∧ µ t⇀ µ′′ implies µ′ = µ′′.
4. Same-level trees can be prepended to runs of ⇀, i.e., for ε t⇀ ε, we have
µ
t;t′
⇀ µ′ if and only if µ t
′
⇀ µ′.
5. The topmost lock on the lockstack is either popped at some point, or not
used at all, i.e., xµ t⇀ µ′ implies one of the following disjoint cases:
a) Either we obtain t1, t2 such that t = t1〉xt2 with ε t1⇀ ε, or
b) we obtain µ1 such that µ′ = µ1xµ and ε
t
⇀ µ1.
Additionally, case a) implies µ t2⇀ µ′.
34
3.3 Locks and Monitors
Proof. Propositions 1-3 are shown by straightforward induction. Proposition 4 is
a straightforward corollary of Propositions 1-3.
Disjointness of the cases of Proposition 5 immediately follows from Proposi-
tions 1-4. Completeness of the cases is shown by induction on t, followed by
a case distinction over the definition of ⇀. The proof for the (leaf)-, (base)-,
(spawn), and (release)-constraints is straightforward. Here, we demonstrate the
case that xµ t⇀ µ′ was derived due to the (acquire)-constraint, i.e., we have a lock
y ∈ X , can write t as t = 〈yt′, and have yxµ t
′
⇀ µ′. By applying the induction
hypothesis, we get two cases:
1) We can write t′ as t′ = t1〉yt2 with ε t1⇀ ε and xµ t2⇀ µ′. In this case, we
apply the induction hypothesis again, and get the cases:
1.1) We can write t2 as t2 = t21〉xt22 with ε t21⇀ ε and µ t22⇀ µ′. Hence, we
get ε
〈yt1〉yt21
⇀ ε, and thus we get case a) of the proposition.
1.2) We obtain µ1 with µ′ = µ1xµ and ε
t2⇀ µ1. Hence we have ε
〈yt1〉yt2
⇀ µ1,
and thus we get case b) of the proposition.
2) We obtain a µ1 with µ′ = µ1yxµ and ε
t′
⇀ µ1. Hence, we have ε
〈yt′
⇀ µ1y,
and thus we get case b) of the proposition.
Moreover, the lock-transition relation simulates the tree-semantics w.r.t. the
lockstack of the local thread:
Lemma 3.22. For all configurations c′ ∈ Conf, thread-configurations ϕ, ϕ′ ∈
PΓ∗, and execution-trees t ∈ T, the following holds:
ϕ
t
=⇒ c′ϕ′ =⇒ ls(ϕ) t⇀ ls(ϕ′).
Proof. By straightforward induction over t=⇒, exploiting the restrictions of the
rules of a Monitor-DPN.
3.3.3 Lock-Sensitive Scheduler
As sketched in the previous section, the lock-sensitive scheduler has to explicitly
keep track of the lockstacks of each thread. For this purpose, we define lock-
execution-trees and lock-execution-hedges as execution-trees and -hedges paired
with lockstacks, such that the trees are well-nested w.r.t. their corresponding
lockstacks, and the lockstacks are consistent, i.e., no lock is on more than one
lockstack at the same time:
35
3 Models
Definition 3.23 (Lock-Execution-Hedge). Let Act be a set of actions and X be
a set of locks. We define the set Tls ⊆ TActX ×X ∗ of lock-execution-trees and the
set Hls ⊆ (TActX ×X ∗)∗ of lock-execution-hedges by:
Tls := {(t, µ) | ∃µ′. µ t⇀ µ′}
Hls := {(t1, µ1) . . . (tn, µn) ∈ (Tls)∗ | ∀1 ≤ i < j ≤ n. µi ∩ µj = ∅}
If Act and X are not clear from the context, we make them explicit by writing
TlsAct,X and HlsAct,X .
The operations |1 : Hls → H and |2 : Hls → (X ∗)∗ project a lock-execution-hedge
to the execution-hedge and the list of lockstacks:
((t1, µ1) . . . (tn, µn))|1 := t1 . . . tn
((t1, µ1) . . . (tn, µn))|2 := µ1 . . . µn
The operation × : H× (X ∗)∗ → Hls pairs an execution-hedge with a list of lock-
stacks. It is only defined if the execution-hedge and the list of lockstacks have the
same length, and the result is a lock-execution-hedge, i.e., if the list of lockstacks
is consistent and the execution-trees are well-nested w.r.t. their corresponding
lockstacks:
(t1, . . . , tn)×(µ1, . . . , µm) :=

(t1, µ1) . . . (tn, µn) if m = n
and (t1, µ1) . . . (tn, µn) ∈ Hls
undefined otherwise
Note that Lemma 3.22 implies that, for a Monitor-DPN with a configuration
c ∈ Conf ls and an execution-hedge c h=⇒ c′, the function h×ls(c) is always defined.
The lock-sensitive scheduler is now defined as a labeled transition system over
lock-execution-hedges:
Definition 3.24 (Lock-Sensitive Scheduler). Let Act be a set of actions and X be
a set of locks. We define the lock-sensitive scheduling relation ;ls⊆ Hls×ActX ×
Hls as the least relation that satisfies the following constraints:
h1[(2at, µ)]h2
2a
; h1[(t, µ)]h2 (base)
h1[(〈xt, µ)]h2 〈x; h1[(t, xµ)]h2 if x /∈ (h1h2)|2 (acquire)
h1[(〉xt, xµ)]h2 〉x; h1[(t, µ)]h2 (release)
h1[(a(t1)t2, µ)]h2
2a
; h1[(t1, ε), (t2, µ)]h2 (spawn)
The reflexive, transitive closure of ;ls is denoted by ;∗ls.
The set of lock-sensitive schedules of a lock-execution-hedge is the set of runs
of ;ls that consume the entire hedge: We define schedls : Hls → 2(ActX )∗ by
schedls(h) = {o¯ | ∃h′. h′|1 ∈ {τ}∗ ∧ h o¯;∗ls h′}.
36
3.3 Locks and Monitors
The (base)- and (spawn)-constraints work as their lock-insensitive counter-
parts, and do not modify the lockstack. The side condition of the (acquire)-
constraint ensures that a lock-acquisition is only executed if the lock is not ac-
quired by some other thread. The (release)-constraint releases the topmost lock
of the lockstack, if it matches the lock from the node’s label (which it always does
for well-nested execution-trees). Note that we defined the scheduler only on well-
nested lock-execution-hedges where no two processes hold the same lock (Hls).
The side condition of the (acquire)-constraint makes this restriction explicit.
We show that the lock-sensitive scheduler matches the lock-sensitive interleav-
ing semantics:
Theorem 3.25 (Equality of Lock-Sensitive Interleaving and Tree-Semantics).
Let c ∈ Conf ls be a consistent configuration and c′ ∈ Conf be an arbitrary configu-
ration. Then, any run of the lock-sensitive interleaving semantics between c and
c′ corresponds to a lock-sensitive schedule of an execution-hedge between c and c′.
Formally:
c
o¯−→∗ls c′ ⇐⇒ ∃h ∈ H. c h=⇒ c′ ∧ o¯ ∈ schedls(h×ls(c)).
Proof. Regard the proof of Theorem 3.8. Analogously, we prove:
c
o−→ls cˆ hˆ=⇒ c′ =⇒ ∃h. c h=⇒ c′ ∧ h×ls(c) a;ls hˆ×ls(cˆ) (∗)
c
h
=⇒ c′ ∧ h×ls(c) o;ls hˆ× ˆ¯µ =⇒ ∃cˆ ∈ Conf ls. c o−→ls cˆ hˆ=⇒ c′ ∧ ˆ¯µ = ls(cˆ) (†)
The proof of (∗) is analogous to that of Theorem 3.8: We split the configu-
ration and the execution-hedge into the thread that executes the step, and the
surrounding threads. The hedge h is obtained by prepending the executed step
to the corresponding tree of hˆ. The step h×ls(c) a;ls hˆ×ls(cˆ) is obtained by using
the corresponding constraint of the lock-sensitive scheduler. The side condition
for an (acquire)-constraint follows from the corresponding side condition of the
(acquire)-constraint of the lock-sensitive interleaving semantics.
Also the proof of (†) is analogous to that of Theorem 3.8: We split h and c
into the thread that was scheduled and the other threads. The node of h that
was scheduled corresponds to a possible step of o−→ls. As the node can be sched-
uled lock-sensitively, the corresponding step can also be executed lock-sensitively,
yielding a configuration cˆ ∈ Conf ls with c o−→ls cˆ. As the lock-sensitive scheduler
and the interleaving semantics modify the lockstacks in the same way, we have
ˆ¯µ = ls(cˆ).
From (∗), the =⇒-direction is shown by induction over o¯, analogously to the
proof of Theorem 3.8: In the case o¯ = ε, we have c = c′. We choose h = τ |c|.
Obviously, we have c τ
|c|
=⇒ c and ε ∈ schedls(τ |c|×ls(c)).
In the case o¯ = oo¯′, we obtain cˆ with c o−→ls cˆ o¯
′−→∗ls c′. By induction hypothesis,
we obtain hˆ such that cˆ hˆ=⇒ c′ and o¯′ ∈ schedls(hˆ×ls(cˆ)). By (∗), we obtain h with
c
h
=⇒ c′ and h×ls(c) o;ls hˆ×ls(cˆ). Together, we get oo¯ ∈ schedls(h×ls(c)).
37
3 Models
Similar, the ⇐=-direction is shown from (†) by induction over o¯: In the case
o¯ = ε, we have h ∈ {τ}∗, hence c = c′, and thus c ε−→∗ls c′.
In the case o¯ = oo¯′, we obtain hˆ and ˆ¯µ, such that hˆ× ˆ¯µ is defined and we have
h×ls(c) o;ls hˆ× ˆ¯µ and o¯′ ∈ schedls(hˆ× ˆ¯µ). From (†), we obtain cˆ ∈ Conf ls with
cˆ
hˆ
=⇒ c′, c o−→ls cˆ, and ˆ¯µ = ls(cˆ). The induction hypothesis yields cˆ o¯
′−→∗ls c′, and
together we get c oo¯
′−→∗ls c′.
Similar to Corollary 3.9, we have:
Corollary 3.26. Let c ∈ Conf ls be a consistent configuration and c′ ∈ Conf be
an arbitrary configuration. Then, there is a run between c and c′, if and only if
there is a schedulable lock-execution-hedge between c and c′. Formally:
∃o¯ ∈ Act∗X . c o¯−→ls c′ ⇐⇒ ∃h ∈ H. c h=⇒ c′ ∧ schedls(h×ls(c)) 6= ∅.
3.4 Summary and Related Work
In this chapter, we have introduced DPNs and Monitor-DPNs. DPNs are an
extension of pushdown systems by dynamic thread creation, and Monitor-DPNs
extend DPNs by synchronization between threads via monitors. For both mod-
els, we have defined an interleaving semantics and a tree-semantics. While an
execution of the interleaving semantics is a totally ordered sequence of steps, an
execution of the tree-semantics is a tree, that does not order unrelated steps of
different threads. In order to obtain a sequential execution from an execution-
tree, it is processed by a scheduler. In Theorems 3.8 and 3.25, we have shown
the equivalence of the interleaving semantics and the tree-semantics—i.e., every
execution of the interleaving semantics corresponds to a schedule of an execution
of the tree-semantics.
In the remainder of this section, we discuss related work. We first discuss exten-
sions of pushdown systems by concurrency, then we discuss concurrent pushdown
systems with locks and with more powerful communication mechanisms. Finally,
we discuss the relation of our tree-semantics to true-concurrency semantics, and
to semantics that model configurations as multisets of threads rather than lists
of threads.
Pushdown Systems with Concurrency A well-known model that extends
pushdown systems by concurrency are PA-processes [7, 8], a process-algebraic
model of parallelism, where a configuration is represented as a term containing
sequential- and parallel-composition operators.
PA-processes have been integrated into a nice hierarchy of process models,
called the process rewrite system hierarchy (PRS-hierarchy), by Mayr [84]. We
38
3.4 Summary and Related Work
briefly present Mayr’s definition of PA-processes (called (1,G)-PRS in the PRS-
hierarchy): PA-processes can be expressed as a system of transition rules on
process terms. In Mayr’s setting, there are sets Act of actions and sets Var of
process variables. A process term has the form:
P ::= ε | Var | P.P | P ‖ P
where ε is the empty term, Var is a process variable, t1.t2 is interpreted as sequen-
tial composition of t1 and t2, and t1 ‖ t2 is interpreted as parallel composition of
t1 and t2. Mayr always regards equivalence classes of process terms modulo as-
sociativity and commutativity of parallel composition, associativity of sequential
composition, and neutrality of the empty process w.r.t. sequential and parallel
composition (i.e., ε.t = t.ε = t and t ‖ ε = t). A PA-process is given by a set of
rules ∆ that are of the form x a−→ t, where x ∈ Var is a process variable, a ∈ Act
is an action and t ∈ P is an arbitrary process term. The rules of a PA-process
induce a labeled transition system on process terms as follows:
(x





t1 ‖ t2 a−→ t′1 ‖ t2
t2
a−→ t′2





When compared with DPNs, process variables match the stack-symbols and se-
quential composition matches the stack. As there is no notion of control-state,
any state of the process must be encoded into the stack-symbols. This implies the
first restriction compared to DPNs: While in a DPN, a rule of the form pγ
a
↪→ p′
can return a result to the caller, there is no such possibility in PA-processes, i.e.,
in a process term of the form t1.t2, the outcome of t1 cannot affect the result of
t2. In the PRS-hierarchy, this can be achieved by PAD-processes ((S,G)-PRS)
that have rules of the form x.y a−→ t′.
Parallelism in PA-processes and DPNs is, however, handled completely differ-
ent. While concurrency in PA-systems is bound to the callstack, in DPNs, con-
currency adds a new dimension of infinity to the state-space. For example, after
executing a spawn-rule in a DPN, say pγ
a
↪→ pγs]pγ′ that induces the transition
pγγc
a−→ pγspγ′γc, both threads continue there executions independently, each on
its own stack. In particular, the spawning thread may pop stack-symbols, while
the spawned thread is still running. For example, we may have the transition
pγspγ
′γc −→ pγspγc, where the spawning thread pops its topmost stack-symbol,
while the spawned thread is still running. However, in PA-processes, after exe-
cution of a rule of the form x a−→ t1 ‖ t2, inducing, for example, the transition
x.xc
a−→ (t1 ‖ t2).xc, we have to wait until both processes t1 and t2 have terminated
before we can return to xc.
PA-systems (and also PAD-systems) are not an adequate model for dynamic
thread creation. This inadequacy is postulated by Bouajjani et al. [16], and
supported by showing that there is a DPN whose trace language does not match
39
3 Models
the trace language of any PA-process. On the other hand, DPNs cannot model
the synchronization on termination of two parallel threads.
Unifying both models leads to Constrained DPNs (CDPNs) [16] that are a
generalization of both DPNs and PAD-processes. A CDPN is a DPN, where a
rule may be constrained by a stable regular condition on the control-states of the
threads spawned by the thread that executes the rule. A stable regular condition
is a regular expression over stable sets of control-states, where a stable set is a
set of control-states that is closed under transitions, i.e., once the control-state of
a thread is in a stable set, it’s control-state will remain in that stable set forever.
In particular, stable constraints allow a thread to wait until a spawned thread
has terminated.
In order to model the relationship of a thread to the threads that it has
spawned, configurations of CDPNs are represented as trees rather than lists. In
such a configuration tree, each node stores the configuration of a single thread,
and the successors of a node hold the configurations of the threads spawned by
the thread of this node. As shown by Bouajjani et al. [16], predecessor sets of
regular sets of configuration trees are again regular and can be computed. Hence,
reachability properties of CDPNs are decidable.
A common model when considering analyses with locks are parallel pushdown
systems (PPDS). A PPDS consists of a fixed number of threads, where each
thread is described by a pushdown system. PPDS are less general than both
PA and DPN. In the first paper on precise analysis of concurrent programs with
procedures and locks, Kahlon et al. [57] use parallel pushdown systems, and so
does most of the work based on this paper [39, 40, 54–56, 61, 63].
Regarding dynamic thread creation, we consider interprocedural flowgraphs
with monitors in [78], which are equivalent to Monitor-DPNs with a single control-
state. In [79], we consider DPNs with well-nested, non-reentrant locks and in [44],
we consider DPNs with well-nested, non-reentrant locks and join-operations. In
this thesis, we consider Monitor-DPNs. We do not handle join-operations, but
it should be possible to combine the techniques in this thesis and the techniques
that we developed in [44], in order to analyze Monitor-DPNs with join-operations.
However, lock-sensitive reachability analysis is PSPACE-hard when considering
join-operations, while it is NP-complete without join-operations (cf. Chapter 9).
Locks and Monitors All precise analyses for concurrent programs with proce-
dures and locks that the author is aware of either restrict to non-reentrant locking
where locks are well-nested [39, 40, 44, 55–57, 61, 63, 79] or have bounded lock-
chains [54], or they consider reentrant monitors [60, 78]. We discuss the prob-
lem of analyzing models with reentrant locks that do not adhere to a monitor-
discipline in Section 8.1. Analysis of pushdown systems with locks that are not
well-nested is undecidable in general [57]. Recently, it has been shown that a
generalization of well-nestedness, so called bounded lock-chains, are sufficient to
40
3.4 Summary and Related Work
decide reachability [54].
Regarding well-nested, non-reentrant locks and reentrant monitors, neither is
more general than the other. As an example, consider the following pushdown
process:
p→ acquire x; q
q → s1; q; s2 q → release x
Where s1 and s2 are some arbitrary statements. Clearly, executions starting at
p use locks in a well-nested, non-reentrant fashion. However, there is no push-
down process using monitors that has the same set of executions. Vice versa,
reentrant acquisition of monitors obviously cannot be modeled by non-reentrant
locks. However, DPNs with reentrant monitors can be converted to DPNs with
non-reentrant monitors, at the cost of an exponential blowup in the number of
monitors. For PPDS, such a conversion is presented in [60]. For Monitor-DPNs,
the idea of the conversion is to encode the set of currently acquired locks into
the control-state of the DPN, and flag each symbol on the stack whether the lock
associated to it is reentrant or not. These flags are required to distinguish reen-
trant from non-reentrant release-operations. In this thesis, a similar conversion
is performed in Section 4.1.
More Powerful Communication In PA-processes, communication is limited
to allow parallel processes to synchronize on termination. In DPNs, apart from
dynamic creation of threads, there is no possibility for threads to communicate.
CDPNs allow a restricted form of communication, as a thread may observe the
state of its children, but only with stable constraints. Locks and monitors add
some limited form of communication, by ensuring mutual exclusion.
In contrast, real programming languages allow for more powerful communi-
cation mechanisms. A common concept is to use shared memory that may be
read and written by any thread. Other well-known concepts include rendezvous-
communication, where two threads synchronize on a specific statement that both
threads must execute simultaneously, or message-passing, where data is passed
between threads via messages. However, precise reachability analysis of systems
with (at least) two pushdown threads and any of the above communication con-
cepts (shared memory, rendezvous, message-passing) is undecidable [104]. The
basic idea is to reduce the emptiness problem of the intersection of two context-
free languages to reachability queries of the pushdown systems, using communi-
cation to ensure that the executions of both pushdown systems produce the same
terminals.
This problem can be attacked by over- or under-approximation. The simplest
method is to ignore the effects of the additional synchronization, leading to an
over-approximation. Properties like absence of data-races are often ensured by
using locks, such that this approximation is often precise enough to prove those
41
3 Models
properties. Another over-approximation, which is based on over-approximating
runs of communicating pushdown systems (CPDS), has been proposed by Boua-
jjani et al. [14].
A natural way to design under-approximations is bounded model-checking [9,
10], where the state-space of the system is only explored up to a certain depth. For
example, the KISS [102] tool regards only executions up to two context switches.
This is extended by [101], where parallel pushdown systems2 are analyzed for
executions up to k context switches. In [15], this is generalized to DPNs with
shared global state. Moreover, while the bound in [101] is based on context
switches, the bound in [15] is based on how often information is passed between
threads via shared global state, while there may be unboundedly many context
switches and threads. The idea of [101] has also been adapted to concurrent
weighted pushdown systems [68], allowing for certain infinite-state abstractions
of program data. In Section 6.4, we discuss how to use the methods of our thesis
to increase the precision of bounded model-checking for Monitor-DPNs.
True-Concurrency Semantics The ideas behind our tree-semantics are based
on true-concurrency semantics. True-concurrency semantics has first been stud-
ied for Petri-nets [98, 99], and also inspired the theory of traces [87]. As illustrated
in Example 3.4, an execution-tree induces a partial ordering on the multiset of
actions. Such partially ordered multisets, also called pomsets, have been intro-
duced as partial strings by Grabowski [48], and applied to Petri-nets. The term
„pomset” was first used by Pratt [100].
Apart from encoding a pomset, an execution-tree specifies an order on the suc-
cessors of an action, i.e., a spawn-node has a left successor, which describes the
spawned thread, and a right successor, which describes the continuation of the
spawning thread. This additional information is essential, as it allows to track the
execution of a thread throughout the execution-tree. Examples are tree concate-
nation, which continues the execution of the root thread (cf. Example 3.19), and
the lock-sensitive scheduler (cf. Section 3.3.3), which assigns the empty lockstack
to the spawned thread and the original lockstack to the spawning thread when
executing a spawn-node.
Ordered Configurations As mentioned above, the successors of a node in an
execution-tree are ordered, and also the configurations of a DPN are an ordered
list of thread-configurations. When regarding CDPNs [16] or Join-Lock-DPNs
[44], configurations are modeled as trees, where each node represents a thread-
configuration, and the successors of a node are the spawned threads of this node.
Also in these models, the successors of a node are ordered. However, in typical
concurrent programs, there is no notion of ordering on threads. Indeed, in [78] we
2They handle creation of a constant number of threads: The pushdown system that describes
the thread is started by the spawn-statement using communication.
42
3.4 Summary and Related Work
represent configurations as unordered multisets of thread-configurations. How-
ever, in order to apply the automata-theoretic techniques developed for DPNs,
we have to specify configurations as lists.
Lugiez [81] studies DPNs with configurations that are unordered, unranked
trees. Sets of such trees are described by Presburger Tree Automata [123], and
their extension, Presburger Weighted Tree Automata [81]. While for the ordered
setting, the set of forward reachable configurations of a CDPN is not regular
[16], the main result of [81] is that, in the unordered setting, the set of forward
reachable configurations is accepted by a Presburger Weighted Tree Automaton.






In the last chapter, we have defined Monitor-DPNs with an interleaving semantics
and a tree-semantics. Corollary 3.26 states that there is a lock-sensitive execution
between two configurations, if and only if there is a schedulable lock-execution-
hedge between those configurations. In this chapter, we show how to characterize
schedulable lock-execution-hedges, using a method based on acquisition histories
[57]. In the next chapter, we then show how to combine this characterization
with the Monitor-DPN to be analyzed.
The characterization of schedulable lock-execution-hedges is simpler to describe
and prove correct if only schedules are regarded that execute the steps between
matching acquisition- and release-operations atomically, i.e., no steps of other
threads are executed in between. Schedules with this property are called disci-
plined schedules. An arbitrary schedule of a lock-execution-hedge can always be
reordered to a disciplined schedule. Thus, it is sufficient to regard disciplined
schedules in order to describe schedulable lock-execution-hedges. Moreover, we
only need to consider non-reentrant acquisition- and release-steps, as reentrant
ones are always executable and have no effect on the set of allocated locks.
Lock-execution-hedges are an adequate model for arbitrary schedules, as the
scheduler processes one node per step. Similarly, we define lock-acquire/release-
hedges (lock-a/r-hedges) as the adequate model for disciplined schedules. In lock-
a/r-hedges, the nodes between (outermost) matched acquisition and release pairs
are represented by single nodes. Moreover, they contain no reentrant acquisitions
and releases.
This chapter is organized as follows: In Section 4.1, we define lock-a/r-hedges
and show how to transform lock-execution-hedges to lock-a/r-hedges. In Sec-
tion 4.2, we show that schedulability of lock-a/r-hedges and lock-execution-hedges
matches, exploiting that schedules can be reordered to disciplined schedules. In
Section 4.3, we use acquisition structures, a generalization of acquisition histories
[57], to characterize the set of schedulable lock-a/r-hedges as a tree automaton.
Finally, we briefly summarize the results of this chapter and discuss related work
in Section 4.4.
4.1 Acquire/Release-Hedges
In this section, we transform a lock-execution-hedge by collapsing nodes between
matched acquisition- and release-operations into single nodes, and renaming reen-
45
4 Lock-Sensitive Schedulability
trant lock-operations to operations that have no effect. The resulting structure
is called lock-acquire/release-hedge.
For this section, let M = (P,Γ,Γ⊥,Act,X ,∆, locks) be a fixed Monitor-DPN.
We first define lock-acquire/release-hedges:
Definition 4.1 (Lock-Acquire/Release-Hedges). We define the following signa-
ture HARls for lock-acquire/release-hedges (lock-a/r-hedges for short):
HARls ::= εh | (TAR, X)#hHARls for X ⊆ X
TAR ::= τ | 〈〉X(SAR,TAR) | 〉x(TAR) | 〈x(TAR) for x ∈ X , X ⊆ X
SAR ::= εs | TAR#sSAR
The elements of TAR are called acquire/release-trees (a/r-trees for short). Ele-
ments of HARls and SAR are also written as lists, omitting the ε- and #-symbols,
and we use a similar notation as for execution-trees to avoid parentheses: We
write 〈xt, 〉xt, and 〈〉X(s)t instead of 〈x(t), 〉x(t), and 〈〉X(s, t).
Moreover, we assume that the sets of locks in the second components of the
elements of a lock-a/r-hedge are pairwise disjoint, that the locks in the trees are
non-reentrant, and that there are no matched acquire/release-pairs. Formally, we
define the set of well-formed lock-a/r-hedges by WF := WF∅h. For all X ⊆ X , the
auxiliary sets WFXh ⊆ HARls, WFXt ⊆ TAR, and WFs ⊆ SAR are defined as the least
solution of the following constraints:
εh ∈ WFXh if X ⊆ X (h-empty)
(t, Y )#hh ∈ WFXh if t ∈ WFYt ∧X ∩ Y = ∅ ∧ h ∈ WFX∪Yh (h-cons)
τ ∈ WFXt if X ⊆ X (leaf)
〉xt ∈ WF{x}∪Xt if x /∈ X ∧ t ∈ WFXt (release)
〈〉Y (s)t ∈ WFXt if X ∩ Y = ∅ ∧ s ∈ WFs ∧ t ∈ WFXt (use)
〈xt ∈ WFXt if x /∈ X ∧ t ∈ WF∅t ∩WF{x}∪Xt (acquire)
εs ∈ WFs (s-empty)
t#ss ∈ WFs if t ∈ WF∅t ∧ s ∈ WFs (s-cons)
For the remainder of this thesis, we restrict the set HARls to well-formed lock-a/r-
hedges.
As for lock-execution-hedges, we use the operators |1 and |2 to project a lock-
a/r-hedge to its a/r-hedge and its lockstacks, respectively.
Note that we explicitly defined the constructors ε and # for hedges and lists of
spawned threads, such that tree automata can be defined over those structures.
46
4.1 Acquire/Release-Hedges
Intuitively, an element (t,X) of a lock-a/r-hedge consists of an a/r-tree t de-
scribing the execution of a thread, and a set X ⊆ X of locks that the thread
holds initially. In an a/r-tree, a 〈〉X(s, t)-node summarizes a sequence of nodes
between a matched acquisition- and release-node. The set X ⊆ X is the set of
used locks, i.e., the set of locks acquired (and released) in the summarized nodes.
The list s ∈ SAR is the list of spawned threads, and t ∈ TAR is the remainder of
the root thread. Nodes of the form 〈xt and 〉xt represent unmatched acquisitions
and releases.
Intuitively, in the WFXh -predicate, the set X models the locks that are initially
held by other elements of the hedge. The (h-cons)-constraint ensures that the
sets of initially held locks are pairwise disjoint, and that the lock-a/r-trees are
well-formed w.r.t. their set of initially held locks.
In the WFXt -predicate, the set X models the locks that are currently held
by the thread. The (release)-constraint ensures that the released lock is really
held by the thread. The (use)-constraint ensures that the set of used locks is
non-reentrant and that the list of spawned threads is well-formed. The (acquire)-
constraint ensures that the acquisition is non-reentrant (x /∈ X), and that the
remainder of the tree does not free any locks, i.e., that the acquisition is really
unmatched (t ∈ WF∅t ). Finally, the WFs-predicate ensures that all a/r-trees in
the list are well-formed w.r.t. the empty set of initially held locks.
Note that the WFt-predicate is defined similar to the lock-transition relation on
execution-trees (cf. Definition 3.16). Because we have no reentrance, it is sufficient
to use sets of locks, instead of the lockstacks used for the lock-transition relation.
Additionally, the WFt-predicate makes explicit only the start lockset, while the
lock-transition relation is defined as a relation between two lockstacks.
The well-formedness of lock-a/r-hedges is similar to the consistency constraint
for lock-execution-hedges (cf. Definition 3.23). Additionally, it ensures that all
matching acquisition- and release-nodes are collapsed into use-nodes, and that
reentrance is completely eliminated, i.e., that there are no reentrant acquisition-
and release-nodes and that use-nodes are not annotated with reentrant locks.
47
4 Lock-Sensitive Schedulability
4.1.1 Mapping Execution-Hedges to A/R-Hedges
In order to map lock-execution-hedges to lock-a/r-hedges, we have to identify
matching acquisitions and releases in the lock-execution-hedge. This is achieved
by the lock-transition relation (cf. Definition 3.16):
Lemma 4.2 (Case Distinction for Lock-Execution-Trees). The following case
distinction for lock-execution-trees (t, µ) ∈ Tls is exhaustive and unambiguous:
t = τ (leaf)
t = 2at
′ ∧ (t′, µ) ∈ Tls for some a, t′ (base)
t = a(ts)t
′ ∧ (ts, ε) ∈ Tls ∧ (t′, µ) ∈ Tls for some a, ts, t′ (spawn)
t = 〈xts〉xt′ ∧ ts ∈ Tsl ∧ (t′, µ) ∈ Tls for some x, ts, t′ (use)
t = 〈xt′ ∧ t′ ∈ Tnr for some x, t′ (acquire)
t = 〉xt′ ∧ µ = xµ′ ∧ (t′, µ′) ∈ Tls for some x, µ′, t′ (release)
Proof. We show that (t, µ) matches exactly one of the cases. From the definition
of Tls we obtain a µ′ such that µ t⇀ µ′. If t = τ , it is matched exactly by the
(leaf)-case. Otherwise, by definition of ⇀, the subtrees of t are also well-nested.
Hence, if the root-node of t is a base- or spawn-node, it is matched exactly by
the (base)- and (spawn)-case, respectively. If the root-node of t is a release-node,
by definition of ⇀, the topmost lock on the lockstack matches the released lock.
Thus, this case is matched exactly by the (release)-case. If the root-node of t
is an acquisition, i.e., t = 〈xt′, we have µ 〈x⇀ xµ t
′
⇀ µ′. We apply Lemma 3.21
(Proposition 5), and the two resulting disjoint cases a) and b) match the (use)-
and (acquire)-case.
Intuitively, Lemma 4.2 distinguishes whether an acquisition-node is matched
(use) or unmatched (acquire). In the matched case, the tree between the acqui-
sition and the matching release is a same-level tree, and in the unmatched case,
it is a non-releasing tree.
Now, we define the mapping from lock-execution-hedges to lock-a/r-hedges:
Definition 4.3. The functions ar := arh, arh : Hls → HARls, art : Tls → TAR,
and ars : Twn × 2X → 2X × TAR∗ are inductively defined over the structure of
48
4.1 Acquire/Release-Hedges
lock-execution-hedges, using the case distinction of Lemma 4.2.
arh(ε) = ε
arh((t, µ)h) = (art(t, µ), set(µ))arh(h)
art(τ, µ) = τ
art(2at, µ) = 〈〉∅(ε)art(t, µ)
art(a(t1)t2, µ) = 〈〉∅([art(t1, ε)])art(t2, µ)
art(〈xt, µ) =
{
〈〉∅(ε)art(t, xµ) if x ∈ set(µ)
〈xart(t, xµ) if x /∈ set(µ)
if t ∈ Tnr
art(〉xt, xµ) =
{
〈〉∅(ε)art(t, µ) if x ∈ set(µ)
〉xart(t, µ) if x /∈ set(µ)
art(〈xt1〉xt2, µ) = 〈〉({x}\set(µ))∪u(s)art(t2, µ) if t1 ∈ Tsl ∧ ars(t1, set(µ)) = (u, s)
ars(τ,X) = (∅, ε)
ars(2at,X) = ars(t,X)
ars(a(t1)t2, X) = (u, art(t1, ε)s) if ars(t2, X) = (u, s)
ars(〈xt,X) = ({x} \X ∪ u, s) if ars(t,X) = (u, s)
ars(〉xt,X) = ars(t,X)
The definition of art follows the case distinction of Lemma 4.2. Moreover, it
keeps track of the current lockstack, to eliminate reentrance. Reentrant locks
in use-nodes are simply not included into the set of used locks. Reentrant
acquisition- or release-nodes are replaced by a dummy use-node 〈〉∅(ε) that spawns
no threads and uses no locks. Also base-steps are translated to such dummy
nodes. Note that we intentionally defined ar such that reentrance elimination
preserves the structure of the tree, and replaced reentrant nodes as well as base-
nodes by dummy nodes, instead of omitting them in the translated tree. This
simplifies the simulation proof between the scheduler on lock-execution-hedges
and the one on lock-a/r-hedges (cf. proof of Lemma 4.11).
Well-Definedness of Definition 4.3. We have to show that the result of the ar-
function is well-formed.
Let h˜ = ar(h) for a lock-execution-hedge h ∈ Hls. We show h˜ ∈ WF∅h. Dis-
jointness of the sets of initially held locks in h˜ follows from disjointness of the
lockstacks in h. It remains to show that art(t, µ) ∈ WFset(µ)t for (t, µ) ∈ Tls.
This is done by induction on t, according to the case distinction of Lemma 4.2.
49
4 Lock-Sensitive Schedulability
Generalizing the goal yields:
(t, µ) ∈ Tls =⇒ art(t, µ) ∈ WFset(µ)t
µ
ts⇀ ε =⇒ ars(ts, X) ∈ 2X\X ×WFs
We demonstrate the case t = 〈xt1 for x /∈ µ and t1 ∈ Tnr and the case t = 〈xt1〉xt2
for t1 ∈ Tsl. The other cases are straightforward or analogous.
In the case t = 〈xt1 for x /∈ µ and t1 ∈ Tnr, we have art(t, µ) = 〈xart(t1, xµ). By
induction hypothesis, we have art(t1, xµ) ∈ WFset(xµ)t . As t1 ∈ Tnr, it contains no
unmatched release-nodes, and so does art(t1, xµ), hence we also have art(t1, xµ) ∈
WF∅t , and the proposition follows with the (acquire)-constraint of WFt.
In the case t = 〈xt1〉xt2 for t1 ∈ Tsl, we have art(t, µ) = 〈〉({x}\set(µ))∪u(s)art(t2, µ)
for (u, s) := ars(t1, set(µ)). By induction hypothesis, we have u ∩ set(µ) = ∅, s ∈
WFs and art(t2, µ) ∈ WFset(µ)t . The proposition follows with the (use)-constraint
of WFs.
4.2 Schedules of Acquire/Release-Hedges
In the last section, we defined lock-a/r-hedges and showed how to transform
lock-execution-hedges to lock-a/r-hedges. In this section, we define a scheduler
on lock-a/r-hedges and show that a lock-execution-hedge is schedulable if and
only if its lock-a/r-hedge is schedulable. While the if-direction of this theorem
is rather straightforward, the only if-direction requires reordering of the steps of
a schedule such that no steps of other threads are scheduled between matched
acquisition- and release-nodes of a thread.
The scheduler on lock-a/r-hedges is defined similar to that on lock-execution-
hedges: In each step, it selects a schedulable root-node of the hedge, and replaces
it by its successors. Additionally, we force the scheduler not to schedule release-
nodes after acquisition-nodes.
Definition 4.4 (Scheduler on Lock-A/R-Hedges). We define the relations
 u, a, r⊆ HARls × HARls
as the least relations that satisfy the following constraints:
h1(〈〉Y (s, t), X)h2  u h1lift(s)(t,X)h2 if Y ∩
⋃
(h1h2)|2 = ∅ (use)
h1(〈x(t), X)h2  a h1(t, {x} ∪X)h2 if x /∈
⋃
(h1h2)|2 (acquire)
h1(〉x(t), {x} ∪X)h2  r h1(t,X)h2 if x /∈ X (release)
Where lift(t1 . . . tn) := (t1, ∅) . . . (tn, ∅). Moreover, we define  ru:= r ∪  u,
 au:= a ∪ u, and  := r ∪ u ∪ a.
50
4.2 Schedules of Acquire/Release-Hedges
A schedule of a lock-a/r-hedge completely schedules the hedge by first applying
a sequence of release- and use-steps ( ∗ru), and then a sequence of acquisition-
and use-steps ( ∗au). Analogously to the set schedls : Hls → 2(ActX )∗, we define the
predicate schedar : HARls → B by
schedar(h) :⇐⇒ ∃h′. h′|1 ∈ {τ}∗ ∧ h ∗ru ◦ ∗au h′.
The following theorem connects the scheduler on lock-a/r-hedges with the
scheduler on lock-execution-hedges.
Theorem 4.5. A lock-execution-hedge has a schedule if and only if its corre-
sponding lock-a/r-hedge has a schedule. Formally, for any lock-execution-hedge
h ∈ Hls, we have:
schedls(h) 6= ∅ ⇐⇒ schedar(ar(h))
In the remainder of this section, we prove this theorem. We proceed as follows:
First, we define a disciplined scheduler on lock-execution-hedges, that schedules
sequences of nodes between matched acquisitions and releases atomically. We
show that there exists a corresponding disciplined schedule for any schedule. In
this step, the main work of the proof is done. Next, we show a stepwise simulation
between the disciplined scheduler on lock-execution-hedges and the scheduler on
lock-a/r-hedges. In this step, we essentially prove the reentrance elimination
correct.
4.2.1 A Theory of Movers
For reordering of schedules to disciplined schedules, we adopt the concept of
movers from Lipton [80]. Section 4.4 contains a detailed comparison of the theory
developed here and [80].
In this subsection, we first describe a theory of movers for a rather general
setting, and then instantiate it to lock-sensitive scheduling. We assume that a
configuration is described by a list of process-configurations from P . Each process
may own locks from X , described by a function locks : P → 2X . The locks held by
the processes in a configuration are assumed to be disjoint. The key observation
is that a step of a process in a configuration only depends on the state of the
process itself and the locks held by the other processes. We describe transitions
by a step-relation r ⊆ P × P∗ × P × 2X . Instead of (p, h, p′, X) ∈ r, we write
p→r h, p′[X], meaning that, within a context where the other processes hold the
locks from X, process p makes a step to p′, spawning the processes in h. We lift
the step-relation to configurations:
h1ph2 ;r h1hsp
′h2 iff p→r hs, p′[locks(h1h2)],
where locks(p1 . . . pn) := locks(p1) ∪ . . . ∪ locks(pn).
Moreover, we assume that step-relations preserve disjointness of locks, are an-
timonotone w.r.t. the context, and freshly spawned processes own no locks:
51
4 Lock-Sensitive Schedulability
Definition 4.6 (Valid Step-Relations). A step-relation r is called valid, iff
locks(p) ∩X = ∅ ∧ p→r h, p′[X] =⇒ locks(p′) ∩X = ∅
X ⊆ X ′ ∧ p→r h, p′[X ′] =⇒ p→r h, p′[X]
p→r h, p′[X ′] =⇒ locks(h) = ∅
Steps that release locks or do not alter the set of locks owned by the process
are called decreasing steps. Steps that acquire locks or do not alter the set of
locks are called increasing steps. Formally:
r decreasing iff p→r h, p′[X] =⇒ locks(p) ⊇ locks(p′)
r increasing iff p→r h, p′[X] =⇒ locks(p) ⊆ locks(p′)
Two consecutive steps may either be unrelated, on the same process, or the
second step may be on a process spawned by the first step. The last two cases
are expressed by the combinators • and , that combine two step-relations into
a new step-relation:
p→r•s h1h2, p′[X] if p→r h1, p˜[X] ∧ p˜→s h2, p′[X] (seq)
p →rs h1h2p′sh3, p′[X] if p→r h1psh3, p′[X] ∧ ps →s h2, p′s[X ∪ locks(p′)] (spawn)
Recall that we assume that spawned processes do not acquire locks, thus processes
spawned by the first step are not considered for the context of the second step.
Moreover, we define the identity step-relation id by
p→id h, p′ iff p′ = p ∧ h = ε.
It is straightforward to show the following properties:
Lemma 4.7.
1. If r and s are valid, so are r • s and r  s. Moreover, id is valid.
2. • is associative, and id is a left- and right-neutral:
(r • s) • t = r • (s • t) and r • id = id • r = r.
3. Sequential composition and spawn-composition imply composition of the
lifted relations:
;r•s ⊆;r ◦;s and ;rs ⊆;r ◦;s .
4. Union distributes over lifting: ;r∪s = ;r ∪;s .
The main-statement of this section is the following:
52
4.2 Schedules of Acquire/Release-Hedges
Lemma 4.8 (Mover-Lemma). If there is an increasing step, followed by a de-
creasing step on an unrelated process, these steps can be swapped.
Formally, this is expressed as a case distinction: Given configurations h, h′ ∈
P∗, an increasing step-relation r, and a decreasing step-relation s, we have:
h;r ◦;s h′ =⇒ (h;r•s h′ ∨ h ;rs h′ ∨ h;s ◦;r h′)
Note that the above cases are not necessarily disjoint. For example, we have
h;id h;id h and also h ;id•id h for the identity step-relation id.
Proof. The proof works by analyzing the relation of the first and second step. If
the first and second step are in the same process, we get h ;r•s h′. If the second
step is in a process that was spawned by the first step, we get h ;rs h′. Finally, if
the first and second step are in independent processes, they can be swapped: The
first step is increasing, hence, before the first step, the configuration has fewer
locks. As the step-relation is antimonotone, the second step can be performed on
that configuration. As the second step is decreasing, the resulting configuration
has fewer locks than the configuration the first step was originally executed on.
Thus, the first step can also be executed on that configuration.
Formally, we have to do rather extensive splitting of configurations: We assume
h ;r h˜ ;s h
′. By definition of the lifting of → to configurations, we can write
h = h1ph2 and h˜ = h1hsp˜h2, such that p→r hs, p˜[locks(h1h2)]. We now distinguish
on which part of the configuration the second step is performed. If it is performed
on p˜, we get h′s and p′ such that h′ = h1hsh′sp′h2 and p˜ →s h′sp′[locks(h1h2)], and
thus p→r•s hsh′s, p′[locks(h1h2)]. This yields h;r•s h′.
If the second step is performed on a process from hs, we obtain ha, hb, h′s, and
p′s such that hs = hapshb, h′ = h1hah′sp′shbp˜h2, and ps →r h′s, p′s[locks(h1p˜h2)].
Thus, we have p →rs hah′sp′shb, p˜[locks(h1h2)], which implies h ;rs h′.
If the second step is performed on h1, we obtain ha, h′s, hb, p2, and p′2 such
that h1 = hap2hb, h′ = hah′sp′2hbhsp˜h2, and p2 ;s h′s, p′2[locks(hahbp˜h2)]. As r
is increasing, we have locks(p) ⊆ locks(p˜). Due to antimonotonicity of the step-
relation, we get p2 →s h′s, p′2[locks(hahbph2)], and thus h ;s hah′sp′2hbph2. As s is
decreasing, we have locks(p2) ⊇ locks(p′2). Moreover, freshly spawned processes
hold no locks, thus we have locks(h′s) = ∅. Due to antimonotonicity of the
step-relation, we get p →r hs, p˜[locks(hah′sp′2hbh2)], and thus hah′sp′2hbph2 ;r h′.
Together, we get h;s ◦;r h′.
The case that the second step is performed on h2 is shown analogously.
Moreover, we denote by r∗ the reflexive, transitive closure of r w.r.t. •, i.e., r∗
is the least relation that satisfies the following constraints:
r∗ ⊇ id and r∗ ⊇ r • r∗
Instantiation of the above theory to scheduling of lock-execution-hedges is
straightforward: A process-state is a lock-execution-tree, i.e., P := Tls. The
53
4 Lock-Sensitive Schedulability
locks-function maps a lock-execution-tree to the set of locks of its lockstack, i.e.
locks(t, µ) = set(µ). Then, configurations correspond to lock-execution-hedges,
and we have locks(h) = {x | ∃µ ∈ h|2. x ∈ µ}.
4.2.2 Disciplined Schedules of Execution-Hedges
We now define the step-relations of the disciplined scheduler on lock-execution-
hedges. The disciplined scheduler on lock-execution-hedges schedules nodes be-
tween matched acquisitions and releases in one step.
Definition 4.9 (Disciplined Scheduler on Lock-Execution-Hedges). We first de-
fine the auxiliary function used : Twn → 2X that maps a well-nested tree to the
set of locks used in the local branch of this tree. The function used is defined
inductively over the structure of an execution-tree:
used(τ) = ∅ used(2at) = used(t)
used(a(t1)t2) = used(t2) used(〈xt) = {x} ∪ used(t)
used(〉xt) = used(t)
As a second auxiliary function, we define the function spawned : Twn → Hls
that collects the spawned trees of the local branch of a well-nested tree, and pairs
them with empty lockstacks:
spawned(τ) = ε spawned(2at) = spawned(t)
spawned(a(t1)t2) = (t1, ε)spawned(t2) spawned(〈xt) = spawned(t)
spawned(〉xt) = spawned(t)
Then, we define the relations →u , →a , and →r as the least relations that satisfy
the following constraints:
(2at, µ)→u ε, (t, µ)[X] (base)
(a(ts, t), µ)→u (ts, []), (t, µ)[X] (spawn)
(〈xts〉xt, µ)→u spawned(ts), (t, µ)[X] (use)
if ts ∈ Tsl ∧ ({x} ∪ used(ts)) ∩X = ∅
(〈xt, µ)→a ε, (t, xµ)[X] if x /∈ X (acquire)
(〉xt, xµ)→r ε, (t, µ)[X] (release)
Moreover, we define:
→ru :=→r ∪ →u and →au :=→a ∪ →u .
Obviously, these relations fulfill the assumptions for step-relations, i.e., they
preserve disjointness of locks, are antimonotone, and spawned processes own no
locks. Using the mover-lemma, we can reorder the steps of a schedule to a
disciplined schedule:
54
4.2 Schedules of Acquire/Release-Hedges
Lemma 4.10. Any lock-sensitive schedule has a corresponding disciplined sched-
ule on lock-execution-hedges and, vice versa, any disciplined schedule has a cor-
responding lock-sensitive schedule, i.e., for any lock-execution-hedge h ∈ Hls, we
have
schedls(h) 6= ∅ ⇐⇒ ∃h′ ∈ Hls. h′|1 ∈ {τ}∗ ∧ h;ru ∗ ◦;au ∗ h′.
Proof. First, we generalize the statement by omitting the condition that the
schedule must be complete:
(∃o¯. h o¯;∗ls h′) ⇐⇒ h;ru ∗ ◦;au ∗ h′.
For the ⇐=-direction, we observe that steps of the disciplined scheduler corre-
spond to single steps or sequences of steps of the original scheduler: For ;a and
















If the ;u -step was derived by the (base)- or (spawn)-rule, we can immediately
derive the corresponding step from ;ls. If it was derived by the (use)-rule,
the proposition follows from the following statement, which is easily shown by
induction on ts:
µ
ts⇀ µ′ ∧ used(ts) ∩ (h1h2)|2 = ∅ =⇒ ∃o¯. (ts, µ) o¯;∗ls spawned(ts)(τ, µ′).
For the =⇒-direction, we show how steps that we append to a disciplined
schedule can be moved left into this schedule to a valid position. Formally, we
show:
h;ru
∗ ◦;au ∗ h˜ ;u∗•r h′ =⇒ h;ru ∗ ◦;au ∗ h′ (∗)
Note that we show how to shift left a sequence of use-steps, followed by a release-
step. This generalization is needed for the induction proof.
Moreover, for all base-actions a ∈ Act and locks x ∈ X , we have:
2a
;ls ⊆;au , and
〈x
;ls ⊆;au , and
〉x
;ls ⊆ ;u∗•r.
Now, it is straightforward to construct the disciplined schedule, by append-
ing the steps of the lock-sensitive schedule one by one to the end of the already




It remains to prove (∗). This is done by induction on the number of ;au -steps
in h ;ru ∗ ◦ ;au ∗ h˜. In the base-case, we have h ;ru ∗ h˜ ;u∗•r h′. Due to Lemma 4.7,
we get
;u∗•r ⊆;ru ∗,
and thus h;ru ∗ h′. If there is at least one ;au -step, we pick the last one, and get
h;ru
∗ ◦;au ∗ hˆ;au h˜ ;u∗•r h′.
As au is increasing and u∗ • r is decreasing, we can apply the mover-lemma
(Lemma 4.8), and get the following cases:
hˆ ;au•u∗•r h
′ In this case, the au and u∗ • r steps are executed on the same thread.
We distinguish whether the first step is an acquisition- or a use-step. In
case of a use-step, we prepend the use-step to the sequence of use-steps
before the release-step, and apply the induction hypothesis. Formally, this
corresponds to the inequation
u • u∗ • r ⊆ u∗ • r,
that is easily shown. If the first step is an acquisition-step, it is matched
with the release-step, obtaining a single use-step that can be executed as
last step of the disciplined schedule. Formally, this corresponds to the
inequation
a • u∗ • r ⊆ u∗,
that is also easily shown. (Note that the acquired and released lock do
match, as the intermediate use-steps do not modify the lockstack.)
hˆ ;au(u∗•r) h
′ In this case, a freshly spawned thread would execute some use-steps,
and then release a lock. However, as freshly spawned threads have an
empty lockstack, use-steps do not change the lockstack, and a lock cannot
be released from an empty lockstack, this yields a contradiction. Formally,
we easily show:
au (u∗ • r) = ∅
hˆ ;u∗•r ◦;au h′ In this case, the induction hypothesis yields h ;ru ∗ ◦ ;au ∗ ◦ ;au h′,
and thus we get the proposition.
Next, we connect disciplined schedules of lock-execution-hedges with schedules
of lock-a/r-hedges.
Lemma 4.11. A disciplined schedule of a lock-execution-hedge corresponds to a
schedule of the lock-a/r-hedge. Formally, for any lock-execution-hedge h ∈ Hls,
we have:
(∃h′ ∈ Hls. h′|1 ∈ {τ}∗ ∧ h;ru ∗ ◦;au ∗ h′) ⇐⇒ schedar(ar(h)).
56
4.2 Schedules of Acquire/Release-Hedges
Proof. After unfolding the definition of schedar, we have to show
(∃h′ ∈ Hls. h′|1 ∈ {τ}∗ ∧ h;ru ∗ ◦;au ∗ h′)
⇐⇒ (∃ha ∈ HARls. ha|1 ∈ {τ}∗ ∧ ar(h) ∗ru ◦ ∗au ha).
It is straightforward to show h′|1 ∈ {τ}∗ ⇐⇒ ar(h′)|1 ∈ {τ}∗. Then, the
statement is proved by the following bisimulation argument:
(∃h′. ha = ar(h′) ∧ h;r h′) ⇐⇒ ar(h) r ha (1)
(∃h′. ha = ar(h′) ∧ h;u h′) ⇐⇒ ar(h) u ha (2)
(∃h′. ha = ar(h′) ∧ h;a h′) ⇐⇒ ar(h) a ha (3)
The proofs of the =⇒-directions are straightforward, by unfolding the defini-
tions. Reentrance elimination of ar causes no problems here, as the 〈〉∅-nodes
resulting from reentrance elimination are always schedulable.
For the⇐=-directions, we have to consider reentrance elimination. We demon-
strate the proof for (2) here, the other proofs are similar: So assume we have
ar(h) u ha. Hence, we can write ar(h) as ar(h) = ha1(〈〉Y (sa, ta), X)ha2, and have
ha = ha1lift(s




2)|2 = ∅. Accordingly, we can write h as
h = h1(t˜, µ)h2 with ar(h1) = ha1, ar(h2) = ha2, and
ar(t˜, µ) = (〈〉Y (sa, ta), X) (∗)





Now, we distinguish on the definition of ar (Definition 4.3), how (∗) was pro-
duced. The use-node may be the result of a translated base-node (i) or spawn-
node (ii), a sequence of nodes between a matched acquire/release-pair (iii), a
reentrant acquisition-node (iv), or a reentrant release-node (v). We demonstrate
the proof for case (iii) here, the other cases are trivial or analogous.
So assume we have t˜ = 〈xts〉xt with ts ∈ Tsl and Y = {x}\X∪u where (u, sa) =
ars(ts, X). By straightforward induction over ts, we get u = used(ts) \ (X) and
sa = ar(spawned(ts)). Hence, we get
(({x} ∪ used(ts)) \ set(µ)) ∩ locks(h1h2) = ∅.
As h is consistent, we have set(µ) ∩ locks(h1h2) = ∅, and thus ({x} ∪ used(ts)) ∩
locks(h1h2) = ∅. We set h′ := h1(spawned(ts))(t, µ)h2, and apply the (use)-
rule to obtain h ;u h′. Obviously, we have ar(h′) = ha, and this implies the
proposition.
Now, we complete the proof of Theorem 4.5:
Proof of Theorem 4.5. This theorem follows directly from Lemmas 4.10 and 4.11:
schedls(h) 6= ∅
⇐⇒ ∃h′ ∈ Hls. h′|1 ∈ {τ}∗ ∧ h;ru ∗ ◦;au ∗ h′ (Lem. 4.10)




In the last sections we have introduced lock-a/r-hedges that are built from lock-
execution-hedges by summarizing matched acquisition- and release-nodes into
use-nodes. We have defined a scheduler on lock-a/r-hedges, and shown that a
lock-execution-hedge has a schedule if and only if its lock-a/r-hedge is schedulable
(Theorem 4.5). In this section, we characterize the set of schedulable lock-a/r-
hedges by a tree automaton.
This section is structured as follows: In Subsection 4.3.1, we define the no-
tion of a dependence-graph. The dependence-graph captures the ordering con-
straints between the nodes of a lock-a/r-hedge that a schedule must adhere to.
We show that schedulability corresponds to acyclicity of the dependence-graph
and some additional consistency properties, regarding initially held locks and du-
plicate acquisitions. While the additional consistency properties are obviously
regular, showing that acyclicity of the (unbounded) dependence-graph is regular
requires some argumentation. The main idea is to use so called acquisition-graphs
and release-graphs, which we introduce in Subsection 4.3.2. Finally, in Subsec-
tion 4.3.3, we put together the results to construct a tree automaton that accepts
exactly the schedulable lock-a/r-hedges.
4.3.1 Dependence-Graph
Given a lock-a/r-hedge, the following criteria must be satisfied by any schedule:
1. No lock may be acquired more than once.
2. A lock that is used or acquired, and not released must not be held initially
by any thread.
3. The nodes must be scheduled in tree-order.
4. No release may be scheduled after an acquisition.
5. A release of a lock must be scheduled before any usage of that lock, and,
symmetrically, an acquisition of a lock must be scheduled after any usage
of that lock.
Due to well-formedness, all acquisitions are unmatched. Hence, if the first cri-
terion is violated, the acquisition that is scheduled first will prevent the other
acquisitions from ever being scheduled. If the second criterion is violated, there
is a lock that is acquired throughout the whole execution. A usage or acquisition
of this lock can never be scheduled. The third and fourth criteria are obvious by
definition of the scheduler. Finally, if the fifth criterion is violated, there is a lock
that is initially owned by some thread until it is released. A usage of this lock
cannot be scheduled before this release. Symmetrically, a lock that is acquired
58
4.3 Acquisition Structures
will never be released again, hence no usage of such a lock can be scheduled after
the acquisition.
Note that the first and second criteria are not symmetric w.r.t. acquisition-
and release-nodes. The counterpart of the first criteria, (i.e., no lock is released
more than once) follows from well-formedness of the hedge. The counterpart of
the second criteria, (i.e., a lock that is used or released, and not acquired, is not
held finally by any thread) also holds for any schedule.
Also note that the first and second criteria are independent of the order in
that the nodes of the hedge are scheduled. To capture whether the third to
fifth criteria can be satisfied, we define the notion of a dependence-graph, that
describes these criteria as edges in a graph over the nodes of the hedge.
Definition 4.12 (Dependence-Graph). Let h ∈ HAR be an a/r-hedge. The hedge
h induces a graph g(h) = (V(h),E(h)) over vertices V(h) and edges E(h), such
that each node v ∈ V(h) corresponds to exactly one non-leaf-node n(v) in h, and
E(h) contains an edge (v, v′) ∈ E(h), if and only if v′ corresponds to a direct
successor node of v in h.
From the graph g(h) of a hedge, we construct the dependence-graph gdep(h) =
(V(h),Edep(h)) with Edep(h) = E(h)∪Eadd, where Eadd contains the following edges:
Eadd := {(v, v′) | ∃x, y. n(v) = 〉x ∧ n(v′) = 〈y} (1)
∪ {(v, v′) | ∃x,X. n(v) = 〉x ∧ n(v′) = 〈〉{x}∪X} (2)
∪ {(v, v′) | ∃x,X. n(v) = 〈〉{x}∪X ∧ n(v′) = 〈x} (3)
We overload g,gdep,V,E,Edep, and Eadd to lock-a/r-hedges in the natural way, i.e.,
ignoring the lockstacks and just considering the trees.
Intuitively, in Eadd, the edges due to (1) describe that releases must be scheduled
before acquisitions. The edges due to (2) describe that a usage has to be scheduled
after the release of a used lock and the edges due to (3) describe that an acquisition
has to be scheduled after a usage of the acquired lock.
Well-Definedness. Up to naming of the nodes in V , the graph g(h) is unique for a
given hedge h. To formally define g(h), we use access strings to uniquely identify
the nodes in h. An access string s ∈ (N)+ is a non-empty sequence of natural
numbers, where each number indexes the next node on a path through the hedge.
The nodes in V(h) are then pairs of access strings and nodes from h. The edges
E(h) are inductively defined as follows:
E(t1 . . . tn) = 1 · E(t1) ∪ . . . ∪ n · E(tn)
E(τ) = ∅
E(〉xt) = {((ε, 〉x), (1, root(t)))} ∪ 1 · E(t)
E(〈xt) = {((ε, 〈x), (1, root(t)))} ∪ 1 · E(t)






where i · E := {(([i]a, b), ([i]a′, b′)) | ((a, b), (a′, b′)) ∈ E}, and root(t) is the root-
node of t. The nodes V(h) are the nodes that occur in E(h), i.e., V(h) := {v |
∃v′. (v, v′) ∈ E(h)∨ (v′, v) ∈ E(h)}. The mapping n(v) is defined as projection to
the second component: n(a, b) = b.
In order to reduce the notational overhead for the following proofs, we introduce
some notation for dependence-graphs: We omit the argument h from E(h) and
Edep(h) if it is clear from the context. Moreover, we write v →E v′ instead of
(v, v′) ∈ E, and v →∗E v′ instead of (v, v′) ∈ E∗, and use analogous notations for
Eadd, Edep, and →+.
When referring to nodes of the dependence-graph, we are often only interested
in the corresponding hedge-node. In such cases, we omit the access string, and
write, for example, 〈〉X →∗E 〉x, instead of ∃a, a′. ((a, 〈〉X), (a′, 〉x)) ∈ E∗. Moreover,
we sometimes are only interested whether a use-node uses a particular lock x. If
the type of x is clear from the context, we may write 〈〉x instead of 〈〉{x}∪X for
some X.
Using the dependence-graph, we formally capture the criteria specified above,
and show that they are not only necessary, but also sufficient for the existence of
a schedule:
Definition 4.13. For an a/r-hedge h ∈ HAR, we define the set of used, acquired,
and released locks as follows:
used(h) =
⋃
{X | 〈〉X ∈ V(h)}
acq(h) = {x | 〈x ∈ V(h)}
rel(h) = {x | 〉x ∈ V(h)}
We overload used, acq, and rel to lock-a/r-hedges in the natural way.
Lemma 4.14. Given a lock-a/r-hedge h ∈ HARls, it is schedulable (i.e., schedar(h)
holds), if and only if the following criteria are satisfied:
∀v, v′ ∈ V(h). n(v) = 〈x ∧ n(v′) = 〈x =⇒ v = v′ (C1)
((used(h) ∪ acq(h)) \ rel(h)) ∩ h|2 = ∅ (C2)
∀v. (v, v) /∈ Edep(h)+ (C3)
Intuitively, (C1) states that no lock may be acquired more than once, (C2)
states that a lock that is used or acquired, and not released must not be held ini-
tially by any thread, and (C3) states that the dependence-graph must be acyclic.
Proof. For the =⇒-direction, assume schedar(h) holds, i.e., we have an h′ with
h′|1 ∈ {τ}∗ and h ∗ru ◦ ∗au h′. The proof is done by induction on ∗ru ◦ ∗au. In
the base-case, we have h = h′, hence h|1 ∈ {τ}∗, i.e., h only contains leaf-nodes
60
4.3 Acquisition Structures
and we have acq(h) = used(h) = rel(h) = ∅, and V(h) = ∅. Thus, (C1)–(C3) are
trivially satisfied.
Now assume we have h ru h˜ ∗ru ◦ ∗au h′ or h au h˜ ∗au h′, and h˜ satisfies
(C1)–(C3). We have to show that h also satisfies (C1)–(C3). For this, let v be
the node scheduled1 in the first step, i.e., we have V(h) = {v} ∪˙ V(h˜).
On (C1): If v is a use- or release-node, the set of acquisition-nodes in h and h˜
is the same, and thus h satisfies (C1), because h˜ does. So assume v is an
acquisition-node, i.e., v = 〈x for some lock x. Hence we have h a h˜ ∗au h′
and x ∈ h˜|2. Then, h˜ contains no release-nodes, as such nodes cannot be
scheduled by  au. Hence, we have rel(h˜) = ∅. As h˜ satisfies (C2), we have
x /∈ acq(h˜), and thus h does not violate (C1) involving v. Moreover, h does
not violate (C1) involving any other nodes, as those nodes would also be
contained in h˜. Hence, h satisfies (C1).
On (C2): If v is a use-node, i.e., v = 〈〉X , we have used(h) = X ∪ used(h˜),
acq(h) = acq(h˜), rel(h) = rel(h˜), and h|2 = h˜|2. Because the first step was
scheduled, and h is well-formed, we have X ∩ h|2 = ∅, and thus (C2) also
holds for h.
If v is a release-node, we have used(h) = used(h˜), acq(h) = acq(h˜), and
rel(h) = {x} ∪ rel(h˜), where x is the released lock. Moreover, we have
h|2 = {x} ∪ h˜|2, and thus h satisfies (C2) because h˜ does.
If v is an acquisition-node, we have used(h) = used(h˜), rel(h) = rel(h˜), and
acq(h) = {x}∪acq(h˜) where x is the acquired lock. Because the acquisition
was scheduled, and h is well-formed, we have x /∈ h|2, and thus h satisfies
(C2) because h˜ does.
On (C3): We show that v has no incoming edges in Edep(h), and thus acyclicity
of gdep(h) is implied by acyclicity of gdep(h˜). Because v was scheduled, it is
a root-node of h, and thus has no incoming edges in E(h). We now show
that it also has no incoming edges in Eadd(h): An edge (v˜, v) ∈ Eadd(h) may
be due to one of the following cases (cf. Definition 4.12):
(1) v is an acquisition-node, and v˜ is a release-node. However, if v is
an acquisition-node, we are in the case h˜  ∗au h′, Moreover, we have
v˜ ∈ h˜. However, the release-node v˜ can never be scheduled by  au,
yielding a contradiction.
(2) v is a use-node, and v˜ is a release-node, i.e., we have v = 〈〉{x}∪X
and v˜ = 〉x. As h is well-formed, the release-node v˜ has no matching
acquisition, and thus releases a lock that is initially held by h, i.e., we
have x ∈ h|2. However, as the use-node v is schedulable and h is well-
formed, we also have ({x}∪X)∩h|2 = ∅, which yields a contradiction.
1Note that, in each step, the scheduler removes exactly one non-leaf root-node from the hedge.
61
4 Lock-Sensitive Schedulability
(3) v is an acquisition-node, and v˜ is a use-node, i.e., v = 〈x and v˜ =
〈〉{x}∪X . We are in the case h˜  ∗au h′, and thus have rel(h˜) = ∅.
Moreover, we have x ∈ used(h˜), and because we just scheduled an
acquisition of lock x, we also have x ∈ h˜|2. Thus, h˜ violates (C2), in
contradiction to the assumption.
For the ⇐=-direction, we construct a schedule for h by induction on h. If h
contains only leaf-nodes, it is trivially schedulable. So assume h contains at least
one non-leaf-node, and satisfies (C1)–(C3). As gdep(h) is acyclic and contains no
leaf-nodes at all, there is a minimal (non-leaf) node v from gdep(h), i.e., ∀v˜. (v˜, v) /∈
Edep(h). We show that v can be scheduled, i.e., we obtain h˜ with h  x h˜ for
x ∈ {r, u, a} and V(h) = V(h˜) ∪˙ {v}. Moreover, we show that h˜ satisfies (C1)–
(C3), such that we can apply the induction hypothesis to continue the schedule.
Note that h˜ trivially satisfies (C1) and (C3), as it contains less nodes than h. We
now distinguish over the node v:
• If v is a release-node, i.e., v = 〉x, it can be scheduled, i.e., h r h˜. Moreover,
as h is well-formed, we have x ∈ h|2. As the locksets in h|2 are disjoint,
and the scheduling step h  r h˜ removes x from one of the locksets, we
have x /∈ h˜|2. Hence, h˜ satisfies (C2). By induction hypothesis, we have
schedar(h˜), and thus schedar(h).
• If v is a use-node, i.e., v = 〈〉X , we first show X ∩ h|2 = ∅, which implies
that v can be scheduled, i.e., h  u h˜. So assume that there is a lock
x ∈ X ∩ h|2. From x ∈ X, we get x ∈ used(h), and (C2) implies x ∈ rel(h).
Hence there is a release-node 〉x ∈ V(h). However, this implies the edge
(〉x, v) ∈ Eadd ⊆ Edep, in contradiction to minimality of v.
Moreover, we have used(h˜) ⊆ used(h), acq(h˜) = acq(h), rel(h˜) = rel(h),
and h˜|2 = h|2, and thus h˜ satisfies (C2) because h does. By induction
hypothesis, we get schedar(h˜), and thus schedar(h).
• Now let v be an acquisition-node, i.e., v = 〈x. First note that h does not
contain any release-nodes, as Eadd contains an edge from any release to any
acquisition-node, and v is minimal. Hence we have rel(h) = ∅. Moreover, we
have x ∈ acq(h), and with (C2) we get x /∈ h|2. Hence v can be scheduled,
i.e., h a h˜. As h satisfies (C1), it contains no second acquisition of x, and
we have x /∈ acq(h˜). Moreover, as any use-node 〈〉{x}∪X ∈ V(h) would imply
an edge (〈〉{x}∪X , v), and v is minimal, we have x /∈ used(h) = used(h˜).
Hence h˜ satisfies (C2) because h does. By induction hypothesis, we get
schedar(h˜), i.e., there is an h′ with h′|1 ∈ {τ}∗ and h˜ ∗ru ◦ ∗au h′. Because
h˜ contains no release-nodes, we even have h˜  ∗au h′, and thus h  ∗au h′.
Together, we get schedar(h).
62
4.3 Acquisition Structures
4.3.2 Acquisition- and Release-Graphs
Criteria (C1) and (C2) can obviously be checked by a finite tree automaton that
collects the acquisition-, release-, and use-sets in a bottom-up computation. How-
ever, (C3) involves building the dependence-graph, which has unbounded size. In
this subsection, we show how to check acyclicity of the dependence-graph within
finite state, i.e., without actually computing the dependence-graph. Instead, we
compute a compressed representation of the dependence-graph, called the release-
and acquisition-graph. We first observe that any cycle in the dependence-graph
consists of either use- and release-nodes, or use- and acquisition-nodes, i.e., there
is no cycle containing both, acquisition- and release-nodes. Moreover, each cycle
contains at least one acquisition- or release-node. This motivates a classification
of cycles whether they contain an acquisition- or a release-nodes, and to handle
each class separately. Lets regard a cycle in the dependence-graph that contains
at least one acquisition-node. It has the form
〈x →+Edep 〈x
As acquisition-nodes have no outgoing edges in Eadd, the first edge of the cycle
stems from E. As E is acyclic, the cycle contains at least one edge from Eadd, and
as in E there is no path from an acquisition- to a release-node, the first edge from
Eadd goes from a use-node to an acquisition-node. Thus we have
〈x →+E 〈〉x1 →Eadd 〈x1 →∗ 〈x.
By iterating this argumentation we get
〈x →+E 〈〉x1 →Eadd 〈x1 →+E . . .→+E 〈〉xn →Eadd 〈xn
with xn = x. Note that we use the notation 〈〉x instead of ∃X. 〈〉{x}∪X here.
The key idea of checking acyclicity in finite state is to represent the sequences
of the form 〈xi →+E 〈〉xi+1 →Eadd 〈xi+1 in the dependence-graph by a single edge in
the acquisition-graph, and identify the acquisition-nodes by the acquired locks.
For cycles over release-nodes, we do the symmetric construction. This leads to
the following definition of the acquisition- and release-graph:
Definition 4.15 (Acquisition- and Release-Graph). Let h ∈ HAR be an a/r-
hedge. We define the acquisition-graph Eacq(h) ⊆ X × X and the release-graph
Erel(h) ⊆ X × X as the least solution of the following constraints:
(x, x′) ∈ Eacq(h) if 〈x →+E 〈〉x′
(x, x′) ∈ Erel(h) if 〈〉x →+E 〉x′
Again, we omit the argument h of Eacq(h) and Erel(h) when clear from the context.
We now show that acquisition and release-graphs can be used to correctly
detect cycles in the dependence-graph:
63
4 Lock-Sensitive Schedulability
Lemma 4.16. Let h ∈ HAR be an a/r-hedge. The dependence-graph Edep(h)
contains a cycle, if and only if the acquisition-graph Eacq(h) or the release-graph
Erel(h) contain a cycle.
Proof. First of all, we show that the dependence-graph does not contain a path
from an acquisition- to a release-node, i.e.
@x, x′. 〈x →∗Edep 〉x′ . (∗)
The proof is done by contradiction. Assume we have 〈x →∗Edep 〉x′ . We proceed by
induction on the length of the path. Clearly, the path cannot be empty. As h is
well-formed, there are no paths from acquisition- to release-nodes only in E. So
we pick the first edge in Eadd on the path, and get nodes v1, v2 with
〈x →∗E v1 →Eadd v2 →∗Edep 〉x′ .
We distinguish over type of the edge v1 →Eadd v2 (cf. Definition 4.12). In case
(1) and (3), v2 is an acquisition-node, and the contradiction follows by induction
hypothesis. In case (2), v1 is a release-node, and the path 〈x →∗E v1 contradicts
well-formedness of h.
Now we prove the lemma: For the =⇒-direction, assume that Edep = E ∪ Eadd
contains a cycle. As E is acyclic (it is the graph of a hedge), the cycle contains
at least one edge from Eadd, i.e., it can be written as v →Eadd v′ →∗Edep v for
some nodes v, v′ ∈ V. We distinguish over the type of the edge v →Eadd v′ (cf.
Definition 4.12):
(1) v is a release-node and v′ is an acquisition-node. As we have v′ →∗Edep v,
this yields a contradiction to (∗).
(2) v is a release-node and v′ is a use-node. This case is shown analogously to
the next case.
(3) v is a use-node and v′ is an acquisition-node, and we have v = 〈〉x and
v′ = 〈x for some lock x. We show that any path starting with such an edge
yields a corresponding path in the acquisition-graph, i.e.
∀x, x′′. 〈〉x →Eadd 〈x →∗Edep 〈〉x′′ =⇒ x→+Eacq x′′. (†)
Applied to our path v →Eadd v′ →∗Edep v this yields the cycle x→+Eacq x.
The statement (†) is proved by induction on the length of the path 〈x →∗Edep
〈〉x′′ . Clearly, the path is not empty. If it only contains edges from E, we
have x →Eacq x′′ by definition of Eacq. So lets assume there is at least one
edge from Eadd. If we pick the first one, we can write the path as
〈〉x →Eadd 〈x →∗E v1 →Eadd v2 →∗Edep 〈〉x′′ .
64
4.3 Acquisition Structures
The edge v1 →Eadd v2 must be of type (3), otherwise v1 or v2 would be a
release-node, in contradiction to (∗). Hence we have
〈〉x →Eadd 〈x →∗E 〈〉y →Eadd 〈y →∗Edep 〈〉x′′ .
By definition of Eacq, we have x →Eacq y. By induction hypothesis, we get
y →∗Eacq x′′, and thus x→∗Eacq x′′.
For the ⇐=-direction, we have to construct a cycle in the dependence-graph
from a cycle in the acquisition or release-graph. We do the construction for
acquisition-graphs here, the argumentation for release-graphs is analogous. So
assume we have a cycle x1 →Eacq . . .→Eacq xn with xn = x1. By definition of Eacq,
the edges result from paths in E, i.e., we have
〈x1 →+E 〈〉x2 〈x2 →+E 〈〉x3 . . . 〈xn−1 →+E 〈〉xn .
Moreover, Eadd contains the edges 〈〉xi →Eadd 〈xi that close the gaps in the path
above, and we get
〈x1 →+E 〈〉x2 →Eadd 〈x2 →+E 〈〉x3 →Eadd . . .→Eadd 〈xn−1 →+E 〈〉xn →Eadd 〈xn .
As we defined Edep = E ∪ Eadd, and set xn = x1, we get the cycle 〈x1 →+Edep 〈x1 in
the dependence-graph.
Note that we used quite informal . . .-notation in the proof of the⇐=-direction.
Formally, the proof is done by induction on n, generalizing the statement to non-
cyclic paths, i.e., removing the assumption xn = x1.
4.3.3 A Tree Automaton for Schedulable A/R-Hedges
In the last two subsections, we have defined criteria for schedulability of a lock-
a/r-hedge, and have shown how to encode them into finite state. In order to decide
schedulability of a lock-a/r-hedge, we need the set of finally acquired, initially
released, used, and initially held locks, as well as the acquisition and release-
graph. In this section, we show how to compute this information bottom-up over
the lock-a/r-hedge. Thus, schedulability of a lock-a/r-hedge can be expressed as
a deterministic bottom-up tree automaton.
The states of the tree automaton are called acquisition structures and hedge
acquisition structures. An acquisition structure is a five-tuple (r, gr, u, a, ga) ∈
X × X 2 × X × X × X 2, where r is the set of released locks, gr is the release-
graph, u is the set of used locks, a is the set of acquired locks, and ga is the
acquisition-graph. The set of acquisition structures is denoted by
AS := 2X×X
2×X×X×X 2 .
When the tree automaton processes the list of lock-a/r-trees and sets of initially
held locks in the lock-a/r-hedge, it has to additionally collect the initially held
65
4 Lock-Sensitive Schedulability
locks. At this phase, its states are hedge acquisition structures of the form (σ,X),
where σ ∈ AS is the acquisition structure of the hedge processed so far, and
X ⊆ X is a the set of initially held locks. The set of hedge acquisition structures
is denoted by
ASh := AS× 2X .
We introduce some operations on acquisition structures. First of all, we abbre-
viate the empty acquisition structure that consists of empty sets only by ∅5 ∈ AS.
Analogously, the empty hedge acquisition structure is abbreviated by ∅6 ∈ ASh,
i.e.
∅5 := (∅, ∅, ∅, ∅, ∅) ∅6 := (∅5, ∅).
Next, we define an operation for parallel composition of two acquisition struc-
tures. This operator is used to combine the acquisition structures at use-nodes
and the acquisition structures of the lock-a/r-trees in the hedge. The information
in the hedge acquisition structure can be used to check (C2) and (C3) after the
lock-a/r-hedge is completely processed. However, (C1), (i.e., the lock-a/r-hedge
contains no two final acquisitions of the same lock) must be checked during the
computation. For this purpose, we define the operations on acquisition structures
as partial functions that are undefined iff Criterion (C1) is violated. We explicitly
denote an undefined value with the symbol ⊥. We define ‖: AS× AS→ AS by:
(r, gr, u, a, ga) ‖ (r′, g′r, u′, a′, g′a) :=
{ ⊥ if a ∩ a′ 6= ∅
(r ∪ r′, gr ∪ g′r, u ∪ u′, a ∪ a′, ga ∪ g′a) else
We overload parallel composition to hedge acquisition structures, and define
(σ,X) ‖ (σ′, X ′) := (σ ‖ σ′, X ∪X ′).
Obviously, the ‖-operator is associative and commutative, and ∅5 (∅6 respectively)
is a neutral element.
Finally, we define the partial function as : HARls → ASh inductively over the
structure of lock-a/r-hedges. For this purpose, we set as := ash, and define the
partial functions ash : HARls → ASh, ass := TAR∗ → AS, and ast : TARls → AS by:
ash(εh) := as
ε
h ash((t,X)#hh) := as
#
h (ast(t), X, ash(h))
ass(εs) := as
ε





t ast(〈〉X(s)t) := as〈〉t (X, ass(s), ast(t))
ast(〈xt) := as〈t(x, ast(t)) ast(〉xt) := as〉t(x, ast(t))
for h ∈ HARls, t ∈ TAR, s ∈ TAR∗, and X ⊆ X . The right-hand sides of the above
66
4.3 Acquisition Structures
recursion equations are defined as follows:
asεh := ∅6
as#h (σ,X, (σ
′, X ′)) := (σ,X) ‖ (σ′, X ′)
asεs := ∅5
as#s (σ, σ





′, (r, gr, u, a, ga)) := σ′ ‖ (r,X × r ∪ gr, X ∪ u, a, ga)
as
〈
t(x, (r, gr, u, a, ga)) :=
{




t(x, (r, gr, u, a, ga)) := ({x} ∪ r, gr, u, a, ga)
The following lemma shows that the as-function computes the expected result:
Lemma 4.17. For any lock-a/r-hedge h ∈ HARls, we have:
as(h) 6= ⊥ =⇒ as(h) = (rel(h),Erel(h), used(h), acq(h),Eacq(h), set(h|2)) (1)
as(h) = ⊥ ⇐⇒ h violates (C1) (2)
Before we prove this lemma, we intuitively explain the recursion equations:
The acquisition structure of a leaf is obviously ∅5, as no locks have been released,
used, or acquired so far. On an acquisition-node, as〈t first checks whether the
lock has already been acquired. If this is the case, (C1) is violated and the
result is undefined. Otherwise, the acquired lock is added to the acquisition-
set a, and edges from the acquired lock to all locks u that are used after the
acquisition are added to the acquisition-graph ga. On a release-node, as
〉
t just
adds the released lock to the release-set r. On a use-node, first, ass computes the
parallel composition of the acquisition structures of the spawned threads. Then,
as
〈〉
t adds the set of used locks to the u-component of the local thread’s acquisition
structure. Moreover, edges from each used lock to each released lock are added
to the release-graph. The functions asεs and as#s compute the parallel composition
of the acquisition structures of a list of spawned threads, and the functions asεh
and as#h compute the parallel composition of the hedge acquisition structures of
a lock-a/r-hedge.
Proof of Lemma 4.17. Proposition 1 is shown by induction on the structure of h.
We show for h ∈ HARls and t ∈ TAR and t1 . . . tn ∈ TAR∗:
ash(h) 6= ⊥ =⇒ ash(h) = (rel(h),Erel(h), used(h), acq(h),Eacq(h), set(h|2))
ass(t1 . . . tn) 6= ⊥ =⇒ ass(t1 . . . tn) = ast(t1) ‖ . . . ‖ ast(tn)
ast(t) 6= ⊥ =⇒ ast(t) = (rel(t),Erel(t), used(t), acq(t),Eacq(t))
67
4 Lock-Sensitive Schedulability
We demonstrate the cases t = 〈xt′ and t = 〈〉X(t1 . . . tn)t′. The other cases are
analogous or straightforward. In the case t = 〈xt′, we clearly have rel(t) = rel(t′),
Erel(t) = Erel(t′), used(t) = used(t′), and acq(t) = {x} ∪ acq(t′). Moreover, the
dependence-graph contains an edge from the acquisition-node at the root of t to
each use-node in t′. This corresponds to the edges {x}×used(t′) in the acquisition-
graph. Thus, we have Eacq(t) = {x}×used(t′) ∪ Eacq(t′). Unfolding the definition
of as〈t and applying the induction hypothesis completes the case.
In the case t = 〈〉X(t1 . . . tn)t′, we first observe that, due to well-formedness of
t, the trees t1 . . . tn contain no release-nodes. Thus, all release-nodes in t are from
t′, i.e., rel(t) = rel(t′). Moreover, we obviously have:
used(t) = X ∪ used(t1) ∪ . . . ∪ used(tn) ∪ used(t′)
acq(t) = acq(t1) ∪ . . . ∪ acq(tn) ∪ acq(t′)
Eacq(t) = Eacq(t1) ∪ . . . ∪ Eacq(tn) ∪ Eacq(t′)
The dependence-graph contains an edge from the 〈〉X-node at the root of t to every
release-node in t′. In the release-graph, this corresponds to the edges X×rel(t′),
and we have Erel(t) = X×rel(t′) ∪ Erel(t′). Using the induction hypothesis for ast
and ass, and unfolding the definition of ‖, we observe that these are exactly the
sets computed by ast(t).
For the⇐=-direction of Proposition 2, assume that h violates (C1). We have to
show that as(h) is undefined. As h violates (C1), there are two final acquisitions
of the same lock x in h. There are three cases how the two final acquisitions may
be distributed in h. In the first case, there is a use-node 〈〉X(t1 . . . tn)tn+1, such
that x ∈ acq(ti) ∩ acq(tj) for 1 ≤ i < j ≤ n + 1. In the second case, there is
an acquisition-node 〈xt′ such that x ∈ acq(t′), and in the third case, there are
lock-a/r-trees (t1, X1), (t2, X2) in h such that x ∈ acq(t1)∩ acq(t2). The first case
is handled by the ‖-operator, which is only defined if the acquisition-sets of the
operands are disjoint. In the first case, ass(t1 . . . tn) is the parallel composition
of the acquisition structures of t1 . . . tn. Hence, if j ≤ n, it is undefined. If
j = n+1, x is contained in the acquisition component of ass(t1 . . . tn), and also in
that of ast(tn+1). Thus, the parallel composition in as
〈〉
t (X, ass(t1 . . . tn), ast(tn+1))
is not defined. Analogously, in the third case, the function ash(h) computes the
parallel composition of the (hedge) acquisition structures of the lock-a/r-trees.
As the acquisition-sets are not disjoint, it is undefined. Finally, in the second
case, as〈t(x, ast(t′)) is undefined, as the side condition x /∈ a is not satisfied.
For the =⇒-direction of Proposition 2, assume that as(h) is undefined. We
have to show that h violates (C1). First, we spot a node in h where the function
is defined for all successor nodes, but undefined for that node. As the function
is always defined for leaf-nodes (i.e., ash(εh) = ∅6, ass(εs) = ast(τ) = ∅5), such
a node exists. Analogously to the ⇐=-direction, there are three cases for this
node: It may be a 〈〉X-node where the same lock is acquired in two subtrees, an
〈xt′-node, where x is acquired in t′, or an (t,X)#sh-node, where the same lock is
68
4.3 Acquisition Structures
acquired in t and h. In any of these cases, there are two final acquisitions of the
same lock in h, i.e., (C1) is violated.
In order to check (C2) and (C3), we define the set of consistent hedge acquisi-
tion structures
Cons = {(r, gr, u, a, ga, X) | (u ∪ a) \ r ∩X = ∅ and gr acyclic and ga acyclic}.
Finally, we prove the following theorem:
Theorem 4.18. A lock-a/r-hedge h ∈ HARls is schedulable, if and only if its
acquisition structure is consistent:
schedar(h) ⇐⇒ as(h) ∈ Cons
Proof. Lemma 4.14 states that schedulability of h is equivalent to h satisfying
(C1)–(C3). From Lemma 4.16, we know that (C3) is equivalent to the acquisition
and release-graph being acyclic. Moreover, from Lemma 4.17, we know that as(h)
computes the sets of acquired, released, used, and initially held locks, as well as
the acquisition- and release-graphs, and is undefined iff (C1) is violated. As the
set Cons contains only defined hedge acquisition structures that satisfy (C2) and
(C3), we get the proposition.
4.3.3.1 Constructing the Tree Automaton
From Theorem 4.18 and the inductive definition of as(h), it is straightforward
to construct a bottom-up deterministic tree automaton that accepts exactly the
schedulable lock-a/r-hedges:
Definition 4.19 (Tree Automaton for Schedulable Lock-A/R-Hedges). We de-
fine the automaton ACons = (Q,F, δ), where the states are hedge acquisition struc-
tures and acquisition structures, and the final states are the consistent hedge ac-
quisition structures, i.e., Q = AS ∪˙ ASh and F = Cons. The automaton has the
following rules δ:
εh →δ asεh (qt, X)#hq →δ as#h (qt, X, q)
εs →δ as#s qt#sqs →δ as#s (qt, qs)
τ →δ asτt 〈〉X(qs)qt →δ as〈〉t (X, qs, qt)
〈xqt →δ as〈t(x, qt) 〉xqt →δ as〉t(x, qt)
for all qs, qt ∈ AS, q ∈ ASh, X ⊆ X , and x ∈ locks, such that the function on the
right-hand side of the rule is defined.
69
4 Lock-Sensitive Schedulability
Lemma 4.20. The automaton ACons accepts exactly the lock-a/r-hedges with a
consistent acquisition structure:
L(ACons) = {h ∈ HARls | as(h) ∈ Cons}.
It can be constructed in time 2O(|X |2), and its size is |ACons| = 2O(|X |2).
Proof. The first proposition is obvious from the definition of as and Theorem 4.18.
Formally, it is shown by a straightforward induction over the structure of lock-
a/r-hedges.
The alphabet of ACons consists of the constructors for lock-a/r-hedges, these
are
{τ, εs, εh,#s}∪{#h(X) | X ⊆ X}∪{〈x | x ∈ X}∪{〉x | x ∈ X}∪{〈〉X | X ⊆ X}.
Hence, the alphabet has O(2|X |) different symbols. Obviously, there are
(23|X |)(22|X |
2
) = 2O(|X |
2)
different acquisition structures, and
(24|X |)(22|X |
2
) = 2O(|X |
2)
different hedge acquisition structures. Thus, the automaton has 2O(|X |2) different
states, and we have
|ACons| = poly(2|X |2O(|X |2)) = 2O(|X |2).
By instantiating the rule-templates for all possible values, the automaton can be
constructed in time 2O(|X |2).
We summarize the results of this chapter by the following corollary:
Corollary 4.21. A lock-execution-hedge is schedulable, if and only if its lock-
a/r-hedge is accepted by ACons:
∀h ∈ Hls. schedls(h) 6= ∅ ⇐⇒ h ∈ ar−1(L(ACons)).
Proof. Straightforward combination of the results of this chapter yields:
schedls(h) 6= ∅
⇐⇒ schedar(ar(h)) by Theorem 4.5
⇐⇒ as(ar(h)) ∈ Cons by Theorem 4.18
⇐⇒ ar(h) ∈ L(ACons) by Lemma 4.20
⇐⇒ h ∈ ar−1(L(ACons)) by definition of ·−1
70
4.4 Summary and Related Work
4.4 Summary and Related Work
In this chapter, we introduced lock-a/r-hedges that summarize matching acquisi-
tions and releases of lock-execution-hedges, and contain no reentrant operations
on locks. In order to relate schedulability of lock-execution-hedges to schedu-
lability of lock-a/r-hedges (Theorem 4.5), we introduced the concept of disci-
plined schedules, and showed that any schedule on a lock-execution-hedge can
be reordered to a disciplined schedule. We developed the concept of disciplined
schedules in [78], where we used it to prune executions that reach data-races,
and to compute acquisition histories of those executions in an abstract inter-
pretation [31, 32] framework. We described schedulability of lock-a/r-hedges by
acyclicity of dependence-graphs, and checked this acyclicity in finite state by
using acquisition structures (Theorem 4.18). Thus, we obtained a characteriza-
tion of the schedulable lock-execution-hedges by a regular set of lock-a/r-hedges
(Corollary 4.21).
In the remainder of this section, we first discuss the relation of our theory
of movers, which we use to reorder schedules to disciplined schedules, to the
original theory of Lipton [80]. Then, we discuss the development of acquisition
structures, and, finally, we highlight the advantages of our modular approach,
which uses lock-a/r-hedges as an intermediate model, over the direct approach
that we pursue in [79].
Movers To reorder arbitrary schedules to disciplined schedules, we have devel-
oped a theory of movers for lock-sensitive schedules (cf. Subsection 4.2.1). This
was inspired by the theory of movers presented by Lipton [80].
The programs in [80] have parbegin...parend-blocks for expressing parallel
execution, a finite set of semaphores a1, . . . , an for synchronization, and shared
variables. While we regard abstracted programs where we essentially replace
conditionals by nondeterministic choice, the programs in Lipton [80] are concrete
programs.
Semaphores are a well-known concept to ensure mutual exclusion [33]. A
semaphore is essentially a shared variable that holds a positive number. The op-
eration P (ai) decrements the semaphore and is only executable if the semaphore
is greater than zero. The operation V (ai) increments the semaphore. The oper-
ations P and V are executed atomically.
A lock can be modeled by a semaphore that is initialized with 1: The P -
operation acquires the lock, and the V -operation releases the lock.
Lipton [80] establishes the notion of left movers and right movers. For a parallel
program, a statement f is a right mover if for any execution αfh of the program,
such that h and f are in different threads, also αhf is a valid execution that leads
to the same configuration.
Symmetrically, a statement g is a left mover if for any execution αhg, such




The main theorem concerning movers states that P -operations are always right
movers, and V -operations are always left movers.
In Subsection 4.2.2, we adopted this theory to construct disciplined schedules
from arbitrary schedules. Regarding movers, the main difference of the program
models is that we have dynamic thread creation, such that a statement cannot
move left beyond the creation of the thread executing this statement, although
the thread-creation-statement and the first statement of the created thread are
executed in different threads. Hence, we introduced the notion of unrelated state-
ments: Two consecutive statements of an execution are unrelated if they are in
different threads and the first one did not create the thread of the second one.
Moreover, for our purpose, we do not only need to move left release-statements,
but also statements unrelated to locking, as well as usages of locks, i.e., state-
ments that atomically execute a block of multiple statements between a matched
acquisition and release. Due to our abstraction, statements unrelated to lock-
ing are independent of the other threads’ configurations, and thus can always be
moved left or right.
Moving left usages is not always possible: As an example, regard the execution
〉x 〈x〉x, i.e., a release of x followed by a usage of x, and assume that the release
and usage are in different threads. Clearly, 〈x〉x 〉x is not a valid execution, as the
lock that shall be used is acquired by the other thread.
Thus, we classified statements (and sequences of statements executed atom-
ically, like usages) as increasing and decreasing, and only allowed moving over
increasing and decreasing statements. Our mover-lemma (Lemma 4.8) essentially
states that one can swap an increasing statement followed by an unrelated de-
creasing statement. Hence, decreasing statements can be seen as constrained left
movers: They can be moved left only over increasing statements. Symmetri-
cally, increasing statements can be seen as constrained right movers: They can
be moved right only over decreasing statements.
Acquisition Structures The concept of acquisition-graphs was invented by
Kahlon et al. [57], called acquisition histories there. Release-graphs have first
been used by Kahlon and Gupta [55], called backwards acquisition histories there.
In [55, 57] and also [56], (backwards) acquisition histories are applied to ana-
lyze reachability properties of parallel pushdown systems with well-nested, non-
reentrant locks (Lock-PPDS). A parallel pushdown system (PPDS) consists of
multiple pushdown systems running in parallel. DPNs can be seen as a general-
ization of PPDS to additionally allow dynamic thread creation.
In [57], pairwise reachability properties are checked. This allows limiting the
analysis to just two pushdown systems running in parallel (2PDS), and also makes
the acquisition-graphs simpler, as one has to check only for cycles of length two.
Also in the settings of [55] and [56], only cycles of length two need to be con-
72
4.4 Summary and Related Work
sidered. In [78], we use acquisition structures to decide pairwise reachability
properties in programs with reentrant monitors and dynamic thread creation.
Although we consider dynamic thread creation there, the restriction to pairwise
reachability properties allows us to only consider cycles of length two, such that
we can use the acquisition histories of [57]. Finally, in [79], we consider lock-
sensitive predecessor set computation for DPNs with well-nested, non-reentrant
locks. Here, we introduce the concept of execution-trees, and compute acquisi-
tion structures over execution-trees. Moreover, as we consider arbitrary regular
sets of configurations instead of pairwise reachability, we had to generalize the
concept of (backwards) acquisition histories to check for arbitrary long cycles (of
course, bounded by the number of locks).
When analyzing parallel pushdown systems with locks, the analysis can be
implemented in a modular fashion, computing information for each of the PDSs
separately, and combining the information at the end of the analysis. This avoids
constructing the state-space of the whole system, thus bypassing the state-space
explosion problem [57]. When regarding DPNs, the modularity of our method
is not immediately obvious, as the distinct pushdown systems in a DPN are
connected by dynamic thread creation. However, we also do not construct the
state-space of the whole system, but combine the acquisition structures as late
as possible, i.e., at the use-nodes or when combining the acquisition structures of
the distinct trees of the lock-a/r-hedge.
Advantages of Lock-A/R-Hedges Our paper [79] is a direct predecessor of
the work presented in this thesis. Instead of reentrant monitors, we study well-
nested, non-reentrant locks there. Like in this thesis, we use acquisition structures
to characterize lock-sensitive schedulability of lock-execution-hedges. And, as in
this thesis, we reduce lock-sensitive reachability to lock-insensitive reachability,
by characterizing the schedulable executions as a tree automaton, and building
a cross-product of the program to be analyzed and that tree automaton.
An important difference between this thesis and [79] is the way how the char-
acterization of schedulable lock-execution-hedges is obtained. In this thesis, we
use lock-a/r-hedges as an abstraction of lock-execution-hedges. Lock-a/r-hedges
contain just the necessary information to decide schedulability: Reentrant lock-
ing is already eliminated, and matched acquisitions and releases are summarized
to use-nodes. On this level, schedulability can easily and intuitively be explained
via acyclicity of the dependence-graph, and the acquisition- and release-graph is
used as a tool to check acyclicity of the dependence-graph with a (finite state) tree
automaton. To establish the connection between schedulability of lock-execution-
hedges and lock-a/r-hedges, we reorder the schedule to form a disciplined sched-
ule, and then perform reentrance elimination. In this thesis, these concepts are
explained in a modular fashion, making the proofs simpler and reusable. For
example, adapting our methods to well-nested, non-reentrant locking, as briefly
73
4 Lock-Sensitive Schedulability
discussed in Chapter 8, does not affect the characterization of schedulable lock-
a/r-hedges.
In contrast, the approach taken in [79] directly characterizes the set of schedu-
lable lock-execution-hedges by a tree automaton. The tree automaton computes
acquisition structures bottom-up, doing the matching of acquisitions and releases
by checking whether the acquired lock is released in the subtree that follows the
acquisition. If this is the case, the lock is removed from the release-set and the
release-graph, and is added to the use-set. The proof of correctness is done by
induction over the lock-execution-hedge. It involves all of the steps described
above: Reordering of schedules, checking acyclicity of dependence-graphs via
acquisition- and release-graphs, and characterizing schedulability by acyclicity of
dependence-graphs. This makes the proof quite involved. Indeed, for [79], the
proof was first developed on paper, and then mechanically verified with the inter-
active theorem prover Isabelle/HOL [94]. During the verification, we discovered
some cases that we had not considered in the first version of the proof, which
required rather complicated arguments. The proof script can be found in [72].
74
5 Automata Constructions
In the last chapter, we have characterized the set of schedulable lock-a/r-hedges
by the tree automaton ACons. In this chapter, we construct a cross-product DPN.
Its execution-hedges correspond to those lock-execution-hedges of the Monitor-
DPN whose lock-a/r-hedges are accepted by ACons—i.e., to the schedulable lock-
execution-hedges of the Monitor-DPN. In the next chapter, we then use the
cross-product DPN to reduce lock-sensitive predecessor set computation to lock-
insensitive predecessor set computation.
The construction is done in two steps: In Section 5.1, we transform a tree
automaton for lock-a/r-hedges into a DPN-Acceptor for lock-execution-hedges.
In Section 5.2, we describe a cross-product construction between a Monitor-DPN
and a DPN-Acceptor, yielding a new DPN. Its execution-hedges correspond to
the lock-execution-hedges of the Monitor-DPN that are accepted by the DPN-
Acceptor. Finally, in Section 5.3, we briefly summarize the results of this chapter
and discuss related work.
5.1 A DPN-Acceptor for Schedulable
Execution-Hedges
In Chapter 4, we have characterized the schedulable lock-a/r-hedges by the tree
automaton ACons. In this section, we show how to translate a tree automaton for
lock-a/r-hedges to a DPN-Acceptor for lock-execution-hedges.
5.1.1 DPN-Acceptors
Our goal is to derive a DPN that accepts exactly the schedulable lock-execution-
hedges. For this, we have to define a notion of acceptance for DPNs. The intuition
is to fix a set of initial and final configurations, and define all execution-hedges
between these sets of configurations as accepted. In order to handle locks, we use
Monitor-DPNs, i.e., we assume that the locks are bound to the stack. Moreover,
for our special purpose, we assume that any non-bottom stack-symbol has a lock,
and that non-release-rules of the DPN do not depend on the stack.
Definition 5.1 (DPN-Acceptor). A DPN-Acceptor D = (ACI ,ACF ,M) consists
of an initial automaton ACI, a final automaton ACF, and a Monitor-DPN M =
(P,Γ,Γ⊥,Act,X ,∆, locks), such that all non-bottom stack-symbols ofM are bound
75
5 Automata Constructions
to locks (1), non-release-rules of the DPN do not depend on the stack (2,3), and
ACI accepts only valid configurations (4):
∀γ ∈ Γ \ Γ⊥. locks(γ) 6= ∅ (1)
pγ
o
↪→ p′wγ′ ∈ ∆ =⇒ γ′ = γ ∧ ∀γ′′ ∈ Γ. pγ′′ o↪→ p′wγ′′ ∈ ∆ (2)
pγ
o
↪→ psws]p′wγ′ ∈ ∆ =⇒ γ′ = γ ∧ ∀γ′′ ∈ Γ. pγ′′ o↪→ psws]p′wγ′′ ∈ ∆ (3)
L(ACI) ⊆ valid (4)
For a state q of ACI, the set D(q) of lock-execution-hedges accepted in state q
is defined as
D(q) := {h× ls(c) | ∃c′. c ∈ ACI(q) ∧ c h=⇒M c′ ∧ c′ ∈ L(ACF)},
and the language ofD is defined as the set of lock-execution-hedges between L(ACI)
and L(ACF):
L(D) := {h× ls(c) | ∃c′. c ∈ L(ACI) ∧ c h=⇒M c′ ∧ c′ ∈ L(ACF)}.
The additional restrictions of the DPN of a DPN-Acceptor are crucial for the
cross-product construction presented in Section 5.2. On the other hand, DPN-
Acceptors are expressive enough to accept the set of schedulable lock-execution-
hedges, as we show in the next subsection.
5.1.2 A DPN-Acceptor for Regular Sets of A/R-Hedges
Given a tree automaton A over lock-a/r-hedges, we show how to construct a DPN-
Acceptor DA such that L(DA) = ar−1(L(A)), i.e., DA accepts exactly those lock-
execution-hedges whose corresponding lock-a/r-hedges are accepted by A. When
setting A := ACons, the DPN-Acceptor DACons accepts exactly the schedulable
lock-execution-hedges.
In order to simulate an automaton over the corresponding lock-a/r-hedge, the
DPN-Acceptor has to identify matching acquisitions and releases. Moreover, it
must identify reentrant acquisitions and releases. For this purpose, the stack
of a thread simulates the lockstack. The stack is updated on acquisition- and
release-operations only. In order to distinguish final acquisitions from usages,
upon an acquisition, the DPN-Acceptor decides nondeterministically whether this
acquisition is final, or has a matched release-node. However, it has to ensure that
the decision is correct, and, otherwise, the hedge must not be accepted. When
deciding for a final acquisition, a flag in the control-state is set, indicating that
no more initial releases must be accepted. Thus, if there should be a release-node
after an acquisition-node that has been recognized as final, the DPN-Acceptor
has no rule to accept this release-node.
76
5.1 A DPN-Acceptor for Schedulable Execution-Hedges
When deciding for a usage, the DPN-Acceptor remembers in its control-state
that it is currently inside a usage, and marks the first lock of the usage on the
stack. The usage is left when the marked stack-symbol is popped by the matching
release. Thus, if an acquisition is recognized as a usage, and there is no matching
release, the control-state after accepting the whole hedge indicates that the DPN-
Acceptor is still inside a usage, and waits for the matching release. The final
automaton ACF excludes configurations with such control-states. Moreover, the
initial automaton ACI ensures that all control-states are initially outside usages.
In order to identify reentrant acquisitions and releases, the DPN-Acceptor
stores the set of currently acquired locks in its control-state, and flags locks on
the stack as reentrant or non-reentrant. Using these flags, the set of currently
acquired locks can be updated correctly on initial releases. Inside usages, only
the set of used locks needs to be computed. For this purpose, it is not required
to update the set of acquired locks inside usages, nor to flag the used locks as
reentrant on the stack.
The initial automaton ACI ensures that the flagging of locks as reentrant is
done correctly on the initial lockstacks, and that the sets of acquired locks in the
control-states are initialized correctly.
Finally, the DPN-Acceptor has to simulate the automaton A over lock-a/r-
hedges. For this purpose, it maintains the state of A in its control-state. As
the DPN-Acceptor identifies usages and reentrance, it has enough information
to correctly update the state of A. In order to identify usages and reentrance,
we have regarded the DPN-Acceptor from a top-down perspective. However, the
tree automaton A over lock-a/r-hedges works bottom-up. Hence, we regard the
DPN-Acceptor from a bottom-up perspective for the following explanation.
The final automaton ACF ensures that all states of the automaton are initialized
to states that may be accepted by leaf-rules. Outside usages, updating the states
of the tree automaton is straightforward, as each rule of the DPN matches a
node in the a/r-tree. On the last release of a usage, the control-state of the
DPN switches to a usage-state. This state stores the old state of the automaton.
Moreover, it keeps track of the set of used locks and the state of the automaton
over the list of spawned threads. This information is updated while accepting
the nodes inside the usage. The actual use-rule of the automaton is applied
when accepting the first acquisition of the usage, using the stored old state of the
automaton, the computed set of used locks, and the computed state for the list
of spawned threads. Finally, the initial automaton ACI computes the state of A
over the hedge, and ensures that it is an initial state.
Following the ideas presented above, for an automaton A over lock-a/r-hedges,
we define the DPN-Acceptor DA, such that L(DA) = ar−1(L(A)). In order to keep
the presentation simpler, we additionally assume that A does not depend on use-
nodes with empty lockset and no spawned threads (〈〉∅(ε)). This assumption is
obviously true for ACons.
77
5 Automata Constructions
Definition 5.2. Let A = (Q,F, δ) be an automaton over lock-a/r-hedges, such
that
〈〉∅(ε)q →∗δ q′ ⇐⇒ q = q′.
We define the DPN-Acceptor DA = (ACI ,ACF ,M) with the Monitor-DPN M =
(P,Γ, {⊥},Act,X ,∆, locks), where
P := {(pb, X, q) | b ∈ B, X ⊆ X , q ∈ Q}
∪˙ {(ub,q˜, X, u, q) | b ∈ B, q˜, q ∈ Q, X, u ⊆ X}
Γ := {xb | x ∈ X , b ∈ B} ∪˙ {⊥}
locks(xb) := {x} locks(⊥) := ∅
The rules ∆ of the DPN are the following:
(pb, X, q)γ
2a
↪→ (pb, X, q)γ (base)
(pb, X, q)γ
2a
↪→ (p⊥, ∅, q1)⊥#(pb, X, q2)γ if 〈〉∅([q1])q2 →∗δ q (spawn)
(pb, X, q)γ
〈x
↪→ (p⊥, X, q)x>γ if x ∈ X (acq-r)
(pb, X, q)γ
〈x
↪→ (p⊥, {x} ∪X, q′)x⊥γ if x /∈ X and 〈xq′ →δ q (acq-n)
(p>, X, q)x>
〉x
↪→ (p>, X, q) (rel-r)
(p>, X, q)x⊥
〉x
↪→ (p>, X \ {x}, q′) if 〉xq′ →δ q (rel-n)
(ub,q˜, X, u, q)γ
2a
↪→ (ub,q˜, X, u, q)γ (u-base)
(ub,q˜, X, u, q)γ
2a
↪→ (p⊥, ∅, q1)⊥#(ub,q˜, X, u, q2)γ if q1#sq2 →δ q (u-spawn)
(ub,q˜, X, {x} \X ∪ u, q)γ 〈x↪→ (ub,q˜, X, u, q)x⊥γ (u-acq)
(ub,q˜, X, u, q)x⊥
〉x
↪→ (ub,q˜, X, u, q) (u-rel)
(pb, X, q)γ
〈x
↪→ (ub,q˜, X, u, q′)x>γ if 〈〉{x}\X∪u(q′)q˜ →δ q (u-begin)
(ub,q˜, X, ∅, q)x> 〉x↪→ (pb, X, q˜) if εs →δ q (u-end)
for all b ∈ B, X, u ⊆ X , x ∈ X , a ∈ Act, γ ∈ Γ, and q, q1, q2, q′, q˜ ∈ Q.
The initial automaton is
ACI := ({s0}×Q ∪˙ {s1}×2X×Q, {s0}×F, {(s0, q) | εh →δ q}, δI).
78
5.1 A DPN-Acceptor for Schedulable Execution-Hedges
Its transition relation δI is the least solution of the following constraints:
(s1, q, ∅) ⊥−→δI (s0, q) (stack-bot)
(s1, q, {x} ∪X) x
b−→δI (s1, q,X) for b = (x ∈ X) (stack)
(s0, q)
(p>,X,qt)−−−−−→δI (s1, q′, X) if (qt, X)#hq′ →δ q (ctrl)
for all q, q′, qt ∈ Q, X ⊆ X , and x ∈ X .
The final automaton is ACF := ({s}, {s}, {s}, δF). It has just the single state s,
and its transition relation δF is the least solution of the following constraints:
s
(pb,X,q)−−−−→δF s if b ∈ B, X ⊆ X , and τ →δ q
s
γ−→δF s if γ ∈ Γ
One easily verifies that the DPN M of the DPN-Acceptor is a Monitor-DPN
(cf. Definition 3.11). Moreover, all non-bottom stack-symbols are associated with
locks, non-release-rules do not depend on the stack, and we have L(ACI) ⊆ valid.
Thus, DA is a well-defined DPN-Acceptor.
The DPN implements the intuition sketched at the beginning of this subsection:
Outside usages, it uses states of the form (pb, X, q). The flag b indicates whether
initial releases are allowed (b = >) or not (b = ⊥). The set X is the set of
currently acquired locks, and q is the current state of the tree automaton A. The
stack consists of the ⊥-symbol and locks that are flagged with Boolean values.
A stack entry x⊥ stands for a non-reentrant lock, i.e., there is no other x on the
stack below this entry. An entry x> stands for a reentrant lock.
The (base)-rule accepts base-nodes in the execution-tree. As base-nodes cor-
respond to 〈〉∅(ε)-nodes in the a/r-tree, it does not change the state of A. The
(spawn)-rule accepts a spawn-node. The spawned thread is initialized with an
empty set of currently acquired locks, and initial releases are not allowed (p⊥).
The spawn-node corresponds to a 〈〉∅(ts)-node in the a/r-tree, and the state of
A is updated accordingly. The (acq-r)-rule accepts a reentrant final acquisition.
It does not modify the state of A, as the reentrant acquisition corresponds to a
〈〉∅(ε)-node in the lock-a/r-tree. The (acq-n)-rule accepts a non-reentrant final
acquisition. A non-reentrant final acquisition corresponds to a 〈x-node in the
a/r-tree, and the state of A is updated accordingly. Both, (acq-r)- and (acq-n)-
rules accept final acquisitions, and thus set the initial-release flag to ⊥, allowing
no more initial releases to be accepted. Analogously, the (rel-r)- and (rel-n)-
rules accept initial releases. They can only be applied if the initial-release flag
is still >. Finally, the (u-begin)-rule switches to a use-state. The outermost
lock of the usage is flagged with >. The state of A is updated according to a
use-rule, using the set of used locks and the state of A for the spawned threads
that was computed bottom-up during processing the nodes of the usage. The
(u-begin)-rule can be applied instead of an (acq-r)- or (acq-n)-rule, modeling the
79
5 Automata Constructions
nondeterministic decision whether an acquisition is final or the beginning of a
usage.
Inside usages, states of the form (ub,q˜, X, u, q) are used. The tuple (b, q˜) contains
the initial-release flag and the tree automaton’s state after the usage, which are
passed through unchanged. The set X is the set of currently acquired locks before
the usage. This set is also passed through unchanged, and used to avoid adding
reentrant locks to the set u of used locks. Finally, the state q keeps track of
the automaton’s state for the list of spawned threads. The (u-base)-rule does
not change the state at all, the (u-spawn)-rule updates the state for the list of
spawned threads according to a rule for #s-nodes of A. The (u-acq)-rule pushes
the acquired lock on the stack, flagged with ⊥, indicating that this lock is an
inner lock of a usage. Moreover, if the acquired lock is non-reentrant (w.r.t. the
set X of locks held before the usage), it is added to the set of used locks. Note
that, for an acquisition that reenters a lock of another non-reentrant acquisition
inside the same usage, adding the acquired lock to the set u of used locks has no
effect, as it is already contained in u. The (u-rel)-rule pops an inner lock of the
usage. Finally, the (u-end)-rule initializes the used-set and the state of A for the
list of spawned threads, as well as the initial-release flag b and the state q˜ of A
after the usage.
The automaton ACI uses states of the form (s0, q) and (s1, X, q). For the fol-
lowing explanation, assume that it reads a configuration backwards, starting at
a final state, and accepting in an initial state. It first reads the bottommost
stack-symbol of a thread-configuration, then the other stack-symbols, and finally
the control-state. The (s0, q)-states are used just before a new stack is read.
The (stack-bot)-rule ensures that the bottommost symbol of this stack is the
⊥-symbol. The (s1, X, q)-states are used during reading the stack. The set X
collects the set of locks on the stack, and is used by the (stack)-rule to ensure a
correct reentrance marking. If the stack is completely read, the (ctrl)-rule reads
the control-state, and ensures correct initialization. The q-component of the state
keeps track of A’s state for the hedge. It is passed through unchanged while ac-
cepting the stack, and only changed by the (ctrl)-rule. The final states of ACI
ensure that q is initialized to a state with which the empty hedge can be accepted,
and the initial states ensure that q is a final state of the tree automaton A.
The automaton ACF ensures that all control-states of the DPN are outside
usages, and that the encoded states of the tree automaton A are states with
which the empty tree can be accepted.
The correctness of the construction is stated by the following theorem:
Theorem 5.3. Let A = (Q,F, δ) be an automaton over lock-a/r-hedges that does
not depend on empty use-nodes:
〈〉∅(ε)q →∗δ q′ ⇐⇒ q = q′.
Then, DA accepts exactly the lock-execution-hedges whose corresponding lock-a/r-
80
5.1 A DPN-Acceptor for Schedulable Execution-Hedges
hedges are accepted by A:
L(DA) = ar
−1(L(A)).
The size of DA is polynomial in the size of A and the number of base-actions,
and exponential in the size of locks:
|DA| = poly(|A||Act|)2O(|X |).
Moreover, DA can be constructed in time poly(|A||Act|)2O(|X |).
Proof. After unfolding the definitions of L, ACI , and ar−1, we have to show for all
lock-execution-hedges h ∈ Hls:
∃q ∈ F, c ∈ ACI(s0, q), c′ ∈ L(ACF). c
h|1
=⇒ c′ ∧ ls(c) = h|2
⇐⇒ ∃q ∈ F. ar(h)→∗δ q
By induction on the structure of lock-execution-hedges we show the following
more general statement: For all lock-execution-hedges h ∈ Hls, lock-execution-
trees (t, µ) ∈ Tls, and well-nested execution-trees ts ∈ Twn with µ ts⇀ ε, we have:
∃c ∈ ACI(s0, q), c′ ∈ L(ACF). c
h|1
=⇒ c′ ∧ ls(c) = h|2 ⇐⇒ arh(h)→∗δ q (1)
∃w ∈ Γ∗, c′ ∈ L(ACF). (pb, set(µ), q)w t=⇒ c′ ∧ w ∼ µ
⇐⇒ art(t, µ)→∗δ q ∧ (b = > ∨ t ∈ Tnr)
(2)
∃c′ ∈ L(ACF). (ub,q˜, X, u, q)µ⊥x>w ts=⇒ c′[ϕ]
⇐⇒ ∃u′, u′′ ⊆ X , q′ ∈ Q, s ∈ TAR∗. ars(ts, X) = (u′′, s)
∧ ϕ = (ub,q˜, X, u′, q′)x>w ∧ u = u′ ∪ u′′ ∧ s#sq′ →∗δ q
(3)
where w ∼ µ means that w is a valid stack with correct reentrance marking and
lockstack µ. It is defined in terms of the automaton ACI :
w ∼ µ :⇐⇒ ∀q˜. (s1, set(µ), q˜) w−→∗δI (s0, q˜) ∧ ls(w) = µ.
Note that it is straightforward to show that ACI correctly computes the lockset
and does not change the state of the simulated tree automaton while accepting
a stack. Thus, we have
(s1, X, q˜)
w−→∗δI (s0, q′) =⇒ q′ = q˜ ∧X = set(ls(w)) ∧ w ∼ ls(w). (*1)
Moreover, the notation s#sq′ →∗δ q means that the list1 s of a/r-trees is ac-
cepted in state q, when starting in state q′:
s#sq
′ →∗δ q :⇐⇒
{
e1#s . . .#sen#sq
′ →∗δ q if s = e1#s . . .#sen#sεs for n > 0
q = q′ if s = εs
1We identify lists of a/r-trees (TAR∗) and terms over #s and εs here.
81
5 Automata Constructions
Intuitively, Statement (1) means that DA accepts a lock-execution-hedge h in
state (s0, q), if and only if A accepts the corresponding lock-a/r-hedge in state
q. Statement (2) is the same statement for lock-execution-trees, generalized to
capture the meaning of the p⊥-state that no initial releases are accepted any
more. Finally, Statement (3) captures the behavior when processing a usage. It
is generalized to also apply for the trees occurring inside a usage, where the locks
from µ are released, and to handle arbitrary states q′ with which the processing
of the list of spawned threads starts and arbitrary sets u′ of used locks at the end
of a usage. Here, µ⊥ means that every element of µ is flagged with ⊥, i.e., for
µ = x1 . . . xn, we define µ⊥ := x⊥1 . . . x⊥n . Note that, in Statement (3), the stack
w that describes the stack below the usage is implicitly all-quantified over the
whole statement.
We now demonstrate the cases of the induction:
Cases of Statement (1) In the case h = ε, we have arh(h) = εh, and get
∃c, c′. c ∈ ACI(s0, q) ∧ c′ ∈ L(ACF) ∧ c ε=⇒ c′ ∧ ls(c) = ε
⇐⇒ ε ∈ ACI(s0, q) (1.1)
⇐⇒ εh →δ q (1.2)
Equivalence (1.1) is due to c ε=⇒ c′ ⇐⇒ c = c′ = ε, ε ∈ L(ACF), and ls(ε) = ε.
Equivalence (1.2) is due to the definition of the final states of ACI .
In the case h = (t, µ)h2, we have arh(h) = (art(t, µ), set(µ))arh(h2). For the
=⇒-direction, we assume
c ∈ ACI(s0, q) ∧ c′ ∈ L(ACF) ∧ c
[t]h2|1
===⇒ c′ ∧ ls(c) = [µ]h2|2.
By definition of ACF , we obviously have
∀c1, c2. c1c2 ∈ L(ACF) ⇐⇒ c1 ∈ L(ACF) ∧ c2 ∈ L(ACF) (*2)
By unfolding the definition of
[t]h2|1
===⇒, splitting the run of ACI accordingly, and
applying (*1) and (*2), we obtain qt, q2 ∈ Q, w ∈ Γ∗, c2 ∈ Conf, and c′1, c′2 ∈
L(ACF) such that
c = (p>, set(µ), qt)wc2 ∧ c′ = c′1c′2 ∧ ls(w) = µ ∧ ls(c2) = h2|2
∧ (p>, set(µ), qt)w t=⇒ c′1 ∧ c2
h2|1
==⇒ c′2
∧ (s0, q) (p
>,set(µ),qt)−−−−−−−→δI (s1, set(µ), q2) w−→∗δI (s0, q2) ∧ c2 ∈ ACI(s0, q2) ∧ w ∼ µ.
Applying the induction hypothesis to t and h2 yields
art(t, µ)→∗δ qt ∧ arh(h2)→∗δ q2.
82
5.1 A DPN-Acceptor for Schedulable Execution-Hedges
Moreover, due to the definition of δI, we have (qt, set(µ))#hq2 →δ q, and together
we get ar(h)→∗δ q.
For the ⇐=-direction, we assume arh(h)→∗δ q. Analyzing the last rule applied
by the run of the tree automaton, and using arh(h) = (art(t, µ), set(µ))#harh(h2),
we obtain qt, q2 ∈ Q, such that
art(t, µ)→∗δ qt ∧ arh(h2)→∗δ q2 ∧ (qt, set(µ))#hq2 →δ q.
Applying the induction hypothesis, we obtain w ∈ Γ∗, c2 ∈ ACI(s0, q2), and
c′1, c
′
2 ∈ L(ACF) with
(p>, set(µ), qt)w
t
=⇒ c′1 ∧ w ∼ µ ∧ c2
h2|1
==⇒ c′2 ∧ ls(c2) = h2|2.
We set c := (p>, set(µ), qt)wc2. Unfolding the definition of ∼, and using the
(ctrl)-rule of ACI , we get c ∈ ACI(s0, q). Moreover, we have ls(c) = h|2. With (*2),
we get c′1c′2 ∈ L(ACF). Combining the executions of t and h2|1 yields c
h|1
=⇒ c′1c′2,
which completes the case.
Cases of Statement (2) In the case t = τ , we have art(t, µ) = τ . Moreover,
we have
∃w ∈ Γ∗, c′ ∈ L(ACF). (pb, set(µ), q)w τ=⇒ c′ ∧ w ∼ µ
⇐⇒ ∃w. (pb, set(µ), q)w ∈ L(ACF) ∧ w ∼ µ.
For the =⇒-direction, we unfold the definition of ACF , and get τ →δ q, which
implies the proposition.
For the ⇐=-direction, we observe that it is straightforward to find a stack
w ∈ Γ∗ with w ∼ µ, by simply adding the correct reentrance marking to µ, and
appending a ⊥-symbol. Moreover, due to τ →δ q, we have (pb, set(µ), q)w ∈
L(ACF), which implies the proposition.
In the case t = 2at1, we have art(t, µ) = 〈〉∅(ε)art(t1, µ). As base-steps have
no effect and are executable from all states (cf. (base)-constraint), we have
(pb, set(µ), q)w
2at1==⇒ c′ iff (pb, set(µ), q)w t1=⇒ c′. Similar, we have 2at1 ∈ Tnr
iff t1 ∈ Tnr. Moreover, as A does not depend on use-nodes, we have 〈〉∅(ε)q →δ q′
iff q = q′. Applying the induction hypothesis completes the case.
In the case t = a(t1)t2, we have art(t, µ) = 〈〉∅([art(t1, ε)])art(t2, µ). For the
=⇒-direction, we assume
c′ ∈ L(ACF) ∧ (pb, set(µ), q)w t=⇒ c′ ∧ w ∼ µ.
Analyzing the rules of DA, the first step must have been derived by a (spawn)-
rule. Using (*2), we obtain q1, q2 ∈ Q and c′1, c′2 ∈ L(ACF) such that:
c′ = c′1c
′
2 ∧ 〈〉∅([q1])q2 →∗δ q ∧ (p⊥, ∅, q1)⊥ t1=⇒ c′1 ∧ (pb, set(µ), q2)w t2=⇒ c′2.
83
5 Automata Constructions
Obviously, we have ⊥ ∼ ε. Thus, we can apply the induction hypothesis on both,
t1 and t2, and get:
art(t1, ε)→∗δ q1 ∧ t1 ∈ Tnr ∧ art(t2, µ)→∗δ q2 ∧ (b = > ∨ t2 ∈ Tnr).
We have t ∈ Tnr if and only if t2 ∈ Tnr. Moreover, using the rule 〈〉∅([q1])q2 →∗δ q,
we get art(t, µ)→∗δ q, which completes the =⇒-direction of the spawn-case.
For the ⇐=-direction, we assume
art(t)→∗δ q ∧ (b = > ∨ t ∈ Tnr).
Analyzing the last rule applied by →∗δ and unfolding the definition of art, we
obtain q1, q2 ∈ Q such that
〈〉∅([q1])q2 →∗δ q ∧ art(t1, ε)→∗δ q1 ∧ art(t2, µ)→∗δ q2.
As t is well-nested, we have t1 ∈ Tnr. Applying the induction hypothesis on both,
t1 and t2, and using that w ∼ ε iff w = [⊥], and that t ∈ Tnr iff t2 ∈ Tnr, we
obtain w ∈ Γ∗ and c′1, c′2 ∈ L(ACF) such that
(p⊥, ∅, q1)⊥ t1=⇒ c′1 ∧ (pb, set(µ), q2)w t2=⇒ c′2 ∧ w ∼ µ.
Using the (spawn)-rule of DA and (*2) completes the case.
In the case of a non-reentrant final acquisition (i.e., t = 〈xt1 with t1 ∈ Tnr and
x /∈ set(µ)), we have art(t, µ) = 〈xart(t1, xµ). For the =⇒-direction, we first have
to show that the root-node of t is accepted by the (acq-n)-rule, and not by the
(u-begin)-rule that also matches. This is done by contradiction. So assume the
root-node is accepted by a (u-begin)-rule. Then, we obtain q˜, q′, and u such that
(ub,q˜, set(µ), u, q′)x>w t1=⇒ c′.
By a straightforward induction on t1, using the case distinction of Lemma 4.2,
we show that for all w ∈ Γ∗, u ⊆ X , and q ∈ Q we have
(ub,q˜, X, u, q)w
t1=⇒ c′ ∧ t1 ∈ Tnr =⇒ ∃u′, q′, w′, c′1. c′ = c′1[(ub,q˜, X, u′, q′)w′w].
Intuitively, the non-returning tree t1 does not pop any symbol from the stack,
and thus the (u-end)-rule that leaves the use-state is never applied. Hence, the
thread is still in a use-state after executing t1. In our case, we obtain u′, q′′, and
c′1 such that c′ = c′1[(ub,q˜, set(µ), u′, q′′)]. However, this contradicts c′ ∈ L(ACF).
Thus, the root-node of t is accepted by an (acq-n)-rule, and we obtain q′ ∈ Q
such that
〈xq′ →δ q ∧ (p⊥, set(xµ), q′)x⊥w t1=⇒ c′.
By unfolding the definition of ∼, we also have x⊥w ∼ xµ. Analogously to the
above cases, we apply the induction hypothesis to t1, and infer the proposition.
84
5.1 A DPN-Acceptor for Schedulable Execution-Hedges
For the ⇐=-direction, we assume 〈xart(t1, xµ) →∗δ q. Hence, we obtain q′ ∈ Q
such that 〈xq′ →δ q and art(t1, xµ)→∗δ q′. As we have t1 ∈ Tnr, we can apply the
induction hypothesis and obtain w ∈ Γ∗ and c′ ∈ L(ACF) such that
(p⊥, set(xµ), q′)w t1=⇒ c′ ∧ w ∼ xµ.
By unfolding the definition of ∼, using x /∈ set(µ), and analyzing the rules of
ACI , we obtain w′ such that w = x⊥w′ and w′ ∼ µ. With the (acq-n)-rule we get
(pb, set(µ), q)w′ t=⇒ c′ for all b ∈ B, which completes the case.
Now assume that the root-node of t is a reentrant final acquisition or a reentrant
initial release:
(t = 〈xt1 ∧ t1 ∈ Tnr ∧ x ∈ set(µ)) ∨ (t = 〉xt1 ∧ µ = xµ′ ∧ x ∈ set(µ′)).
We have ar(t, µ) = 〈〉∅(ε)t˜1, where t˜1 = ar(t1, xµ) or t˜1 = ar(t1, µ′), respectively.
These cases are proved analogously to the case t = 2at1, and for a reentrant final
acquisition, we show—analogously to the above case—that the root-node of t is
accepted by the (acq-r)-rule rather than by the (u-begin)-rule.
In the case of a non-reentrant initial release (i.e., t = 〉xt1, µ = xµ′, and
x /∈ set(µ′)), we have art(t, µ) = 〉xart(t1, µ′). The only matching rule of DA is the
(rel-n)-rule, and the proposition is shown analogously to the cases above.
Finally, in the case of a usage (i.e., t = 〈xt1〉xt2 with t1 ∈ Tsl), we obtain u ⊆ X
and s ∈ TAR∗ with
art(t, µ) = 〈〉{x}\set(µ)∪u(s)art(t2, µ) and ars(t1, set(µ)) = (u, s).
For the =⇒-direction, we assume
c′ ∈ L(ACF) ∧ (pb, set(µ), q)w
〈xt1〉xt2
====⇒ c′ ∧ w ∼ µ.
We first show that the 〈x-node is accepted by a (u-begin)-rule, and not by an (acq-
r)- or (acq-n)-rule, which also match. The proof is, again, done by contradiction.
So assume the 〈x-node is accepted by an (acq-r)-rule (The proof for an (acq-n)-




It is straightforward to show that after execution of the same-level tree t1, we
are in a p⊥-state again, and the stack is x>w, i.e., we obtain c′1 and q′ with
(p⊥, set(xµ), q)x>w t1=⇒ c′1(p⊥, set(xµ), q′)x>w
〉xt2
==⇒ c′.
However, there is no rule to accept the release-node 〉x from a p⊥-state, in con-
tradiction to the above execution. Thus, the root-node of t is accepted by a
(u-begin)-rule, and we obtain q˜, q′ ∈ Q and u˜ ⊆ X with





Further splitting the execution of t1〉xt2, and using (*2), we obtain c′1, c′2 ∈ L(ACF)
and ϕ, ϕ′ ∈ PΓ∗, such that
c′ = c′1c
′
2 ∧ (ub,q˜, set(µ), u˜, q′)x>w t1=⇒ c′1ϕ ∧ ϕ
〉x
=⇒ ϕ′ ∧ ϕ′ t2=⇒ c′2.
Applying Equivalence (3) of the induction hypothesis and using ar(ts, set(µ)) =
(u, s), we obtain u′ and q′′ such that
ϕ = (ub,q˜, set(µ), u′, q′′)x>w ∧ u˜ = u′ ∪ u ∧ s#sq′′ →∗δ q′.
Hence, the only applicable rule to accept the 〉x-node is the (u-end)-rule, and we
have
u′ = ∅ ∧ εs →δ q′′ ∧ ϕ′ = (pb, set(µ), q˜)w,
and thus u = u˜ and s →∗δ q′. Moreover, we can apply the induction hypothesis
for t2, and obtain
ar(t2, µ)→∗δ q˜ ∧ (b = > ∨ t2 ∈ Tnr).
Together, we get ar(t, µ) →∗δ q. Moreover, we have t ∈ Tnr iff t2 ∈ Tnr, which
completes the =⇒-direction of the use-case.
For the ⇐=-direction, we assume
〈〉{x}\set(µ)∪u(s)art(t2, µ)→∗δ q ∧ (b = > ∨ 〈xt1〉xt2 ∈ Tnr).
Hence, we obtain q′, q˜ ∈ Q such that
〈〉{x}\set(µ)∪u(q′)q˜ →δ q ∧ s→∗δ q′ ∧ art(t2, µ)→∗δ q˜.
Splitting the run s →∗δ q′ after the first rule, we obtain q′′ such that ε →δ
q′′ ∧ sq′′ →∗δ q′. Applying the induction hypothesis on t2 yields w ∈ Γ∗ and
c′2 ∈ L(ACF) such that (pb, set(µ), q˜)w t2=⇒ c′2 and w ∼ µ. With the (u-end)-rule,
we get
(ub,q˜, set(µ), ∅, q′′)x>w 〉x=⇒ (pb, set(µ), q˜)w.
Applying Equivalence (3) of the induction hypothesis on t1, instantiating µ = ε
and u′ = ∅, yields c′1 ∈ L(ACF) such that
(ub,q˜, set(µ), u, q′)x>w t1=⇒ c′1[(ub,q˜, set(µ), ∅, q′′)x>w].
Finally, with the (u-begin)-rule, we get
(pb, set(µ), q)w
〈x
=⇒ (ub,q˜, set(µ), u, q′)x>w,
and with (*2), we have c′1c′2 ∈ L(ACF). Combining the executions derived above
completes the case.
86
5.1 A DPN-Acceptor for Schedulable Execution-Hedges
Cases of Statement (3) In the case ts = τ , we have µ = ε and ars(ts, X) =
(∅, ε). For the =⇒-direction, we assume
(ub,q˜, X, u, q)x>w τ=⇒ c′ϕ ∧ c′ ∈ L(ACF).
Hence we have c′ = ε and ϕ = (ub,q˜, X, u, q)x>w. The statements u = u ∪ ∅ and
εs#sq →∗δ q are trivial, which completes the =⇒-direction.
For the ⇐=-direction, we assume
ϕ = (ub,q˜, X, u′, q′)x>w ∧ u = ∅ ∪ u′ ∧ ε#sq →δ q′.
Hence, we have u = u′ and q = q′. We choose c′ := ε. By definition of ACF , we
have c′ ∈ L(ACF), which completes the case.
In the case ts = 2at1, we have ars(ts, X) = ars(t1, X), and the proposition
is proved straightforwardly by applying the induction hypothesis and using the
definition of the (u-base)-rule that does not change the state or stack.
In the case ts = a(t1)t2, we have
ars(ts, X) = (u
′′, [art(t1, ε)]s) with ars(t2, X) = (u′′, s).
For the =⇒-direction, we assume
(ub,q˜, X, u, q)µ⊥x>w
a(t1)t2
====⇒ c′[ϕ] ∧ c′ ∈ L(ACF).
The a-node was accepted by a (u-spawn)-rule, and with (*2) we obtain c′1, c′2 ∈
L(ACF) and q1, q2 ∈ Q such that
c′ = c′1c
′
2 ∧ q1#sq2 →δ q ∧ (p⊥, ∅, q1)⊥ t1=⇒ c′1 ∧ (ub,q˜, X, u, q2)µ⊥x>w t2=⇒ c′2[ϕ].
We have⊥ ∼ ε, and thus can apply Equivalence (2) of the induction hypothesis on
t1, and get art(t1, ε)→∗δ q1. Applying Equivalence (3) of the induction hypothesis
to t2, we obtain u ⊆ X and q′ ∈ Q such that
ϕ = (ub,q˜, X, u′, q′)x>w ∧ u = u′ ∪ u′′ ∧ s#sq′ →∗δ q2.
Together, we get [art(t1, ε)]sq′ →∗δ q, which completes the =⇒-direction.
For the ⇐=-direction, we assume
ϕ = (ub,q˜, X, u′, q′)µ⊥x>w ∧ [art(t1, ε)]s#sq′ →∗δ q ∧ u = u′ ∪ u′′.
Hence, we obtain q1, q2 ∈ Q such that
q1#sq2 →δ q ∧ art(t1, ε)→∗δ q1 ∧ s#sq′ →∗δ q2.
Due to well-nestedness, we have t1 ∈ Tnr, and applying Equivalence (2) of the
induction hypothesis on t1 yields w ∈ Γ∗ and c′1 ∈ L(ACF) with
(p⊥, ∅, q1) t1=⇒ c′1 ∧ w ∼ ε.
87
5 Automata Constructions
Applying Equivalence (3) of the induction hypothesis on t2, we obtain c′2 ∈ L(ACF)
such that
(ub,q˜, X, u, q2)µ
⊥x>w t2=⇒ c′2[ϕ].
With the (u-spawn)-rule, we get (ub,q˜, X, u, q)µ⊥x>w
a(t1)t2
====⇒ c′1c′2[ϕ], and due to
(*2) we have c′1c′2 ∈ L(ACF), which completes the case.
In the case ts = 〈yt1, we have ars(ts, X) = ({y} \X ∪ u′′, s) where ars(t1, X) =
(u′′, s). Here, we need the generalization that allowed arbitrary lockstacks µ
above the outermost lock of the usage: By definition of ⇀, we have yµ t1⇀ ε.
This allows us to apply the induction hypothesis on t1. The following calculation
completes the case:
∃c′ ∈ L(ACF). (ub,q˜, X, u, q)µ⊥x>w
〈yt1
==⇒ c′[ϕ]
⇐⇒ ∃u1 ⊆ X , c′ ∈ L(ACF). (ub,q˜, X, u1, q)(yµ)⊥x>w t1=⇒ c′[ϕ] ∧ u = {y} \X ∪ u1
(3.1)
⇐⇒ ∃u′, u′′ ⊆ X , q′ ∈ Q. ϕ = (ub,q˜, X, u′, q′)x>w ∧ s#sq′ →∗δ q (3.2)
∧ u1 = u′ ∪ u′′ ∧ u = u′ ∪ ({y} \X ∪ u′′)
Here, Equivalence (3.1) is due to the fact that the only rule that matches the
acquisition-node is (u-acq), and Equivalence (3.2) is due to the induction hy-
pothesis.
Finally, in the case ts = 〉yt1 we have ars(〉yt1, X) = ars(t1, X), and the proof
is done by straightforward application of the induction hypothesis, using that
(u-rel) is the only rule to accept the 〉x-node.
Size Estimation For the size estimation, we treat the components of DA sep-
arately. We set
DA = (ACI ,ACF ,M) with M = (P,Γ,Γ⊥,Act,X ,∆, locks).
For the control-states and stack-symbols, we have:
|P | = 2|Q|2|X | + 2|Q|222|X | = poly(|A|)2O(|X |)
|Γ| = 2|X |+ 1 = O(|X |)
The automaton ACI is defined over the alphabet P ∪ Γ, and we have
|P ∪ Γ| = poly(|A|)2O(|X |).
Moreover, ACI has
|Q|+ 2|X ||Q| = poly(|A|)2O(|X |)
states, hence its size is
|ACI| = poly(poly(|A|)2O(|X |)poly(|A|)2O(|X |)) = poly(|A|)2O(|X |).
88
5.2 Cross-Product Construction
The automaton ACF is also defined over the alphabet P ∪Γ, and has one state.
Hence, its size is
|ACF | = poly(poly(|A|)2O(|X |)) = poly(|A|)2O(|X |).
Finally, the size of the Monitor-DPN M is
|M | = poly(|P ||Γ||Act|) = poly(|A||Act|)2O(|X |),
and together we get
|DA| = |ACI |+ |ACF |+ |M |
= poly(|A|)2O(|X |) + poly(|A|)2O(|X |) + poly(|A||Act|)2O(|X |)
= poly(|A||Act|)2O(|X |).
Combining Corollary 4.21, which states that the schedulable lock-execution-
hedges are characterized by ar−1(L(ACons)), and Theorem 5.3, which states that
L(DA) = ar
−1(L(A)) for a tree automaton A, we get the main result of this
chapter:
Theorem 5.4. The DPN-Acceptor DACons is well-defined and accepts exactly the
schedulable lock-execution-hedges. Its size is polynomial in the number of base-
actions, and exponential in the number of locks:
h ∈ L(DACons) ⇐⇒ schedls(h) 6= ∅ and |DACons| = poly(|Act|)2O(|X |
2).
Proof. From the definition of ACons (Definition 4.19), it is straightforward to show
that ACons does not depend on empty use-nodes. Thus, DACons is well-defined, and
the first part of the proposition follows from Corollary 4.21 and Theorem 5.3.
The size estimation follows from the size estimation for ACons (cf. Lemma 4.20)
and that for DA (cf. Theorem 5.3).
5.2 Cross-Product Construction
In the last section, we have constructed the DPN-Acceptor DACons that accepted
exactly the schedulable lock-execution-hedges. In this section, we combine DACons
with the Monitor-DPN to be analyzed. We do a general construction for ar-
bitrary DPN-Acceptors: Given a Monitor-DPN M1 and a DPN-Acceptor D2,
we construct a lock-insensitive cross-product DPN M× and automata ACI× and
ACF× , such that the execution-hedges of M× between L(ACI×) and L(ACF×) match
exactly the lock-execution-hedges of M1 that are accepted by D2. Note that
(ACI× ,ACF× ,M×) can intuitively be seen as a DPN-Acceptor, however, it does not
89
5 Automata Constructions
satisfy the additional restrictions that we imposed on DPN-Acceptors (i.e., that
all stack-symbols are bound to locks and that non-release-rules are independent
from the stack).
In the remainder of this section, we prove the following theorem:
Theorem 5.5. Let M1 be a Monitor-DPN and D2 be a DPN-Acceptor with
M1 = (P1,Γ1,Γ⊥,1,Act,X ,∆1, locks1),
D2 = (ACI ,ACF ,M2), and
M2 = (P2,Γ2,Γ⊥,2,Act,X ,∆2, locks2).
Then, a (lock-insensitive) DPN M× = (P×,Γ×,ActX ,∆×), automata ACI× and
ACF×, and a projection pi1 : (P× ∪ Γ×)∗ → (P1 ∪ Γ1)∗ can be constructed, such
that, for all execution-hedges h ∈ H and sequences c, c′ ∈ (P1 ∪ Γ1)∗, we have
c ∈ Conf ls1 ∩ valid1 ∧ c h=⇒1 c′ ∧ h×ls(c) ∈ L(D2)
⇐⇒ ∃c× ∈ L(ACI×), c′× ∈ L(ACF×). c× h=⇒× c′× ∧ pi1(c×) = c ∧ pi1(c′×) = c′.
The automaton ACI× can be constructed in time poly(|M1||D2|)2O(|X |), and ACF×
and M× can be constructed in time poly(|M1||D2|). Their sizes are
|ACI×| = poly(|M1||D2|)2O(|X |) and |ACF×|, |M×| = poly(|M1||D2|).
Note that we sometimes write ×, 1, 2 instead of M×, M1, M2, e.g. =⇒× instead
of =⇒M× .
The construction exploits that the stacks of the Monitor-DPN M1 and the
DPN-Acceptor’s DPN M2 are almost synchronized: On acquisition-actions, both
DPNs push a stack-symbol, on release-actions, both DPNs pop a stack-symbol,
and on spawn-actions, both DPNs preserve the height of the stack. Only on
base-actions, M1 may push or pop a stack-symbol, or preserve the height of the
stack, while M2 always preserves the height of the stack.
We use a cross-product construction, where the control-states of the new DPN
are pairs of control-states from P1 and P2. The stack-symbols are either pairs of
stack-symbols from Γ1 and Γ2, or just symbols from Γ1. The pairs are pushed on
acquisition-steps, the single symbols are pushed on base-steps. As non-release-
steps of M2 do not depend on the stack (by definition of DPN-Acceptors), the
cross-product DPN does not need the topmost stack-symbol of M2’s stack in
order to perform non-release-steps.
To account for the independence of the stack, we use the following abbreviations
for rules from ∆2:
p2
o
↪→ p′2w2 ∈ ∆2 := ∃γ2. p2γ2
o
↪→ p′2w2γ2 ∈ ∆2 for w2 ∈ Γ∗2
p2
o
↪→ pˆ2γˆ2]p′2w2 ∈ ∆2 := ∃γ2. p2γ2
o
↪→ pˆ2γˆ2]p′2w2γ2 ∈ ∆2 for w2 ∈ Γ∗2
90
5.2 Cross-Product Construction
As sketched above, the control-states of the new DPN are P× := P1 × P2, and
the stack-symbols are
Γ× := {γ1 ∈ Γ1 \ Γ⊥,1 | locks(γ1) = ∅}
∪ {(γ1, γ2) ∈ Γ1×Γ2 | locks(γ1) = locks(γ2) 6= ∅}
∪ Γ⊥,1×Γ⊥,2.
The projections pi1 and pi2 are defined by
pi1(p1, p2) = p1 pi1(γ1) = γ1 pi1(γ1, γ2) = γ1
pi2(p1, p2) = p2 pi2(γ1) = ε pi2(γ1, γ2) = γ2
for p1 ∈ P1, p2 ∈ P2, γ1 ∈ Γ1, and γ2 ∈ Γ2. Note that pi1 and pi2 are homomor-
phisms. They are extended to sequences and sets of sequences in the natural way.




↪→ (p′1, p′2)γ′× ∈ ∆× (base)
if p1pi1(γ×)
2a
↪→ p′1pi1(γ′×) ∈ ∆1 ∧ p2
2a
↪→ p′2 ∈ ∆2 ∧ pi2(γ×) = pi2(γ′×)
(p1, p2)γ×
2a
↪→ (pˆ1, pˆ2)(γˆ1, γˆ2)#(p′1, p′2)γ′× ∈ ∆× (spawn)
if p1pi1(γ×)
2a
↪→ pˆ1γˆ1#p′1pi1(γ′×) ∈ ∆1 ∧ p2
2a
↪→ pˆ2γˆ2#p′2 ∈ ∆2 ∧ pi2(γ×) = pi2(γ′×)
(p1, p2)γ×
2a
↪→ (p′1, p′2)γˆ1γ′× ∈ ∆× (push)
if p1pi1(γ×)
2a
↪→ p′1γˆ1pi1(γ′×) ∈ ∆1 ∧ p2
2a
↪→ p′2 ∈ ∆2 ∧ pi2(γ×) = pi2(γ′×)
(p1, p2)γ1
2a
↪→ (p′1, p′2) ∈ ∆× (pop)
if p1γ1
2a
↪→ p′1 ∈ ∆1 ∧ p2
2a
↪→ p′2 ∈ ∆2
(p1, p2)γ×
〈x
↪→ (p′1, p′2)(γˆ1, γˆ2)γ′× ∈ ∆× (acq)
if p1pi1(γ×)
〈x
↪→ p′1γˆ1pi1(γ′×) ∈ ∆1 ∧ p2
〈x
↪→ p′2γˆ2 ∈ ∆2 ∧ pi2(γ×) = pi2(γ′×)
(p1, p2)(γ1, γ2)
〉x
↪→ (p′1, p′2) ∈ ∆× (rel)
if p1γ1
〉x
↪→ p′1 ∈ ∆1 ∧ p2γ2
〉x
↪→ p′2 ∈ ∆2
where p1, pˆ1, p′1 ∈ P1, p2, pˆ2, p′2 ∈ P2, γ1, γˆ1 ∈ Γ1, γ2, γˆ2 ∈ Γ2, γ×, γ′× ∈ Γ×, x ∈ X ,
and a ∈ Act. This obviously defines a DPN, i.e., the sets P× and Γ× are finite.
Analogously to the notion of valid stacks for Monitor-DPNs, we define the
notion of valid stacks for the cross-product DPN:
Γ⊥,× := Γ⊥,1 × Γ⊥,2 valid× := (Γ× \ Γ⊥,×)∗Γ⊥,×.
91
5 Automata Constructions
We extend valid× to configurations in the obvious way. It is straightforward to
show that validity is preserved by steps of M×, and that valid configurations of
M× are projected to valid configurations of M1 and M2, respectively:
c× ∈ valid× ∧ c× −→× c′× =⇒ c′× ∈ valid×
c× ∈ valid× =⇒ pi1(c×) ∈ valid1 ∧ pi2(c×) ∈ valid2
(*1)
The automaton ACI× checks that the configuration is valid, that the encoded
lockstacks are consistent, and that the projection to the second component is an
initial configuration of D2, i.e., it is accepted by ACI . The automaton ACF× checks
that the second component is a final configuration of D2, i.e., it is accepted by
ACF . We define
ACI× := Avalid× ∩ pi−11 (AConf ls1) ∩ pi−12 (ACI) and ACF× := pi−12 (ACF),
where the automaton Avalid× accepts the valid configurations of M× and the au-
tomaton AConf ls1 accepts the consistent configurations of M1.
The automaton Avalid× is obtained straightforwardly from the definition of




We obviously have L(Avalid×) = valid× ∩ Conf×.
The automaton AConf ls1 is defined as
AConf ls1 := (P1 ∪ Γ1, 2X × 2X , {(∅, ∅)}, (2X , 2X ), δls),
where the transition relation δls is the least solution of the following constraints:
(X, Y )
p−→δls (∅, X ∪ Y ) for p ∈ P1 (ctrl)
(X, Y )
γ1−→δls (locks1(γ) ∪X, Y ) for γ1 ∈ Γ1 if locks1(γ1) ∩ Y = ∅ (stack)
A state (X, Y ) of the automaton stores the setX of locks on the current lockstack,
and the set Y of locks already seen on other stacks. When reading a control-
symbol, the (ctrl)-rule joins the two sets. When reading a stack-symbol, the
(stack)-rule updates the current lockset. It is only applicable if the lock associated
to the stack-symbol is not in the set of locks already seen on other lockstacks. It
is straightforward to show that






L(ACI×) = valid× ∩ pi−11 (Conf ls1 ) ∩ pi−12 (L(ACI)) and L(ACF×) = pi−12 (L(ACF)).
Given valid configurations c1 ∈ valid1 and c2 ∈ valid2 with the same lockstacks,
it is straightforward to construct a valid configuration c× ∈ valid×, whose projec-
tions are the given configurations. Vice versa, it is obvious that the projections
of a valid configuration c× ∈ valid× have the same lockstacks:
c1 ∈ valid1 ∧ c2 ∈ valid2 ∧ ls(c1) = ls(c2)
⇐⇒ ∃c× ∈ valid×. pi1(c×) = c1 ∧ pi2(c×) = c2
(*2)
Example 5.6. We give an example how to construct a configuration c× out of two
given configurations c1 ∈ valid1 and c2 ∈ valid2. So assume we have c1 = pγ1 . . . γ8
and c2 = pˆγˆ1 . . . γˆ5 with γ8 ∈ Γ⊥,1 and γˆ5 ∈ Γ⊥,2. The locks-functions are:
locks1: γ1 7→ x1, γ2 7→ x2, γ3 7→ ∅, γ4 7→ x3, γ5 7→ ∅, γ6 7→ x4, γ7, γ8 7→ ∅
locks2: γˆ1 7→ x1, γˆ2 7→ x2, γˆ3 7→ x3, γˆ4 7→ x4, γˆ5 7→ ∅
Then, both configurations are valid and have the same lockstacks. The corre-
sponding configuration c× is
c× = (p, pˆ)(γ1, γˆ1)(γ2, γˆ2)γ3(γ4, γˆ3)γ5(γ6, γˆ4)γ7(γ8, γˆ5).
Finally, we show the following relation between the two component DPNs
M1,M2 and the cross-product DPN M×:
∀c×, c′× ∈ valid×. pi1(c×) h=⇒1 pi1(c′×) ∧ pi2(c×) h=⇒2 pi2(c′×) ⇐⇒ c× h=⇒× c′×. (*3)
Proof. With Theorem 3.8, the proposition is shown by straightforward induc-
tion, using the following statement over single lock-actions o ∈ ActX , thread-
configurations (p1, p2)w×, (p′1, p′2)w′× ∈ valid×, and configurations c′× ∈ valid×:
p1pi1(w×)
o−→1 pi1(c′×)p′1pi1(w′×) ∧ p2pi2(w×) o−→2 pi2(c′×)p′2pi2(w′×)
⇐⇒ (p1, p2)w× o−→× c′×(p′1, p′2)w′×.
For the =⇒-direction, we assume
p1pi1(w×)
o−→1 pi1(c′×)p′1pi1(w′×) ∧ p2pi2(w×) o−→2 pi2(c′×)p′2pi2(w′×).
We make a case distinction over the step o−→1. First, consider the case of a base-
step. We have o = 2a and c′× = ε. As pi1 projects each symbol to exactly one
symbol, we obtain γ× ∈ Γ× and v×, r× ∈ Γ∗× with w× = γ×r×, w′× = v×r×,
and p1, pi1(γ×)
2a
↪→ p′1, pi1(v×) ∈ ∆1. As non-release-rules of M2 are independent
of the stack, we have p2
2a
↪→ p′2 and pi2(w×) = pi2(w′×). We proceed with a case
93
5 Automata Constructions
distinction over v×, i.e., we distinguish whether the step ofM1 was a base-, push-
or pop-step. In case of a base-step (i.e., v× = [γ′×] for some γ′× ∈ Γ×), we get
pi2(γ×) = pi2(γ′×). With the (base)-constraint, we get
(p1, p2)γ×
2a
↪→ (p′1, p′2)γ′× ∈ ∆×,
and thus (p1, p2)w×
2a−→× (p′1, p′2)w′×.
In case of a push-step (i.e., v× = [γˆ×, γ′×] for some γˆ×, γ′× ∈ Γ×), we have
locks(pi1(γˆ×)) = ∅ and pi1(γˆ×) /∈ Γ⊥,1. Thus, we have γˆ× ∈ Γ1 which implies




↪→ (p′1, p′2)γˆ×γ′× ∈ ∆×,
and thus we have (p1, p2)w×
2a−→× (p′1, p′2)w′×.
In case of a pop-step (i.e., v× = ε), we have locks(pi1(γ×)) = ∅ and γ× /∈ Γ⊥,×.
Thus, we have γ× ∈ Γ1, and the (pop)-constraint yields
(p1, p2)γ×
2a
↪→ (p′1, p′2) ∈ ∆×,
and thus we have (p1, p2)w×
2a−→× (p′1, p′2)w′×.
We continue the case distinction over the step o−→1: Consider the case of a
spawn-step. We have o = 2a and c′× = (pˆ1, pˆ2)(γˆ1, γˆ2) with (γˆ1, γˆ2) ∈ Γ⊥,×.
Analogously to the previous case, we obtain γ×, γ′× ∈ Γ× and r× ∈ Γ∗× with
pi2(γ×) = pi2(γ′×) and





↪→ pˆ1γˆ1]p′1pi1(γ′×) ∈ ∆1 p2
2a




↪→ (pˆ1, pˆ2)(γˆ1, γˆ2)](p′1, p′2)γ′× ∈ ∆×,
and thus we have (p1, p2)w×
2a−→× c′×[(p′1, p′2)w′×].
In case of an acquisition-step, we have o = 〈x and c′× = ε. Analogously to
the previous cases, we obtain γ×, (γˆ1, γˆ2), γ′× ∈ Γ× and r× ∈ Γ∗× with locks(γˆ1) =
locks(γˆ2) = {x}, pi2(γ×) = pi2(γ′×), and





↪→ p′1γˆ1pi1(γ′×) ∈ ∆1 p2
〈x
↪→ p′2γˆ2 ∈ ∆2.
With the (acq)-constraint, we get
(p1, p2)γ×
〈x
↪→ (p′1, p′2)γˆ×γ′× ∈ ∆×,
94
5.2 Cross-Product Construction
and thus (p1, p2)w×
〈x−→× (p′1, p′2)w′×.
In case of a release-step, we have o = 〉x and c′× = ε. We obtain (γ1, γ2) ∈ Γ×
with locks(γ1) = locks(γ2) = {x}, w× = (γ1, γ2)w′×, and
p1γ1
〉x
↪→ p′1 ∈ ∆1 p2γ2
〉x
↪→ p′2.
With the (rel)-constraint, we get
(p1, p2)γ×
〉x
↪→ (p′1, p′2) ∈ ∆×,
and thus (p1, p2)w×
〉x−→× (p′1, p′2)w′×.
The proof of the ⇐=-direction is done symmetrically.
Now we are prepared to prove Theorem 5.5:
Proof of Theorem 5.5. The DPN M×, the automata ACI× and, ACF× , and the
projection pi1 are constructed as described above.
For the =⇒-direction, we assume
c1 ∈ Conf ls1 ∩ valid1 ∧ c1 h=⇒1 c′1 ∧ h×ls(c1) ∈ L(D2).
By unfolding the definition of L(D2), we obtain c2 ∈ L(ACI) and c′2 ∈ L(ACF) with
c2
h
=⇒2 c′2 ∧ ls(c1) = ls(c2).
As D2 is a DPN-Acceptor, we have L(ACI) ⊆ valid2, and thus c2 ∈ valid2. As
validity is preserved by executions, we also have c′1 ∈ valid1 and c′2 ∈ valid2.
As the lockstacks of a final configuration only depends on the lockstacks of the
initial configuration and on the execution-hedge, which are the same for both
executions, we have ls(c′1) = ls(c′2). With (*2), we construct valid configurations
c×, c′× ∈ valid× with
c1 = pi1(c×) ∧ c′1 = pi1(c′×) ∧ c2 = pi2(c×) ∧ c′2 = pi2(c′×).
With (*3) we get c×
h
=⇒× c′×. From c1 ∈ Conf ls1 ∩ valid1, we get c× ∈ pi−11 (Conf ls1 ),
and from c2 ∈ L(ACI), we get c× ∈ L(pi−12 (ACI)). Analogously, we get c′× ∈
L(pi−12 (ACF)). Due to c× ∈ valid×, we have c× ∈ Avalid× , and by unfolding the
definitions of ACI× and ACF× , we get c× ∈ L(ACI×) and c′× ∈ L(ACF×), which
completes the case.
For the ⇐=-direction, we assume
c× ∈ L(ACI×) ∧ c′× ∈ L(ACF×) ∧ c× h=⇒× c′×.
95
5 Automata Constructions
By definition of ACI× , we have c× ∈ valid×, pi1(c×) ∈ Conf ls1 , and pi2(c×) ∈ L(ACI).
By (*1) we also have c′× ∈ valid×. Hence, by (*3), we get
pi1(c×)
h
=⇒1 pi1(c′×) ∧ pi2(c×) h=⇒2 pi2(c′×).
With (*2), we have pi1(c×) ∈ valid1, pi2(c×) ∈ valid2, and ls(pi1(c×)) = ls(pi2(c×)).
By definition of ACF× , we also have pi2(c′×) ∈ L(ACF). Together, by folding the
definition of L(D2), we get h×ls(pi1(c×)) ∈ L(D2), which completes the case.
5.3 Summary and Related Work
In this chapter, we have mapped tree automata on lock-a/r-hedges to DPN-
Acceptors on lock-execution-hedges (Theorem 5.3). Using this construction, we
obtained the DPN-Acceptor DACons that accepts exactly the schedulable lock-
execution-hedges (Theorem 5.4). Then, we constructed a cross-product between
a Monitor-DPN and a DPN-Acceptor (Theorem 5.5). These results are used
in the next chapter to reduce lock-sensitive to lock-insensitive predecessor set
computation.
In [79], we construct a cross-product of a DPN and a hedge automaton. A hedge
automaton in [79] is the stackless analogon to a DPN-Acceptor: A hedge automa-
ton consists of an initial automaton and tree automata rules. An execution-hedge
is accepted if its trees are accepted by a sequence of states that is accepted by
the initial automaton. Note that we do not need a final automaton in [79]. The
cross-product construction is simpler than the one presented here, as only one
stack is involved. The more complicated construction in this thesis is required
because the set of schedulable lock-execution-hedges is not regular for reentrant
monitors, while it is regular for the well-nested, non-reentrant locks considered
in [79].
The cross-product construction relies on the fact that the stacks of the Monitor-
DPN and the DPN-Acceptor are updated nearly synchronously: On acquisition-
operations, a stack-symbol is pushed on both stacks, and on release-operations,
a stack-symbol is popped from both stacks. Alur and Madhusudan [2] introduce
the concept of visibly pushdown automata, which they further develop in [1, 3, 4].
Intuitively, in a visibly pushdown automaton, the labels of the rules “make visible”
the effect of the rule on the stack by indicating whether the rule pushes or pops
a stack-symbol, or does not alter the height of the stack. Like regular languages,
visibly pushdown languages are closed under union, intersection, concatenation,
Kleene-∗, and complement [2]. Our notion of Monitor-DPNs and DPN-Acceptors
can be seen as a generalization of visibly pushdown automata to DPNs: For DPN-
Acceptors, rules that push stack-symbols are labeled by acquisition-operations,
rules that pop stack-symbols are labeled by release-operations, and rules that do
not alter the height of the stack are labeled by base-operations. Thus, their effect
96
5.3 Summary and Related Work
on the stack is fully visible in the execution-tree. However, for Monitor-DPNs,
push- and pop-rules are labeled by base-actions. Thus, their effect on the stack is
not visible. For this reason, we further restricted DPN-Acceptors such that non-
release-rules are independent from the stack. This restriction is strong enough
that the cross-product construction works, and weak enough that reentrant mon-
itors can still be modeled. It remains future research to explore whether the
theory of visibly pushdown languages yields a more elegant alternative to our
cross-product construction.
Kidd et al. [60] have taken a similar approach to handle reentrant monitors.
Roughly speaking, they model programs with reentrant monitors as communi-
cating pushdown systems [14], where they have one PDS per lock, and one PDS
per thread. Using intersection of visibly pushdown automata, they replace the
pushdown systems for locks by regular languages. This accelerates their model-
checking procedure, as the number of PDSs is reduced. Moreover, it increases
precision, as PDSs are over-approximated by their model-checker, while regular






In the last chapter, we have characterized schedulable lock-execution-hedges
by the DPN-Acceptor DACons , and combined the Monitor-DPN to be analyzed
with DACons , yielding a cross-product DPN, whose lock-insensitive executions cor-
respond to lock-sensitive executions of the Monitor-DPN. In this chapter, we
use these results to reduce lock-sensitive predecessor set computation to lock-
insensitive predecessor set computation on the cross-product DPN. The latter is
then performed by the algorithm of Bouajjani et al. [16].
This chapter is organized as follows: In Section 6.1, we define predecessor and
immediate predecessor sets. In Section 6.2, we recall the algorithm for lock-
insensitive predecessor set computation of Bouajjani et al. [16], which we use in
Section 6.3 to construct an algorithm for lock-sensitive predecessor set computa-
tion. In Section 6.4, we discuss applications of our algorithm to various analysis
problems. Finally, in Section 6.5, we summarize the results of this chapter and
discuss related work.
6.1 Definitions
In this section we define the predecessor and immediate predecessor set of a set
of configurations. Intuitively, the predecessor set of C is the set of configurations
from that a configuration in C can be reached. Analogously, the immediate
predecessor set of C is the set of configurations from which a configuration in C
can be reached in exactly one step.
Definition 6.1 (Predecessor Sets). Given a DPN M and a set of configurations
C ⊆ Conf, we define:
preM(C) := {c ∈ Conf | ∃c′ ∈ C. c −→M c′}
pre∗M(C) := {c ∈ Conf | ∃c′ ∈ C. c −→∗M c′}
When clear from the context, we omit the index M and write pre(C) and
pre∗(C).
The set pre(C) is called the immediate predecessor set of C, the set pre∗(C) is
called the predecessor set of C.
99
6 Lock-Sensitive Predecessor Sets
For Monitor-DPNs, we make a similar definition:
Definition 6.2 (Lock-Sensitive Predecessor Sets). Given a Monitor-DPNM and
a set of consistent and valid configurations C ⊆ Conf ls ∩ valid, we define:
prels,M(C) := {c ∈ Conf ls ∩ valid | ∃c′ ∈ C. c −→ls,M c′}
pre∗ls,M(C) := {c ∈ Conf ls ∩ valid | ∃c′ ∈ C. c −→∗ls,M c′}
Again, we omit the index M when the DPN is clear from the context.
The set prels(C) is called the lock-sensitive immediate predecessor set of C, the
set pre∗ls(C) is called the lock-sensitive predecessor set of C.
6.2 Lock-Insensitive Predecessor Set
Computation
In the last section, we have defined predecessor and immediate predecessor sets.
In this section, we sketch the lock-insensitive predecessor set computation by
Bouajjani et al. [16], slightly adapted to our notation.
The predecessor set computation takes as input a DPNM = (P,Γ,∆,Act) with
P ∩ Γ = ∅, and a finite automaton A = (P ∪ Γ, Q, I, F, δ). The first step is to
convert this automaton into an M-automaton. This is done for technical reasons,
as it makes description of the predecessor set computation much simpler.
Definition 6.3. An automaton A = (P ∪Γ, Q, I, F, δ) with epsilon-transitions is
an M-automaton, iff
1. The states Q can be partitioned into the sets Q = Qc ∪˙Qs,
2. for every q ∈ Qc and p ∈ P , there is a unique and distinguished state
qp ∈ Qs and a transition q p−→δ qp.
3. any other transition in δ is either of the form
q
γ−→δ q′ for q ∈ Qs, γ ∈ Γ, and q′ ∈ Qs \ {qp | q ∈ Qc, p ∈ P}, or
q
ε−→δ q′ for q ∈ Qs and q′ ∈ Qc, and
4. the initial states are in Qc, i.e., I ⊆ Qc.
For any automaton A, an M-automaton A′ with L(A′) = L(A) ∩ Conf can be
constructed [16]. In [16], no procedure for this construction is specified. We
sketch a possible construction here, with the goal of estimating the size of the
resulting M-automaton.
100
6.2 Lock-Insensitive Predecessor Set Computation
1. First, we intersect the automaton with an automaton for valid configura-
tions. As such an automaton has two states, this doubles the number of
states. Then, we remove states from that no final state is reachable. This
ensures that initial states have no outgoing Γ-transitions, nor there is an
ε-path from an initial state to a state with an outgoing Γ-transition.
2. Next, we copy each state q of the automaton to obtain two copies q1 and q2.
Both copies get the same incoming edges as q, but q1 only gets outgoing ε-
transitions and Γ-transitions, and q2 only gets outgoing P -transitions. The
q2-states will become the states in Qc. Then, we propagate the initial states
over ε-transitions, i.e., every state that is reachable from an initial state via
an ε-path becomes initial, too. Finally, we delete all initial states that are
not q2-states. As the automaton only accepts valid configurations, this does
not change the language of the automaton.
3. Then, for each q2-state, we create a state q′2, and a transition q′2
ε−→ q2, and
redirect all incoming transitions of q2 to q′2. This step ensures that states
from Qc have only incoming ε-transitions.
4. Next, by copying states, we ensure that each P -transition is the only in-
coming transition of its target-state. Then, we merge the target-states of
P -transitions from the same source state that are labeled with the same
control-symbol. This ensures that we can identify unique and distinguished
qp-states for each state with outgoing P -transitions.
5. Finally, we add missing qp-states and p-transitions.
The first step restricts the language of A to valid configurations. The other steps
do not change the language of the automaton. The first 3 steps blow up the
number of states of the automaton only by a constant factor, the last two steps
blow up the number of states by a factor of at most |P |. Thus, a conservative
estimate for the size |A′| of the resulting M-automaton is |A′| = |A|poly(|M |).
In the following, we assume that A is an M-automaton. We define the automa-
ton pre∗M(A) := (Q, I, F, δ′), where δ′ is the least relation that contains δ and
satisfies the following conditions:
qp
γ−→δ′ q′ if pγ o↪→ p1w1 ∈ ∆ and q p1w1−−→∗δ′ q′ (R1)
qp
γ−→δ′ q′ if pγ o↪→ p1w1]p2w2 ∈ ∆ and q p1w1p2w2−−−−−→∗δ′ q′ (R2)
The relation δ′ can be computed by repeatedly adding transitions required by
(R1) or (R2) to δ, until both conditions are satisfied. As no new states are
added, this procedure terminates after polynomially many addition steps, and
each addition step needs polynomial time.
This procedure is called saturation, and, accordingly, (R1) and (R2) are called
saturation-rules. The intuition behind this procedure is the following: If there is
101
6 Lock-Sensitive Predecessor Sets
a configuration of the form c1p′wrc2, and a rule pγ
o
↪→ p′w ∈ ∆, then the DPN
admits the transition c1pγrc2
o−→M c1p′wrc2. Hence, the configuration c1pγrc2 is
a predecessor of the configuration c1p′wrc2. This is exactly the intuition of the
saturation-rule (R1). The rule (R2) works similar for spawn-steps.
We summarize this section by the following Theorem:
Theorem 6.4 (Predecessor Sets). For every DPNM and regular set of configura-
tions C ⊆ Conf, the sets pre∗M(C) and preM(C) are regular. Given an automaton
A, automata pre∗M(A) and preM(A) with
L(pre∗M(A)) = pre
∗
M(L(A) ∩ Conf) and L(preM(A)) = preM(L(A) ∩ Conf).
can be effectively constructed in time poly(|M ||A|).
The sizes of the automata can be estimated by
|pre∗M(A)| = |A|poly(|M |) and |preM(A)| = |A|poly(|M |).
Proof. The correctness of the saturation procedure is proved in the appendix of
[17]. The modification of the algorithm to compute immediate predecessor sets
is described in [16, 17].
For the size estimation, observe that the number of states does not increase
during the saturation procedure. For the immediate predecessor algorithm, the
number of states is doubled.
We conclude with a short note on implementing the saturation algorithm: One
should use a representation of the M-automaton that does not explicitly store
qp-states that have no outgoing edges and are not final. In particular for DPNs
with many control-states, like the cross-product DPNs used in this thesis, this
optimization is essential.
6.3 Reduction to Lock-Insensitive Predecessor
Set Computation
In Chapter 3, we have shown the correspondence between interleaving and tree-
semantics (Theorem 3.8 with Corollary 3.9 for the lock-insensitive case, and The-
orem 3.25 with Corollary 3.26 for the lock-sensitive case). In Chapter 4, we char-
acterized the schedulable lock-execution-hedges by their lock-a/r-hedges, and in
Chapter 5, we have shown how to combine this characterization with the Monitor-
DPN to be analyzed, yielding a cross-product DPN whose lock-insensitive exe-
cutions correspond to the lock-sensitive executions of the original Monitor-DPN
(Theorems 5.4 and 5.5). In the last section, we have shown how to compute
lock-insensitive predecessor sets. In this section, we apply these results to re-
duce lock-sensitive predecessor set computation on the original Monitor-DPN to
lock-insensitive predecessor set computation on the cross-product DPN.
102
6.3 Reduction to Lock-Insensitive Predecessor Set Computation
Given a Monitor-DPN M , and configurations c, c′ ∈ Conf, we have:
c −→∗ls,M c′ ∧ c ∈ Conf ls ∩ valid
⇐⇒ ∃h ∈ H. c h=⇒M c′ ∧ schedls(h×ls(c)) 6= ∅ ∧ c ∈ Conf ls ∩ valid (Cor. 3.26)
⇐⇒ ∃h ∈ H. c h=⇒M c′ ∧ (h×ls(c)) ∈ L(DACons) ∧ c ∈ Conf ls ∩ valid (Thm. 5.4)
⇐⇒ ∃h ∈ H, c× ∈ L(ACI×), c′× ∈ L(ACF×). c× h=⇒× c′× ∧ pi1(c×) = c ∧ pi1(c′×) = c′
(Thm. 5.5)
⇐⇒ ∃c× ∈ L(ACI×), c′× ∈ L(ACF×). c× −→∗× c′× ∧ pi1(c×) = c ∧ pi1(c′×) = c′
(Cor. 3.9)
Here, (ACI× ,ACF× ,M×) is the cross-product of M and DACons . Now, let C ⊆ Conf
be a set of configurations of M . We do the following calculation:
pre∗ls,M(C ∩ Conf ls ∩ valid)
= {c | c ∈ Conf ls ∩ valid ∧ ∃c′ ∈ C ∩ Conf ls ∩ valid. c −→∗ls,M c′} (1)
= {c | c ∈ Conf ls ∩ valid ∧ ∃c′ ∈ C. c −→∗ls,M c′} (2)





1 (C) ∩ L(ACF×)) ∩ L(ACI×)) (4)
Here, Equation (1) is due to unfolding the definition of pre∗ls (Definition 6.2).
Equation (2) is due to the fact that consistency and validity of configurations
is preserved by the lock-sensitive interleaving semantics. Equation (3) uses the
calculation from above, and Equation (4) is by folding the definition of pre∗
(Definition 6.1). Summarized, we have
pre∗ls,M(C ∩ Conf ls ∩ valid) = pi1(pre∗×(pi−11 (C) ∩ L(ACF×)) ∩ L(ACI×)).
Using the lock-insensitive predecessor set computation, and standard opera-
tions on automata, an automaton for the right-hand side of this equation can be
effectively computed.
Together, we get the main theorem of this thesis that generalizes Theorem 6.4
to Monitor-DPNs:
Theorem 6.5 (Computing Lock-Sensitive Predecessor Sets for Monitor-DPNs).
Given a Monitor-DPN M and a regular set of valid, consistent configurations
C ⊆ Conf ls ∩ valid, the sets pre∗ls,M(C) and prels,M(C) are regular. Given an
automaton A, automata pre∗ls,M(A) and prels,M(A) with
L(pre∗ls,M(A)) = pre
∗
ls,M(L(A) ∩ Conf ls ∩ valid)
L(prels,M(A)) = prels,M(L(A) ∩ Conf ls ∩ valid)
103
6 Lock-Sensitive Predecessor Sets
can be effectively constructed in time poly(|M ||A|)2O(|X |2). The sizes of the au-
tomata are:
|pre∗ls,M(A)| = |A|poly(|M |)2O(|X |
2)
|prels,M(A)| = |A|poly(|M |)2O(|X |
2)





1 (A) ∩ ACF×) ∩ ACI×)
prels,M(A) := pi1(pre×(pi
−1
1 (A) ∩ ACF×) ∩ ACI×)
Note that the construction for immediate predecessor sets is correct, as single
steps of the cross-product DPN correspond to single steps of the Monitor-DPN.
The time and size estimation follows straightforwardly from the estimations
for the components: The automaton ACons can be constructed in time 2O(|X |
2),
and its size is 2O(|X |2) (Lemma 4.20). Hence, the DPN-Acceptor DACons can be
constructed in time
poly(2O(|X |
2)|Act|)2O(|X |) = poly(|Act|)2O(|X |2),
and its size is poly(|Act|)2O(|X |2) (Theorem 5.3). Finally, the cross-product can
be constructed in time
poly(|M |poly(|Act|)2O(|X |2))2O(|X |) = poly(|M |)2O(|X |2),
and its size is poly(|M |)2O(|X |2) (Theorem 5.5). The remaining operations (inter-
section, projection, inverse projection, and lock-insensitive predecessor set com-
putation) are all polynomial time. Moreover, the number of states of the resulting
automaton depends linearly on the number of states of the input automata, which
completes the proposition.
Note that the algorithm is polynomial in the size of the automaton and the
DPN, and exponential only in the number of locks. For Monitor-DPNs derived
from typical programs, we expect the number of locks to be much smaller than
the DPN itself.
6.4 Applications
In this section, we show how predecessor set computation can be used for program
analysis and verification. We first generalize the definition of predecessor sets:
Definition 6.6. Let M be a DPN, and S ⊆ Act∗ be a set of sequences of actions.
Then, we define
preM [S](C) := {c ∈ Conf | ∃c′ ∈ C, a¯ ∈ S. c a¯−→∗M c′}.
104
6.4 Applications
Similarly, we define for a Monitor-DPN M and a set S ⊆ (ActX )∗:
prels,M [S](C) := {c ∈ Conf | ∃c′ ∈ C, o¯ ∈ S. c o¯−→∗ls,M c′}.
In order to get an effective predecessor set computation, we need to restrict
S to rather simple sets. For example, consider the set S given by the regular
expression (aa¯ + bb¯ + τ)∗. Then, the statement p0γ0 ∈ preM [S](C) is true, if
and only if there is an execution from the start configuration to a configuration
in C that is properly synchronized w.r.t. rendezvous-communication. However,
reachability problems for pushdown systems communicating via rendezvous are
undecidable [104]. Hence, we only regard sets of the form S = A and S = A∗ for
a set A ⊆ ActX .
Obviously, we have pre = pre[Act] and pre∗ = pre[Act∗]. In order to compute
pre[A] or pre[A∗] for some set A ⊆ Act, we modify the DPN by removing all rules
with actions not in A, and then compute pre or pre∗ for the modified DPN. For
a DPN M and a set A ⊆ Act, we write M |A for the DPN that results from M
when removing all rules with actions not in A. Thus, we have preM [A] = preM |A
and preM [A∗] = pre∗M |A , and, analogously, prels,M [A] = prels,M |A and prels,M [A
∗] =
pre∗ls,M |A .
Many interesting properties can be expressed by predecessor sets. For example,
if C characterizes some set of undesired configurations (e.g. configurations that
have a data-race or are deadlocked), a program has this undesired property if
and only if
p0γ0 ∈ pre[Act∗](C),
where p0γ0 is the start configuration of the program.
Also, more complex properties can be checked. For example, a variable x is
live at program point u if and only if
p0γ0 ∈ pre[Act∗](atu ∩ pre[(¬write(x))∗](pre[read(x)](Conf))),
where atu is the set of configurations that have at least one thread that is at
program point u, ¬write(x) is the set of actions that do not write to variable x,
and read(x) is the set of actions that read from variable x.
6.4.1 Atomic-Set Serializability Violation
Another analysis that can be expressed by predecessor sets is detection of atomic-
set serializability violations [6, 117]. An atomic-set serializability violation indi-
cates a high-level data-race [6]. Intuitively, high-level data-races are a gener-
alization of data-races to regions of memory that are assumed to be updated
atomically. As an example1, consider a program that has a vector (with an x-
and y-coordinate) shared between two threads. One thread sets the vector to
1Taken from [6].
105
6 Lock-Sensitive Predecessor Sets
zero, and the other thread reads the vector. Setting the vector to zero is done
component-wise, and the update of each component is protected by a lock, as
well as the read access to each component of the vector. Obviously, there is no
data-race, as the accesses to both the x- and y-component are properly protected.
However, when the second thread is executed after the first component has been
updated, but before the second component has been updated, it will read some
inconsistent vector. Such behaviors are called high-level data-races.
We briefly sketch the approach of [117] to describe high-level data-races: The
programmer (or some heuristics) has to specify atomic sets of data and units
of work in the code. The data in an atomic set is assumed to belong together,
like the x- and y-component of the vector in the above example, and a unit of
work is assumed to perform an operation on the data, e.g. setting the vector to
zero. An execution is serializable, if its projection to each atomic set is equiv-
alent to an execution in which the units of work occur in serial order. Thus,
a violation of atomic-set serializability indicates a high-level data-race. Vaziri
et al. [117] identify 11 problematic access patterns that are complete in the sense
that the program violates atomic-set serializability if and only if it has an exe-
cution with such a problematic pattern2. The access patterns are sequences of
read- and write-actions to elements of an atomic set l. For example, the pattern
Wu(l);Ru′(l);Wu(l) means that unit of work u writes to l, then unit of work u′
reads from l, and, finally, u writes to l again. In this case, the thread executing
u′ will observe an intermediate state of the data, as u is currently updating it.
In the example above, u would be the setting of the vector to zero, and u′ would
be the reading of the vector.
Checking whether the program has an execution that has one of those prob-
lematic access patterns can be done by iterated predecessor set computation.
Intuitively, the predecessor set computation is iterated for each access in the
pattern, ensuring that the intermediate configurations execute the correspond-
ing access. In practice, one has to additionally identify the units of work. In
[61, 63], we have implemented atomic-set serializability violation detection for
parallel pushdown systems with locks, which are extracted from Java programs
using the random isolation method [62]. The detection of the problematic ac-
cess patterns described there can be adapted straightforwardly to predecessor set
computations for DPNs.
6.4.2 EF-Formula
Lugiez and Schnoebelen [83] show how to decide the branching time logic EF
for PA-processes, using predecessor set computations. These results transfer im-
mediately to DPNs and Monitor-DPNs. We briefly recapture the procedure of
2Under the assumption that every unit of work that writes to one element of an atomic set
writes to all elements of this atomic set.
106
6.4 Applications
Lugiez and Schnoebelen [83] here, slightly adapted to our model. The logic EF
has the following syntax:
ϕ ::= P | ¬ϕ | ϕ ∧ ϕ′ | EX〈A〉ϕ | EF〈A〉ϕ
The atomic propositions P are regular sets of configurations, for which automata
AP are given, and the constraints A are sets of actions. We also write EX instead
of EX〈Act〉, and EF instead of EF〈Act〉. Given a Monitor-DPNM and a consistent
and valid configuration c ∈ Conf ls ∩ valid, the semantics of EF-logic is defined as
follows:
c |= P :⇐⇒ c ∈ P ∩ Conf ls ∩ valid
c |= ¬ϕ :⇐⇒ c ∈ Conf ls ∩ valid ∧ c 6|= ϕ
c |= ϕ ∧ ϕ′ :⇐⇒ c |= ϕ ∧ c |= ϕ′
c |= EX〈A〉ϕ :⇐⇒ ∃c′ ∈ Conf ls ∩ valid, o ∈ A. c′ |= ϕ ∧ c o−→ls,M c′
c |= EF〈A〉ϕ :⇐⇒ ∃c′ ∈ Conf ls ∩ valid, o¯ ∈ A∗. c′ |= ϕ ∧ c o¯−→∗ls,M c′
Additionally, we define the connective ∨ via De Morgan’s law:
ϕ ∨ ϕ′ := ¬(¬ϕ ∧ ¬ϕ′).
With EF-logic, the reachability and live variable properties from the beginning
of this section can be expressed more succinctly: Reachability of an undesired
configuration from a regular set C can be expressed as p0γ0 |= EF C, and liveness
of a variable can be expressed as p0γ0 |= EF(atu∧EF〈¬write(x)〉EX〈read(x)〉Conf).
In order to decide EF-logic via predecessor set computation, we define
[[ϕ]] := {c ∈ Conf | c |= ϕ}.
Obviously, the following holds:
[[P ]] = L(AP ) ∩ Conf ls ∩ valid [[¬ϕ]] = (Conf ls ∩ valid) \ [[ϕ]]
[[ϕ ∧ ϕ′]] = [[ϕ]] ∩ [[ϕ′]] [[ϕ ∨ ϕ′]] = [[ϕ]] ∪ [[ϕ′]]
[[EX〈A〉ϕ]] = prels,M [A]([[ϕ]]) [[EF〈A〉ϕ]] = prels,M [A∗]([[ϕ]])
Thus, using the results of this chapter, [[ϕ]] is regular and an automaton for [[ϕ]]
can be computed.
In order to do the actual computation, it is easier to compute an automaton
Aϕ with L(Aϕ)∩Conf ls∩ valid = [[ϕ]], i.e., we allow that the language of the result
automaton may contain inconsistent and invalid configurations. The automaton
Aϕ is computed as follows:
AP = AP A¬ϕ = ¬(Aϕ)
Aϕ∧ϕ′ = Aϕ ∩ Aϕ′ Aϕ∨ϕ′ = Aϕ ∪ Aϕ′




6 Lock-Sensitive Predecessor Sets
where we assume that an atomic proposition P is specified as an automaton,
and pre′ls,M and pre′
∗
ls,M are the optimized predecessor set computations that may
return inconsistent configurations. They will be discussed in Section 7.2 (cf.
Theorem 7.2).
For each negation, this construction complements the automaton, resulting in
a worst-case exponential blowup. As an optimization, negation can be pushed
inwards, but not crossing EX- or EF-operations. It is straightforward to show
that the following holds:
[[¬¬ϕ]] = [[ϕ]]
[[¬(ϕ ∧ ϕ′)]] = [[¬ϕ ∨ ¬ϕ′]] [[¬(ϕ ∨ ϕ′)]] = [[¬ϕ ∧ ¬ϕ′]]
Thus we assume that the considered formula ϕ is in normal form w.r.t. the above
rules, i.e., negations are pushed inwards as far as possible. The size of the automa-
ton Aϕ (and also the time required to compute it) can be estimated recursively
over the formula ϕ. For the size s(ϕ) of Aϕ, we get:
s(P ) = |P | s(¬ϕ) = 2O(s(ϕ))
s(ϕ ∧ ϕ′) ≤ s(ϕ)s(ϕ′) s(ϕ ∨ ϕ′) ≤ s(ϕ) + s(ϕ′) + 1
s(EX〈A〉ϕ) = s(ϕ)poly(|M |) s(EF〈A〉ϕ) = s(ϕ)poly(|M |)2O(|X |2)
If there are no negations in the formula, and we estimate s(ϕ) + s(ϕ′) + 1 ≤
s(ϕ)s(ϕ′) (valid if both automata have more than two states), the size of the
resulting automaton is the product of the sizes of the automata for the atomic
propositions and a factor of poly(|M |) for each EX and poly(|M |)2O(|X |2) for each
EF. We define nP(ϕ) to be the number of atomic propositions, sP(ϕ) to be the
maximum size of the automaton for an atomic proposition, nEX(ϕ) to be the
number of EX-operators, and nEF(ϕ) to be the number of EF-operators. Then, in
the negation-free case, we can estimate the size of Aϕ by
|Aϕ| = sP(ϕ)nP(ϕ)poly(M)nEX(ϕ)+nEF(ϕ)2O(|X |2)nEF(ϕ).
If we have additional negations, the size increases exponentially for each nega-
tion that has to be applied. Like in [82], we define the alternation depth nalt(ϕ)
of negations in ϕ. Assuming that ϕ is in normal form, we define:
nalt(P ) = 0 nalt(¬ϕ) = 1 + nalt(ϕ)
nalt(ϕ ∧ ϕ′) = max(nalt(ϕ), nalt(ϕ′)) nalt(ϕ ∨ ϕ′) = max(nalt(ϕ), nalt(ϕ′))
nalt(EX〈A〉ϕ) = nalt(ϕ) nalt(EF〈A〉ϕ) = nalt(ϕ)
Then, the size of the result automaton can be estimated by
|Aϕ| = tower(nalt(ϕ), O(sP(ϕ)nP(ϕ))poly(M)nEX(ϕ)+nEF(ϕ)2O(|X |2)nEF(ϕ)),
108
6.5 Summary and Related Work
where tower(n,m) is a tower of n 2s, the innermost exponent being m:
tower(0,m) := m tower(n+ 1,m) := 2tower(n,m).
For the above estimation, we use the fact that x2O(x) = 2O(x), in order to shift
the factors upwards in the tower.
6.4.3 Bounded Model-Checking
We already discussed bounded model-checking for DPNs with shared global state
[15] in Section 3.4. There, only executions up to a bounded number of contexts
are regarded. In each context, only a single thread may access the global state.
However, all threads may make steps and new threads may be created. Thus,
the number of contexts limits the number of communications via the global state.
The method described in [15] works by computing iterated predecessor sets. After
each predecessor set computation, the result automaton is modified to reflect the
context switch (i.e., change the thread that may access the shared memory).
This method can also be used with lock-sensitive predecessor set computation.
This increases the precision of the analysis, as locks are handled precisely, while
they must be emulated via the global state in the original method. However,
it also increases the complexity of the analysis, as we replace the polynomial
lock-insensitive predecessor set computation by the lock-sensitive predecessor set
computation that is exponential in the number of locks.
6.5 Summary and Related Work
In this chapter, we combined the results of the previous chapters to describe a
lock-sensitive predecessor set computation that works by reduction of the problem
to lock-insensitive predecessor set computation. Then, we sketched applications
of our algorithm to program analysis and model-checking.
We used the saturation algorithm of Bouajjani et al. [16] to compute lock-
insensitive predecessor sets for DPNs (cf. Theorem 6.5). Regarding pushdown
automata and, in general, prefix rewrite systems, the fact that the set of reach-
able configurations is regular is well-known, and dates back to Büchi [19]. In
[22, 23], this result is extended by showing that the relation on configurations
induced by runs of a prefix rewrite system is a rational transduction, and giving
a construction of the corresponding transducer. In [11, 13, 41], these results are
applied to model-checking of linear and branching time logics for pushdown au-
tomata, leveraging the automata-theoretic approach to model-checking [115, 116].
In [37], an efficient implementation for linear time logic is provided. Lugiez and
Schnoebelen [82, 83] show how to compute predecessor sets for the process alge-
bra PA [8, 12]. They show that predecessor sets of regular sets of process terms
are, again, regular and can be computed in polynomial time, and apply this result
109
6 Lock-Sensitive Predecessor Sets
to model-checking EF-logic. Esparza and Podelski [36] provide an efficient and
simple implementation of this algorithm. Finally, Bouajjani et al. [16] generalize
predecessor set computation to DPNs and CDPNs.
While the algorithm for DPNs [16] computes predecessor sets only for paths
of the form A or A∗ for some set of actions A, in [82], a computation for pre-
decessor sets w.r.t. any decomposable [109] set of paths is described. Recall that
predecessor set computation that is constrained with more complex sets of paths
(e.g. arbitrary regular sets) is undecidable, as it can be used to synchronize two
pushdown processes. Exploring whether predecessor sets w.r.t. arbitrary decom-
posable sets can also be computed for DPNs is left to future research.
Interestingly, when regarding execution-hedges instead of interleaved execu-
tions, we can compute predecessor sets that are constrained with languages of
arbitrary DPN-Acceptors. The languages of DPN-Acceptors include, in partic-
ular, all regular languages. In [79], we used a similar technique to compute
predecessor sets constrained with regular sets of execution-hedges.
In [35] predecessor and successor set computation for pushdown systems is
applied to interprocedural data flow analysis. In [36], this is extended to PA.
In [39], an approach to bitvector-analysis of parallel pushdown systems (PPDS)
with locks is presented3, which is also based on acquisition histories. While
bitvector-analysis is less general than the temporal logics considered in [55, 56],
or the phase-automata of [61, 63], or what can be expressed by lock-sensitive
predecessor set computation [79], the analysis of [39] is optimized to bitvector-
analysis, allowing a more efficient approach, in particular when computing a
solution for every control point in the program. Like our algorithm, the algorithm
in [39] has a worst-case runtime that is polynomial in the program size and
exponential4 in the number of locks.
In Section 6.4, we considered live variables analysis as an application for pre-
decessor sets. Regarding dataflow analysis of concurrent programs, in particular
kill/gen-analysis, there is a line of research that considers fixed point based tech-
niques rather than automata based techniques. Knoop et al. [65] describe precise
intraprocedural kill/gen-analyses for programs with parbegin...parend-blocks.
The basic idea is to compute the possible interference at each control location,
i.e., the set of statements that can possibly be executed in parallel. This ap-
proach has been extended to interprocedural analysis of programs with parallel
calls [110]. In [70], we extend the approach of [65] to interprocedural analysis
of programs with dynamic thread creation, and, in [76], to programs with both,
dynamic thread creation and parallel calls.
In [78], we use fixed point based techniques and abstract interpretation [31] to
compute acquisition structures of programs with procedures and dynamic thread
3The paper actually presents the analysis without supporting procedures, but claims that
generalization to procedures is straightforward.
4The original paper states “double exponential”, but, in personal communication, the authors
confirmed that it should be (single) exponential.
110
6.5 Summary and Related Work
creation. Reps et al. [105, 106] introduce weighted pushdown systems (WPDS),
which combine automata based and fixed point based techniques. Intuitively, a
weighted pushdown system is a pushdown system where rules are annotated by
weights from a semiring. Given two regular sets of configurations, one can com-
pute the least upper bound of the weights of all paths between the two sets of
configurations. The algorithm is similar to the saturation algorithm for predeces-
sor set computation in [11, 37]. Weighted pushdown systems have been extended
to support handling of local variables [67]. In [120, 121], weighted dynamic push-
down networks are introduced, leveraging the ideas of WPDSs to DPNs.
In Section 6.4.2, we considered the branching time logic EF. For PA- and PAD-
Processes, this logic has been shown decidable by Mayr [85, 86]. The elegant ap-
plication of predecessor set computation to decide EF-logic has been highlighted
by Lugiez and Schnoebelen [83]. For parallel pushdown systems communicating
with locks, Kahlon and Gupta [55, 56] have investigated several classes of lin-
ear time and branching time logics, and obtained decidability and undecidability
results. The atomic propositions regarded there are double-indexed properties,
fixing the simultaneous state of two processes, while our atomic propositions are
arbitrary regular sets of configurations. Kahlon and Gupta [55, 56] show5 de-
cidability of the linear time logics LTL\X (in [55]), LTL(X,F,F∞), LTL(X,G),
and the branching time alternation free mu-calculus (all in [56]). While deciding
EF-logic via predecessor set computations is straightforward, those logics require
more subtle algorithms. We leave it for future research to investigate whether
they can be transferred to DPNs with locks.
Atomic-set serializability violation detection has been implemented for Java
[47] programs [61–63] and applied to some programs of the ConTest bench-
mark suite [38], yielding promising results. The Java programs are abstracted
to parallel pushdown systems (PPDSs) with locks, using the random isolation
method [62]. While the original analysis of the PPDS [62] is based on an over-
approximation by communicating pushdown systems [14], we present a decision
procedure that is based on acquisition histories in [61, 63]. This decision proce-
dure has been implemented symbolically with BDDs, and resulted in a significant
speedup compared to the approximation by communicating pushdown systems.
5The published versions of these papers have a flaw: The algorithms do not properly keep track
of the correlation between the different threads that are induced by atomic propositions,
leading to incorrect results. The flaw has been confirmed by the authors in personal com-
munication, and can be fixed at the cost of increased complexity. For a detailed description
of this flaw, see [59, Sec. 7.9]
111
6 Lock-Sensitive Predecessor Sets
112
7 Optimizations
In the last chapter, we presented an algorithm for lock-sensitive predecessor set
computation and discussed its applications. In this chapter, we present optimiza-
tions that increase the efficiency of our algorithm for specific applications. Then,
we illustrate how our algorithm is used to detect the data-race in the program
of Example 1.1. With the help of this example, we identify problems that an
implementation of our algorithm has to solve, and discuss possible solutions.
This chapter is organized as follows: In Section 7.1, we introduce a notation
for DPNs as pseudocode programs, which is more convenient than specifying sets
of transition rules in some cases. In Section 7.2, we discuss optimizations that
can be applied to the typical case of querying whether the result of a predecessor
set computation contains the start configuration. In Section 7.3, we investigate
optimizations for deadlock-free DPNs. In Section 7.4, we illustrate the application
of our algorithm to data-race detection by means of a simple example program,
and discuss issues that have to be addressed by an implementation. Finally, in
Section 7.5, we briefly summarize the results of this chapter and discuss related
work.
7.1 Pseudocode
Instead of specifying a DPN by its set of states, stack-symbols, actions, and
rules, it is sometimes more convenient to specify a program in pseudocode. The
translation of these programs to DPNs is straightforward: Each label and each
unlabeled statement in the program corresponds to a stack-symbol. Also, sync-
blocks correspond to stack-symbols. The DPN has just one control-state $ and
one base-action a. Each statement is translated into a rule that changes the
topmost stack-symbol accordingly.
Example 7.1 (Translation of Pseudocode to DPN). As an example, consider
the following program:
p1: sync (x1){spawn p2;u: skip};
p2: sync (x2){}; v: skip;
To translate this program, we first add labels to all unlabeled statements and at
the end of non-returning procedures and synchronized blocks.
p1: sync (x1){l1: spawn p2;u: skip; l2: }; l3:
p2: sync (x2){l4: }; v: skip; l5:
113
7 Optimizations
Then, each label becomes a stack-symbol, and the rules are added according to the
statements in the program. The resulting Monitor-DPN is
M = ({$},Γ,Γ⊥, {a},X ,∆, locks),
where
Γ = {p1, l1, l2, l3, l4, l5, p2, u, v}
Γ⊥ = {p1, l3, p2, v, l5}
X = {x1, x2}
locks = p1, l3, p2, v, l5 7→ ∅, l1, l2, u 7→ {x1}, l4 7→ {x2}
∆ = {$p1
〈x







↪→ $, $v a↪→ $l5}
Instead of sync-blocks, we may also use acq x- and rel x- statements.
7.2 Queries from the Start Configuration
Predecessor sets are often used to describe executions from the start configuration
(cf. the data-race detection and live variables analysis examples in Section 6.4).
In those cases, the result of a predecessor set computation is the input to the
next iteration, or just queried whether it contains the start configuration. In this
section, we describe optimizations that are possible in this case.
7.2.1 Consistency Check
As executions preserve consistency of configurations, we can pull intersections
with AConf ls out of predecessor set computations. We define




1 (A) ∩ ACF×) ∩ Avalid× ∩ pi−12 (ACI)).
Then, due to the definitions of ACI× (cf. Section 5.2) and pre∗ls(A) (cf. Theo-






1 (A) ∩ ACF×) ∩ Avalid× ∩ pi−11 (AConf ls) ∩ pi−12 (ACI)))
= L(pre′∗ls (A)) ∩ Conf ls,
where the first equality is by unfolding the definitions of pre∗ls(A) and ACI× , and
the second equality by pi1 ◦ pi−11 = id and L(AConf ls) = Conf ls.
114
7.2 Queries from the Start Configuration
Thus, when querying whether a consistent configuration c0 ∈ Conf ls (e.g. the
start configuration) is contained in the predecessor set, we can exploit the follow-
ing optimization:
c0 ∈ pre∗ls(A) ⇐⇒ c0 ∈ pre′∗ls (A).
As AConf ls has exponentially many states in the number of locks, the automaton
pre′∗ls (A)—that performs no intersection with AConf ls—is considerably smaller than
pre∗ls(A) in typical cases.
We defined the automaton pre∗ls(A) for arbitrary automata A, such that pre∗ls(A)
only considers consistent and valid configurations in L(A) (cf. Theorem 6.5).
Thus, when computing iterated predecessor sets, there is no need to intersect
the result of the inner computation with AConf ls , and the following optimization
applies:
L(pre∗ls(A1 ∩ pre∗ls(A2))) = L(pre∗ls(A1 ∩ pre′∗ls (A2))).
Similar optimizations apply for immediate predecessor set computations, and in
the next subsection we show an polynomial algorithm to compute an automaton
pre′ls(A). In many cases no intersection with AConf ls is required at all, e.g. in the
data-race detection and live variables analysis examples. We also applied these
optimizations to model-checking EF-logic in Subsection 6.4.2.
7.2.2 Immediate Predecessor Sets
When computing immediate predecessor sets, the lock-execution-hedge h contains
exactly one non-leaf-node. Thus, also the lock-a/r-hedge ar(h) has exactly one
non-leaf-node. Hence, its dependence-graph is trivially acyclic, and it cannot
contain two final acquisitions of the same lock. Thus, criteria (C1) and (C3) are
trivially satisfied, and it remains to check Criterion (C2):
((used(ar(h)) ∪ acq(ar(h))) \ rel(ar(h))) ∩ ar(h)|2 = ∅.
As a use-node in ar(h) with a non-empty set of used locks corresponds to at
least two non-leaf-nodes in h, there are no such use-nodes in ar(h), and we have
used(ar(h)) = ∅. Moreover, ar(h) contains exactly one node, hence it does not
contain both, a release- and an acquisition-node. Thus, (C2) simplifies to
acq(ar(h)) = ∅ ∨ (∃x. acq(ar(h)) = {x} ∧ x /∈ ar(h)|2).
This can be checked by a polynomial size tree automaton. Also the translation of
this tree automaton to a DPN-Acceptor can be done in polynomial time: Here,
the reentrance elimination caused the exponential blowup. However, if there is
just one node, reentrance elimination can be done in polynomial time, e.g. by
nondeterministically guessing the lock x of the node. Thus, except the inter-
section with AConf ls , immediate predecessor sets can be computed in polynomial
time. In the last section we argued that this intersection can be omitted in many
115
7 Optimizations
cases. However, in Section 9.2.1.1, we show that deciding whether a regular set
of configurations contains a consistent configuration is NP-hard, and thus also
immediate predecessor set computation with intersection with AConf ls is NP-hard.
This matches the fact that AConf ls has exponential size in the number of locks,
and thus the result automaton of the immediate predecessor set computation may
blow up exponentially upon intersection with AConf ls .
We summarize the results by the following theorem:
Theorem 7.2. Given a Monitor-DPN M and an automaton A, we can compute
automata pre′∗ls,M(A) and pre′ls,M(A), such that
L(pre′∗ls,M(A)) ∩ Conf ls = pre∗ls,M(L(A) ∩ Conf ls ∩ valid)
L(pre′ls,M(A)) ∩ Conf ls = prels,M(L(A) ∩ Conf ls ∩ valid).
The automaton pre′∗ls,M(A) can be computed in time poly(|M ||A|)2O(|X |2) and
has size |A|poly(|M |)2O(|X |2). The automaton pre′ls,M(A) can be computed in time
poly(|M ||A|) and has size |A|poly(|M |).
7.2.3 Initial Releases
Moreover, the start configuration usually holds no locks. In this case, the execu-
tion that belongs to the outermost predecessor set computation contains no initial
releases. Consequently, the release-set and the release-graph of execution-hedges
from the start configuration are always empty. This simplifies the construction
of the automaton ACons, as the release-set and release-graph components of the
acquisition structure may be omitted. Also, the initial lockstack is empty, and
thus the component of the hedge acquisition structure that collects the initially
held locks may be omitted, and Criterion (C2) becomes trivial, as there are no
initially held locks. (Recall that (C2) (cf. Lemma 4.14) checks that the set of
used and acquired locks that are not released is disjoint from the set of initially
held locks.) Moreover, in the construction of DACons (cf. Section 5.1.2), we may
omit the initial-release flag.
As a final note, we mention that for iterated predecessor set computations like
p0γ0 ∈ pre∗ls(A1 ∩ pre∗ls(A2)),
the release-set for the inner predecessor set computation is, in general, not empty.
Thus, this optimization applies only to the outermost predecessor set computa-
tion.
7.3 Deadlocks
In the last section, we identified some optimizations that may be applied in
common cases. In this section, we investigate the correspondence between dead-
locks and cycles in the dependence-graph, and derive optimizations for certain
116
7.3 Deadlocks
classes of deadlock-free DPNs. We aim at finding classes of DPNs where the
dependence-graph is always acyclic, and thus the acquisition- and release-graphs
are not required for the analysis. Note that the acquisition- and release-graphs
are sets over pairs of locks, i.e., there are 2|X |2 different acquisition-graphs. All
other sets involved in our analysis are just sets of locks. Thus, when omitting the
acquisition and release-graphs, the exponential factor in the complexity of our
analysis reduces from 2O(|X |2) to 2O(|X |).
Intuitively, a deadlock is a configuration where some threads mutually wait on
each other to release a lock, or where a thread is terminated still holding a lock,
and thus blocking another thread. We distinguish a deadlock from a terminated
configuration, where threads may still hold locks, but do not block other threads.
Formally, we characterize a deadlock by a non-empty set of threads. Each thread
in a deadlock is either terminated and holds a lock, or all its outgoing steps are
acquisitions that are blocked by another thread from the deadlock. In order to
distinguish a deadlocked from a terminated configuration, we require at least one
thread in the deadlock to have outgoing acquisition-steps.
Definition 7.3 (Deadlocked Configuration). Let M be a Monitor-DPN. First,
we define the function act : PΓ∗ → 2ActX that maps a thread-configuration to the
set of possible actions from this configuration:
act(pw) := {o | ∃c. pw o−→ c}.
A configuration c = p1w1 . . . pnwn ∈ Conf ls is called deadlocked, iff
∃∅ ⊂ D ⊆ {1, . . . , n}. ∀i ∈ D.
(act(piwi) = ∅ ∧ ls(wi) 6= ε)
∨ (act(piwi) 6= ∅ ∧ ∀o ∈ act(piwi). ∃j ∈ D, x ∈ ls(wj). i 6= j ∧ o = 〈x)
∧ ∃i ∈ D. act(piwi) 6= ∅
The set of deadlocked configurations is denoted by Cdead.
A DPN M with a start configuration c0 ∈ Conf ls is called deadlock-free, if no
deadlocked configuration can be reached from c0.
Assuming that the analyzed DPN is deadlock-free is realistic, if a separate
analysis for possible deadlocks is done before the reachability analysis. If this
prior analysis proves the DPN deadlock-free, we may exploit this knowledge in
our analysis.
The reason why we examine deadlock-free DPNs is the observation that cy-
cles in the dependence-graph (cf. Section 4.3.1) indicate deadlock-like conditions,
where a subset of threads is mutually waiting on each other to release a lock.
First of all, we give some examples of deadlock-free DPNs, where acquisition-
and release-graphs are required for precise reachability analysis. Then, we discuss
additional restrictions with which acquisition- and release-graphs are not required













Here, OR means nondeterministic choice—i.e., the corresponding DPN has two
rules that both can be applied, one acquiring the lock y, and the other continuing
the execution at label l. This program has no deadlock, as the thread p can always
decide to continue at l, and thus break the deadlock.
Moreover, the program has no data-race: The only lock-a/r-hedge that reaches
u and v simultaneously when starting with two threads, one at p and one at q, is
h = [(t1, ∅), (t2, ∅)], with
t1 = 〈x〈〉{y}(ε)τu and t2 = 〈y〈〉{x}(ε)τv
For better readability, we explicitly wrote the τ -nodes, and indexed them with
the label that is reached by the corresponding thread. Obviously, h satisfies
(C1) and (C2), but violates (C3). In our example, h has a cyclic acquisition-
graph. However, an analogous example is possible where h has a cyclic release-
graph. Hence, this example shows that both the acquisition- and release-graph
are required for precise reachability analysis, even for deadlock-free programs.
7.3.1 Inescapable Locks
The fact that the above program has no deadlock results from allowing to non-
deterministically choose not to acquire the lock y, but jump to label l instead,
and thus break a possible deadlock. However, in typical programs, there is no
such option: One has to call some lock() function, and this function will not
return until the lock is acquired. An exception are libraries that support time-
outs for locking (i.e., the lock-operation returns with an error after waiting a
certain time for the lock), or tryLock()-operations (i.e., the lock-operation im-
mediately returns with an error if the lock is not free). Such operations can be
over-approximated by nondeterministically deciding whether to acquire the lock
or not.
For example, the java.util.concurrent.locks-package supports both, time-
outs and tryLock()-operations, while the original Java locking mechanism via
synchronized methods does not support either concept.
Locks that cannot be “escaped” by nondeterministically choosing another tran-
sition are called inescapable. Accordingly, a Monitor-DPN where all locks are
inescapable is called inescapable Monitor-DPN.
Definition 7.4 (Inescapable Monitor-DPNs). Let
M = (P,Γ,Γ⊥,Act,X ,∆, locks)
118
7.3 Deadlocks




↪→ p1w1 ∈ ∆ ∧ pγ o↪→ p2w2 ∈ ∆ =⇒ o = 〈x














The execution starts with a single thread at p that holds no locks. The correspond-
ing DPN is obviously inescapable and deadlock-free. We want to check whether
the labels u,v, and w are simultaneously reachable. The only lock-a/r-hedge that
reaches a configuration that is simultaneously at u,v, and w is h = [(t, ∅)] with
t = 〈〉∅(〈y〈〉∅(〈〉{x}(ε)τw)τv)〈x〈〉{y}(ε)τu










Obviously, h has no multiple final acquisitions of the same lock, and thus satisfies
(C1). As no locks are initially held by h, it trivially satisfies (C2). However, the
dependence-graph of h has a cycle:
〈x → 〈〉{y} → 〈y → 〈〉∅ → 〈〉{x} → 〈x
Hence, the labels u, v, and w are not simultaneously reachable.
The cycle in the dependence-graph leads to a cyclic acquisition-graph. Thus,




However, we do not need release-graphs: We show that any lock-a/r-hedge from
a reachable configuration of an inescapable, deadlock-free DPN has an acyclic
release-graph:
Lemma 7.6. LetM be an inescapable, deadlock-free DPN with start configuration
c0 ∈ Conf ls. Let c ∈ Conf ls be a reachable configuration (i.e., c0 −→∗ls c) and h an
execution-hedge from c (i.e., ∃c′. c h=⇒ c′). Moreover, assume that the correspond-
ing lock-a/r-hedge ar(h×ls(c)) satisfies (C1) and (C2). Then, its release-graph
Erel(ar(h×ls(c))) is acyclic.
In the proof of this lemma, we will construct a partial schedule of a lock-
execution-hedge that leads to a deadlock. In order to connect the partial schedule
of the lock-execution-hedge to a run of the interleaving semantics, we need the
following lemma:
Lemma 7.7. Let h ∈ H be an execution-hedge, c ∈ Conf ls be a consistent config-




=⇒ c′ ∧ h×ls(c) ;∗ls hs×µ¯s
=⇒ ∃c˜, h˜. c h˜=⇒ c˜ hs=⇒ c′ ∧ schedls(h˜×ls(c)) 6= ∅ ∧ ls(c˜) = µ¯s.
Proof. The lemma is shown by induction on ;∗ls. If ;∗ls makes no steps, we have
h = hs and ls(c) = µ¯s. Moreover, we have c
τ |h|
==⇒ c h=⇒ c′ and schedls(τ |h|×ls(c)) =
{ε}, where τ |h| is the execution-hedge that consists of |h| leaf-nodes. Choosing
c˜ := c and h˜ := τ |h| completes the case.
In the case that ;∗ls makes at least one step, we obtain hˆ× ˆ¯µ ∈ Hls, such that
h×ls(c) ;ls hˆ× ˆ¯µ;∗ls hs×µ¯s.
If the first step is no spawn, we split h,c,hˆ, and ˆ¯µ according to the first step. We
obtain h1, h2, o, tˆ, µˆ, c1, and c2 with
h = h1(o; tˆ)h2, c = c1ϕc2, hˆ = h1tˆh2, and ˆ¯µ = ls(c1)µˆls(c2),
such that o can be scheduled in the context of ls(c1c2). From c
h
=⇒ c′, we obtain
ϕˆ, c′1, c′2, and c′ϕ such that
ϕ
o
=⇒ ϕˆ tˆ=⇒ c′ϕ, c1 h1=⇒ c′1, and c2 h2=⇒ c′2.
As the operation o has the same effect on the lockstack, whether executed by the
scheduler or by the Monitor-DPN, we have µˆ = ls(ϕˆ). Together, we get
hˆ×ls(c1ϕˆc2) ;∗ls hs×µ¯s ∧ c1ϕˆc2 hˆ=⇒ c′.
120
7.3 Deadlocks
The induction hypothesis yields c˜ and h˜ such that
c1ϕˆc2
h˜
=⇒ c˜ hs=⇒ c′ ∧ schedls(h˜×ls(c1ϕˆc2)) 6= ∅ ∧ ls(c1ϕˆc2) = µ¯s.
Splitting h˜, we obtain h˜1, h˜2, t˜, c˜1, c˜ϕ, and c˜2 such that
h˜ = h˜1t˜h˜2 c˜ = c˜1c˜ϕc˜2 c1
h˜1=⇒ c˜1 ϕˆ t˜=⇒ c˜ϕ c2 h˜2=⇒ c˜2.
Prepending the execution ϕ o=⇒ ϕˆ, we get ϕ o;t˜=⇒ c˜ϕ, and thus c h˜1(o;t˜)h˜2=====⇒ c˜1c˜ϕc˜2. As
o is schedulable in the context of ls(c1c2), we also have
(h˜1(o; t˜)h˜2)×ls(c) ;ls h˜×ls(c1ϕˆc2),
which completes the case.
The case of a spawn-step is proved analogously.
Note that the above proof could be done more conceptually with the notion of
a prefix of an execution-hedge: Intuitively, we can prove the lemma as follows:
The scheduler schedules a prefix of the hedge h, leaving the suffix hs. Let this
prefix be h˜. Then, we split the execution c h=⇒ c′ into the execution of the prefix h˜
and the suffix hs, obtaining a configuration c˜ with c
h˜
=⇒ c˜ hs=⇒ c′. As the scheduler
and the rules of the Monitor-DPN alter the lockstack in the same way, we have
µ¯s = ls(c˜). As the scheduler schedules the complete prefix h˜×ls(c) of h×ls(c), we
have schedls(h˜×ls(c)) 6= ∅.
However, formalizing the notion of a prefix of an execution-hedge and con-
necting this notion with the hedge semantics and the scheduler requires quite
a large formal overhead. As this is the only place in this thesis where such a
notion is needed, we avoided this overhead, and performed a more direct (but
less conceptual) induction proof here.
Now, we are prepared to prove Lemma 7.6:
Proof of Lemma 7.6. By contraposition, we may assume that Erel(ar(h×ls(c)))
is cyclic and construct a schedule to a deadlocked configuration. Let h˜ :=
ar(h×ls(c)) be the corresponding lock-a/r-hedge.
First, we schedule an initial part of h˜: We repeatedly select a minimal node in
the dependence-graph and schedule it, until there are no minimal nodes left. As h˜
satisfies the criteria (C1) and (C2), and (C1) and (C2) are preserved by scheduling
steps, minimal nodes in the dependence-graph can always be scheduled. We get a
lock-a/r-hedge h˜1 and a schedule h˜ ∗ h˜1, such that gdep(h˜1) contains no minimal
nodes.
All non-terminated threads in h˜1 start with an acquisition- or a use-node, as
release-nodes have no incoming additional edges in gdep(h1), and thus would be
121
7 Optimizations
minimal. Moreover, as the release-graph was cyclic, at least one1 thread starts
with a use-node.
Let D ⊂ N be the set of threads that start with a use-node. (Threads are
identified by their position in the list h˜1.) Each thread i ∈ D starts with a
use-node 〈〉X that has an incoming edge in the dependence-graph. This incoming
edge must result from a release-node 〉x of one of the used locks x ∈ X. This
release-node cannot be in one of the threads outside D, as they are terminated
or start with final acquisitions, and due to well-formedness, initial releases do
not occur after final acquisitions. Let the index of the thread with the 〉x-node
be j ∈ D. Due to well-nestedness, the initially released lock must be on the
lockstack, i.e., we have x ∈ µj. Moreover, we have i 6= j, as otherwise the usage
of x would be reentrant. Hence each thread in D starts with a usage that uses a
lock that is currently acquired by some other thread in D.
At this point, we have constructed a deadlock w.r.t. the scheduler for lock-
a/r-hedges. However, deadlocks are defined on the more fine grained interleaving
semantics. Thus, we transfer the deadlock to the scheduler for lock-execution-
hedges and then to the interleaving semantics: The schedule h˜ ∗ h˜1 corresponds
to a schedule on lock-execution-hedges. We get h1×µ¯1 ∈ Hls such that h˜1 =
ar(h1×µ¯1) and h×ls(c) ;∗ls h1×µ¯1. We continue scheduling h1×µ¯1 with the lock-
sensitive scheduler. For each thread in D, this schedule must get stuck before
completing the first usage, just before an acquisition of a lock that is currently
held by another thread in D. We get h2×µ¯2 ∈ Hls such that h×ls(c) ;∗ls h2×µ¯2,
and for each thread in D, the first node of the corresponding tree in h2 is an
acquisition of a lock held by some other thread in D. Using Lemma 7.7, we
obtain cp, hp such that c
hp
=⇒ cp h2=⇒ c′, schedls(hp×ls(c)) 6= ∅, and ls(cp) = µ¯2. By
Theorem 3.25, we also have c −→∗ls cp, thus, as c is reachable, cp is also a reachable
configuration. Moreover, in h2×µ¯2, each thread in D starts with an acquisition
of a lock that is currently held by some other thread in D. These acquisitions
correspond to acquisition-rules of M that are applicable to the threads in cp, and
as M is inescapable, there are no alternatives to these acquisition-rules. Hence,
cp is deadlocked.
The previous lemma implies that predecessor sets of inescapable, deadlock-free
DPNs can be computed precisely without using release-graphs, as long as only
(lock-insensitively) reachable configurations are considered.
7.3.2 No Spawn inside Monitors
In the last subsection, we have shown that we do not need a release-graph when
analyzing deadlock-free, inescapable DPNs. However, as was illustrated in Ex-
ample 7.5, the acquisition-graph is still required for this class of DPNs. In this
1Actually, cycles in the release-graph have a minimum length of two, thus at least two threads
start with a use-node.
122
7.3 Deadlocks
subsection, we regard Monitor-DPNs that allow spawn-operations only when the
spawning thread is outside any monitor. Typically, the operations inside moni-
tors just access some protected data. Hence, this assumption may be realistic for
many programs.
We show that we do need neither acquisition- nor release-graphs for pre-
cise reachability analysis of inescapable, deadlock-free DPNs without spawn-
operations inside monitors.
Lemma 7.8. Let M be a Monitor-DPN with start configuration c0 ∈ Conf ls, such
that spawn-operations may only be performed outside locks. Formally, we require
that in each lock-a/r-hedge from c0, there must be no path from an acquisition-
node to a use-node with a non-empty list of spawned threads. Moreover, let c =
p1w1 . . . pnwn be a reachable lock-configuration, and h˜ be a lock-a/r-hedge from c,
i.e., we have a lock-execution-hedge h and a configuration c′ ∈ Conf such that
c0 −→∗ls c ∧ c h=⇒ c′ ∧ h˜ = ar(h×ls(c)).
Finally, assume that h˜ satisfies (C1) and (C2).
Then, the dependence-graph gdep(h˜) of h˜ is acyclic.
Proof. First note that the precondition implies that lock-a/r-hedges from any
reachable configuration, in particular h˜, contain no paths from acquisition-nodes
to use-nodes that spawn threads.
The proof is done similar to the proof of Lemma 7.6—i.e., we assume that the
dependence-graph is cyclic, and construct a schedule to a deadlocked configura-
tion.
As in the proof of Lemma 7.6, we repeatedly schedule minimal nodes of gdep(h˜)
until there are no minimal nodes left, and get a schedule h˜  ∗ h˜1, such that
gdep(h˜1) contains no minimal nodes, h˜1 satisfies (C1) and (C2), and we have a
corresponding schedule of the lock-execution-hedge, i.e., h×ls(c) ;∗ls h1×µ¯1 and
h˜1 = ar(h1×µ¯1).
All non-terminated threads in h˜1 start with an acquisition- or a use-node. If
there is a thread starting with a use-node, we deduce the contradiction anal-
ogously to the proof of Lemma 7.6. So assume all non-terminated threads in
h˜1 start with an acquisition-node. Thus h˜1 contains no use-nodes that spawn
threads.
Let D be the set of non-terminated threads in h˜1. In the proof of Lemma 7.6,
any schedule of h˜1 got stuck in a deadlock. In this case, however, we have to
carefully construct the schedule to a deadlock.
In order to reach a deadlock, we do not need to schedule h˜1 completely. We
remove leaf-nodes2 from h˜1. Removing nodes may break cycles in the dependence-
graph. We repeatedly remove leaf-nodes until there are no more leaf-nodes that
2In this context, a leaf-node is a node that has τ as its successor
123
7 Optimizations
can be removed without making the dependence-graph acyclic. We arrive at
a lock-a/r-hedge h˜2 that is a prefix of h˜1. Let h2 be the corresponding lock-
execution-hedge, i.e., we have h˜2 = ar(h2×ls(µ¯1)).
In h˜2, each thread from D ends with a use-node, as all other nodes would
have been removed without breaking a cycle. Let h˜2 = (t1, X1) . . . (tn, Xn), and
consider a thread i ∈ D. The tree ti spawns no threads, i.e., it is actually a
list. The last node is a use-node that is part of a cycle in the dependence-graph,
i.e., it has an outgoing edge to some acquisition-node in a thread j ∈ D. As the
use-node is non-reentrant, we have j 6= i. Hence, each thread in h˜2 ends with a
use-node that uses some lock that is finally acquired in some other thread of D.
Next, we remove all the use-nodes at the leafs of h˜2, arriving at a lock-a/r-
hedge h˜3 with corresponding execution-hedge h3, i.e., h˜3 = ar(h3×ls(µ¯1)). The
dependence-graph of h˜3 is acyclic, i.e., it satisfies (C3). Moreover, as h˜1 contains
no initial release-nodes and satisfies (C1) and (C2), and h˜3 contains less nodes
than h˜1, h˜3 also satisfies (C1) and (C2). Hence, we can completely schedule h˜3.
Thus, also the corresponding lock-execution-hedge h3×ls(µ¯1) is schedulable. As
h3 is a prefix of h1, we can combine this with the schedule h×ls(c) ;∗ls h1×µ¯1,
resulting in a schedule h×ls(c) ;∗ls hs×µ¯s for a lock-execution-hedge hs×µ¯s. By
construction of h3, in hs×µ¯s, each thread i ∈ D starts with the usage of a lock
that was finally acquired by another thread j ∈ D during the schedule of the
prefix of h. Thus, in µ¯s, thread j has this lock on its lockstack. Analogously to
the proof of Lemma 7.6, we construct an execution of the interleaving semantics
to a deadlocked configuration.
7.3.3 Deadlock Detection
In the last subsections, we have shown that we may omit the release-graph when
considering inescapable, deadlock-free DPNs. Moreover, if the DPN additionally
spawns no threads while inside monitors, we may omit both, the acquisition and
the release-graph.
Checking whether a DPN is inescapable is straightforward by examining the
rules. Checking whether a DPN spawns no threads inside monitors is also
straightforward: Whether a lock-a/r-hedge has a path from an acquisition- to
a use-node that spawns threads is a regular property, and the methods of Chap-
ter 5 can be used to check whether the original DPN has an execution with such
a path.
In order to check whether a DPN is deadlock-free, we observe that the set
Cdead of deadlocked configurations is regular. For any set S of actions, the set
{pw | act(pw) = S} of configurations with that set of actions is regular. Whether
a rule is applicable depends only on the control-state and the topmost stack-
symbol. Hence, we can write the set as follows:
{pw | act(pw) = S} = {pγw˜ | w˜ ∈ Γ∗ ∧ S = {o | ∃c. pγ o↪→ c ∈ ∆}}.
124
7.3 Deadlocks
The set on the right-hand side of this equation is obviously regular. Moreover,
it is sufficient for a deadlock to only regard threads that currently hold locks,
plus at most one thread that does not hold a lock. Hence, for any deadlocked
configuration, we find a set D such that |D| ≤ |X |+1 according to Definition 7.3.
Thus, Cdead can be characterized as follows:
Cdead =
⋃
{Conf ls{[(p1w1)]}Conf ls . . .Conf ls{[(pn, wn)]}Conf ls | 0 < n ≤ |X |+ 1
∧ ∀0 < i ≤ n. (act(piwi) = ∅ ∧ ls(wi) 6= ε)
∨ (∃S 6= ∅. act(piwi) = S
∧ ∀o ∈ S. ∃0 < j ≤ n, x ∈ ls(wj). i 6= j ∧ o = 〈x)
∧ ∃0 < i ≤ n. act(piwi) 6= ∅} ∩ Conf ls.
The set on the right-hand side is obviously regular, and we can decide whether a
DPN with start configuration c0 is deadlock-free, by deciding
c0 /∈ pre∗ls(Cdead).
However, an automaton for Cdead has exponential size in the number of locks.
Thus, a precise deadlock analysis may not be worth the effort. However, there
are sufficient criteria for a DPN to be deadlock-free that are simpler to check,
e.g. global lock ordering.
Global lock ordering is a well-known criterion for ensuring absence of deadlocks.
We adapt this notion to Monitor-DPNs here: We define a relation < ⊆ X × X ,
such that x < y iff there is an execution from the start configuration in that a
thread acquires y while holding x. If < is acyclic, the DPN adheres to a global
lock ordering.
Lemma 7.9. If a Monitor-DPN M has a global lock ordering, and each thread
that is inside a monitor has a possible action (i.e., threads may not terminate
while inside monitors), then M is deadlock-free.
Proof. Assume the DPN reaches a deadlocked configuration. W.l.o.g. let this con-
figuration be c = p1w1 . . . pnwn. By definition of a deadlock (cf. Definition 7.3),
we obtain a non-empty set D ⊆ {1, . . . , n}, such that
∀i ∈ D. act(piwi) 6= ∅ ∧ ∀o ∈ act(piwi). ∃j ∈ D, x ∈ ls(wj). i 6= j ∧ o = 〈x.
Note that, by assumption, there are no terminated threads that hold locks. Thus
we omitted the case act(piwi) = ∅ ∧ ls(wi) 6= ε.
We now construct a cycle in <. If we remove the threads that hold no locks
from D, the reduced set D still satisfies the deadlock condition. (Note that the
existential quantification ∃j ∈ D only considers threads that hold at least one
lock.) Moreover, D contains at least one thread that holds a lock. Thus, we may
safely assume that all threads in D hold at least one lock. In order to construct
125
7 Optimizations
the cycle, we start with some thread i1 ∈ D. By definition of <, we have x1 < x2
for each lock x1 ∈ ls(wi1) and x2 with 〈x2 ∈ act(pi1wi1). As both sets are not
empty, we can pick such a pair x1, x2, and find a thread i2 6= i1 with x2 ∈ ls(wi2).
Continuing this argumentation inductively, we get a chain x1 < x2 < x3 < . . ..
However, as there are only finitely many locks, this implies a cycle in <.
Note that the other direction of Lemma 7.9 is not true in general, i.e., there
are deadlock-free programs that do not adhere to a global lock ordering. For














Clearly, we have y < z < y. However, the program has no deadlock: Intuitively,
the lock x prevents that two threads simultaneously reach the location where <
is cyclic. Thus, a global lock ordering is a sufficient, but not a necessary criterion
for absence of deadlocks.
Checking whether each (lock-insensitive) execution of a Monitor-DPN adheres
to a global lock ordering, as well as checking whether threads may terminate
while holding locks, can be done by a straightforward, polynomial time analysis.
Such an analysis could be done before the actual predecessor set computation,
in order to determine whether the DPN meets the criteria to omit the release-
and/or acquisition-graph.
7.4 Example
In this section, we resume the example from the introduction (Example 1.1).
Although the scope of this thesis is limited to the analysis of Monitor-DPNs,
this example covers all steps from the Java program to the actual predecessor
set computation. However, the abstraction step from the Java program to the
Monitor-DPN is done with an ad-hoc technique.
We start with the following program written in Java [47]:
public class Example implements Runnable {
private stat ic void write_terminal ( S t r ing s ) {
for ( int i =0; i<s . l ength ();++ i ) {




Thread . y i e l d ( ) ;
} catch ( Exception e ) {}
}
System . out . f l u s h ( ) ;
}
private synchronized void wr i t e ( S t r ing s ) {




public void run ( ) {
g3 : wr i t e ( "He l lo " ) ;
g4 : {}
}
public stat ic void main ( St r ing args [ ] ) {
g0 : new Thread (new Example ( ) ) . s t a r t ( ) ;




Note that the labels g0,g1,. . . have been added to illustrate the connection
of this program and its translation to a DPN that is presented below. The pro-
gram resembles the one from Example 1.1, except that we have instrumented the
write_terminal() method with a call to Thread.yield(), in order to increase
the probability that the concurrency error manifests itself.
The main method of this program spawns two instances of the Example-thread.
The run-method of this thread calls the write-method. This is a synchronized
method that accesses the terminal to print a string. The access to the terminal
is done by the write_terminal-method that models an unsynchronized access
to the terminal. It may be interrupted after each letter is printed. Additionally,
we hint the scheduler to actually interrupt the thread after each letter, using
the yield-method. This increases the probability that the data-race actually
manifests itself in erroneous output. Probably, the programmer expected this
program to output "HelloHello". However, when running the program a few
times, one observes different outputs. On the author’s machine, for example, the
most frequent output was "HHeelllloo", and sometimes, the author got "HHel-
loello", or even the correct output "HelloHello". Obviously, the above program
contains a concurrency error. Deeper inspection of the program quickly reveals
the error: The write-method is synchronized, however it synchronizes on the
127
7 Optimizations
Example-object of the current thread. As each thread has its own instance of
Example, the two threads synchronize write on two different monitors, resulting
in a data-race on the write-access to the terminal. Note that this concurrency
error occurs at a very small probability, when omitting the call to yield() after
output of each letter, and may easily be missed by testing. Thus, this program is
a very simple example that motivates the use of automatic verification methods
for concurrent programs. Moreover, it is well-suited to illustrate our analysis,
point out possible implementation issues, and propose solutions to them, while
keeping the intermediate results at a suitable size for manual calculation.
The first step of the analysis is the translation of the Java program to a Monitor-
DPN. We abstract from the write_terminal-method, and replace it by a base-
action labeled w (for write-access). We construct the Monitor-DPN
M = (P,Γ,Γ⊥,Act,X ,∆, locks)
with
P = {p} Act = {a, w}
Γ = {γ15 , γ25 , γ16 , γ26} ∪ Γ⊥ Γ⊥ = {γ0, γ1, γ2, γ13 , γ23 , γ14 , γ24}
X = {x1, x2}






























The stack-symbols γ0, γ1, . . . correspond to the labels g0,g1,. . . in the Java pro-
gram. The run methods and the main method have no return-rules, thus match-
ing our constraint that the bottommost stack-symbol must not be popped. The
two instances of the Example-class are resolved by duplication of the methods
and locks. We added the superscripts ·1 and ·2 to the affected stack-symbols and
locks. For the spawn-rules, we introduced a dummy action a that has no further
meaning, but is syntactically required, as any rule must have a label. One can
easily verify that M is a Monitor-DPN.
In general, translation of arbitrary Java programs to Monitor-DPNs is not as
easy as for this simple example. A major issue is pointer analysis, and its impli-
cations for locks: While DPNs only support boundedly many directly addressed
locks, Java programs may have unboundedly many locks that are addressed by
128
7.4 Example
reference. When abstracting a Java program to a Monitor-DPN, a monitor from
Java can only be translated to a monitor in the DPN if pointer analysis has not
summarized different objects. If, for example, pointer analysis would summarize
the two instances of the Example-object in our program, we could not identify
the summarized instance with a single lock, as the translated DPN would have
no data-race any more, and the abstraction would be unsound. General methods
for abstraction of concurrent Java programs to Monitor-DPNs are beyond the
scope of this thesis.
The next step of our analysis is to specify the error condition to be checked,
and to translate it to a predecessor set computation. We want to check whether
more than one thread can simultaneously call write_terminal. On the DPN,
this translates to checking whether more than one thread can do a 2w-action
simultaneously. As only rules from pγ15 and pγ25 may perform a 2w-action, this is
equivalent to checking whether a configuration from the regular set
C := (P ∪ Γ)∗P{γ15 , γ25}(P ∪ Γ)∗P{γ15 , γ25}(P ∪ Γ)∗
is reachable from the start configuration pγ0. Hence, the DPN cannot reach two
2w-actions simultaneously, if and only if
pre∗ls(C) ∩ pγ0 = ∅.
In general the abstraction from the Java program to the DPN may introduce
spurious executions. Thus, if the above statement holds, the Java program def-
initely cannot call write_terminal simultaneously from two threads. However,
if the above statement does not hold, this only indicates a possible error: Either,
the Java program has a real error, or the error is due to some spurious execution
introduced by the abstraction. Note that, in our simple example, the abstraction
is precise, i.e., it introduces no spurious executions.
The set C is accepted by the following automaton A:
P ∪ Γ
P {γ15 , γ25}
P ∪ Γ
P {γ15 , γ25}
P ∪ Γ
As we check for reachability from the start configuration, the optimizations
sketched in Section 7.2 can be applied: We may omit the release-set, the release-
graph, and the set of initially held locks, and we need not intersect the result
with Conf ls. Additionally, we may omit the initial-release flag in DACons .
Following the construction from the proof of Theorem 6.5, we first construct
the automaton ACons on lock-a/r-hedges. Then, we translate ACons to the DPN-
Acceptor DACons , which has the components DACons = (ACI ,ACF ,M2) with M2 =
129
7 Optimizations
(P2,Γ2,Γ⊥,2,Act,X ,∆2, locks2). Next, we do a cross-product construction be-
tween M and DACons , yielding the DPN M× = (P×,Γ×,ActX ,∆×), automata ACI×
and ACF× , and the projections pi1, pi2. Finally, we compute the automaton




1 (A) ∩ ACF×) ∩ Avalid× ∩ pi−12 (ACI)),
and check p0γ0 ∈ pre′∗ls (A).
The control-symbols in P× have the form
(p, (p, X, (u, a, ga))) or (p, (u(u˜,a˜,g˜a), X, Y, (u, a, ga))),
where p is the single control-state of M , X is the set of currently acquired locks,
Y is the set that accumulates used locks3, (u, a, ga) and (u˜, a˜, g˜a) are acquisi-
tion structures without release-sets and release-graphs. In order to make the
presentation more readable, we omit the control-state p of M , and just write
(p, X, (u, a, ga)) and (u(u˜,a˜,g˜a), X, Y, (u, a, ga)).
The stack-symbols in Γ× have the form (γ, xb) with γ ∈ Γ and locks(γ) = {x}
or (γ⊥,⊥) with γ⊥ ∈ Γ⊥. Note that, in our special case, all symbols are either
bottom stack-symbols or are associated with a lock, such that there are no stack-
symbols γ with γ ∈ Γ \ Γ⊥ and locks(γ) = ∅.







Here, we use the abbreviations
PF := {(p, X, ∅3) | X ⊆ X},
Γ5 := {(γi5, xi,b) | i ∈ {1, 2}, b ∈ B}, and
∅3 := (∅, ∅, ∅).
Next, we convert this automaton to an M-automaton. This is achieved by
duplicating some states. Note that, after this conversion, words that do not
represent configurations (i.e., that do not start with a control-symbol) are not
accepted any more. The resulting M-automaton has the following shape:
3In Definition 5.2, this set is named u. However, this name would clash with the u-component









(p, {x 2}, ∅)















This reveals a problem: Due to the M-automata condition, we have to create a
separate node for each control-state. For edges not present in the automaton, we
may omit these nodes and add them on demand during the saturation. However,
for edges present in the automaton, we have to actually create the separate states.
This results in an exponential blowup in the number of locks, as we have to add
a state for every possible set of allocated locks. Note that the automaton accepts
configurations with arbitrary stack, thus all sets of locks may actually occur on
some stack.
However, when inspecting our example DPN more deeply, we observe that not
all of these configurations are reachable, even lock-insensitively. Actually, only a
single configuration in L(A) is reachable lock-insensitively: pγ15γ14pγ25γ24pγ2. Using








The automaton for pi−11 (A′) ∩ pi−12 (ACF) is constructed accordingly. However,
we do not include configurations where the lockstack does not match the set of
acquired locks stored in the DPNs control-state. This is a correct optimization, as
ACI ensures that the locks match the stored sets in the initial configuration, and
this property is invariant under transitions of M2. As our automaton A′ accepts
only one configuration, there is only one set of locks for each control-state, and
we get the following automaton:
131
7 Optimizations
q1 q2 q3 q4




(p, {x1}, ∅3) (γ15 , x1,⊥) (γ14 ,⊥)
(p, {x2}, ∅3) (γ25 , x2,⊥) (γ24 ,⊥)
(p, ∅, ∅3) (γ2,⊥)
Note that we already introduced the ε-transitions necessary to make this au-
tomaton an M-automaton. We now apply the saturation algorithm. For this,
we show how to instantiate the rule-templates of ACons (Definition 4.19), DACons
(Definition 5.2), and the cross-product construction (Section 5.2) to obtain the
rules of M× that are used for the saturation.
1. Regard the rule
pγ13
〈x1
↪→ pγ15γ14 ∈ ∆
and the rule
(p, ∅, as〈t(x1, ∅3))⊥
〈x1
↪→ (p, {x1} ∪ ∅, ∅3)x1,⊥⊥ ∈ ∆2.
The latter one is obtained by instantiating the (acq-n)-rule ofDACons with the
acquisition-rule of ACons, the empty set of acquired locks, and the empty
acquisition structure ∅3. These rules are paired by the (acq)-rule of the
cross-product construction, yielding the rule
(p, ∅, (∅, {x1}, ∅))(γ13 ,⊥)
〈x1
↪→ (p, {x1}, ∅3)(γ15 , x1,⊥)(γ14 ,⊥).
The right-hand side of this rule is recognized by the automaton from state q1
to q4. Hence, the saturation algorithm adds a shortcut using the left-hand
side. As there is no transition from q1 for the control-state (p, ∅, (∅, {x1}, ∅))
yet, we create a state q12, and add the transitions
q1
(p,∅,(∅,{x1},∅))−−−−−−−−→ q12 and q12
(γ13 ,⊥)−−−→ q4
to the automaton.
2. Similar as in Step 1, we create a state q16 and add the transitions
q5









(p, ∅, (∅, {x2}, ∅))⊥ 2a↪→ (p, ∅, (∅, {x2}, ∅))⊥](p, ∅, ∅3)⊥ ∈ DACons ,
that is obtained by instantiating the (spawn)-rule of DACons with
〈〉∅([(∅, {x2}, ∅)])∅3 →∗ACons (∅, {x2}, ∅),
which, in turn, is easily derived from the rules of ACons. The (spawn)-rule
of the cross-product construction pairs these rules, yielding
(p, ∅, (∅, {x2}, ∅))(γ1,⊥) 2a↪→ (p, ∅, (∅, {x2}, ∅))(γ23 ,⊥)](p, ∅, ∅3)(γ2,⊥).
The right-hand side of this rule is recognized by the automaton, using the
sequence of states q5q16q8q9q10q11. Hence, we have to add the left-hand side
as a shortcut from state q5 to q11. There is already the state q16 for the
control-symbol (p, ∅, (∅, {x2}, ∅)), hence we only add the transition
q16
(γ1,⊥)−−−→ q11.
4. Similar to Step 3, we add the state q22 and the transitions
q1
(p,∅,(∅,{x1,x2},∅))−−−−−−−−−−→ q22 and q22
(γ0,⊥)−−−→ q11
to the automaton.
At this point, the saturation is finished. We call the saturated automaton Asat.
Next, we have to compute
p0γ0 ∈ L(pi1(Asat ∩ Avalid× ∩ pi−12 (ACI))).
This is equivalent to
pi−11 (pγ0) ∩ Asat ∩ Avalid× ∩ pi−12 (ACI) 6= ∅.
Intersection of Asat with pi−11 (pγ0) yields the following automaton:
q1 q22 q11
(p, ∅, (∅, {x1, x2}, ∅)) (γ0,⊥)
133
7 Optimizations
Intersection with Avalid× does not change this automaton. So it remains to inter-
sect the automaton with pi−12 (ACI). The automaton ACI (Definition 5.2) computes,
within its states, the hedge acquisition structure. It only accepts configurations
with consistent hedge acquisition structures.
However, in our setting, the stack-symbol γ0 of the initial configuration has no
locks. Moreover, the initial configuration has just one thread. Thus, intersection
with pi−12 (ACI) can be done by removing all edges that are labeled with cyclic
acquisition-graphs. The acquisition-graph of (p, ∅, (∅, {x1, x2}, ∅)) is empty, and
thus acyclic. Hence, intersection with pi−12 (ACI) also does not change the automa-
ton.
The language of the automaton is obviously not empty, and thus a data-race
can be reached from the initial configuration. When adding some bookkeeping
functionality to the saturation algorithm, one can reconstruct the executions
associated with each configuration in the saturated automaton [111, 121]. This
could be used to construct an error trace.
7.4.1 Discussion
In this section, we have illustrated—with the help of a very simple example—how
our model-checking algorithm works. We performed some ad-hoc optimizations
that made the analysis simpler. We now comment on these optimizations and
their general applicability.
The first optimizations (i.e., omitting the release-set, the release-graph, the
initial-release flag, the intersection of the result with Conf ls, and the set of ini-
tially held locks) have been discussed in Section 7.2. They can always be applied
for non-iterated reachability queries from a start configuration that holds no locks,
make the involved data structures simpler, and avoid unnecessary blowup when
constructing the M-automaton for pi−11 (A)∩pi−12 (ACF). Moreover, if the start con-
figuration consists of just a single process, intersection of the saturated automaton
with pi−12 (ACI) reduces to removing edges labeled with cyclic acquisition-graphs.
Note that the example program is deadlock-free, as its global lock ordering re-
lation is empty. Moreover, the locks are inescapable and threads are not spawned
inside monitors. Thus, we could also have omitted the acquisition-graph (cf.
Subsection 7.3).
The automaton pi−12 (ACF) admits exponentially many control-states in the num-
ber of locks. A crucial optimization was to additionally intersect the automaton
pi−11 (A) ∩ pi−12 (ACF) with the set of those configurations where the set of locks
stored in the control-state matches the actual set of locks on the stack. This
optimization does not work if A, and thus pi−11 (A), contains all possible stacks, as
it is the case in our example. Thus, we used forward reachability information to
intersect A with the lock-insensitively reachable configurations. In our case, this
reduced the number of configurations accepted by A to a single configuration.
However, lock-sensitive successor sets of regular sets (and even of singleton sets)
134
7.5 Summary and Related Work
are, in general, not regular [16]. However, they are still context-free [16]. Thus,
intersecting A with the set of reachable configurations cannot be done precisely,
but regular approximations of the precise context-free set may be used.
A simpler alternative would be to compute an over-approximation of the reach-
able locksets by a simple static analysis. When constructing ACF , only reachable
locksets have to be considered.
The above optimization tackles a fundamental problem when implementing our
analysis: The predecessor set computation explores all states that are backward-
reachable from the target configurations. These typically include many states
that are not forward reachable. Computing the acquisition histories and locksets
for all those unreachable states may cause a significant blowup—in our case, we
had to explore stacks with exponentially many different (unreachable) locksets.
The optimizations try to restrict the states that are explored to forward reachable
states.
Another crucial optimization was the demand driven construction of the rules
of M×. There are exponentially many rules in M×, but, in our example, only
few of them are actually needed for the saturation. We expect that the set
of acquisition histories belonging to actual executions of the program is quite
small in typical programs. Thus, for typical programs, it should be profitable
to generate the rules of M× on demand, using symbolic techniques to compactly
represent M×.
The last optimization was to do the intersection of the saturated automaton
with the start configuration as early as possible. This reduced the number of
states in the automaton. This optimization is profitable for simple sets of start
configurations, where intersection is expected to reduce the number of edges and
states. Typical start configurations consists of just a single control-state and a
single stack-symbol, and thus are suited for this optimization.
7.5 Summary and Related Work
In this chapter, we have discussed optimizations of the lock-sensitive predecessor
set computation and presented an example how our analysis finds a data-race.
In Section 7.2, we presented some simple optimizations for reachability queries
from the start configuration. In Section 7.3, we discussed the relation between
deadlocks and the acquisition and release-graph, with the result that acquisition
and release-graphs are not required for inescapable, deadlock-free DPNs without
spawn-operations inside monitors. A preliminary result of this kind has been de-
veloped in our group quite early [111]. There, inescapable, deadlock-free DPNs
where all spawn-operations are performed before the first monitor-operation are
regarded. Our criterion (i.e., no spawn-operations inside monitors) is a general-
ization of this criterion. In Section 7.4, we presented an example of our analysis
and indicated problems that have to be solved by an implementation of the analy-
135
7 Optimizations
sis. Despite the worst-case exponential blowup in the number of locks, acquisition
history based methods have been successfully implemented [39, 40, 57, 61, 63].
Our optimizations aim at reducing the unreachable executions for that acqui-
sition structures are computed. Weighted DPNs [120, 121] are DPNs where the
edges are annotated with weights. Wenner [120, 121] presents an extended satu-
ration algorithm for predecessor set computation that yields the automaton for
the predecessor set, and, additionally, a constraint system that associates edges
of the automaton with weights. Using this constraint system, one can compute
the least upper bound of the weights of all executions between two regular sets of
configurations. There is evidence that sets of acquisition structures can be used
as weights. For reachability analysis, the solution of the generated constraint
system needs only to be computed for the start configuration, thus avoiding the
expensive computation of acquisition structures for unreachable states. An elab-
oration and evaluation of this approach is left to future research.
Another approach, which is complementary to predecessor set computation, is
successor set computation via regular execution-trees [44, 74]. While the set of
execution-trees of a DPN is not regular in general, one can add more structure
to execution-trees such that this set becomes regular. The idea is to distin-
guish whether a pushed stack-symbol is popped during the execution or not.
In the former case, the push-rule is associated to a binary node in the regular
execution-tree, where the left successor describes the execution up to the match-
ing pop-rule, and the right successor describes the remaining execution after the
pop-rule. From the rules of a DPN, a tree automaton that accepts the regu-
lar execution-trees of the DPN can be efficiently constructed. Also the set of
regular execution-trees that have a lock-sensitive schedule is regular, as well as
the set of regular execution-trees that reach a configuration accepted by a given
automaton. Thus, lock-sensitive reachability analysis can be implemented by an
emptiness check of the intersection of three tree automata. There is evidence
that this emptiness check can be implemented in a way such that only acquisi-
tion structures for reachable executions are generated, thus avoiding the blowup
that we encountered for predecessor set computation. While we have already de-
scribed lock-sensitive reachability analysis via regular execution-trees [44] (even
for Join-Lock-DPNs), an implementation of this method and its extension to more
powerful analyses (e.g. bitvector-analysis and atomic-set serializability violation
detection) is subject of current research.
The problem of abstracting Java programs to DPNs is out of scope of this thesis.
From our point of view, the most interesting work in this direction is the random
isolation method [62], that can be used to abstract concurrent Java programs to
parallel pushdown systems with monitors. Applied on the ConTest benchmark
suite by Eytani et al. [38], this approach yields promising results [59, 61, 63]
in combination with acquisition history based methods. Leveraging the random




In the last chapters, we described lock-sensitive predecessor set computation for
Monitor-DPNs. In this chapter, we briefly discuss other locking disciplines.
Restricting analyses to Monitor-DPNs can be justified by the fact that com-
mon high-level programming languages (e.g. Java [47]) use monitors rather than
arbitrary locking.1
However, we are also able to analyze DPNs that use locks not adhering to a
monitor-discipline. In this case, we have to restrict to well-nested, non-reentrant
locks: If locks are used in a non-well-nested fashion, reachability problems be-
come undecidable, even for non-reentrant locks [57]. For DPNs with well-nested,
reentrant locks, we do not know whether reachability properties are decidable.
We discuss this problem in Section 8.1. In Section 8.2, we briefly discuss how to
modify the methods of this thesis such that they apply to DPNs with well-nested,
non-reentrant locks. Moreover, we sketch a polynomial time algorithm to verify
that a given DPN uses locks only in a well-nested, non-reentrant fashion. Finally,
in Section 8.3, we briefly summarize the results of this chapter and discuss related
work.
8.1 Well-Nested, Reentrant Locks
We do not know whether reachability properties of DPNs with well-nested, reen-
trant locks are decidable. The cross-product construction (cf. Section 5.2) heavily
relies on the fact that the lockstack and the callstack of a Monitor-DPN are de-
pendent: The locks are bound to stack-symbols, and thus the lockstack can be
extracted from the callstack by projecting the stack-symbols to their associated
locks, as done by the ls-function. However, if locks may be acquired and released
independently from the callstack, we do not have this dependence any more: The
stack of the DPN M to be analyzed and the stack of the DPN-Acceptor DACons ,
which accepts schedulable lock-execution-hedges, are independent. Thus, the
cross-product construction would have to intersect two DPNs with independent
stacks. However, as DPNs are a more general model than pushdown systems,
1Since Java version 5, the possibility for arbitrary locking has been introduced by the package
java.util.concurrent.locks However, monitors are more tightly integrated within the
syntax of Java (synchronized-keyword), while locks are realized as library functions.
137
8 Non-Monitor Locking Disciplines
and already the intersection of pushdown systems is not effective, intersection of
DPNs is not effective, too.
On the other hand, we cannot immediately derive an undecidability result from
this observation, as we are not forced to use the method of this thesis: There may
exist other methods for reachability analysis that do not have this problem. We
have not been able to obtain undecidability results for reachability properties of
DPNs with well-nested, reentrant locks, and leave this problem open for future
research.
8.2 Well-Nested, Non-Reentrant Locks
In the last section, we sketched that our method does not transfer to well-nested,
reentrant locks. However, it does transfer to well-nested, non-reentrant locks.
In this section, we briefly sketch how to adapt our lock-sensitive predecessor
set computation to DPNs with well-nested, non-reentrant locks (Lock-DPNs). As
this thesis is focused on reentrant monitors, we remain on an intuitive level here,
omitting formal definitions. For a complete description of an acquisition structure
based predecessor set computation for Lock-DPNs, we refer to our previous work
[79]. As an additional contribution over [79], we sketch a polynomial algorithm
that checks whether a DPN uses locks only in a well-nested, non-reentrant fashion.
Such an algorithm is important to check that the DPN to be analyzed is actually
well-nested and non-reentrant, as our predecessor set computation is unsound for
DPNs that violate the well-nestedness or non-reentrance assumption.
The interleaving semantics of Lock-DPNs can be defined as a labeled transition
system on configurations paired with lockstacks. We have to explicitly keep
track of lockstacks, while, for Monitor-DPNs, the lockstacks are encoded into
the callstacks. The tree-semantics, as well as lock-a/r-hedges and the definition
of ACons, remain the same. Also, the construction of the DPN-Acceptor DACons
(cf. Section 5.1.2) does not change. However, the heights of the stacks used by
DACons are bounded by the number of locks. Thus, the stacks can be encoded
into the control-states of DACons , such that the DPN of DACons effectively becomes
a tree automaton. We call the resulting structure a stackless DPN-Acceptor.
In [79], we describe a cross-product construction between a hedge automaton
and a DPN. It is straightforward to adapt this construction to cross-products of
Lock-DPNs and stackless DPN-Acceptors. Then, like for Monitor-DPNs, lock-
sensitive predecessor set computation is reduced to lock-insensitive predecessor
set computation on the cross-product DPN.
As there are more than |X |! different non-reentrant lockstacks for the set X
of locks, the hedge automaton for DACons has more than |X |! states. However,
if we assume that every execution of the DPN uses locks only in a well-nested,
non-reentrant fashion, the lockstacks in the configurations can be replaced by
locksets. Also for DACons (cf. Definition 5.2), we do not require the lockstacks to
138
8.2 Well-Nested, Non-Reentrant Locks
eliminate reentrance any more, nor to exclude non-well-nested executions. The
summarization of usages, which is implemented by using flags on the lockstack,
can also be realized by storing the outermost lock of the usage in the control-
state. This way, the number of states of DACons can be reduced to 2O(|X |
2)—the
same bound as for monitors.
Checking that all executions of a given DPN use locks only in a well-nested,
non-reentrant fashion can be done in polynomial time. In the remainder of this
section, we present a polynomial time algorithm to verify that all executions of a
given DPN use locks only in a well-nested, non-reentrant fashion. It can be used
prior to the predecessor set computation, in order to refuse to analyze a DPN
that does not satisfy the well-nestedness or non-reentrance assumption.
Given a DPN M with locks X , and a start configuration p0γ0 that holds no
locks, we check whether every execution-tree from the start configuration is non-
reentrant and well-nested. The idea is to do the check separately for each lock.
Given a lock x ∈ X , we construct an automaton Ax that accepts execution-
trees where the acquisition- and release-operations between an acquisition and
a release of x are balanced, ignoring the actual locks, and where x is only used
in a non-reentrant fashion. For example, Ax accepts 〈x〈y〉y〉x, but also 〈x〈y〉z〉x.
However, it does not accept 〈x〈y〉x, 〈x〈y〈x, nor 〈z〈y〉y〉x. The idea is to count the
number of locks that are above x on the lockstack. As locks are non-reentrant,
this number is bounded by the number of locks. For each lock x ∈ X , we define
the automaton Ax := (Q,F, δ) with
Q = {qb0 | b ∈ B} ∪ {(q1, n) | 0 ≤ n < |X |} F = {qb0 | b ∈ B}
and the rules
τ →δ q⊥0 (leaf) 2aq →δ q (base)
a(q
b
0)q →δ q (spawn) 〈yqb0 →δ qb0 (acq-y0)
〉yqb0 →δ qb0 (rel-y0) 〈y(q1, n+ 1)→δ (q1, n) (acq-y1)
〉y(q1, n)→δ (q1, n+ 1) (rel-y1) 〈x(q1, 0)→δ q>0 (acq-x)
〉x(qb0)→δ (q1, 0) (rel-x) 〈xq⊥0 →δ q>0 (acq-xf)
for any q ∈ Q, y ∈ X \ {x}, b ∈ B, and 0 ≤ n < |X | − 1.
Intuitively, the qb0-states indicate that we are outside an acquisition of x. The b-
flag indicates whether there is an acquisition of x in the local branch of the current
subtree. This is required to correctly accept an unmatched final acquisition of
x. Hence, the state q⊥0 can be interpreted as maybe outside x, while the state q>0
can be interpreted as definitely outside x. The (q1, n)-states indicate that we are
inside an acquisition of x, and there are n more locks on the lockstack above x.
Initially, the (leaf)-rule starts in state q⊥0 (i.e., we may be outside x or inside
a final acquisition of x). The (base)- and (spawn)-rules do not change the state.
The (spawn)-rule additionally ensures that the spawned thread is not currently
139
8 Non-Monitor Locking Disciplines
inside x, as this would violate well-nestedness. If outside (or maybe outside) an
acquisition of x, the (acq-y0)- and (rel-y0)-rules accept acquisitions and releases
of other locks than x, without changing the state. If inside an acquisition of x,
the (acq-y1)- and (rel-y1)-rules update the nesting counter. They can only be
applied if the counter would not overflow. Note that the counter can only overflow
on reentrant executions. The (acq-x)- and (rel-x)-rules accept an acquisition
or release of x. The (rel-x)-rule initializes the nesting counter to 0, and the
(acq-x)-rule requires the nesting counter to be 0 again, as otherwise the release
and acquisition of x would be mismatched. Finally, the (acq-xf)-rule accepts an
unmatched final acquisition, and sets the state to q>0 , indicating that we are now
definitely outside an acquisition of x.
We now show that non-reentrant and well-nested execution-trees can be char-
acterized by Ax as follows:
t is non-reentrant and well-nested, if and only if ∀x ∈ X . t ∈ L(Ax).
Proof. With the intuition described above, it is straightforward to show that, for
all x ∈ X , Ax accepts any well-nested and non-reentrant execution-tree. This
implies the =⇒-direction.
For the ⇐=-direction, we assume that t is reentrant or not well-nested, and
show that there is an x ∈ X such that t is not accepted by Ax.
If t is reentrant, it contains two acquisitions of a lock x in the same thread,
without a release of x in between. After accepting the second acquisition, Ax is
in state q>0 . As there is no release of x, Ax still is in state q>0 when arriving at
the first acquisition. However, there is no rule to accept 〈xq>0 , and thus t is not
accepted by Ax.
If t is not well-nested, it has an unmatched release of a lock x, or an acquisition
of a lock y that matches a release of a lock x 6= y. In the former case, in Ax,
a subtree at a spawn-node or t itself is accepted with a (q1, n)-state. However,
there is no rule to accept a a(q1, n)-node, nor is (q1, n) a final state. Thus, t is
not accepted by Ax. In the latter case, Ax accepts the release-node of x in the
state (q1, 0). When arriving at the mismatched acquisition of y, the state is again
(q1, 0). As there is no rule to accept 〈y(q1, 0) in Ax, the tree t is not accepted.
Obviously, the size of Ax is polynomial in the size of M . Moreover, as Ax
is a bottom-up deterministic automaton, it can be complemented by making it
complete and complementing the set of final states, yielding the automaton Ax,





of M and Ax can be constructed in polynomial time (cf. [79] for details on the
140
8.3 Summary and Related Work
cross-product construction between tree automata and DPNs). We have
∀t, c. p0γ0 t=⇒M c =⇒ t non-reentrant and well-nested
⇐⇒ ∀t, c. p0γ0 t=⇒M c =⇒ (∀x. t ∈ L(Ax))
⇐⇒ ∀x. ¬
(
∃t, c. p0γ0 t=⇒M c ∧ t ∈ L(Ax)
)
⇐⇒ ∀x. p0γ0 /∈ pi1(AxCI× ∩ pre∗Mx×(A
x
CF×))
where the first equivalence was shown above, the second equivalence is due to
basic rewriting, and the third equivalence is due to the correctness of the cross-
product construction. The right-hand side of this equivalence can be checked in
polynomial time, iterating over each lock x ∈ X and using the lock-insensitive
predecessor set computation from [16] (cf. Theorem 6.4) in each iteration. Thus,
it can be checked in polynomial time whether M uses locks only in a well-nested
and non-reentrant fashion.
8.3 Summary and Related Work
In this chapter, we indicated why our method does not transfer to well-nested,
reentrant locks. We left open the decidability for this class of models. How-
ever, we briefly sketched how to transfer our method to DPNs with well-nested,
non-reentrant locks (Lock-DPNs). Finally, we sketched a polynomial time algo-
rithm to check whether a DPN satisfies the well-nestedness and non-reentrance
assumption.
While reachability analysis for arbitrary locking is undecidable [57], it remains
decidable for parallel pushdown systems with bounded lock-chains [54]. The
bounded lock-chain criterion is a generalization of well-nestedness: Intuitively,
in an execution of a single thread, a lock-chain is a sequence of acquisitions and
releases of locks, such that the ith lock is released between the acquisitions of the
i+ 1st and i+ 2nd lock. Thus, well-nested lock-usage corresponds to lock-chains
of length at most one. Investigating whether those results transfer to DPNs with
locks is left to future research.
In [79], we describe predecessor set computation for DPNs with well-nested,
non-reentrant locks. The approach is largely the same as what we described
above. Like in this thesis, we define an interleaving semantics, a tree-semantics,
and a lock-sensitive scheduler. In contrast to this thesis, we use locksets instead
of lockstacks already in the definition of the semantics, assuming that locks are
used in a well-nested and non-reentrant fashion. We show that the set of schedu-
lable lock-execution-hedges is regular, and do a cross-product construction of the
DPN and the regular set of schedulable lock-execution-hedges. We do not use
the concept of lock-a/r-hedges, but characterize the schedulable lock-execution-
hedges directly. This results in more complicated definitions and proofs. (Cf.
Section 4.4 for a detailed comparison.)
141
8 Non-Monitor Locking Disciplines
142
9 Complexity
In Chapter 6, we described an algorithm for lock-sensitive predecessor set compu-
tation. It is based on acquisition structures and requires polynomial time in the
program size and exponential time in the number of locks. This algorithm can be
used to model-check EF-logic and various practical problems that can be encoded
into fragments of EF-logic. In Chapter 7, we analyzed an example and discussed
methods to avoid the worst-case run time for typical programs. Moreover, we
pointed to successful implementations of acquisition structure based methods for
parallel pushdown systems (PPDS) [39, 57, 61, 63] that indicate that the prob-
lem remains tractable in practice. In this chapter, we discuss the computational
complexity of analyzing programs communicating with locks. Our main result is
that model-checking negation-free EF-formulas with a fixed number of operators,
as well as many practical relevant properties, is NP-complete for Monitor-DPNs.
This gives strong evidence that the worst-case exponential runtime cannot be
avoided. The NP-hardness results also apply to practical analysis problems like
data-race detection and bitvector-analysis, which indicates that our method of us-
ing lock-sensitive predecessor set computations is adequate for those problems, as
it introduces no additional complexity. Moreover, we show that model-checking
problems become harder for more expressive properties or models. This indicates
that we have not missed extensions to more general models or properties that
could be made without increasing the complexity.
This chapter is organized as follows: In Section 9.1, we introduce the models
and properties that we will consider. In Section 9.2, we study the complexity of
checking models with monitors or well-nested, non-reentrant locks. In Section 9.3,
we consider models with stronger synchronization mechanisms like join and locks
or rendezvous-communication. Finally, in Section 9.4, we discuss our results and
give pointers to related work.
9.1 Models and Properties
In this section, we introduce the models and properties whose complexity will be
studied. The models are derived from DPNs, and the properties are fragments
of EF-logic (cf. Section 6.4.2).
A DPN has two dimensions of infinity: Unboundedly many threads may be
created, and the stacks may be unboundedly deep. We obtain our models by
restricting none, one, or both of these dimensions. A DPN without thread cre-
143
9 Complexity
ation is a parallel pushdown system (PPDS). In a PPDS, each thread is described
by a pushdown system, and the number of threads does not change during the
run of a PPDS, i.e., all threads are already present in the start configuration.
If the number of threads is fixed (i.e., there are n threads for a constant n), we
get an nPDS. A model that consists of concurrent finite-state machines, where a
transition rule may spawn a new thread as a side effect, is called dynamic finite-
state network (DFN). Similar, in a parallel finite-state machine (PFSM), we have
multiple finite-state machines that are concurrently executed, and in an nFSM,
the number of threads is fixed to n.
For DPNs and DFNs, we assume that the start configuration consists of one
thread that has just one symbol on its stack (for DPNs) and holds no locks. For
PPDSs and PFSMs, we assume that the start configuration consists of a list of
threads, each thread having just one symbol on its stack (for PPDS), and holding
no locks.
For models with monitors but without a stack, we require that the lockstack is
encoded into the control-states, i.e., we have a function locks from control-states
to lockstacks that is compatible with the transition relation.
The properties that we consider are fragments of EF-logic. First, we distin-
guish between double-indexed and regular atomic propositions. A double-indexed
atomic proposition has the form p1‖p2, where p1 and p2 are control-states. It holds
for configurations where one thread is at p1 and another thread is at p2. Double-
indexed atomic propositions are suited to characterize data-races or atomicity
violations. A regular atomic proposition is an automaton that describes a set of
configurations. Note that regular atomic propositions are strictly more general
than double-indexed atomic propositions.
Second, we distinguish whether path-operators are nested. We regard pairwise
reachability (EF(p1‖p2)), i.e., whether two control-states p1 and p2 can simulta-
neously be reached from the start configuration; regular set reachability (EF(A));
iterated pairwise reachability (EF(p1‖p2 ∧ EF(p3‖p4))); and iterated regular set
reachability (EF(A1 ∧ EF(A2))). Moreover, we regard EF-formulas with a fixed
number of operators and EF-formulas with an unbounded number of operators,
as well as EF-formulas with (EF) and without (EF\neg) negation. Note that, for
double-indexed EF-formulas, we use the term fixed-size to indicate a formula with
a fixed number of operators. We avoid this term for EF-formulas with regular
atomic propositions, to clarify that we only fix the number of operators, not the
size of the automata for the atomic propositions.
Figure 9.1 illustrates the considered models and properties. An arrow points
from a more general to a more special model or property, e.g. DPNs are more
general than DFNs and PPDS, and EF-logic is more general than negation-free
EF-logic.
144













Figure 9.1: Considered models and properties.
Table 9.1: Complexity results for models with reentrant monitors or well-nested,
non-reentrant locks.
DPN PPDS 2PDS DFN PFSM nFSM
EF(p1‖p2) NP∗? NP†? NP†? NP∗! P P
EF(A) NP NP NP†? NP NP P
EF(p1‖p2 ∧ EF(p3‖p4)) NP NP NP ::::NP∗! P P
EF(A1 ∧ EF(A2)) NP NP NP NP NP P
EF\neg (fixed #ops)
::::
NP NP NP NP NP P
EF (fixed #ops) ≥ PSPACE‡ ≥NP P




9.2 Monitors and Well-Nested, Non-Reentrant
Locks
In this section, we consider the complexity of checking models with reentrant
monitors or well-nested, non-reentrant locks. Table 9.1 shows the complexity
results that we will establish in this section. The rows of the table are indexed
with the properties, and the columns are indexed with the models. Interestingly,
the hardness results already hold for models with non-reentrant monitors. This
locking discipline is more special than the reentrant monitor and well-nested,
non-reentrant locking disciplines that are typically considered.
The entries in the table are annotated with various additional information that
we now explain: A ∗-annotation indicates that the NP-hardness result requires a
145
9 Complexity
model that spawns threads inside monitors. Note that this might not be realistic,
as monitors are typically used to protect access to shared resources, and the code
inside monitors is kept small. Thus, one would usually not spawn threads inside
monitors. An ∗!-annotation means that we have a polynomial algorithm if we
do not spawn threads inside monitors. An ∗?-annotation means that we do not
know the exact complexity of the analysis problem, if we must not spawn threads
inside monitors, and the model is additionally deadlock-free and uses inescapable
locks.
A †-annotation indicates that the NP-hardness result requires a model that
has deadlocks or escapable locks. Allowing escapable locks may not be realistic,
as real programming languages are deterministic, and thus locks are inescapable.
However, escapable locks are required to model timeouts or tryLock-operations,
which are often used to avoid deadlocks and are also available for Java [47].
Assuming that the program to be analyzed may have deadlocks is realistic, as
deadlocks are a common programming error. However, if a prior analysis step has
verified the absence of deadlocks, we may assume that the program is deadlock-
free. Unfortunately, we do not know the exact complexity of some problems in the
deadlock-free case with inescapable locks, as we indicate with the †?-annotation.
A ‡-annotation means that the hardness result does not require locks at all.
Those hardness results are rather strong, as they do not depend on communica-
tion between threads.
The reg-annotation in the row for negation-free EF-logic means that we require
regular atomic propositions to establish the hardness result, while the hardness
result for EF-logic with negation already holds for single-indexed atomic propo-
sitions. We leave open the exact complexity for double-indexed EF-logic, as
indicated by the reg?-annotation. A lower bound is, of course, NP-hardness.
An underlined entry means that the hardness has to be proved for this entry.
The hardness for entries without an underline follows from the hierarchy of the
models and properties (cf. Figure 9.1).




underline indicates that the easiness has to be proved
for this entry. The easiness for entries without a wavy underline follows from the
hierarchy of the models and properties (cf. Figure 9.1).
Finally, for entries that are marked with ≥, we only show the hardness direc-
tion.
The remainder of this section is organized as follows: In Subsection 9.2.1, we
establish the lower complexity bounds, and in Subsection 9.2.2, we establish the
upper complexity bounds.
9.2.1 Lower Complexity Bounds
In this subsection, we establish the lower complexity bounds for the problems in
Table 9.1. We only show the bounds for underlined entries. The other results
follow from the hierarchy of the models and properties (cf. Figure 9.1).
146
9.2 Monitors and Well-Nested, Non-Reentrant Locks
The NP-hardness results are obtained by reduction from the 3SAT-problem.
For the PSPACE-hardness results, we perform a reduction from the QBF-problem
or refer to existing results. Note that all hardness results established in this
subsection already hold for models with non-reentrant monitors.
For the next paragraphs, assume that we have an arbitrary 3SAT-instance
(V,C) over n variables and m clauses.
NP-Hardness of EF(p1‖p2 ∧ EF(p3‖p4)) for 2PDS We reduce 3SAT to this
problem. Regard the following program:
p1: sync (v1){call p2} OR sync (¬v1){call p2};
p2: sync (v2){call p3} OR sync (¬v2){call p3}; return
...
pn: sync (vn){call pn+1} OR sync (¬vn){call pn+1}; return
pn+1: a: skip; return
q:
sync (l11){} OR sync (l12){} OR sync (l13){}
...
sync (lm1){} OR sync (lm2){} OR sync (lm3){}
b: skip
Note that the program is given as pseudocode. The symbols p1, . . . , pn+1 and q
denote procedures here, and should not be confused with control-states. We argue
on the basis of pseudocode here, a translation to 2PDS is straightforward. The
execution starts with two threads at p1 and q. The program uses monitors vi and
¬vi for 1 ≤ i ≤ n. Intuitively, the monitors encode a valuation of the variables.
Entering the monitor vi encodes the valuation vi = ⊥, and entering the monitor
¬vi encodes the valuation vi = >. The first thread, which starts at p1, can reach
its label a with monitors encoding exactly all possible valuations of the variables.
Assuming that the first thread is at label a, the second thread, which starts at q,
can reach the label b if and only if the valuation chosen by the first thread satisfies
all clauses. Hence, to complete the reduction, we have to check whether the first
thread, called the chooser thread, can reach a, and afterwards, the second thread,
called the checker thread, can reach b while the chooser thread remains at a. This
can be expressed by an iterated pairwise reachability property, and we have that
p1γ1qγ1 |= EF(a‖q ∧ EF(a‖b)),
if and only if there is a valuation that satisfies all clauses. Here, γ1 and γ2 are
the bottommost stack-symbols of the threads, and we assume that the labels are
encoded into the control-state. Moreover, the program is obviously deadlock-free,
even if the locks are translated to inescapable locks.
147
9 Complexity
NP-Hardness of EF(A) for PFSM Again, we reduce 3SAT to this problem.
The idea is similar to the reduction above. However, we use one copy of the
monitor per clause, and use separate threads for each monitor. The regular set
to be reached ensures that the copies of the monitors are consistently entered.
We use the following program, with chooser threads tji and checker threads cj for
1 ≤ i ≤ n and 1 ≤ j ≤ m.
tji : sync (v
j







The regular set to be reached is defined by the regular expression
L(A) := ((a11 . . . a
m
1 ) + (b
1
1 . . . b
m
1 )) . . . ((a
1
n . . . a
m
n ) + (b
1














Obviously, A can be defined in size O(nm). We then have
t11 . . . t
m
1 . . . t
1
n . . . t
m
n c1 . . . cm |= EF(A),
if and only if the 3SAT-formula is satisfiable. Moreover, the program is deadlock-
free, as it does not use nested monitors.
At this point, we have shown NP-hard all problems except pairwise reachability.
Note that regular set reachability and pairwise reachability is equivalent for 2PDS,
and that pairwise reachability problems can be reduced from PPDS to 2PDS, as
only two threads need to be considered (cf. [57]).
The problem to apply our reduction to pairwise reachability properties is to en-
sure the correct sequence of checker and chooser thread. For iterated reachability,
ensuring the correct sequence was straightforward. For regular set reachability,
we could exploit the regular set to be reached to enforce consistency between
many threads. For pairwise reachability, however, we see no other option than
using thread-creation inside monitors or deadlocks.
NP-Hardness of EF(p1‖p2) for 2PDS Again, we reduce 3SAT to this problem.
We consider a chooser and a checker thread, like in the reduction for iterated
pairwise reachability. However, the whole checker thread is synchronized on an
additional lock x, and the chooser thread synchronizes on x before it reaches
label a. This way, the checker thread cannot start too early, as it would prevent
the chooser thread from reaching a. However, the program can deadlock: If
the chooser thread chooses an unsatisfiable configuration, and the checker thread
starts before the chooser thread has synchronized on x, both threads will get
148
9.2 Monitors and Well-Nested, Non-Reentrant Locks
stuck. Regard the following program:
p1: sync (v1){call p2} OR sync (¬v1){call p2};
p2: sync (v2){call p3} OR sync (¬v2){call p3}; return
...
pn: sync (vn){call pn+1} OR sync (¬vn){call pn+1}; return
pn+1: sync (x){}; a: skip; return
q: sync (x){
sync (l11){} OR sync (l12){} OR sync (l13){}
...
sync (lm1){} OR sync (lm2){} OR sync (lm3){}
b: }
With the arguments sketched above, we have that p1γ1qγ2 |= EF(a‖b), if and only
if the 3SAT-formula is satisfiable.
NP-Hardness of EF(p1‖p2) for DFN We reduce 3SAT to this problem. Here,
we use thread creation inside monitors to ensure the proper sequence of checker
and chooser threads. We regard the following program:
t0: sync (x){spawn t1; a: skip}
t1: sync (v1){spawn t2; sync (x){}} OR sync (¬v1){spawn t2; sync (x){}}
...
tn: sync (vn){spawn tn+1; sync (x){}} OR sync (¬vn){spawn tn+1; sync (x){}}
tn+1:
sync (l11){} OR sync (l12){} OR sync (l13){}
...
sync (lm1){} OR sync (lm2){} OR sync (lm3){}
b: skip
The chooser threads t1 . . . tn are spawned in sequence: After entering its monitor,
a chooser thread spawns the next chooser thread, and the chooser thread tn for
the last variable spawns the checker thread. Thus, the checker thread runs after
the chooser threads have made their choices. The additional lock x ensures that
the chooser threads do not leave their monitors too early. Note that the program
has no deadlocks: At any time, t0 can leave its monitor, and then, the chooser
threads can use x and leave their monitors. With the arguments sketched above,
we have t0 |= EF(a‖b), if and only if the 3SAT-formula is satisfiable.
PSPACE-Hardness for EF-Logic For full EF-logic, there are various known
hardness results for models without locks. Bouajjani et al. [13] show that model-
checking a fixed-size, single-indexed EF-formula is PSPACE-hard already for PDS
[13]—a model without concurrency. This implies the PSPACE-hardness results
in the EF-row for a fixed number of operators.
149
9 Complexity
A well-known result is that deciding emptiness of the intersection of a given
set of automata (INT) is PSPACE-complete [66]. INT can be reduced to model-
checking an EF-formula of the form EF(L(A1)∧ . . .∧ L(An)), if the model is able
to produce arbitrary length configurations. This is the case for any model that is
as least as general as DFN or 2PDS. This implies the PSPACE-hardness results
in the EF\neg-row.
Note that this reduction does not work for double-indexed negation-free EF-
formulas, and we have to leave open the exact complexity for this case. At
least, we obtain NP-hardness for PFSMs without locks: We reduce 3SAT to this
problem. Given a 3SAT-instance (V,C), we regard a PFSM with one chooser
thread per variable and the rules pi → vi and pi → ¬vi for 1 ≤ i ≤ n. The
choice of the variables is indicated by the states of the chooser threads. As we
work with unbounded size EF-formulas, we can encode the clauses directly into
the formula, and have






if and only if the formula is satisfiable. (Note that this formula is even single-
indexed.)
Esparza [34] shows that model-checking EF-logic is PSPACE-hard for basic
parallel processes. We adopt the idea of his reduction to show PSPACE-hardness
of model-checking PFSMs without locks. We reduce QBF to this problem. Given
a QBF-instance (V,C) over n variables and m clauses, we use the same PFSM as
above, i.e., we have one thread per variable and the rules pi → vi and pi → ¬vi.
Then, we construct an EF-formula that checks all possibilities of the variable
valuations, and tests the clauses for each possibility. Existential quantification is
modeled by the EF-operator, and universal quantification is modeled by the AG-
operator. Note that we have AGϕ = ¬EF¬ϕ. We regard the following formula:
ϕ :=EF((v1 ∨ ¬v1) ∧ p2 ∧ . . . ∧ pn
∧ AG((v2 ∨ ¬v2) ∧ p3 ∧ . . . ∧ pn
=⇒ . . .





lij) . . .))
We have p1 . . . pn |= ϕ, if and only if the QBF is true. This implies the hardness
results in the EF-row.
9.2.1.1 Filtering Consistent Configurations
The reductions presented in the previous paragraphs required reachability queries
only from consistent start configurations. Thus, our analysis algorithm does
150
9.2 Monitors and Well-Nested, Non-Reentrant Locks
not require intersection of the result with the automaton AConf ls (cf. Chapter 7).
However, if we are interested in the set of consistent predecessor configurations or
consistent immediate predecessor configurations of some automaton that may also
contain inconsistent configurations, we have to intersect the result set with the
automaton AConf ls that accepts the consistent configurations. In this subsection,
we show that this makes the problem hard. More precisely, we show that deciding
whether an automaton A with L(A) ⊆ valid accepts a consistent configuration is
NP-hard, and thus also deciding whether prels,M(L(A) ∩ Conf ls) 6= ∅ is NP-hard,
although we have a polynomial algorithm to compute an automaton pre′ls,M(A)
with L(pre′ls,M(A)) ∩ Conf ls = prels,M(L(A) ∩ Conf ls) (cf. Theorem 7.2).
Lemma 9.1. Given a Monitor-DPN M = (P,Γ,Γ⊥,Act,X ,∆, locks), and an
automaton A over the alphabet P ∪Γ such that L(A) ⊆ valid. Then, it is NP-hard
to decide whether
L(A) ∩ Conf ls 6= ∅.
The problem remains NP-hard, even if we assume that all configurations accepted
by A are non-reentrant.
Proof. The proof is, again, based on a reduction from 3SAT. Given a 3SAT-
instance (V,C) over n variables and m clauses, we construct a Monitor-DPN M
with a single control-state P = {p}, locks X = V ∪ {¬v | v ∈ V }, stack alphabet
X ∪˙ {⊥}, and locks : ⊥ 7→ ∅, x 7→ {x}. Note that the other components of the
Monitor-DPN are not relevant for our construction, as we just need the DPN
to define the set Conf ls of consistent configurations. We construct the following
automaton A:














Recall that, in a consistent configuration, each lock is held by at most one process.
The automaton A now produces configurations that consist of two processes. The
first process holds, for each variable vi, either the lock vi or ¬vi. The second
process holds, for each clause, the lock associated with one of the literals of
that clause. If A accepts a consistent configuration, the locks held by the first
and second process are disjoint, i.e., when accepting the first process, A chooses
a valuation such that, when accepting the second process, each clause can be
satisfied. Vice versa, for any satisfying valuation, a corresponding consistent
configuration is accepted by A. Obviously we have L(A) ⊆ valid, which completes
the proof.
The reduction can easily be modified such that all configurations accepted by
A are non-reentrant, i.e., no lock occurs twice on the same lockstack. For this
purpose, we work with 2m different locks v1i , . . . , vmi and ¬v1i , . . . ,¬vmi for each
151
9 Complexity
variable vi. When choosing the valuation, we ensure that either all vji or all ¬vji
locks are on the lockstack. When checking the clauses, a literal ljk is checked by
the lock (ljk)j. Thus, the lockstacks of both threads are non-reentrant.
In Section 7.2.2, we sketched a polynomial time algorithm for computing im-
mediate predecessor sets. However, the result could still contain inconsistent
configurations. If inconsistent configurations need to be filtered out from the
result, but the input automaton may contain inconsistent configurations, the
problem is NP-hard.
Theorem 9.2. Given as input a Monitor-DPN or Lock-DPN M and an automa-
ton A with L(A) ⊆ valid. Then, it is NP-hard to decide whether:
prels,M(L(A) ∩ Conf ls) 6= ∅.
Proof. We reduce the problem of checking whether a given automaton A with
L(A) ⊆ valid accepts a consistent configuration (cf. Lemma 9.1) to our problem.
We set the rules ofM such that there is a rule of the form pγ
a
↪→ pγ for each p ∈ P ,
γ ∈ Γ, and some a ∈ Act. Thus, steps of the DPN do not change the configuration
and are always possible. Hence, A accepts a consistent configuration, if and only
if prels,M(L(A) ∩ Conf ls) 6= ∅.
9.2.2 Upper Complexity Bounds
In the last subsection, we have established the lower complexity bounds of Ta-
ble 9.1. In this subsection, we establish the upper complexity bounds. We have
to provide algorithms for the entries marked with a wavy underline. The other
results then follow from the hierarchy of the models and properties illustrated in
Figure 9.1.
First of all, note that model-checking CTL for sequential finite-state processes
(1FSM) can be done in polynomial time [27]. Moreover, model-checking an nFSM
with m states can be reduced to model-checking its asynchronous product, which
is a 1FSM with mn states. As n is fixed, we get a polynomial time algorithm.
The remainder of this subsection is organized as follows: In Subsection 9.2.2.1,
we give an alternative characterization of the complexity class NP that is better
suited for our purposes. In Subsection 9.2.2.2, we show that model-checking
negation-free EF-formulas with a fixed number of operators and regular atomic
propositions is in NP for Monitor-DPNs. In Subsection 9.2.2.3, we briefly sketch
that this result also transfers to Lock-DPNs. In Subsection 9.2.2.4, we show that
model-checking fixed-size, negation-free, double-indexed EF-formulas is in P for
Monitor-DFNs.
152
9.2 Monitors and Well-Nested, Non-Reentrant Locks
9.2.2.1 NP via Verification of Certificates
We characterize NP as those problems where membership can be verified by a
polynomial size certificate in polynomial time (cf. Papadimitriou [96], Chapter 9):
Theorem 9.3. A problem L is in NP, if and only if there is a polynomially
decidable and polynomially balanced relation R, such that
L := {x | ∃y. (x, y) ∈ R},
where a relation R is polynomially balanced, iff
(x, y) ∈ R =⇒ |y| ≤ |x|k for some fixed k ≥ 1.
Intuitively, for each element x of the problem L, we find a certificate y that
has polynomial size in x, and can be verified in polynomial time.
In the remainder of this subsection, we show how to characterize lock-sensitive
reachability problems for Monitor-DPNs in this way, thus proving that they are
in NP.
9.2.2.2 Model-Checking negation-free EF-Formulas with a Fixed Number
of Operators
We show that the problem is in NP for Monitor-DPNs. In order to obtain an NP-
algorithm, we proceed in two steps. In the first step, we develop a polynomial
verifier for lock-sensitive predecessor sets. Given a Monitor-DPN M and an
automaton A, the verifier takes some certificate C and computes an automaton
A′ that has polynomial size in M , A, and C, and whose language is a subset of
pre∗ls,M(L(A)). Moreover, we show that, for each configuration c ∈ pre∗ls,M(L(A)),
there is a certificate of size poly(|X |) such that c ∈ A′.
In the second step, this verifier is used to develop an NP-algorithm for model-
checking negation-free EF-formulas with a fixed number of operators.
Recall that we use the DPN-Acceptor DACons to characterize the set of schedu-
lable lock-execution-hedges. We then construct the cross-product of the Monitor-
DPN to be analyzed and DACons , and compute lock-insensitive predecessor sets on
the cross-product. The result has to be intersected with the set of consistent con-
figurations, described by the automaton AConf ls . The cross-product construction
and lock-insensitive predecessor set computation can be done in polynomial time.
However, the size of DACons and AConf ls is exponential in the number of locks.
The idea of the certificate is to regard the rules of DACons that are actually
required to accept a single lock-execution-hedge that witnesses membership of a
single configuration in the predecessor set. After some modifications to ACons and
DACons , we show that we only need polynomially many rules to accept a single
lock-execution-hedge. Similar, we only need polynomially many rules of AConf ls to
accept a single configuration. Thus, the certificate consists of the rules required to
153
9 Complexity
accept the lock-execution-hedge and the configuration, and the verifier performs
the cross-product construction, the lock-insensitive predecessor set algorithm,
and the required automata-operations. As all those operations are polynomial
time, the verifier runs in polynomial time.
We construct the verifier in three steps. First, we modify the automaton ACons
that accepts schedulable lock-a/r-hedges, such that every hedge that is accepted
requires only polynomially many rules. Second, we modify the translation to
the DPN-Acceptor DACons , such that any lock-execution-hedge that is accepted
requires only polynomially many rules. Third, we show that only polynomially
many rules of AConf ls are required to accept a single configuration.
Subsumption Order on Acquisition Structures For the first step, we require
the subsumption ordering on acquisition structures.
Definition 9.4 (Subsumption Order). We define an ordering ≤ ⊆ AS × AS on
acquisition structures by pointwise lifting of set-inclusion. For two acquisition
structures s = (r, gr, u, a, ga) and s′ = (r′, g′r, u′, a′, g′a), we define
s ≤ s′ :⇐⇒ r ⊇ r′ ∧ gr ⊆ g′r ∧ u ⊆ u′ ∧ a ⊆ a′ ∧ ga ⊆ ga.
We also overload ≤ to hedge acquisition structures, and define ≤ ⊆ ASh×ASh by:
(s,X) ≤ (s′, X ′) :⇐⇒ s ≤ s′ ∧X ⊆ X ′.
The ordering ≤ on (hedge) acquisition structures is called subsumption ordering.
Note that we defined the inclusion for release-sets the other way round. In-
tuitively, this is because a thread that releases more locks is more likely to be
executable than a thread that releases fewer locks.
It is straightforward to show that consistency of acquisition histories is mono-
tonic w.r.t. the subsumption ordering:
Lemma 9.5.
s ∈ Cons ∧ s′ ≤ s =⇒ s′ ∈ Cons
Proof. Obvious by definition of Cons.
Intuitively, if we have s′ ≤ s, the acquisition structure s′ is subsumed by s, as
s′ is consistent if s is. In order to verify that an execution-hedge is schedulable,
it is sufficient to verify that its acquisition history is subsumed by a consistent
acquisition history.
154
9.2 Monitors and Well-Nested, Non-Reentrant Locks
Bounding the Tree Automaton on Lock-A/R-Hedges Regard a schedulable
lock-a/r-hedge accepted by ACons. It has no two final acquisitions of the same lock
and the set of used locks minus the set of initially released locks is disjoint from
the set of initially held locks. As the hedge is consistent and well-formed, it
contains no two initial releases of the same lock.
Now, regard an accepting run of ACons for this hedge. In its states, ACons
computes the acquisition history for the hedge in a bottom-up fashion. The
acquisition-graph and the sets of initially released and finally acquired locks
(ga-, r-, and a-components of the acquisition structure) are only changed at
acquisition- and release-nodes. Thus, during the run of ACons, there are only
O(|X |) different ga-, r- and a-components.
During processing a tree, the release-graph (gr-component) only increases, and
when joining an acquisition history with a hedge acquisition history at #h-nodes,
only disjoint release-graphs are joined, as no lock is initially released more than
once. As there are |X |2 possible edges in a release-graph, there are only O(|X |2)
different release-graphs in the states used during the run of ACons.
While processing the #h-nodes of the hedge, the set of initially held locks is
collected in the X-component of the hedge acquisition structure. As this set only
increases, there are only O(|X |) different X-components during the run. As the
sets of initially held locks of every thread are disjoint, the elements of the hedge
have only O(|X |) different sets of initially held locks. Thus, in order to process
all those sets, we only need to instantiate the #h-rule of ACons for O(|X |) different
X-sets.
However, there may be 2|X | different u-components during the run: As the size
of the a/r-hedge is not bounded, it may consist of 2|X | different threads, each
using a different set of locks. Thus, exponentially many rules may be required to
process a single hedge.
This is where the subsumption ordering comes into play. Instead of precisely
computing the use-sets, we nondeterministically guess the use-sets at the leafs of
the hedge and at acquisition-nodes. Then, we check that the actual use-sets are
below the guessed sets. Moreover, we ensure that the guessed use-sets of spawned
threads are the same as the guessed use-set of the spawning thread, such that
combination of threads at use-nodes does not involve different use-sets. This way,
we can bound the number of use-sets during the run of the automaton to O(|X |).
Moreover, for processing a use-node, we do not need to instantiate the rule for
the exact set of used locks, but it is sufficient to specify an upper bound. For this
purpose, we introduce rule-templates of the form 〈〉X(q1)q2 → q ∀X ⊆ Y , and
count such a template as a single rule. Later, we show how to translate those
templates to a DPN-Acceptor, requiring only polynomially many rules.




εh →δ asεh[u 7→ u] for all u ⊆ X (h-empty)
(st, X)#hs→δ as#h (st, X, s) if st.u = s.u (h-cons)
εs →δ asεs [u 7→ u] for all u ⊆ X (u-empty)
st#sss →δ as#s (st, ss) if st.u = ss.u (u-cons)
τ →δ asτt [u 7→ u] for all u ⊆ X (nil)
〈〉X(ss)st →δ as〈〉t (X, ss, st) ∀X ⊆ st.u if st.u = ss.u (use)
〈xst →δ as〈t(x, st)[u 7→ u′] for all u′ ⊇ st.u (acq)
〉xst →δ as〉t(x, st) (rel)
where we use the notation s.u for a (hedge) acquisition structure s ∈ AS ∪ ASh,
to identify the used-set component of s, and s[u 7→ u′] to replace the used-set
component of the (hedge) acquisition structure s by the set u′.
First of all, we show the correctness of A′Cons, i.e., we have
L(A′Cons) = L(ACons).
Proof. It is straightforward to show that any hedge h accepted by A′Cons with
acquisition structure σ′ is accepted by ACons with a smaller acquisition structure
σ ≤ σ′:
h ∈ A′Cons(σ′) =⇒ ∃σ. σ ≤ σ′ ∧ h ∈ ACons(σ).
As consistency is monotonic w.r.t. the subsumption ordering (Lemma 9.5), we
get L(A′Cons) ⊆ L(ACons).
For the ⊇-direction, we show the following statement, which implies L(A′Cons) ⊇
L(ACons): For all h ∈ HARls, t ∈ TAR, l ∈ TAR∗, σ ∈ AS, and X, u′ ⊆ X we have:
h ∈ ACons(σ,X) ∧ u′ ⊇ σ.u =⇒ h ∈ A′Cons(σ[u 7→ u′], X) (1)
t ∈ ACons(σ) ∧ u′ ⊇ σ.u =⇒ t ∈ A′Cons(σ[u 7→ u′]) (2)
l ∈ ACons(σ) ∧ u′ ⊇ σ.u =⇒ l ∈ A′Cons(σ[u 7→ u′]) (3)
This statement is shown by a rather straightforward induction on the structure
of lock-a/r-hedges, i.e., by induction on h, t, and l: The case h = ε is trivial.
In the case h = (t,Xt)hˆ, we obtain acquisition structures σt ∈ AS and (σˆ, Xˆ) ∈
ASh such that
(σ,X) = as#h (σt, Xt, (σˆ, Xˆ)) = (σt, Xt) ‖ (σˆ, Xˆ) ∧ t ∈ ACons(σ) ∧ hˆ ∈ ACons(σˆ, Xˆ).
By unfolding the definition of ‖, we get σ.u = σt.u ∪ σˆ.u, and thus u′ ⊇ σt.u and
u′ ⊇ σˆ.u. Hence, we can apply the induction hypothesis and get
t ∈ A′Cons(σt[u 7→ u′]) ∧ hˆ ∈ A′Cons(σˆ[u 7→ u′], Xˆ).
156
9.2 Monitors and Well-Nested, Non-Reentrant Locks
Using the (h-cons)-rule, we get
h ∈ A′Cons(as#h (σt[u 7→ u′], Xt, (σˆ[u 7→ u′], Xˆ))).
Moreover, we have
as#h (σt[u 7→ u′], Xt, (σˆ[u 7→ u′], Xˆ)) = as#h (σt, Xt, (σˆ, Xˆ))[u 7→ u′],
which completes the case.
The case t = τ is, again, trivial.




t (Xu, σs, σˆ) ∧ s ∈ ACons(σs) ∧ tˆ ∈ ACons(σˆ).
Hence, we have σ.u = σs.u ∪ σˆ.u ∪ Xu, and thus u′ ⊇ σs.u and u′ ⊇ σˆ.u and
u′ ⊇ Xu. The induction hypothesis yields
s ∈ A′Cons(σs[u 7→ u′]) ∧ tˆ ∈ A′Cons(σˆ[u 7→ u′]).
By the (use)-rule, we get t ∈ A′Cons(as〈〉t (Xu, σs[u 7→ u′], σˆ[u 7→ u′])). Moreover, we
have as〈〉t (Xu, σs[u 7→ u′], σˆ[u 7→ u′]) = as〈〉t (Xu, σs, σˆ)[u 7→ u′], which completes the
case.
In the case t = 〈xtˆ, we obtain σˆ ∈ AS, such that
σ = as
〈
t(x, σˆ) ∧ tˆ ∈ ACons(σˆ).
By induction hypothesis, we have tˆ ∈ A′Cons(σˆ), and with the (acq)-rule, we get
t ∈ A′Cons(as〈t(x, σˆ)[u 7→ u′]), which completes the case.
The case t = 〉xtˆ is proved by straightforward application of the induction
hypothesis and the (rel)-rule.
The case l = ε is trivial.
The case l = tlˆ is shown analogously to the case h = (t,X)hˆ.
Next, we show that, given a hedge h ∈ L(A′Cons), we find a polynomial size
subset of the rules of A′Cons, such that h is also accepted by this subset of rules.
As discussed above, we count rule-templates of the form 〈〉X(q1, q2)→δ q ∀X ⊆ Y
as one rule here. We already argued that, in an accepting run of ACons, there occur
only O(|X |) different acquisition-sets (a), release-sets (r), initially held locksets
(X), and acquisition-graphs (ga), as well as O(|X |2) different release-graphs. The
same argument also works for A′Cons. Moreover, the u-components of the states
of A′Cons only change at acquisition-nodes, and each u-component guessed at a
leaf-node is either equal to the u-component at an acquisition-node, or to the u-
component guessed by the (h-empty)-rule. Thus, there are only O(|X |) different
u-components during a run of A′Cons. Together, A′Cons requires only poly(|X |)
states to accept h. As we count the use-rule-templates as one rule, and need to
instantiate the (h-cons)-rule for only O(|X |) different sets X, it also requires only
poly(|X |) rules. Together, we get the following lemma:
157
9 Complexity
Lemma 9.6. There is a deterministic polynomial time algorithm that takes a
certificate C, and constructs an automaton ACCons of size poly(|C|), such that
L(ACCons) ⊆ L(ACons).
Moreover, given a schedulable lock-a/r-hedge h, there is a certificate C of size
poly(X ), such that h ∈ L(ACCons).
Proof. The algorithm first checks that the certificate C consists of a subset of
rules from A′Cons and a single final state qi ∈ Cons. If this is not the case, the
certificate is considered invalid, and the algorithm returns an empty automaton.
Otherwise, it returns the automaton ACCons that consists of the rules from C, with
the only final state qi. We have
L(ACCons) ⊆ L(A′Cons) = L(ACons).
As described above, for every lock-a/r-hedge h accepted by A′Cons, we find
poly(|X |) rules of A′Cons that are sufficient to accept h. The certificate C for the
hedge h consists of these rules and the final state in that h is accepted.
Note that it is sufficient to have a certificate of polynomial size in the input,
in order to show membership in NP. Here, we additionally demonstrate that the
size of the certificate only depends polynomially on the number of locks, thus
showing that the high complexity is caused by the number of locks, and not by
the program size.
With a slightly more complicated argumentation, we could bound the grade
of the polynomial, and show that a certificate of size O(|X |2) is sufficient. This
would match the complexity of the algorithm presented in Chapter 6. The idea is
to also guess the edges that will be added to the release-graph at a release-node,
and only check that the actual edges are below the guessed set. Moreover, the
certificate would not store the rules of the automaton, but only the sets of locks.
The instantiation of the rules with the guessed sets of locks can be done by the
verifier in polynomial time. This would only require O(|X |) different sets of locks
to be guessed. As each set of locks can be represented by |X | bits, the certificate
would have size O(|X |2).
Bounding the DPN-Acceptor In the last paragraph, we have shown that we
only need a polynomial size subset of the automaton A′Cons to accept a lock-a/r-
hedge. In this paragraph, we show the analogous result for lock-execution-hedges
and DPN-Acceptors.
Regard a run of the DPN-Acceptor DACCons (cf. Definition 5.2) that accepts a
schedulable lock-execution-hedge h, and assume that |C| = poly(|X |). The DPN
has control-states of the form (p, X, q) and (ub,q˜, X, u, q). The X-components are
the sets of currently acquired locks. They are only changed on final acquisitions
and initial releases, not while processing usages of locks. As h is consistent, there
are only O(|X |) different sets of initially held locks. As ar(h) is schedulable, there
158
9.2 Monitors and Well-Nested, Non-Reentrant Locks
is at most one final acquisition and one initial release for each lock. Thus, there
are only O(|X |) different X-components during the run of DACCons . As ACCons has
size poly(|X |), there are only poly(|X |) different q- and q˜-components. However,
there may be 2|X | different u-components during a single run of DACCons . We had
the analogous problem for the automaton ACons: There, we had worst-case expo-
nentially many 〈〉X-nodes. The problem was solved by introducing rule-templates
of the form 〈〉X(q1)q2 → q ∀X ⊆ Y . In order to check whether an instance of
such a rule-template is applicable, the DPN-Acceptor does not need to precisely
collect the set of used locks, but only needs to check whether the set of used locks
is a subset of Y , i.e., whether each used lock is contained in Y .
We modify the DPN-Acceptor DA from Definition 5.2 accordingly. We obtain
D′A by replacing the following rules of DA:
(ub,q˜, X, u, q)γ
〈x
↪→ (ub,q˜, X, u, q)x⊥γ if {x} \X ⊆ u (u-acq)
(pb, X, q)γ
〈x
↪→ (ub,q˜, X, u, q′)x>γ if 〈〉Y (q′)q˜ →δ q ∀Y ⊆ u and {x} \X ⊆ u
(u-begin)
(ub,q˜, X, u, q)x>
〉x
↪→ (pb, X, q˜) if εs →δ q (u-end)
where δ is the transition relation of A. The automata ACF and ACI are the same
as for DA.
Intuitively, we replace accumulation of the set of used locks by checking whether
the used locks are below the upper bound from the template-rule. The u-
component in the new set of rules stores this upper bound, and is not modified
during a usage. Hence, we do not require more u-components than there are
template-rules in A.
Analogously to Theorem 5.3, we have
L(D′A) = ar
−1(L(A)).
Proof. The proof is analogous to that of Theorem 5.3. For the induction, the
statements for hedges (1) and trees outside usages (2) remain the same. For
same-level trees ts and lockstacks µ with µ
ts⇀ ε, we show the following statement
instead:
∃c′ ∈ L(ACF). (ub,q˜, X, u, q)µ⊥x>w ts=⇒ c′[ϕ]
⇐⇒ ∃u′′ ⊆ X , q′ ∈ Q, s ∈ TAR∗. ars(ts, X) = (u′′, s)
∧ ϕ = (ub,q˜, X, u, q′)x>w ∧ u′′ ⊆ u ∧ s#sq′ →∗δ q
(3)
The proof of the cases is analogous to that of Theorem 5.3.
Next, we have to show that we only need polynomially many states and rules
for a single run of D′
ACCons
that accepts the lock-execution-hedge h. We already
159
9 Complexity
argued that we only need O(|X |) sets of currently acquired locks (X-components).
Moreover, each u-component stems from a template-rule of the tree automaton,
thus we only need O(|ACCons|) = poly(|X |) such components. The sets X in the
states of the initial automaton ACI are exactly the sets of initially held locks of
the threads in h, hence we only need O(|X |) of them. Also, the automaton ACF
requires only rules for the O(|X |) many X-components that actually occur during
the run of the DPN. Together, we get the following lemma:
Lemma 9.7. There is a deterministic algorithm that takes a certificate C, and
constructs in time poly(|C|)O(|Act|) a DPN-Acceptor DCACons such that L(DCACons) ⊆
L(DACons) and |DCACons| = poly(|C|)O(|Act|).
Moreover, for any lock-execution-hedge h ∈ L(DACons), there is a certificate C
of size poly(|X |), such that h ∈ L(DCACons).
Proof. We first check that the certificate has the form C = (C ′, X¯), for a certifi-
cate C ′ for the algorithm from Lemma 9.6, and a set X¯ ⊆ 2X of sets of locks. If
the certificate has the wrong format, we return an empty DPN-Acceptor.
Then, we run the algorithm from Lemma 9.6, and obtain AC′Cons. Next, DCACons is




only for X-components from X¯.
As we have to instantiate the rules for base- and spawn-actions for all possible
action labels from Act, we get the extra factor |Act| in the size and runtime
estimation.





, and thus we have
L(DCACons) ⊆ L(D′AC′Cons) = L(DAC′Cons) = ar
−1(L(AC
′
Cons)) ⊆ ar−1(L(ACons)) = L(DACons).
Given a lock-execution-hedge h ∈ L(DACons), we have ar(h) ∈ L(ACons). By
Lemma 9.6, we obtain C ′ with |C ′| = poly(|X |) such that ar(h) ∈ L(AC′Cons). Thus,
we choose C := (C ′, X¯) where the set X¯ contains the X-components that occur




that accepts h. As argued above, their number is bounded
by poly(|X |).
Bounding the Automaton for Consistent Configurations The cross-product
construction requires an automaton AConf ls that accepts the consistent configura-
tions (cf. Section 5.2). It has exponentially many states in the number of locks.
However, for accepting a single consistent configuration, only polynomially many
states are required. Thus, we get:





) ⊆ Conf ls. The algorithm needs time poly(|P ||Γ||C|).
Moreover, for any consistent configuration c ∈ Conf ls, there exists a certificate




9.2 Monitors and Well-Nested, Non-Reentrant Locks
Proof. We use the automaton AConf ls that was constructed in Section 5.2. The
certificate C is interpreted as a set of sets of locks, and the automaton AC
Conf ls
is
constructed by instantiating the rules of AConf ls only for locksets from C. The time
estimation follows from |X | = O(|Γ|) (every lock is bound to a stack-symbol).
In a single accepting run of AConf ls , only O(|X |) different sets of locks are re-
quired, as the sets of locks only increase, and the set of locks for a thread is
disjoint from the overall set of locks. These sets of locks are chosen as certificate
C.
NP-Algorithm for Negation-Free EF-Formulas Combining the results of the
last paragraphs, we get:
Lemma 9.9. There is a polynomial time algorithm that, given a Monitor-DPN





L(preCls,M(A)) ⊆ prels,M(L(A) ∩ Conf ls ∩ valid) |preCls,M(A)| = |A|poly(|M ||C|)
L(pre∗,Cls,M(A)) ⊆ pre∗ls,M(L(A) ∩ Conf ls ∩ valid) |pre∗,Cls,M(A)| = |A|poly(|M ||C|)
Moreover, given a configuration c ∈ prels,M(L(A) ∩ Conf ls ∩ valid), there is a
certificate C of size poly(|X |) such that c ∈ L(preCls,M(A)). Also, given a configu-
ration c ∈ pre∗ls,M(L(A) ∩ Conf ls ∩ valid), there is a certificate C of size poly(|X |)
such that c ∈ L(pre∗,Cls,M(A)).
Proof. The algorithm checks whether the certificate has the form C = (C1, C2),
such that C1 is a valid certificate for the algorithm from Lemma 9.7 and C2
is a valid certificate for the algorithm from Lemma 9.8. Otherwise, the empty





1 (L(A)) ∩ L(ACF×)) ∩ A′CI×),
where (ACI× ,ACF× ,M×) is the cross-product of M and D
C1
ACons
= (ACI ,ACF ,M2),
and we set
A′CI× := valid× ∩ pi−11 (AC2Conf ls) ∩ pi−12 (ACI).
The size bound for this automaton is shown analogously to Theorem 6.5, using
Lemmas 9.7 and 9.8.
The proof is completed by the following calculation: Similar to Section 6.3, we
have
c ∈ pre∗ls,M(L(A) ∩ Conf ls ∩ valid)
⇐⇒ ∃h ∈ H, c′ ∈ L(A). c ∈ Conf ls ∩ valid ∧ c h=⇒ c′ ∧ (h×ls(c)) ∈ L(DACons)
161
9 Complexity
With Lemma 9.7, we get:
⇐⇒ ∃h ∈ H, c′ ∈ L(A), C1. |C1| = poly(|X |)
∧ c ∈ Conf ls ∩ valid ∧ c h=⇒ c′ ∧ (h×ls(c)) ∈ L(DC1ACons)
Using the cross-product construction, we get, analogously to Section 5.2:
⇐⇒ ∃C1. |C1| = poly(|X |) ∧ c ∈ pi1(pre∗×(pi−11 (L(A)) ∩ L(ACF×)) ∩ ACI×)




definition of ACI× , using Lemma 9.8, and folding the definition of A′CI× , we get:
⇐⇒ ∃C1, C2. C1, C2 = poly(|X |) ∧ c ∈ pi1(pre∗×(pi−11 (L(A′)) ∩ L(ACF×)) ∩ A′CI×)
The argumentation for immediate predecessor sets is analogous.
This algorithm is now used to compute the semantics of a negation-free EF-
formula.
Theorem 9.10. Given a negation-free EF-formula ϕ, a Monitor-DPN M , and
a certificate C, there is an algorithm that computes an automaton ACϕ of size
|ACϕ | = sP(ϕ)nP(ϕ)poly(M)nEX(ϕ)+nEF(ϕ)poly(C),
such that L(Aϕ) ⊆ [[ϕ]]. The algorithm is polynomial in |M |, |C|, and sP(ϕ), and
exponential in the number of operators of ϕ.
Moreover, for any configuration c ∈ [[ϕ]], we get a certificate C of size
|C| = poly(|X |(nEF(ϕ) + 1)),
such that c ∈ L(ACϕ ).
Proof. The automaton ACϕ is computed similar to the algorithm presented in
Section 6.4.2. However, we use the nondeterministic predecessor set computation
from Lemma 9.9. Thus a certificate of size poly(|X |) is required for each of the
nEF(ϕ) EF-operators. (The intermediate predecessor sets are computed with the
polynomial algorithm from Section 7.2.2.) Moreover, we intersect the final result
with the the set of consistent configurations. Due to Lemma 9.8, this requires
another certificate of size poly(|X |).
The size of the resulting automaton is shown analogously to Section 6.4.2.
We immediately get the following corollary, which is the main result of this
subsection:
Corollary 9.11. Model-checking negation-free EF-formulas with a fixed number
of operators is in NP for Monitor-DPNs.
Finally, note that our result does not show NP-easiness for checking EF-
formulas with unboundedly many operators, as the verifier requires exponential
time in the number of atomic propositions and path-operators.
162
9.2 Monitors and Well-Nested, Non-Reentrant Locks
9.2.2.3 Well-Nested, Non-Reentrant Locks
In Chapter 8, we briefly discussed how the predecessor set computation that was
developed in this thesis can be adapted to DPNs with well-nested, non-reentrant
locks (Lock-DPNs). Here, it was sufficient to keep track of the set of currently
acquired locks in configurations, and the DPN-Acceptor DACons could be realized
without a stack. The same modifications that we did for monitors also apply to
to the stackless version of DACons . Thus, we get an NP-algorithm also for well-
nested, non-reentrant locks. The cross-product construction of a Lock-DPN and
a stackless DPN-Acceptor can be done in polynomial time, along the lines of [79].
As this thesis is focused on reentrant monitors, we omit any formal details here,
and just state the result:
Theorem 9.12. Model-checking negation-free EF-formulas with a fixed number
of operators is in NP for Lock-DPNs.
9.2.2.4 Model-Checking DFNs without Spawns inside Monitors
In the last subsection, we have shown that model-checking Monitor-DPNs against
negation-free EF-formulas with a fixed number of operators and regular atomic
propositions is in NP. In this subsection, we present a polynomial time algo-
rithm for model-checking fixed-size, negation-free, double-indexed EF-formulas
for Monitor-DFNs without spawns inside monitors.
The algorithm is based on the observation that we only need to keep track of
two simultaneously running threads in order to check a double-indexed atomic
proposition. If a thread is spawned, both the spawning and the spawned thread
hold no locks, and thus can be discarded if not involved in satisfying any double-
indexed atomic proposition. Thus, in order to check a formula ϕ with n double-
indexed atomic propositions, it is sufficient to keep track of 2n threads simulta-
neously.
The first step of the polynomial algorithm is to bound the configurations of the
DFN to at most 2n threads. When executing a spawn-rule, we nondeterministi-
cally choose to drop the spawning or the spawned thread from the configuration
or to keep both threads if this would not exceed the limit of 2n threads. The
second step is to model-check the resulting bounded DFN, which can be done in
polynomial time.
Let M = (P,Act,X ,∆, locks) be the Monitor-DFN, and p0 ∈ P be the start
configuration. Note that locks : P → X ∗ is a function from control-states to
lockstacks here, and the rules are assumed to be compatible with this function.
The bounded DFN A is a finite-state machine with the states Q = {c ∈ Conf ls |
163
9 Complexity
|c| ≤ 2n} and the following rules:
c1pc2
o−→A c1p′c2 for p o↪→ p′ ∈ ∆ and c1p′c2 ∈ Conf ls (no-spawn)
c1pc2
o−→A c1psp′c2 for p o↪→ ps]p′ ∈ ∆, |c1psp′c2| ≤ 2n (spawn-1)
c1pc2
o−→A c1p′c2 for p o↪→ ps]p′ ∈ ∆ (spawn-2)
c1pc2
o−→A c1psc2 for p o↪→ ps]p′ ∈ ∆ (spawn-3)
The (no-spawn)-rule simply changes the state of a thread, and ensures that a step
can only be applied if the new configuration is consistent. The (spawn-1)-rule
spawns a new thread, and keeps both, the spawning and the spawned thread.
This is only possible if the number of threads does not exceed 2n. The (spawn-
2)-rule drops the spawned thread, and the (spawn-3)-rule drops the spawning
thread. The bounded DFN has less than |P |2n+1 different states, thus its size is
polynomial in |P | (as n is a constant).
By assumption, spawn-statements are not done inside monitors, and spawned
threads hold no monitors initially. Thus, the (spawn-2)- and (spawn-3)-rules only
drop threads that hold no locks. Hence, any execution of A corresponds to an
execution of M :
b −→∗A b′ =⇒ ∃c′. b′  c′ ∧ b −→∗M c′.
Here,  denotes the subword relation, i.e., we have c  c′, iff c can be obtained
from c′ by deleting some elements. As atomic propositions remain valid when
adding additional threads, we have
b |=A ϕ =⇒ b |=M ϕ.
For the other direction, we show that we do not need to keep more than 2n
threads in order to satisfy ϕ. Formally, we have to apply a generalization for the
inductive proof. We show
c |=M ϕ
=⇒ ∃b. b  c ∧ |b| ≤ idx(ϕ) ∧ (∀b′. b  b′  c ∧ |b′| ≤ 2n =⇒ b′ |=A ϕ) ,
where idx(ϕ) is twice the number of atomic propositions in ϕ. Intuitively, this
proposition fixes a set b of threads that are required to satisfy ϕ, and states that
ϕ is satisfied on the bounded DFN for all sub-configurations of c that contain b.
This generalization is required for conjunction and disjunction of formulas, where
we have to consider the required threads of both formulas.
The proof is done by rather straightforward induction on the structure of ϕ: If
ϕ is an atomic proposition, we have idx(ϕ) = 2 and choose b to contain the two
threads that satisfy the atomic proposition. If ϕ is a conjunction or disjunction
of ϕ1 and ϕ2, we choose b to contain those threads required by ϕ1 or ϕ2. If we
164
9.3 Stronger Synchronization Mechanisms
have ϕ = EFϕ′ or ϕ = EXϕ′, we include a thread in b if it is required by ϕ′, or if
it spawns a required thread. During the execution, a spawned thread is dropped
if it is not required. Also the spawning thread is dropped if its only purpose was
to spawn a required thread. Thus, we do not need to consider more than idx(ϕ′)
threads at any point of the execution, and the execution can be simulated by A.
Summarized, we have p0 |=M ϕ if and only if p0 |=A ϕ. As model-checking of
CTL for finite-state machines can be done in polynomial time [27], we get the
following theorem:
Theorem 9.13. Given as inputs a DFN M that spawns no threads inside moni-
tors, a control-state p0, and a fixed-size, negation-free, double-indexed EF-formula
ϕ. Then, there is a polynomial time algorithm that decides p0 |=M ϕ.
9.3 Stronger Synchronization Mechanisms
In the last sections, we discussed the complexity of checking models with locks.
We discussed monitors and well-nested, non-reentrant locks. In this section,
we discuss stronger synchronization mechanisms. As explained in Section 8.1,
we left open the decidability of checking models with well-nested, reentrant
locks. Non-well-nested, non-reentrant locks can be used to emulate rendezvous-
communication [57]. Moreover, in [44], we analyzed Join-Lock-DPNs, which use
well-nested, non-reentrant locks and join-synchronization. Thus we regard models
that synchronize via locks and joins, and models that synchronize via rendezvous.
Rendezvous-synchronization allows two threads to wait for each other, and
then simultaneously execute two statements. For this purpose, we assume a set
of synchronized actions {a1, a¯1, . . . , an, a¯n} with n ≥ 2, such that an ai-step must
pair with an a¯i step such that both steps are executed simultaneously, and, vice
versa, an a¯i step must pair with an ai-step. Rendezvous-synchronization, usually
in the form of synchronous message-passing, is found in many frameworks for
distributed computing, like the well-known Message-Passing Interface (MPI) (cf.
[49]).
Intuitively, join-synchronization allows a thread to wait until some other thread
has terminated. Join-synchronization is supported by many programming lan-
guages, e.g. Java [47]. For our purpose, we introduce Join-Lock-DPNs. A Join-
Lock-DPN is a Lock-DPN, with a special jo-action. A thread t can execute a
jo-action only if all threads created by t so far have terminated. Termination of
a thread is modeled by reaching a special control-state. We studied Join-Lock-
DPNs in [44], and showed that regular set reachability is decidable. Here, we show
that already deciding reachability of a single control-state is PSPACE-hard. For
a formal semantics of Join-Lock-DPNs, we refer to [44]. However, an intuitive




Table 9.2: Complexity of checking pairwise reachability.
EF(p1‖p2) DPN PPDS 2PDS DFN PFSM nFSM
Monitors NP NP NP NP P P
Join-Lock ≥PSPACE — — ≥PSPACE — —
Rendezvous undec. undec. undec. ≥NP NP P
Table 9.2 shows the complexity for pairwise reachability properties for models
with various synchronization mechanisms. The complexities of the first row have
been shown in Section 9.2. The complexities for rendezvous-communication follow
straightforwardly from results of Taylor [113] (PFSM) and Ramalingam [104]
(2PDS). Note that we do not investigate here whether the NP-easiness results of
Taylor [113] also apply to DFNs.
It remains to show the results for Join-Lock-DPNs and Join-Lock-DFNs. Note
that join-synchronization only makes sense for models with thread creation. We
show, by reduction from QBF, that already deciding reachability of a single
control-state from the start configuration is PSPACE-hard for Join-Lock-DFNs.
Given a QBF (V,C), we construct the following program:
t1: sync (x1){spawn t2; join } OR sync (¬x1){spawn t2; join }; a: terminate
t2: sync (x2){spawn t3; join }; sync (¬x2){spawn t3; join }; terminate
...
tn: sync (xn){spawn tn+1; join }; sync (¬xn){spawn tn+1; join }; terminate
tn+1:
sync (l11){} OR sync (l12){} OR sync (l13){}
...
sync (lm1){} OR sync (lm2){} OR sync (lm3){}
terminate
The chooser threads t1 . . . tn choose all valuations that have to be checked, one by
one. An existentially quantified variable is modeled by nondeterministic choice,
and a universally quantified variable is modeled by sequentially producing both
valuations. It is straightforward to show that the above program can reach label a
if and only if the QBF is true. The above program translates to a Join-Lock-DFN
with non-reentrant monitors, and thus deciding single control-state reachability
for Join-Lock-DFNs is PSPACE-hard. The result for Join-Lock-DPNs follows
from the fact that Join-Lock-DPNs are more expressive than Join-Lock-DFNs.
9.4 Discussion and Related Work
In this chapter, we have established upper and lower complexity bounds for vari-
ous model-checking problems. The main result is that model-checking negation-
166
9.4 Discussion and Related Work
free EF-formulas with a fixed number of operators and regular atomic proposi-
tions is NP-complete for Monitor-DPNs. The easiness part of this result implies
that there are NP-algorithms for various interesting problems that can be mapped
to those EF-formulas. These include detection of data-races and atomic-set se-
rializability violations, bitvector-analysis, and bounded model-checking with a
constant bound. Moreover, the high complexity of the problem is induced by the
number of locks, i.e., the number of nondeterministic choices required by the NP-
algorithm only depends on the number of locks, not on the program size. This
also matches the complexity of the algorithm that we developed in Chapter 6,
which is exponential only in the number of locks.
In order to show that our model-checking algorithm does not solve a too general
problem at the cost of increased complexity, we have established lower bounds for
rather special problems. These imply that detection of data-races and atomic-
set serializability violations, as well as bitvector-analysis and bounded model-
checking with a bound of two rendezvous-communications is NP-hard, even for
more special models than Monitor-DPNs. Our hardness results also apply to
various problems on parallel pushdown systems communicating via well-nested,
non-reentrant locks, which have been studied recently without explicitly stating
lower complexity bounds: Pairwise reachability properties [57]; model-checking
of double-indexed LTL\X-formula [55], LTL(X,F)-formula [56], and alternation
free mu-calculus [56]; reachability properties w.r.t. phase-automata [61, 63]; and
bitvector-analysis [39].
We also regarded deadlock-free models with inescapable locks and no spawns
inside monitors. The assumptions of inescapable locks and no spawning in-
side monitors are realistic, and assuming deadlock-free models makes sense if
a prior analysis is used to verify absence of deadlocks. Interestingly, the com-
plexity changes for this locking discipline: There is a polynomial algorithm for
model-checking fixed-size, double-indexed EF-formulas for DFNs, while regular
set reachability remains hard. Also, iterated pairwise reachability is hard for
models with a stack. However, we had to leave open the complexity of the pair-
wise reachability problem for models with a stack. This problem is important, as
it is equivalent to data-race detection, and a lower bound would also transfer to
bitvector-analysis and atomic-set serializability violation detection.
In order to show that our problem is a rather general NP-complete problem,
we have shown PSPACE-hardness of some slightly more general problems. When
increasing the expressiveness of the properties, the problem gets harder: For
EF-formulas with negation, the problem is PSPACE-hard, already for fixed-
size, single-indexed formula and 1PDS without locks [13]. For unbounded size,
negation-free EF-formulas with regular atomic propositions, the problem is also
PSPACE-hard, already for DFNs and 1PDS. However, we had to leave open
the exact complexity of model-checking unbounded size, negation-free, double-
indexed EF-formulas. Here, the best lower bound that we could establish is
NP-hardness for PFSMs without locks.
167
9 Complexity
Also increasing the expressiveness of the model makes the problem harder: De-
ciding reachability properties for Join-Lock-DPNs [44] is PSPACE-hard. For even
more powerful synchronization mechanisms, like shared memory or rendezvous,
problems tend to get undecidable for models with a stack [104].
We already mentioned most of the related work: Bouajjani et al. [13] ap-
ply predecessor set computation to model-check linear and branching time log-
ics for pushdown systems, and also establish lower complexity bounds. Among
other results, they show PSPACE-hardness of model-checking fixed-size (single-
indexed) EF-formulas for pushdown systems, which implies our hardness results
for 2PDS, PPDS, and DPNs. Esparza [34] studies the complexity of model-
checking linear and branching time logics for Petri-nets and basic parallel pro-
cesses (BPPs). Among other results, he shows that model-checking EF-logic for
BPPs is PSPACE-hard. We have transferred this result to PFSMs. Taylor [113]
shows that checking various interesting properties for PFSMs with rendezvous-
communication is NP-complete, and Ramalingam [104] shows that these problems
become undecidable when regarding a model with stacks, e.g. 2PDS.
Regarding program analysis for concurrent programs without locks, interproce-
dural bitvector-analysis seems to be robustly in polynomial time for many classes
of models, among them programs with parallel calls [110] and programs with
parallel calls and thread creation [76]. These analysis are based on the idea of
computing possible interference, which has first been used by Knoop et al. [65] for
intraprocedural analysis of parallel programs. However, more complex analysis
problems tend to become harder again: Müller-Olm and Seidl [89, 92] show that
interprocedural copy-constants and some related problems (like faint variables
and slicing) are undecidable for programs with parallel calls. Even the intrapro-
cedural variants of these analysis are PSPACE-complete, and the intraprocedural
variants for loop-free programs are still (co)NP-complete. They also show that
their hardness results do not depend on the ability of parallel procedure calls to
synchronize on termination, and thus the problems are also undecidable for par-
allel pushdown systems or DPNs. Interestingly, interprocedural copy-constants
and faint variables become decidable for programs with parallel calls, when as-
signments are interpreted non-atomically, i.e., a context switch may occur after
reading the right-hand side, but before writing the left-hand side [90, 91]. How-
ever, the problem is still co-NP-hard, and there is evidence that it is strictly
harder than co-NP [90]. Whether this decidability result transfers to DPNs or
even DPNs with locks is left to future research.
Finally, we used a subsumption ordering on acquisition structures for our NP-
easiness result. We originally used a similar subsumption ordering as an optimiza-




Concurrent programming becomes more and more important, as concurrent hard-
ware, like multicore processors, is becoming popular. It is prone to subtle errors,
like data-races, that are easily missed by testing. This increases the need for
automatic verification methods for concurrent programs. Modern high-level lan-
guages, like Java, support concurrency via dynamic thread creation, monitors,
and shared memory.
In this thesis, we have studied Monitor-DPNs, an abstract model suited for this
type of concurrent programs. It supports procedures, dynamic thread creation,
and reentrant monitors. We have presented an algorithm for precise lock-sensitive
predecessor set computation that can be used to decide various interesting prop-
erties, among them absence of data-races and EF-logic.
Our algorithm requires polynomial time in the size of the program, and ex-
ponential time in the number of locks. However, the exponential complexity is
a worst-case bound that is not reached by typical programs, as suggested by
experimental results on the closely related PPDS-model [39, 57, 63]. Suitable
abstractions from real programming languages to Monitor-DPNs are subject of
current research.
While automatic verification tries to prove absence of errors, a complementary
method is error-detection that tries to find errors. Here, our methods can be
applied to increase the precision of bounded model-checking.
From a theoretical point of view, we have extended the decidability boundary
for model-checking of concurrent programs with locks. Prior to this research, the
state of the art was checking pairwise reachability (viz. double-indexed temporal
logics) for parallel pushdown systems with well-nested, non-reentrant locks. We
have extended this to reachability between arbitrary regular sets (viz. EF-logic
with regular atomic propositions) for DPNs with well-nested, non-reentrant locks
(in [79]) or reentrant monitors (in this thesis). Moreover, we have established
precise upper and lower complexity bounds for our problems, showing that model-
checking negation-free EF-logic with a fixed number of operators is NP-complete,
and that the complexity depends on the number of locks rather than on the size of
the program. The NP-hardness results already hold for many practical problems
like data-race detection and bitvector-analysis.
Our method is based on execution-trees, which are a true-concurrency seman-
tics for DPNs. In addition to encoding the causal order of steps, an execution-tree
is ordered, i.e., a spawn-node has a left successor that describes the execution of
the spawned thread, and a right successor that describes the remainder of the
169
10 Conclusion
execution of the spawning thread. The ordering allows us to keep track of the
execution of a thread, and enables handling of execution-trees with tree automata
based methods.
While reachability analysis with regular constraints on the interleaved execu-
tions is undecidable for DPNs, reachability analysis with regular constraints on
the execution-trees is efficiently decidable. It remains decidable even for con-
straints given by the language of DPN-Acceptors, a generalization of tree au-
tomata, which may make limited use of a stack. We obtained this result by re-
ducing constrained to unconstrained reachability analysis, using a cross-product
construction.
Moreover, we have characterized the set of schedulable execution-trees of a
Monitor-DPN by a DPN-Acceptor. This result has been obtained by generalizing
the acquisition history method of Kahlon et al. [57] to execution-trees. We used
acquire/release-trees as an intermediate model, to get an elegant and modular
approach.
Combining these results, we constrain reachability analysis to only consider
schedulable execution-trees, and thus reduce lock-sensitive reachability analysis
to lock-insensitive reachability analysis. The lock-insensitive reachability analysis
is then performed by the existing predecessor set algorithm of Bouajjani et al.
[16].
Topics of current and future research include the generalization to more ex-
pressive but still decidable models, exploration of efficient implementations, ex-
ploration of suitable abstractions from real programming languages, and me-
chanically verified analysis algorithms. We already generalized our methods to
reachability analysis for Join-Lock-DPNs [44]. The next step is generalization
to CDPNs with locks and to more powerful analyses, e.g. bitvector-analyses and
atomic-set serializability violation detection.
Regarding efficient implementations, we are currently exploring lock-sensitive
analysis via regular execution-trees [44, 74]. This technique computes successor
sets rather than predecessor sets, and thus does not suffer from state-space explo-
sion due to generating acquisition structures for unreachable executions. While
we only cover reachability properties from the start configuration in [44, 74],
we are currently extending our methods to iterated reachability properties, like
bitvector-analysis and atomic-set serializability violation detection. A topic for
future research is to use Weighted-DPNs [120, 121] to avoid exploring unreachable
executions during predecessor set computation.
For the actual implementation of our algorithms, we are experimenting with
logic programming languages derived from PROLOG [28]. They allow for a suc-
cinct representation of algorithms that are based on fixed point computations.
We may use the DATALOG [43] subset of PROLOG, where predicates are in-
terpreted over finite domains. Bddbddb [69, 122] is a symbolic implementation
of DATALOG that uses binary decision diagrams (BDDs) [18]. It has been ex-
tended in our research group to concepts required for coding acquisition structure
170
based analysis [88]. Evaluation and further improvement of this approach remains
future research. We are also planning to evaluate other implementations of PRO-
LOG or related formalisms, like the succinct solver [93], which implements the
alternation free fragment of least fixed point logic (ALFP) (cf. [93]).
Abstraction from Java [47] to Monitor-DPNs is subject of current research. The
greatest challenge is to get precise abstractions of locks, which are referenced via
pointers in Java. A promising approach is the random isolation method of Kidd
et al. [62] that we already used successfully for parallel pushdown systems [61, 63].
We have mechanically verified most of the results that led to this thesis [72, 77]
with the interactive theorem prover Isabelle/HOL [94]. Also the main result of
this thesis—i.e., lock-sensitive predecessor set computation for Monitor-DPNs—
has been mechanically verified. The formal verification has been completed before
this thesis was written, and, consequently, this thesis contains some changes
in order to improve the presentation. The main objective of the verification
was to show that our methods are correct and effective. Another objective,
which is subject of current research, is to derive verified and efficient analysis
algorithms from the formalizations. As a first step in this direction, we developed
an efficient and mechanically verified tree automata library [71], which is based
on our mechanically verified library of efficient collection data structures [73, 75].
We also have (unpublished) formalizations of the saturation algorithm for lock-
insensitive predecessor set computation [16, 17] and of the regular execution-
tree method [74]. Integrating and extending these results to obtain efficient and





[1] R. Alur. Marrying words and trees. In Proceedings of the twenty-sixth ACM
SIGMOD-SIGACT-SIGART symposium on Principles of database systems,
PODS ’07, pages 233–242, New York, NY, USA, 2007. ACM.
[2] R. Alur and P. Madhusudan. Visibly pushdown languages. In Proceedings
of the thirty-sixth annual ACM symposium on Theory of computing, STOC
’04, pages 202–211, New York, NY, USA, 2004. ACM.
[3] R. Alur and P. Madhusudan. Adding nesting structure to words. In
O. H. Ibarra and Z. Dang, editors, Developments in Language Theory, vol-
ume 4036 of Lecture Notes in Computer Science, pages 1–13. Springer Berlin
/ Heidelberg, 2006.
[4] R. Alur and P. Madhusudan. Adding nesting structure to words. J. ACM,
56:16:1–16:43, May 2009. ISSN 0004-5411.
[5] K. R. Apt. Ten years of Hoare’s logic: A survey - part I. ACM Trans.
Program. Lang. Syst., 3:431–483, October 1981.
[6] C. Artho, K. Havelund, and A. Biere. High-level data races. Software
Testing, Verification and Reliability, 13(4):207–227, 2003.
[7] J. Baeten and W. Weijland. Process algebra. Cambridge Tracts in Theo-
retical Computer Science, 18, 1990.
[8] J. Bergstra and J. Klop. Process theory based on bisimulation semantics.
In J. de Bakker, W. de Roever, and G. Rozenberg, editors, Linear Time,
Branching Time and Partial Order in Logics and Models for Concurrency,
volume 354 of Lecture Notes in Computer Science, pages 50–122. Springer
Berlin / Heidelberg, 1989.
[9] A. Biere, A. Cimatti, E. M. Clarke, and Y. Zhu. Symbolic model checking
without bdds. In Proceedings of the 5th International Conference on Tools
and Algorithms for Construction and Analysis of Systems, TACAS ’99,
pages 193–207, London, UK, 1999. Springer-Verlag.
[10] A. Biere, A. Cimatti, E. M. Clarke, O. Strichman, and Y. Zhu. Bounded




[11] A. Bouajjani and O. Maler. Reachability analysis of pushdown automata.
In Proc. Intern. Workshop on Verification of Infinite-State Systems (Infin-
ity’96), 1996.
[12] A. Bouajjani, R. Echahed, and P. Habermehl. Verifying infinite state pro-
cesses with sequential and parallel composition. In Proceedings of the 22nd
ACM SIGPLAN-SIGACT symposium on Principles of programming lan-
guages, POPL ’95, pages 95–106, New York, NY, USA, 1995. ACM.
[13] A. Bouajjani, J. Esparza, and O. Maler. Reachability analysis of pushdown
automata: Application to model-checking. In Proceedings of the 8th Inter-
national Conference on Concurrency Theory, pages 135–150, London, UK,
1997. Springer-Verlag. ISBN 3-540-63141-0.
[14] A. Bouajjani, J. Esparza, and T. Touili. A generic approach to the static
analysis of concurrent programs with procedures. In Proceedings of the
30th ACM SIGPLAN-SIGACT symposium on Principles of programming
languages, POPL ’03, pages 62–73, New York, NY, USA, 2003. ACM.
[15] A. Bouajjani, J. Esparza, S. Schwoon, and J. Strejcek. Reachability anal-
ysis of multithreaded software with asynchronous communication. In 25th
Intern. Conf. on Foundations of Software Technology and Theoretical Com-
puter Science (FSTTCS’05). LNCS 3821, Springer, 2005.
[16] A. Bouajjani, M. Müller-Olm, and T. Touili. Regular symbolic analysis of
dynamic networks of pushdown systems. In Concurrency Theory. 16th Int.
Conf. (CONCUR), pages 473–487. LNCS 3653, Springer, 2005.
[17] A. Bouajjani, M. Müller-Olm, and T. Touili. Regular symbolic anal-
ysis of dynamic networks of pushdown systems. Available on: http:
//cs.uni-muenster.de/sev/publications, 2005. Extended version with
proofs.
[18] R. E. Bryant. Graph-based algorithms for boolean function manipulation.
IEEE Trans. Comput., 35:677–691, August 1986.
[19] J. R. Büchi. Regular canonical systems. Arch. Math. Logik Grundlag., 6:
91–111, 1964.
[20] J. R. Burch, E. M. Clarke, K. L. Mcmillan, D. L. Dill, and L. J. Hwang.
Symbolic model checking: 1020 states and beyond. In Proc. of Logic in
Computer Science (LICS), pages 428–439, 1990.
[21] D. R. Butenhof. Programming with POSIX Threads. Addison-Wesley, 1997.
174
Bibliography
[22] D. Caucal. On the regular structure of prefix rewriting. In A. Arnold,
editor, CAAP ’90, volume 431 of Lecture Notes in Computer Science, pages
87–102. Springer Berlin / Heidelberg, 1990.
[23] D. Caucal. On the regular structure of prefix rewriting. Theoretical Com-
puter Science, 106(1):61–86, 1992.
[24] E. Clarke. The birth of model checking. In O. Grumberg and H. Veith,
editors, 25 Years of Model Checking, volume 5000 of Lecture Notes in Com-
puter Science, pages 1–26. Springer Berlin / Heidelberg, 2008.
[25] E. Clarke, O. Grumberg, S. Jha, Y. Lu, and H. Veith. Counterexample-
guided abstraction refinement. In E. Emerson and A. Sistla, editors, Com-
puter Aided Verification, volume 1855 of Lecture Notes in Computer Sci-
ence, pages 154–169. Springer Berlin / Heidelberg, 2000.
[26] E. M. Clarke and E. A. Emerson. Design and synthesis of synchroniza-
tion skeletons using branching-time temporal logic. In Logic of Programs,
Workshop, pages 52–71, London, UK, 1982. Springer-Verlag.
[27] E. M. Clarke, E. A. Emerson, and A. P. Sistla. Automatic verification of
finite-state concurrent systems using temporal logic specifications. ACM
Trans. Program. Lang. Syst., 8:244–263, April 1986.
[28] A. Colmerauer, H. Kanoui, P. Roussel, and R. Pasero. Un systeme de
communication homme-machine en Francais, 1973. Rapport de Recherche
en Intelligence Artificielle, Marseille.
[29] H. Comon, M. Dauchet, R. Gilleron, C. Löding, F. Jacquemard, D. Lugiez,
S. Tison, and M. Tommasi. Tree automata techniques and applications.
Available on: http://www.grappa.univ-lille3.fr/tata, 2007. release
October, 12th 2007.
[30] S. A. Cook. The complexity of theorem-proving procedures. In Proceedings
of the third annual ACM symposium on Theory of computing, STOC ’71,
pages 151–158, New York, NY, USA, 1971. ACM.
[31] P. Cousot and R. Cousot. Abstract interpretation: a unified lattice model
for static analysis of programs by construction or approximation of fix-
points. In Proc. of POPL’77, pages 238–252, Los Angeles, California, 1977.
ACM Press, New York.
[32] P. Cousot and R. Cousot. Systematic design of program analysis frame-
works. In Conference Record of the Sixth Annual ACM SIGPLAN-SIGACT
Symposium on Principles of Programming Languages, pages 269–282, San
Antonio, Texas, 1979. ACM Press, New York, NY.
175
Bibliography
[33] E. W. Dijkstra. Cooperating sequential processes, technical report ewd-123.
Technical report, 1965.
[34] J. Esparza. Decidability of model checking for infinite-state concurrent
systems. Acta Informatica, 34:85–107, 1997.
[35] J. Esparza and J. Knoop. An automata-theoretic approach to interproce-
dural data-flow analysis. In W. Thomas, editor, Foundations of Software
Science and Computation Structures, volume 1578 of Lecture Notes in Com-
puter Science, pages 642–642. Springer Berlin / Heidelberg, 1999.
[36] J. Esparza and A. Podelski. Efficient algorithms for pre* and post* on inter-
procedural parallel flow graphs. In Proceedings of the 27th ACM SIGPLAN-
SIGACT symposium on Principles of programming languages, POPL ’00,
pages 1–11, New York, NY, USA, 2000. ACM.
[37] J. Esparza, D. Hansel, P. Rossmanith, and S. Schwoon. Efficient algorithms
for model checking pushdown systems. In E. Emerson and A. Sistla, editors,
Computer Aided Verification, volume 1855 of Lecture Notes in Computer
Science, pages 232–247. Springer Berlin / Heidelberg, 2000.
[38] Y. Eytani, K. Havelund, S. D. Stoller, and S. Ur. Towards a framework
and a benchmark for testing tools for multi-threaded programs: Research
articles. Concurr. Comput. : Pract. Exper., 19:267–279, March 2007.
[39] A. Farzan and Z. Kincaid. Compositional bitvector analysis for concur-
rent programs with nested locks. In Proceedings of the 17th international
conference on Static analysis, SAS’10, pages 253–270, Berlin, Heidelberg,
2010. Springer-Verlag.
[40] A. Farzan, P. Madhusudan, and F. Sorrentino. Meta-analysis for atomicity
violations under nested locking. In Computer Aided Verification, pages
248–262. Springer-Verlag, 2009.
[41] A. Finkel, B. Willems, and P. Wolper. A direct symbolic approach to model
checking pushdown systems (extended abstract). Electronic Notes in Theo-
retical Computer Science, 9:27 – 37, 1997. Infinity’97, Second International
Workshop on Verification of Infinite State Systems.
[42] R. W. Floyd. Assigning meanings to programs. Mathematical aspects of
computer science, 19(19-32):1, 1967.
[43] H. Gallaire, J. Minker, , and J.-M. Nicolas. An overview and introduction




[44] T. M. Gawlitza, P. Lammich, M. Müller-Olm, H. Seidl, and A. Wenner.
Join-lock-sensitive forward reachability analysis of concurrent programs
with dynamic process creation. In To appear in Proc. of VMCAI 2011.
Springer, 2011.
[45] D. Gelperin and B. Hetzel. The growth of software testing. Commun. ACM,
31:687–695, June 1988.
[46] P. Godefroid. Using partial orders to improve automatic verification meth-
ods. In E. Clarke and R. Kurshan, editors, Computer-Aided Verification,
volume 531 of Lecture Notes in Computer Science, pages 176–185. Springer
Berlin / Heidelberg, 1991.
[47] J. Gosling, B. Joy, G. Steele, and G. Bracha. The Java(TM) Language Spec-
ification. Addison Wesley, 3rd edition edition, 2005. ISBN 978-0321246783.
[48] J. Grabowski. On partial languages. Annales Societatis Mathematicae
Polonae, Fundamenta Informaticae, IV(2):428–498, 1981.
[49] W. Gropp, E. Lusk, and A. Skjellum. Using MPI — Portable Parallel
Programming with the Message Passing Interface. MIT Press, 2nd edition
edition, 1999.
[50] C. A. R. Hoare. An axiomatic basis for computer programming. Commun.
ACM, 12:576–580, October 1969.
[51] C. A. R. Hoare. Monitors: an operating system structuring concept. Com-
mun. ACM, 17:549–557, October 1974.
[52] J. E. Hopcroft, R. Motwani, and J. D. Ullman. Introduction to Automata
Theory, Languages, and Computation. Addison-Wesley, 3rd edition, 2006.
[53] C. B. Jones. Development Methods for Computer Programs including a
Notion of Interference. PhD thesis, Oxford University, June 1981. Printed
as: Programming Research Group, Technical Monograph 25.
[54] V. Kahlon. Boundedness vs. unboundedness of lock chains: Characterizing
decidability of cfl-reachability for threads communicating via locks. In Proc.
of the 24th Annual IEEE Symposium on Logic in Computer Science (LICS),
Los Angeles, USA, August 2009.
[55] V. Kahlon and A. Gupta. An automata-theoretic approach for model check-




[56] V. Kahlon and A. Gupta. On the analysis of interacting pushdown systems.
In Proceedings of POPL ’07, pages 303–314, New York, NY, USA, 2007.
ACM.
[57] V. Kahlon, F. Ivancic, and A. Gupta. Reasoning about threads communi-
cating via locks. In Proc. of CAV 2005, volume 3576 of LNCS. Springer,
2005.
[58] R. M. Karp. In complexity of computer computations. Reducibility Among
Combinatorial Problems, pages 85–103, 1972.
[59] N. Kidd. Static verification of data-consistency properties. PhD thesis,
Computer Sciences Department, University of Wisconsin, Madison, WI,
August 2009. Tech. Rep. TR-1665.
[60] N. Kidd, A. Lal, and T. Reps. Language strength reduction. In M. Alpuente
and G. Vidal, editors, Static Analysis, volume 5079 of Lecture Notes in
Computer Science, pages 283–298. Springer Berlin / Heidelberg, 2008.
[61] N. A. Kidd, P. Lammich, T. Touili, , and T. W. Reps. A decision proce-
dure for detecting atomicity violations for communicating processes with
locks. In Proc. of SPIN Workshop on Model Checking of Software (SPIN).
Springer, June 2009.
[62] N. A. Kidd, T. W. Reps, J. Dolby, and M. Vaziri. Finding concurrency-
related bugs using random isolation. In Proc. of Verification, Model Check-
ing, and Abstract Interpretation (VMCAI). Springer, January 2009.
[63] N. A. Kidd, P. Lammich, T. Touili, , and T. W. Reps. A decision procedure
for detecting atomicity violations for communicating processes with locks.
Proc. Int. Journal on Software Tools for Technology Transfer (STTT), 13
(1):37–60, 2010.
[64] T. Kleymann. Hoare Logic and VDM: Machine-Checked Soundness and
Completeness Proofs. PhD thesis, University of Edinburgh, 1998.
[65] J. Knoop, B. Steffen, and J. Vollmer. Parallelism for free: Efficient and
optimal bitvector analyses for parallel programs. TOPLAS, 18(3):268–299,
May 1996.
[66] D. Kozen. Lower bounds for natural proof systems. In Proceedings of the
18th Annual Symposium on Foundations of Computer Science, pages 254–
266, Washington, DC, USA, 1977. IEEE Computer Society.
[67] A. Lal, T. Reps, and G. Balakrishnan. Extended weighted pushdown sys-
tems. In In CAV, pages 434–448, 2005.
178
Bibliography
[68] A. Lal, T. Touili, N. Kidd, and T. Reps. Interprocedural analysis of concur-
rent programs under a context bound. In C. Ramakrishnan and J. Rehof,
editors, Tools and Algorithms for the Construction and Analysis of Sys-
tems, volume 4963 of Lecture Notes in Computer Science, pages 282–298.
Springer Berlin / Heidelberg, 2008.
[69] M. S. Lam, J. Whaley, V. B. Livshits, M. C. Martin, D. Avots, M. Carbin,
and C. Unkel. Context-sensitive program analysis as database queries. In
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART sympo-
sium on Principles of database systems, PODS ’05, pages 1–12, New York,
NY, USA, 2005. ACM.
[70] P. Lammich. Fixpunkt-basierte optimale Analyse von Programmen mit
Thread-Erzeugung. Master’s thesis, University of Dortmund, May 2006.
[71] P. Lammich. Tree automata. In G. Klein, T. Nipkow, and L. Paulson, edi-
tors, The Archive of Formal Proofs. Nov 2009. Formal proof development.
[72] P. Lammich. Isabelle formalization of hedge-
constrained pre* and DPNs with locks. Available from
http://cs.uni-muenster.de/sev/publications/, 2009. Technical
Report.
[73] P. Lammich. Collections framework. In G. Klein, T. Nipkow, and L. Paul-
son, editors, The Archive of Formal Proofs. Nov 2009. Formal proof devel-
opment.
[74] P. Lammich. Tree automata for analyzing dynamic pushdown networks.
In Tagungsband des 15. Kolloquium Programmiersprachen und Grundlagen
der Programmierung (KPS 2009), Schriftenreihe des Instituts für Comput-
ersprachen. Technische Universität Wien, 2009.
[75] P. Lammich and A. Lochbihler. The isabelle collections framework. In
M. Kaufmann and L. Paulson, editors, Interactive Theorem Proving, vol-
ume 6172 of Lecture Notes in Computer Science, pages 339–354. Springer
Berlin / Heidelberg, 2010.
[76] P. Lammich and M. Müller-Olm. Precise fixpoint-based analysis of pro-
grams with thread-creation. In Proc. of CONCUR 2007, pages 287–302.
Springer, 2007.
[77] P. Lammich and M. Müller-Olm. Formalization of conflict analysis of pro-
grams with procedures, thread creation, and monitors. In G. Klein, T. Nip-




[78] P. Lammich and M. Müller-Olm. Conflict analysis of programs with proce-
dures, dynamic thread creation, and monitors. In Proc. of SAS’08, volume
5079 of LNCS. Springer, 2008. ISBN 978-3-540-69163-1.
[79] P. Lammich, M. Müller-Olm, and A. Wenner. Predecessor sets of dynamic
pushdown networks with tree-regular constraints. In CAV, pages 525–539,
2009.
[80] R. J. Lipton. Reduction: a method of proving properties of parallel pro-
grams. Commun. ACM, 18(12):717–721, 1975. ISSN 0001-0782.
[81] D. Lugiez. Forward analysis of dynamic network of pushdown systems is
easier without order. In O. Bournez and I. Potapov, editors, Reachability
Problems, volume 5797 of Lecture Notes in Computer Science, pages 127–
140. Springer Berlin / Heidelberg, 2009.
[82] D. Lugiez and P. Schnoebelen. The regular viewpoint on pa-processes.
Theor. Comput. Sci., 274:89–115, March 2002.
[83] D. Lugiez and P. Schnoebelen. The regular viewpoint on pa-processes. In
Proceedings of the 9th International Conference on Concurrency Theory,
CONCUR ’98, pages 50–66, London, UK, 1998. Springer-Verlag.
[84] R. Mayr. Process rewrite systems. Information and Computation, 156:
2000, 1997.
[85] R. Mayr. Model checking pa-processes. In A. Mazurkiewicz and
J. Winkowski, editors, CONCUR ’97: Concurrency Theory, volume 1243
of Lecture Notes in Computer Science, pages 332–346. Springer Berlin /
Heidelberg, 1997.
[86] R. Mayr. Decidability and Complexity of Model Checking Problems for
Infinite-State Systems. PhD thesis, TU München, April 1998.
[87] A. Mazurkiewicz. Concurrent program schemes and their interpretations.
In DAIMI Report PB-78. Department of Computer Science, Aarhus Uni-
versity, Aarhus, Denmark, 1977.
[88] M. Mohr. Ein Datalog-Dialekt mit BDD-Semantik zur symbolischen Anal-
yse paralleler Programme. Master’s thesis, WWU Münster, 2011. submit-
ted.
[89] M. Müller-Olm. The complexity of copy constant detection in parallel
programs. In Proceedings of the 18th Annual Symposium on Theoretical




[90] M. Müller-Olm. Variations on Constants. Habilitation thesis, Fachbereich
Informatik, Universität Dortmund, August 2002.
[91] M. Müller-Olm. Precise interprocedural dependence analysis of parallel
programs. Theoretical Computer Science (TCS), 31:325–388, 2004.
[92] M. Müller-Olm and H. Seidl. On optimal slicing of parallel programs.
In Proceedings of the thirty-third annual ACM symposium on Theory of
computing, STOC ’01, pages 647–656, New York, NY, USA, 2001. ACM.
[93] F. Nielson, H. Seidl, and H. R. Nielson. A succinct solver for alfp. Nordic
J. of Computing, 9:335–372, December 2002.
[94] T. Nipkow, L. C. Paulson, and M. Wenzel. Isabelle/HOL — A Proof As-
sistant for Higher-Order Logic, volume 2283 of LNCS. Springer, 2002.
[95] S. Owicki and D. Gries. Verifying properties of parallel programs: an
axiomatic approach. Commun. ACM, 19:279–285, May 1976.
[96] C. M. Papadimitriou. Computational Complexity. Addison-Wesley, 1994.
ISBN 0-201-53082-1.
[97] D. Peled. Combining partial order reductions with on-the-fly model-
checking. In D. Dill, editor, Computer Aided Verification, volume 818
of Lecture Notes in Computer Science, pages 377–390. Springer Berlin /
Heidelberg, 1994.
[98] C. A. Petri. Kommunikation mit Automaten. Bonn: Institut für Instru-
mentelle Mathematik, Schriften des IIM Nr. 2, 1962.
[99] C. A. Petri. Kommunikation mit automaten. New York: Griffiss Air Force
Base, Technical Report RADC-TR-65–377, 1:1–Suppl. 1, 1966. English
translation.
[100] V. Pratt. Modelling concurrency with partial orders. International Journal
of Parallel Programming, 15:33–71, 1986.
[101] S. Qadeer and J. Rehof. Context-bounded model checking of concurrent
software. In In TACAS, pages 93–107. Springer, 2005.
[102] S. Qadeer and D. Wu. Kiss: keep it simple and sequential. In Proceedings
of the ACM SIGPLAN 2004 conference on Programming language design




[103] J. Queille and J. Sifakis. Specification and verification of concurrent systems
in cesar. In M. Dezani-Ciancaglini and U. Montanari, editors, International
Symposium on Programming, volume 137 of Lecture Notes in Computer
Science, pages 337–351. Springer Berlin / Heidelberg, 1982.
[104] G. Ramalingam. Context-sensitive synchronization-sensitive analysis is un-
decidable. TOPLAS, 22(2):416–430, 2000.
[105] T. Reps, S. Schwoon, and S. Jha. Weighted pushdown systems and their
application to interprocedural dataflow analysis. In R. Cousot, editor, Static
Analysis, volume 2694 of Lecture Notes in Computer Science, pages 1075–
1075. Springer Berlin / Heidelberg, 2003.
[106] T. Reps, S. Schwoon, S. Jha, and D. Melski. Weighted pushdown systems
and their application to interprocedural dataflow analysis. Sci. Comput.
Program., 58(1-2):206–263, 2005.
[107] H. G. Rice. Classes of recursively enumerable sets and their decision prob-
lems. Transactions of the American Mathematical Society, 74(2):pp. 358–
366, 1953.
[108] W. J. Savitch. Relationships between nondeterministic and deterministic
tape complexities. J. Comput. Syst. Sci., 4:177–192, April 1970.
[109] Ph. Schnoebelen. Decomposable regular languages and the shuﬄe oper-
ator. EATCS Bulletin, 67:283–289, Feb. 1999. URL http://www.lsv.
ens-cachan.fr/Publis/PAPERS/PS/Sch-BEATCS99.ps.
[110] H. Seidl and B. Steffen. Constraint-based inter-procedural analysis of par-
allel programs. Nordic Journal of Computing (NJC), 7(4):375–400, 2000.
[111] M. Strauch. Realisierung und Anwendung eines automatenbasierten
Ansatzes zur Analyse von Programmen mit Threads. Master’s thesis, Fach-
bereich Informatik der Universität Dortmund, 2007.
[112] B. Stroustrup. The C++ Programming Language (Special Edition). Addi-
son Wesley, Reading Mass. USA, 2000.
[113] R. N. Taylor. Complexity of analyzing the synchronization structure of
concurrent programs. Acta Informatica, 19:57–84, 1983.
[114] A. Valmari. A stubborn attack on state explosion. In Proceedings of the 2nd
International Workshop on Computer Aided Verification, CAV ’90, pages
156–165, London, UK, 1991. Springer-Verlag.
182
Bibliography
[115] M. Y. Vardi and P. Wolper. Automata theoretic techniques for modal logics
of programs: (extended abstract). In Proceedings of the sixteenth annual
ACM symposium on Theory of computing, STOC ’84, pages 446–456, New
York, NY, USA, 1984. ACM. ISBN 0-89791-133-4.
[116] M. Y. Vardi and P. Wolper. Automata-theoretic techniques for modal logics
of programs. J. Comput. Syst. Sci., 32:183–221, April 1986. ISSN 0022-
0000.
[117] M. Vaziri, F. Tip, and J. Dolby. Associating synchronization constraints
with data in an object-oriented language. In Conference record of the 33rd
ACM SIGPLAN-SIGACT symposium on Principles of programming lan-
guages, POPL ’06, pages 334–345, New York, NY, USA, 2006. ACM.
[118] D. von Oheimb. Hoare logic for mutual recursion and local variables. In
V. R. C. Pandu Rangan and R. Ramanujam, editors, Foundations of Soft-
ware Technology and Theoretical Computer Science, volume 1738 of LNCS,
pages 168–180. Springer, 1999.
[119] I. Wegener. Komplexitätstheorie: Grenzen der Effizienz von Algorithmen.
Springer, 2003. ISBN 978-3540001614.
[120] A. Wenner. Optimale Analyse gewichteter dynamischer Push-Down Net-
zwerke. Master’s thesis, WWU Münster, August 2008.
[121] A. Wenner. Weighted dynamic pushdown networks. In A. Gordon, editor,
Programming Languages and Systems, volume 6012 of LNCS, pages 590–
609. Springer Berlin / Heidelberg, 2010.
[122] J. Whaley and M. S. Lam. Cloning-based context-sensitive pointer alias
analysis using binary decision diagrams. In Proceedings of the ACM SIG-
PLAN 2004 conference on Programming language design and implementa-
tion, PLDI ’04, pages 131–144, New York, NY, USA, 2004. ACM.
[123] S. Zilio and D. Lugiez. Xml schema, tree logic and sheaves automata. Appli-







geboren am 8. Dezember 1980 in Freiburg im Breisgau.
Familienstand: ledig.
Name des Vaters: Siegfried Lammich.
Name der Mutter: Maria Lammich, geb. Perschke.
Schulbildung: Grundschule von 1987 bis 1991 in Mülheim a. d. Ruhr.
Gymnasium von 1991 bis 2000 in Mülheim a. d. Ruhr.
Hochschulreife (Abitur): Am 14. Juni 2000 in Mülheim a. d. Ruhr.
Durchschnittsnote: 1.7.
Zivildienst von August 2000 bis Juni 2001
in Mülheim a. d. Ruhr.
Studium: Informatik von Oktober 2001 bis Mai 2006
an der Universität Dortmund.
Promotionsstudiengang: Ab SoSo 2007 an der Universität Münster.
Prüfungen: Diplom im Fach Informatik
am 18. Mai 2006 an der Universität Dortmund.
Gesamturteil: Mit Auszeichnung.
Tätigkeiten: Studentische Hilfskraft an der Universität Dortmund
von April 2002 bis März 2006.
Wissenschaftlicher Mitarbeiter an der WWU Münster
ab Juni 2006.
Beginn der Dissertation: April 2007
am Institut für Informatik (Univ. Münster)
bei Prof. Dr. Markus Müller-Olm.
