Component refinement and CSC-solving for STG decomposition  by Schaefer, Mark & Vogler, Walter
Theoretical Computer Science 388 (2007) 243–266
www.elsevier.com/locate/tcs
Component refinement and CSC-solving for STG decomposition
Mark Schaefer∗, Walter Vogler
University of Augsburg, Universita¨tsstr. 14, 86135 Augsburg, Germany
Received 13 March 2006; received in revised form 5 August 2007; accepted 15 August 2007
Communicated by R. Gorrieri
Abstract
STGs (Signal Transition Graphs) give a formalism for the description of asynchronous circuits based on Petri nets. To overcome
the state explosion problem one may encounter during circuit synthesis, a nondeterministic algorithm for decomposing STGs was
suggested by Chu and improved by one of the present authors.
Here we study how CSC-solving – which is essential for circuit synthesis – can be combined with decomposition. For
this purpose, the correctness definition for decomposition is enhanced with internal signals and hierarchical decomposition is
proven correct. Based on this, it is shown that speed-independent CSC-solving preserves correctness and can be combined with
decomposition.
Furthermore, we use our new correctness definition to give the first correctness proof for the decomposition method of Carmona
and Cortadella. Finally, we compare three different implementation relations for STGs: one derived from our correctness definition;
one defined by Dill based on trace structures; and one derived from I/O-compatibility defined by Carmona and Cortadella.
c© 2007 Elsevier B.V. All rights reserved.
Keywords: Asynchronous circuits; STG; Petri nets; Decomposition; Implementation relation
1. Introduction
Signal Transition Graphs (STG) are a formalism for the description of asynchronous circuit behaviour. An STG
is a labelled Petri net where the labels denote signal changes between logical high and logical low. The synthesis
of circuits from STGs is supported by several tools, e.g. PETRIFY [5] and MPSAT [11], and it often involves the
generation of the reachability graph, which may have a size exponential in the size of the STG (state explosion).
To cope with this problem, Chu suggested a nondeterministic method for decomposing an STG (without internal
signals) into several smaller ones [4], see also [10]. The idea is that all components together can be synthesised
faster than the original STG while the corresponding circuits perform together in the same way as the circuit directly
synthesised from the specification. While there are strong restrictions on the structure and labelling of STGs in [4], the
improved decomposition algorithm of Vogler, Wollowski and Kangsah [16,15] works under – comparatively moderate
– restrictions on the labelling only.
∗ Corresponding address: University of Augsburg, Institute of Computer Science, Universita¨t Augsburg, Universita¨tsstr. 14, 86135 Augsburg,
Germany. Tel.: +49 821 598 3109; fax: +49 821 598 2175.
E-mail addresses: mark.schaefer@informatik.uni-augsburg.de (M. Schaefer), vogler@informatik.uni-augsburg.de (W. Vogler).
0304-3975/$ - see front matter c© 2007 Elsevier B.V. All rights reserved.
doi:10.1016/j.tcs.2007.08.005
244 M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266
Roughly, this decomposition algorithm works as follows. Initially, a partition of the output signals has to be chosen,
and for each set in this partition a component producing the respective output signals will be constructed as follows.
For each component, our algorithm finds a set of signals that (at least initially) can be regarded as irrelevant for the
output signals under consideration; then, it takes a copy of the original STG and turns each transition corresponding to
an irrelevant signal into a dummy (λ-labelled) transition; finally, it tries to remove all dummy transitions by so-called
secure transition contractions and deletions of (structurally) redundant places and redundant transitions.
In general, our algorithm might find during the generation of a component that additional signals are relevant;
then, it has to start anew from a suitably modified copy of the original STG. The algorithm terminates when no more
relevant signals are found or all signals are identified as relevant. This eventually gives a correct decomposition as
proven in [16,15].
Complete state coding (CSC) is an important property of STGs and must be achieved before an asynchronous
circuit can be synthesised; e.g. PETRIFY can solve CSC, i.e. modify an STG on the basis of its reachability graph by
introducing new internal signals such that CSC holds while the behaviour is preserved in some sense. While some
decomposition methods [3,18] have to assume that the original STG satisfies CSC, our decomposition algorithm is
more general since it does not presuppose this; on the other hand, the methods in [3,18] construct components with
CSC, while our components might not have CSC. For each such component one can solve CSC and synthesise a
separate circuit e.g. by using PETRIFY; compared to solving CSC for the original STG (with its potentially huge
reachability graph) and synthesising one circuit, this can be much faster, see experimental results in [16,15].
One would expect that the components generated by our decomposition algorithm are still correct when they have
been modified to achieve CSC, and in fact it would also be very interesting in what sense CSC-solving with PETRIFY
is correct – independently of the issue of decomposition; it seems that no correctness for this has been proven so far.
For such correctness results, one needs a correctness definition that takes internal signals into account. The purpose of
this paper is to enhance the correctness definition of [16,15] appropriately, to study its properties and give applications
in the area of decomposition and CSC-solving.
As the main property of the new correctness notion, we show that it is preserved when decomposition is performed
hierarchically. This correctness of top-down decomposition is of interest in itself, but it also implies that the
implementation relation arising from our correctness notion is a preorder, and it can in particular be used to improve
the efficiency of our decomposition algorithm; a discussion of this and other methods can be found in [14]. Then
we prove that CSC-solving for speed-independent circuits as performed by PETRIFY is correct in our sense. With
our result on the correctness of top-down decomposition, we then conclude that speed-independent CSC-solving
can indeed be combined with the decomposition algorithm of [16,15]. As another contribution, we prove that the
decomposition method in [3] is correct in the sense of our enhanced correctness definition; in [3] itself, no correctness
proof is given. Finally, we compare our implementation relation with existing concepts.
The paper is organised as follows. In the next section, Petri Nets, STGs and their basic notions are introduced.
In Section 3 the correctness definition is enhanced with internal signals. In Section 4, we prove that top-down
decomposition is correct in terms of our enhanced correctness definition; the succeeding section studies correctness of
speed-independent CSC-solving on its own and in combination with decomposition. Section 6 shows the correctness
for the approach of [3], which is followed by the comparison of our implementation relation with the one of Dill,
Carmona and Cortadella respectively. We conclude with Section 8. In Appendix A we demonstrate CSC-solving by
means of a practical example: the VME bus controller; Appendix B contains the proofs which are too long for the body.
2. Basic definitions
This section provides basic notions for Petri nets and STGs, for a more detailed explanation cf. e.g. [6]. A Petri net
is a 4-tuple N = (P, T, W, MN ) where P is a finite set of places and T a finite set of transitions with P ∩ T = ∅.
W : P × T ∪ T × P → N0 is the weight function and MN the initial marking, where a marking is a function
P → N0 which assigns a number of tokens to each place. A node is a place or a transition and a Petri net can be
considered as a bipartite graph with weighted and directed edges between its nodes. Whenever a Petri net N, N ′ , N1,
etc. is introduced, the corresponding tuples (P, T, W, MN ), (P ′, T ′, W ′, MN ′ ), (P1, T1, W1, MN1 ) etc. are introduced
implicitly and the same applies to STGs later on.
The preset of a node x is denoted as •x and defined by •x = {y ∈ P ∪ T | W (y, x) > 0}, the postset of a node x
is denoted as x• and defined by x• = {y ∈ P ∪ T | W (x, y) > 0}. We write •x• as shorthand for •x ∪ x•. All these
notions are extended to sets as usual. We say that there is an arc from each y ∈ •x to x .
M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266 245
dtack- dsr+ lds+
d- lds- ldtack- ldtack+
dsr- dtack+ d+
(a)
00100
00000
10000
01100
01000
11000 10010
01110
01010 11010
M′′ 11010M′
01111 11111 11011
dtack− dsr+
ldtack− ldtack− ldtack− lds+
dtack− dsr+
lds− lds− lds−
dtack− dsr+
ldtack+
d−
dsr− dtack+
d+
(b)
Fig. 1. (a) VME bus controller. (b) Corresponding reachability graph with state vectors, order of signals is: dsr, ldtack, dtack, lds, d. There is a CSC
conflict for states M ′ and M ′′ .
A transition t is enabled under a marking M if ∀p ∈ •t : M(p) ≥ W (p, t), which is denoted by M[t〉.
An enabled transition can fire or occur yielding a new marking M ′, written as M[t〉M ′ , if M[t〉 and M ′(p) =
M(p) − W (p, t) + W (t, p) for all p ∈ P . A transition sequence v = t1t2 . . . tn is enabled under a marking M
(yielding M ′) if M = M0[t1〉M1[t2〉M2 . . . Mn−1[tn〉Mn = M ′, and we write M[v〉, M[v〉M ′ resp.; v is called firing
sequence if MN [v〉. The empty transition sequence λ is enabled under every marking.
M ′ is called reachable from M if a transition sequence v with M[v〉M ′ exists. The set of all markings reachable
from M is denoted by [M〉. [MN 〉 is the set of reachable markings (of N), and we only deal with N where this set is
finite (i.e. N is bounded).
An STG is a tuple N = (P, T, W, MN , In, Out, Int, l) where (P, T, W, MN ) is a Petri net and In, Out and Int
are disjoint sets of input, output and internal signals. We define the set of all signals Sig := In ∪ Out ∪ Int, the
set of locally controlled or just local signals Loc := Out ∪ Int and the set of external signals Ext := In ∪ Out.
l : T → Sig±, with Sig± := Sig{+,−}, is the labelling function. In this paper we do not have to consider λ-labelled
dummy transitions, which play an important role in the decomposition algorithm of [16,15].
Sig± is the set of signal edges; a+ denotes that the value of signal a changes from logical low (written as 0) to
logical high (written as 1), and a− denotes a change in the other direction. We write a± if the direction is not important
or unknown; if such a term appears more than once in the same context, it always denotes the same direction. For
convenience, we often abbreviate input signal edge with input edge or just input when the meaning is clear from the
context (analogous for output and internal signals).
Some of the results of this paper do not depend on the fact that transition labels are of the form a+ or a−, i.e. they
can be applied in any setting where actions can be regarded as inputs, outputs or internal.
For an example for an STG look at the VME bus controller (or just VME) in Fig. 1(a), which will be used as an
example whenever possible. STGs are in principle drawn as ordinary Petri nets, i.e. places as circles containing tokens
denoting their marking and transitions as boxes containing their label. Different from this, it is often convenient to use
the marked graph style for STGs, i.e. unmarked places p with • p = {t} and p• = {t ′} are omitted and t and t ′ are
connected directly; furthermore, transitions labelled with output or internal signals are drawn with a thick border and
internal transitions are additionally shaded. So, VME has the output signals lds, dtack and d and the input signals dsr
and ldtack.
An STG can be taken as a formalism for asynchronous circuits which have no clock signal. Such a circuit
has input signals, which are under the control of its environment, and local signals, whose values are changed
by the circuit. The STG describes which output and internal signals should be performed; at the same time, it
describes assumptions about the environment, which should perform input signals only if this is specified by the
STG. We frequently identify STGs and the circuits they describe. Input signals which arrive when they are not
expected lead to malfunction, i.e. the circuit enters an unanticipated state or non-digital behaviour (called hazard)
occurs.
246 M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266
There exist several models for asynchronous circuits; for our decomposition method and in this paper we use the
speed-independent model (SI model) [12] with the following properties:
• In general, input and output signals can occur in an arbitrary order, in contrast to other models where they have to
alternate.
• Signals are considered to have no delay, i.e. a signal edge is received immediately by all listeners.
• The gates1 which generate the signals can work with arbitrary speed, thus the generation of signal edges can be
delayed.
The last item will become important in this paper when discussing input-properness, speed-independent CSC-solving
and I/O-compatibility.
We lift the notion of enabledness to transition labels and write M[a±〉〉M ′ if M[t〉M ′ and l(t) = a±. This is
extended to sequences as usual. A sequence v ∈ (Sig±)∗ is called a trace of a marking M if M[v〉〉, and a trace of N
if MN [v〉〉. The language of N is the set of all traces of N and denoted by L(N).
The reachability graph RGN of an STG N is an edge-labelled directed graph on the reachable markings with MN as
root; there is an edge from M to M ′ labelled s± ∈ Sig± whenever M[s±〉〉M ′ . RGN can be seen as a finite automaton
(where all states are final), and L(N) is the language of this automaton. N is deterministic if its reachability graph is
a deterministic automaton, i.e. if for each reachable marking M and each signal transition s± there is at most one M ′
with M[s±〉〉M ′. The reachability graph of VME can be found in Fig. 1(b). (The states are annotated with their state
vector, see below.)
The identity of the transitions or places of an STG, as well as the names of the internal signals are not relevant for
us; hence, we regard STGs N and N ′ as equal if they are externally isomorphic, i.e. if they have the same input and
output signals, and we can rename the internal signals of N and then map the transitions (places resp.) of the resulting
STG bijectively onto the transitions (places resp.) of N ′ such that the weight function, the marking and the labelling
are preserved. Altogether, the external signals are preserved while the internal signals might be renamed.
For the modular construction of STGs, the operations hiding, relabelling and parallel composition are of interest.
Given an STG N and a set H of signals with H ∩ In = ∅, the hiding of H results in the STG:
N/H = (P, T, W, MN , In, Out \ H, Int ∪ H, l).
Given a bijection φ defined at least for the external signals of N , the relabelling of N is
φ(N) = (P, T, W, M0, φ(In), φ(Out), Int, φ ◦ l).
This assumes that, if necessary, the internal signals of N are renamed such that Int ∩ (φ(In) ∪ φ(Out)) = ∅ and φ is
extended to be the identity on the internal signals.
Observe that hiding and relabelling preserve determinism as defined above and the same will apply for parallel
composition. In particular hiding does not change the identity of signals or removes them completely from the STG
as it is done in other settings.
In the following definition of parallel composition ‖, we will have to consider the distinction between input, output
and internal signals. The idea of parallel composition is that the composed systems run in parallel synchronising on
common signals. Since a system controls its outputs, we cannot allow a signal to be an output of more than one
component; input signals, on the other hand, can be shared. An output signal of one component can be an input of
one or several others, and in any case it is an output of the composition. Internal signals of one component are not
shared with other components (this can be achieved with a suitable renaming) and they become internal signals of the
composition. A composition can also be ill-defined due to what e.g. Ebergen [8] calls computational interference; this
is a semantic problem, and we will not consider it here, but later in the definition of correctness.
The parallel composition of STGs N1 and N2 is defined if Loc1 ∩ Loc2 = ∅ and Int1 ∩ In2 = Int2 ∩ In1 = ∅.
Then, let A = Sig1 ∩ Sig2 be the set of common signals; observe that A contains no internal signals. If e.g. s is an
output of N1 and an input of N2, then an occurrence of s in N1 is ‘seen’ by N2, i.e. it must be accompanied by an
occurrence of s in N2. Since we do not know a priori which s±-labelled transition of N2 will occur together with
some s±-labelled transition of N1, we have to allow for each possible pairing. Therefore, the parallel composition
1 Gates can be considered as the basic building blocks of digital circuits; they calculate logical functions, e.g. the ordinary AND.
M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266 247
p1
p2
a+
(a)
t1
q1 q2
q3
a+ q4
q5 q6
a+
(b)
t2
t3
(p1, ∗)
(p2, ∗)
(∗, q1) (∗, q2)
(∗, q3)
a+
(∗, q4)
(∗, q5) (∗, q6)
a+
(c)
(t1, t2)
(t1, t3)
Fig. 2. Example for a parallel composition. (a) Part of STG N1 with a single output edge a+ in t1. (b) Part of STG N2 with two input edges a+ in
t2 and t3. (c) Part of N1 ||N2 with two output edges a+ in the combined transitions (t1, t2) and (t1, t3).
N = N1||N2 is obtained from the disjoint union of N1 and N2 by combining each s±-labelled transition t1 of N1 with
each s±-labelled transition t2 from N2 if s ∈ A.
In the formal definition of parallel composition, ∗ is used as a dummy element, which is formally combined e.g.
with those transitions that do not have their label in the synchronisation set A. (We assume that ∗ is not a transition or
a place of any net.) Thus, N is defined by (see also Fig. 2)
P = P1 × {∗} ∪ {∗} × P2
T = {(t1, t2) | t1 ∈ T1, t2 ∈ T2, l1(t1) = l2(t2) ∈ A±}
∪ {(t1, ∗) | t1 ∈ T1, l1(t1) /∈ A±} ∪ {(∗, t2) | t2 ∈ T2, l2(t2) /∈ A±}
W ((p1, p2), (t1, t2)) =
{
W1(p1, t1) if p1 ∈ P1, t1 ∈ T1
W2(p2, t2) if p2 ∈ P2, t2 ∈ T2
W ((t1, t2), (p1, p2)) =
{
W1(t1, p1) if p1 ∈ P1, t1 ∈ T1
W2(t2, p2) if p2 ∈ P2, t2 ∈ T2
l((t1, t2)) =
{
l1(t1) if t1 ∈ T1
l2(t2) if t2 ∈ T2
MN = MN1 ∪˙MN2 where ∪˙ is the disjoint union, i.e.
MN ((p1, p2)) =
{
MN1 (p1) if p1 ∈ P1
MN2 (p2) if p2 ∈ P2
Int = Int1 ∪ Int2 Out = Out1 ∪ Out2 In = (In1 ∪ In2) − Out.
It is not hard to see that parallel composition is associative and commutative up to external isomorphism and
||i∈I Ni is defined if each Ni ||N j is defined. Furthermore, one can consider the place set of the composition as the
disjoint union of the place sets of the components; therefore, we can consider markings of the composition (regarded
as multisets) as the disjoint union of markings of the components as exemplified for MN above; the latter makes clear
what we mean by the restriction M Pi for a marking M of the composition.
STGs together with the three operations defined above form a circuit algebra as defined in Dill’s Ph.D. thesis [7],
when regarding externally isomorphic STGs as equal. For our further considerations we will use the properties
(CA6) : (N/H1)/H2 = N/(H1 ∪ H2) and
(CA8) : N1/H1||N2/H2 = (N1||N2)/(H1 ∪ H2) if Hi ∩ Sig3−i = ∅, i = 1, 2
satisfied by a circuit algebra.2
2 There are 7 additional laws a circuit algebra must fulfill (in our notation): (CA1) (N1||N2)||N3 = N1 ||(N2||N3) = N1||N2||N3, (CA2):
N1 ||N2 = N2||N1; (CA3): φ2(φ1(N)) = (φ2 ◦ φ1)(N), (CA4): φ(N1||N2) = φ(N1)||φ(N2), (CA5): id(N) = N , (CA7): N/∅ = N , (CA9):
φ(N/H ) = φ′(N)/φ′(H ), for φ = φ′ |Sig\H and φ(Sig \ H ) ∩ φ′(H ) = ∅. These properties are satisfied for our definitions, where (CA4) and
(CA9) only have to hold if both sides are defined.
248 M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266
While (CA6) is obvious, we will give a short argument for (CA8). Observe that (N1||N2)/{x} = (N1/{x})||N2 if
x ∈ Sig2: since x ∈ Sig2, the transitions labelled with x± in N1 are not paired with transitions of N2 and it is therefore
not important whether the hiding is done before or after the parallel composition. Using (CA6) and symmetry, this
can be generalised to sets H1 and H2 by induction.
Let RGN be the reachability graph of an STG N . A state vector is a function sv : Sig → {0, 1} mapping signals to
their state, where ‘0’ means logical low and ‘1’ logical high. A state assignment assigns a state vector to each marking
M of RGN denoted by svM .
A state assignment must satisfy for every signal x ∈ Sig and every pair of markings M, M ′ ∈ [MN 〉:
• M[x+〉〉M ′ implies svM (x) = 0, svM ′(x) = 1
• M[x−〉〉M ′ implies svM (x) = 1, svM ′(x) = 0
• M[y±〉〉M ′ for y = x implies svM (x) = svM ′(x).
If such an assignment exists, it is uniquely defined by these properties,3 and the reachability graph (and also the
underlying STG) is called consistent. From an inconsistent STG, one cannot synthesise a circuit. As mentioned before,
the reachability graph of VME is annotated with its state assignment, which proves that VME is in fact consistent, see
Fig. 1(b).
Another necessary condition for synthesis is complete state coding (CSC). We say that a consistent RGN (and N)
has CSC if:
∀x ∈ Loc, M, M ′ ∈ [MN 〉 : svM = svM ′ ⇒ (M[x±〉〉 ⇔ M ′[x±〉〉).
If an STG violates CSC, no asynchronous circuit can be synthesised because a circuit determines the activated local
signal edges only from the current state of its signals (the state vector); hence, the circuit cannot distinguish the two
markings with the same state vector and the same local signals must be enabled. It is possible that different input
signals are enabled in M and M ′ because these are not controlled by the circuit. For an example for a CSC violation
look at Fig. 1(b): the states labelled M ′ and M ′′ have the same state vector, but enable different output signals.
3. Correctness with internal signals
Before we come to our new correctness definition for decomposition, we introduce an important notion, which
is related to the speed-independent model. As mentioned in the introduction, PETRIFY can modify an STG such
that CSC is satisfied. If one is interested in speed-independent circuits, as we are in this paper, one can require that
PETRIFY preserves the following important property.
Definition 1 (Input-Properness). An STG is input-proper if no input signal becomes enabled by the occurrence of an
internal signal, i.e. M1[s±〉〉M2 with M1 a reachable marking, ¬M1[a〉〉 and M2[a〉〉, a ∈ In, implies s ∈ Int.
Recall that an STG also specifies which inputs the environment may perform; if the environment performs an input
that is not enabled in the current marking of the STG, then such an unexpected input may lead to a malfunction of the
circuit as described in the previous section. To meet this specification, the environment must ‘know’ whether an input
is expected or not. Therefore, speed-independent STGs as well as asynchronous circuits have to be input-proper: as an
example where input-properness is violated, consider the case when the environment will produce an input signal after
the circuit produced a certain output, but the circuit must produce some internal signal before it is ready to receive this
input. Since the SI model allows that the production of the internal signal is delayed, the input from the environment
might arrive too early.
Actually, the implementation of non-input-proper STGs is still possible, but one has to make timing assumptions
about the relative order of signal transitions, e.g. one might assume that an input is slower than an internal signal if both
are triggered by the same output. However, such circuits are no longer speed-independent and are not considered here.
Now we give our improved correctness definition, which considers internal signals; afterwards, we will explain its
specific properties and why they are sound. In addition to internal signals of the components, we allow the components
to communicate with each other internally4; such signals have to be outputs of a component, but internal signals from
3 At least for every signal s ∈ Sig which actually occurs, i.e. M[s±〉〉 for some reachable marking M.
4 For the time being, our decomposition method does not produce such components, but this is definitely a future research topic.
M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266 249
a+ b+x+ y+(a)
a+ b+x+ y+(b)
Fig. 3. (a) Original STG N . (b) Decomposition into two components. In the second component and also in the parallel composition – but not in N
– the input signal b+ can occur at the initial marking yielding a marking M. But this is not a problem since no environment suitable for N will fire
b+ initially. Correspondingly, the marking M does not have to (and in fact cannot) appear in the STG-bisimulation for this decomposition.
the perspective of the specification. Thus, we allow such signals to be hidden globally on the level of the parallel
composition using the set of signals H .
Definition 2 (Correct Decomposition). A collection of deterministic components (Ci )i∈I is a correct decomposition
of (or simply correct w.r.t.) a deterministic STG N – also called specification – when hiding H , if C = (||i∈I Ci )/H
is defined, InC ⊆ InN , OutC ⊆ OutN and there is an STG-bisimulation B between the markings of N and those of C
with the following properties:
1. (MN , MC ) ∈ B.
2. For all (M, M ′) ∈ B, we have:
(N1) If a ∈ InN and M[a±〉〉M1, then either a ∈ InC , M ′[a±〉〉M ′1 and (M1, M ′1) ∈ B for some M ′1 or a ∈ InC
and (M1, M ′) ∈ B.
(N2) If x ∈ OutN and M[x±〉〉M1, then M ′[vx±〉〉M ′1 and (M1, M ′1) ∈ B for some M ′1 with v ∈ (IntC±)∗ .
(N3) If u ∈ IntN and M[u±〉〉M1, then M ′[v〉〉M ′1 and (M1, M ′1) ∈ B for some M ′1 and v ∈ (IntC±)∗.
(C1) If x ∈ OutC and M ′[x±〉〉M ′1, then M[vx±〉〉M1 and (M1, M ′1) ∈ B for some M1 with v ∈ (IntN±)∗.
(C2) If x ∈ Outi for some i ∈ I and M ′ Pi [x±〉〉, then M ′[x±〉〉 (no computational interference).
(C3) If u ∈ IntC and M ′[u±〉〉M ′1, then M[v〉〉M1 and (M1, M ′1) ∈ B for some M1 and v ∈ (IntN ±)∗.
Here, and whenever we have a collection (Ci )i∈I in the following, Pi stands for PCi , Outi for OutCi etc.
In the most simple case, (Ci )i∈I consists of just one component C1 (immediately implying (C2)) and H is empty;
in this case we say that C1 is a (correct) implementation of N.
B describes how behaviour of N and C closely match each other, similar to ordinary bisimulation. As in [16,15], we
allow OutC to be a proper subset of OutN for the case when there are output signals, which are in fact never produced
by the specification. Our decomposition algorithm actually only produces components Ci where OutC = OutN ; in
any case, if equality is desired, it can be achieved by formally adding the missing output signals OutN \ OutC to some
set Outi .
For a different reason we allow InC to be a proper subset of InN ; there are cases where some inputs are just
irrelevant for the behaviour of a circuit, but they were possibly included by some design error. The decomposition
algorithm might detect such signals, since they are not needed for any component. Because of this possibility, in (N1)
an input signal transition of the specification does not have to be matched by the implementation.
(C2) ensures that no computational interference (mentioned before the definition of parallel composition) occurs;
i.e. if a component produces an output, then the other components expect this signal if it belongs to their inputs, and
no malfunction of these other components must be feared. (C2) is actually also satisfied for x ∈ Inti , since internal
signals of one component are by the definition of parallel composition unknown to the other components.
Remarkably, there is no condition that requires a matching for an input occurring in the implementation. On the one
hand, if the specification also allows such an input in a matching marking, then the markings after the input must match
again by (N1) due to determinism. On the other hand, there are very natural decompositions which allow additional
input edges compared to the specification, and it does no harm to include these decompositions in our definition:
since the specification also describes which inputs are or are not allowed for the environment, the additional inputs
will actually never occur if the decomposition runs in an environment it is meant for. (The additional input leads to a
marking which in a way corresponds to a don’t-care entry in a Karnaugh-diagram.)
As a consequence, the components might have behaviour and markings that never turn up if the components run
in an appropriate environment; also, these markings do not appear in B; see e.g. Fig. 3 taken from [16]. A subtle
property of our correctness definition is that it allows e.g. computational interference for such markings, which is
perfectly reasonable since such an interference will not occur in practical use.
250 M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266
dsr+ lds+
d- lds- ldtack- ldtack+
dsr- d+
C1
dtack- d+
dtack+d-
C2
Fig. 4. Decomposition of VME from Fig. 1(a) into two components.
The features discussed so far are taken from [16], where some more explanations can be found. The new features
deal with internal signals; they extend the definition of [16] conservatively: for STGs without internal signals, the
two correctness notions coincide. The consequence will be that the result about top-down decomposition in the next
section also applies in the setting of our decomposition algorithm, where we have not considered internal signals so
far.
First of all, we allow the hiding of some output signals H in the parallel composition of the components; this
concerns additional signals to enable communication between the components. It is no problem that we allow hiding
at the “top-level” only: by way of an example, assume that the components C1 and C2 communicate via a signal x
which should not be visible to the other components; this would be modelled by (((C1||C2)/{x}) || (||i∈I\{1,2}Ci ))/H .
Now this equals ||i∈I Ci/(H ∪ {x}) by the properties (CA8) and (CA6) of a circuit algebra, where (CA8) is applicable
since x is only known to C1 and C2 and hence assumed to be not a signal of ||i∈I\{1,2}Ci . We will use similar reasoning
in Section 4 where a component will be replaced by a decomposition of its own.
In (N2) and (C1) outputs do not have to be matched directly; (N2) allows the components to prepare the production
of this output by some internal signals, e.g. to achieve CSC or to inform other components; (C1) allows the
specification to perform internal signals also. In any case, from an external point of view each output is matched
by the same output.
In contrast, input signals must be matched directly; if the implementation could precede the input by some internal
signals, the environment could produce the input as specified in N at a stage where the implementation is not ready
yet to receive it, which could lead to malfunction as discussed above in connection with speed-independence and
input-properness. As for computational interference, the absence of this malfunction is only checked for markings
appearing in B, since only for these the problem is practically relevant.
In fact, the direct matching of inputs implies that the implementation is in a sense input-proper, at least in its
“reachable behaviour”: assume that M1[u±〉〉M2 with u ∈ IntC , M1 a reachable marking of C , and M2[a±〉〉 for some
a ∈ InC ; then either there is no pair (M, M1) in the STG-bisimulation (hence, M1 will not be reached if C works in a
proper environment) or ¬M[a±〉〉 (a proper environment will not produce a±) or M1[a±〉〉 by (N1). In the latter case
a± is not enabled by u±, hence input-properness is not violated.
Finally, (N3) and (C3) prescribe the matching of an internal signal by a sequence of internal signals – just as in
ordinary weak bisimulation. Note that we have several internal signals, since these have to be implemented physically;
but regarding external behaviour, the identity of an internal signal does not matter. In principle, performing an internal
signal could make a choice, e.g. by disabling an output; according to these clauses, this choice has to be matched.
An example for a decomposition for VME (generated with our tool DESIJ [14]) which is correct according to
Definition 2 can be found in Fig. 4. Observe that the input d of the right hand component is an output of the parallel
composition and is produced by the left hand component. The corresponding STG-bisimulation can be found in
Appendix A, Figs. A.1–A.3.
Translating the treatment of internal signals in the definition of the somewhat related notion of I/O-compatibility
[2] to our setting (see also Section 7), one would require that e.g. in (N3) (M1, M ′) ∈ B without involving any u –
and analogously in (C3); the idea is that internal signals cannot make decisions in digital circuits. There are several
reasons not to follow this idea. First of all, this concerns a property one might like all STGs to have and it is not related
to comparing STGs or to the communication between circuits – in contrast to e.g. computational interference; if one
M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266 251
wants this property in order to ensure physical implementability, it has to hold also for markings not appearing in B.
Therefore, this property has no adequate place in a correctness definition and should be required separately. Secondly,
one might want to use so-called ME-elements [17], which can make decisions; the respective signals could be internal
to the parallel composition. We see it as an advantage that we can cover such cases. Finally, the alternative definition
turned out to be technically inconvenient.
Observe that the alternative definition coincides with ours if the specification does not have internal signals; then,
(N3) is never applicable, and in (C3) we have v = λ and M = M1.
There is another important comment. Our correctness definition concerns the correctness of a decomposition, but it
also covers the question whether one STG is an implementation of another. With this notion, we will prove in Section 5
that speed-independent CSC-solving with PETRIFY produces a correct implementation.
One would like this implementation relation to be a preorder. Reflexivity is obvious (choose B as the identity), and
transitivity will follow from our first main result in the next section. One would also like it to be a precongruence for
the operations of interest. This is obvious for relabelling and easy for hiding (use the same STG-bisimulation). The
much more important case of parallel composition will be discussed in the next section.
Actually, one can see a more general result for hiding just as easily: (∗) if (Ci )i∈I is correct w.r.t. N when hiding H ,
then (Ci )i∈I is also correct w.r.t. N/H ′ when hiding H ∪H ′. As a consequence, we can apply our decomposition algo-
rithm [16,15] also to an STG N1 with internal signals as follows. Since the algorithm can only decompose STGs with-
out internal signals, we change the internal signals of N1 to outputs obtaining an STG N2 with N1 = N2/Int1. Then we
decompose N2, obtaining a correct decomposition (Ci )i∈I of N2. After that, the formerly internal signals are hidden
in N2 and in ||i∈I Ci and from (∗) we get that (Ci )i∈I is a correct decomposition of N1 = N2/Int1 when hiding Int1.
4. Decomposition of subcomponents
In this section we will show that correctness is preserved when we decompose a component of an STG decompo-
sition into subcomponents. This result makes it possible to design and implement STGs in a top-down fashion.
In particular, such top-down decomposition can be useful for efficiency of our decomposition algorithm. For
example, consider a case where only one component Ci of a decomposition needs a specific input signal a, which
therefore will be removed from every other component by the decomposition algorithm (cf. Section 1). Alternatively,
the algorithm could first construct a component C j which generates every output signal that is not produced by Ci , and
afterwards decompose it into smaller components. This way, the signal a will only be removed from one component
C j , which can improve performance. This and other strategies for decomposition are studied in [14]; there, it is also
discussed how to group the output signals for good efficiency.
Top-down decomposition as described above is possible under two minor conditions stated in the following
theorem: the composition of the subcomponents must have all output signals of the decomposed component and
its internal signals must be unknown to the other components. The first condition is often automatically true or can be
achieved easily as mentioned after the definition of correctness, the latter one is an obvious restriction required by our
definition of parallel composition and can trivially be fulfilled renaming internal signals.
Theorem 3 (Correctness of Top-down Decomposition).
(1) Let N be an STG and (Ci )i∈I a correct decomposition of N when hiding HC. Furthermore let (Ck)k∈K be
a correct decomposition of some C j when hiding HK ( j ∈ I , I ∩ K = ∅). Then (Ci )i∈I ′ with I ′ := I
∪ K − { j} is a correct decomposition of N when hiding HC ∪ HK if ⋃k∈K OutCk \ HK = OutC j and
(
⋃
k∈K IntCk ∪ HK ) ∩
⋃
i∈I\{ j } SigCi = ∅.(2) The implementation relation is a preorder.
Proof. The proof of (1) requires a careful and detailed case analysis and can be found in Appendix B; here we will
just give the main idea.
The main part is to find an STG-bisimulation B between the markings of N and the new decomposition C ′: let B1
be the STG-bisimulation between the markings of N and the old decomposition C , B2 the one between the markings
of C j and its decomposition CK . We define a relation B between the markings of N and C ′ as follows:
(M1, M2∪˙M3) ∈ B ⇔ (M1, M2∪˙M j ) ∈ B1 and (M j , M3) ∈ B2 for some M j ,
252 M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266
where M1 is a marking of N , M j one of C j , M2 a combined marking of the untouched components, and M3 a marking
of CK .
Regarding (2), we already know that the implementation relation is reflexive; transitivity is just a special case of
(1), where both hiding sets are empty and the decompositions have just one component each. So (1) tells us that,
if C is an implementation of N and C ′ an implementation of C , then C ′ is an implementation of N – except that
we do not have the two extra conditions required in (1). But observe that the second condition is trivially true since⋃
i∈I\{ j } SigCi is empty. The first condition is only needed to prove claims that are obvious for this restricted case,
namely that the parallel composition C ′ (which has only one component here) is defined and that InC ′ ⊆ InN . 
Remark. One might expect that refining a component C j of (||i∈I Ci )/HC with (||k∈K Ck)/HK would give the STG(||i∈I\{ j }Ci || (||k∈K Ck/HK )) /HC , where there is not just one hiding on the top-level as in the theorem. But with
the same reasoning already used in the discussion of Definition 2, we can derive from the properties (CA8) (use the
second assumption on HK ) and (CA6) of a circuit algebra that for H = HC ∪ HK :(||i∈I\{ j }Ci || (||k∈K Ck/HK )) /HC = ((||i∈I ′Ci ) /HK ) /HC = ||i∈I ′Ci/H.
As explained after Definition 2, our correctness definition coincides with the one of [16,15] if we restrict ourselves
to STGs without internal signals; hence, the above theorem also holds in this setting (where of course no hiding
is applied, i.e. the hiding sets are taken to be empty). Therefore, the theorem can indeed be used to improve the
decomposition of [16,15] as explained at the beginning of this section.
Surprisingly, the theorem has also an impact on the question whether the implementation relation between STGs
is a precongruence for parallel composition, which we will now show under some mild restrictions. Recall that, for
some N1||N2 to be defined, we only had some syntactic requirements regarding the signal sets; but the composition
only makes sense in the area of circuits, if we also ensure absence of computational interference; for the following
definition cf. the discussion on condition (C2) of Definition 2.
Definition 4 (Interference-free). A parallel composition N1||N2 is interference-free if, for all its reachable markings
M1
.∪ M2, i ∈ {1, 2} and x ∈ Outi , Mi [x±〉〉 implies M1
.∪ M2[x±〉〉.
Corollary 5. If N2 is a correct implementation of N1, N1 and N2 have the same output signals, and N1||N is a
well-defined and interference-free parallel composition, then N2||N is a correct implementation of N1||N.
Proof. Since the composition is interference-free, the identity is an STG-bisimulation showing that the family (N1, N)
is a correct decomposition of N1||N ; note that in this setting all conditions for an STG-bisimulation are trivially
fulfilled except for (C2). With this observation, the claim follows from our theorem. 
Note that each of our operations hiding, renaming and parallel composition with another STG changes the set of
output signals in the same way, such that equality of these sets is preserved.
Corollary 6 (Implementation Relation as Precongruence). The implementation relation is a precongruence for
hiding, relabelling and parallel composition when restricted to STGs with the same output signals.
We will see another application of the theorem in the next section.
5. CSC-Solving for components of a decomposition
In this section we will prove that speed-independent CSC-solving fits into our correctness definition, i.e. that it
leads to a correct implementation. Theorem 3 then implies that speed-independent CSC-solving can be combined
with our decomposition algorithm. The latter could be shown directly without this theorem, but its use makes the
following proof much easier, because we have to consider only one component. First, we will introduce the operation
of input-proper event insertion, which is used by the tool PETRIFY to achieve CSC.5
Given an STG without CSC, one can (in many cases) insert internal signals into the STG such that their values
distinguish between the markings with equal state vectors but different enabled outputs. These insertions take place
5 One can also insert events on the level for Petri nets, see e.g. [11].
M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266 253
a x c
d
(a)
1 2
3 4
5 6
c
a
c
a
x x
c
d
(b)
1 2 2’
3 4 4’
5 6
c
a
c
a
x x
u
u
a
c
d
(c)
Fig. 5. Example for an event insertion. (a) A Petri net (to keep it small, transitions are labelled with signals). (b) Its reachability graph. The two
gray states are the region R where the new event u will be inserted. (c) The reachability graph with the inserted event u. The marking relation is
M = {(1, 1), (2, 2), (2, 2′), (3, 3), (4, 4), (4, 4′), (5, 5), (6, 6)}.
on the level of reachability graphs (as most of our considerations in this paper do). It is also possible to derive an
STG for the modified reachability graph, and although this is not important for the synthesis of a circuit, it fits our
manner-of-speaking well. We take the following definition of input-proper event insertions from [6]. One can perform
a number of these operations, arriving at an STG with CSC, and this we call speed-independent CSC-solving.6
In the following definition, a region denotes a set of markings where a new signal transition should be produced;
all these markings are duplicated, such that the state vectors of the new markings differ from the old ones only in
the value of the new internal signal. Furthermore, once the region is entered, it can only be left if the internal signal
was produced. A region has to be chosen such that the CSC conflict is actually destroyed and input-properness is
preserved. To prove the correctness of CSC-solving we only need the latter condition.
Definition 7 (Event Insertion). Let N be a deterministic STG, u± a signal transition not appearing in N for a
(possibly new) internal signal u and R ⊆ [MN 〉. The event insertion of u± at region R into N modifies the reachability
graph RGN (and results in a corresponding STG N ′) as follows (cf. e.g. Fig. 5):
(1) For every marking M ∈ R add a duplicate M ′ and add the transition M[u±〉〉M ′.
(2) If M1, M2 ∈ R and M1[s±〉〉M2, add the transition M ′1[s±〉〉M ′2.
(3) If M1 ∈ R, M2 ∈ [MN 〉 \ R and M1[s±〉〉M2, remove this transition and add M ′1[s±〉〉M2.
(4) The initial marking of N ′ is the same as that of N . Add u to Int.
The insertion is called input-proper, if there is no M1[a±〉〉M2 in RGN with a ∈ In, M1 ∈ R and M2 ∈ R.
We define the marking relationM between the markings of N and of N ′ such that (M1, M2) ∈M if M2 = M1 or
M2 = M ′1.
It is not hard to see that N ′ as above is deterministic again. An example for an event insertion can be found in
Fig. 5.
Of course, the insertion of just a single signal edge will generate an inconsistent STG; this is no problem for our
correctness notion, but for practical purposes event insertions are always performed in pairs (u+ and u− for a new
internal signal u), such that the result is consistent again; this is called signal insertion.
The following lemma is needed to prove the main theorem of this section. It is also needed to prove Proposition 9
which explains why we speak of an input-proper insertion and which in particular guarantees that input-properness is
preserved during CSC-solving.
Lemma 8. Let N be an STG and N ′ be obtained from N by the event insertion of u± at region R. Let (M1, M2) ∈M
and a ∈ SigN .
6 Other methods of CSC-solving rely on timing-assumptions and are not treated here.
254 M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266
(1) If M2 = M ′1, then M1[a±〉〉Mˆ1 in N implies M2[a±〉〉Mˆ2 in N ′ with (Mˆ1, Mˆ2) ∈M.
(2) M2[a±〉〉Mˆ2 in N ′ implies M1[a±〉〉Mˆ1 in N with (Mˆ1, Mˆ2) ∈M.
Proof. (1) M2 = M ′1 implies M1 ∈ R and by Definition 7(2),(3) M2[a±〉〉Mˆ2 in N ′ with (Mˆ1, Mˆ2), where we have
Mˆ2 = Mˆ ′1 if case 2 is applicable and Mˆ2 = Mˆ1 if case 3 is applicable.
(2) The reasoning is similar. 
Proposition 9. Let N be an input-proper STG and let N ′ be obtained by the insertion of u± at R. Then N ′ is input-
proper if and only if the insertion is.
Proof. Assume that the insertion is not input-proper because of M1[a±〉〉M2; then we have in N ′: M1[u±〉〉M ′1 due
to Definition 7(1) and ¬M1[a±〉〉 and M ′1[a±〉〉M2 due to (3).
Vice versa, assume now that N ′ is not input-proper because of M1[v±〉〉M2, ¬M1[a±〉〉 and M2[a±〉〉M3 for some
a ∈ In. If v± is the newly inserted u±, then M2 = M ′1; we cannot have M2[a±〉〉M3 due to Definition 7(2) because
then M1[a±〉〉, and thus we must have M2[a±〉〉M3 due to Definition 7(3), i.e. the insertion is not input-proper because
of M1[a±〉〉M3 in N .
Otherwise, there are Mˆ1 and Mˆ2 with (Mˆ1, M1) ∈ M, (Mˆ2, M2) ∈ M, Mˆ1[v±〉〉Mˆ2 in N and Mˆ2[a±〉〉 in N by
Lemma 8(2). Since N is input-proper, this implies Mˆ1[a±〉〉; since ¬M1[a±〉〉 in N ′, this in turn implies Mˆ1 = M1 by
Lemma 8(1). Thus, Mˆ1 has ‘lost’ an a±-transition during the event insertion; this can only be due to Definition 7(3),
and also in this case, the insertion is not input-proper. 
Now we come to the main results of this section.
Theorem 10 (Correctness of CSC-Solving). Let N be an STG and N ′ be obtained from N by speed-independent
CSC-solving, then N ′ is a correct implementation of N and vice versa.
Proof. N ′ is obtained from N by a sequence of input-proper event insertions. It suffices to show the claims for
one such insertion, and then the theorem follows from Theorem 3(2). Thus, assume that N ′ is obtained from N by the
input-proper insertion of u± at R, withM being the corresponding marking relation. Obviously, we have InN = InN ′ ,
OutN = OutN ′ .
N ′ is a correct implementation of N : We will show that M is an STG-bisimulation for N and N ′.
(1) Fulfilled by definition of event insertion.
(2) For this part observe that (C2) is trivially fulfilled, because we consider only one component. Now let
(M1, M2) ∈M.
For (N1)–(N3), we only have to consider M2 = M1 due to Lemma 8(1). If a ∈ InN and M1[a±〉〉Mˆ1 in N , then
Definition 7(3) cannot be applicable for this transition since the insertion is input-proper, hence M1[a±〉〉Mˆ1 in N ′ as
well with (Mˆ1, Mˆ1) ∈M and (N1) follows.
Now let x ∈ LocN and M1[x±〉〉Mˆ1 in N . Then we have in N ′ that M1[x±〉〉Mˆ1 if M1 ∈ R or Mˆ1 ∈ R, and that
M1[u ± x±〉〉Mˆ1 otherwise; obviously (Mˆ1, Mˆ1) ∈M and (N2) and (N3) follow.
(C1/C3): Let x ∈ LocN ′ and M2[x±〉〉Mˆ2 in N ′. If x± is not the inserted u±, Lemma 8(2) implies M1[x±〉〉Mˆ1
and (Mˆ1, Mˆ2) ∈M. Otherwise, we have M2 = M1, Mˆ2 = M ′1 and (M1, Mˆ2) ∈M.
N is a correct implementation of N ′: We will argue thatM−1 is an STG-bisimulation for N ′ and N .
(1) Fulfilled by definition of event insertion.
(2) For this part observe that (C2) is trivially fulfilled, because we consider only one component. Furthermore,
N2/N3/C1/C3 are dual to C1/C3/N2/N3 above, so we only have to check (N1), which follows directly from
Lemma 8(2), since InN = InN ′ . 
Now we can conclude that speed-independent CSC-solving can be combined with decomposition. An example for
this can be found in Appendix A.
Corollary 11. Let (Ci )i∈I be a correct decomposition of N when hiding H , and let C ′i be obtained from Ci by speed-
independent CSC-solving for all i ∈ I . Then (C ′i )i∈I is a correct decomposition of N when hiding H .
M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266 255
Proof. It is sufficient to consider one component C j and to apply induction afterwards. By Theorem 10, C ′j is a correct
decomposition of C j . Furthermore it fulfills the preconditions of Theorem 3(1), esp. the crucial first one on HK since
HK = ∅ and event insertion does not change the sets of output and input signals. Therefore ((Ci )i∈I\{ j }, C ′j ) is a
correct decomposition of N . 
6. Correctness of an ILP approach to decomposition
In this section we will show that the decomposition method of Carmona and Cortadella [1,3], which has not
been proven correct so far, yields components which are correct decompositions according to our definition. For this
method, it is assumed that an STG with CSC is given, where CSC can also be achieved by modifications on the
STG-level, i.e. without considering the reachability graph. (It can also be given due to a suitable translation from a
description in a high-level language to STGs as in [18].) As explained at the end of Section 2, we can assume that
there are no internal signals.
The method of [1,3] works roughly as follows. Starting with a deterministic STG N that already has CSC, for every
output signal x a CSC support is determined; this is a set of signals, which guarantees CSC for x . Here is the formal
definition:
Definition 12 (CSC Support). Let N be an STG and S ⊆ SigN .
(1) For v ∈ (SigN ±)∗, code change(S, v) is defined as the function over S ⊆ SigN which maps each s ∈ S to the
difference between the numbers of s+ and of s− in v.
(2) S is called CSC support for the output signal x if the following holds for all reachable markings M1, M2: if
MN [v1〉〉M1, MN [v2〉〉M2 for some v1, v2 ∈ (SigN ±)∗ with code change(S, v1) = code change(S, v2), then
M1[x±〉〉 ⇔ M2[x±〉〉.
From the previous definition one can derive an integer linear programming problem (ILP) for an output x and a
signal set S. The infeasibility of this problem then implies that S is a CSC support for x . Actually, the algorithm of
Carmona and Cortadella uses a slightly weaker definition of CSC support, which nevertheless coincides with the given
one for most practical STGs.7 The ILP problem in [3] can easily be modified to match the more accurate Definition 12,
see also [9].
The algorithm starts for every output x with the set including the so-called syntactical triggers of x and x itself, and
iteratively adds signals until it is a CSC support for x , which is checked with the ILP problem mentioned above. Since
the original STG has CSC, this algorithm is always successful. An advantage is therefore that this method produces
components with CSC.
After that, for every output signal the original STG is projected onto the corresponding CSC support: the other
signals are considered as dummies, and as far as possible these dummies and redundant places are removed much as
in our decomposition algorithm. If the resulting component still contains dummies, then [private communication]: the
reachability graph is generated and viewed as a finite automaton with dummies regarded as the empty word. Now the
automaton is made deterministic with well-known methods, which in particular remove all λ-labelled edges. Finally,
we can regard this automaton as an STG again, which has the edges of the automaton as transitions.
The projection part is similar to our algorithm, the difference occurs where backtracking is performed: the method
of [1,3] uses some form of backtracking when determining the CSC support as described above – our algorithm uses
backtracking when the contraction of a dummy signal is not possible.
The CSC-support algorithm produces components (Ci )i∈I with the following properties, which we use for the proof
of Theorem 13. Actually, Item 2 allows components with more than one output signal, making our result stronger.
(1) Every component is deterministic.
(2) The signals of every Ci is a CSC support of the output signals of Ci .
(3) ∀i ∈ I : L(Ci ) = L(N)↓i .
7 In [3], S is called CSC support for the output x if for all reachable markings M1, M2 with M1[v〉〉M2 and code change(S, v) = 0 one has:
M1[x±〉〉 ⇔ M2[x±〉〉. This definition is equivalent to ours for a reversible STG, i.e. if MN is reachable from every reachable marking.
256 M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266
In the last item, L(N)↓i denotes the projection of L(N) onto the signals of Ci , i.e. all signal transitions s± for
which s ∈ Sigi are removed from the strings in L(N). Note that this item is not equivalent to L(||i∈I CI ) = L(N). An
example for a decomposition satisfying these properties can be found in Appendix A. We can now prove that (Ci )i∈I
is a correct decomposition according to Definition 2.
Theorem 13 (Correctness of the CSC-support Algorithm). Let N be an STG and (Ci )i∈I be given as above. Then,
(Ci )i∈I is correct w.r.t. N. The algorithm of [3] is correct.
Proof. By the above, the second claim is a corollary of the first. Let C = ||i∈I Ci . We define a relation B between the
markings of N and C by
(M, (Mi )i∈I ) ∈ B ⇔ ∃w : MN [w〉〉M ∧ ∀i ∈ I : MCi [w↓i 〉〉Mi
where (Mi )i∈I denotes the disjoint union of the Mi , i.e. a marking of ||i∈I Ci .
We will show that B is an STG-bisimulation.
(1) Obviously fulfilled for w = λ.
(2) Let (M, (Mi )i∈I ) ∈ B. Therefore ∃w : MN [w〉〉M ∧∀i ∈ I : MCi [w↓i 〉〉Mi . Since there are no internal signals,
we do not have to consider (N3) and (C3).
(N1): Let a ∈ InN and M[a±〉〉Mˆ . This implies wa± ∈ L(N) and therefore ∀i ∈ I : (wa±)↓i∈ L(Ci ). If a ∈ InC
we are done, otherwise it follows from the determinism of the components that every Ci with a ∈ Ini can fire a±:
there is only one transition sequence v with l(v) = w↓i and one sequence v′ with l(v′) = w↓i a±, obviously v is a
prefix of v′ and reaches Mi , and therefore Mi [a±〉〉Mˆi .
This holds for every component with a ∈ Ini and therefore (Mi )i∈I [a±〉〉(Mˆi )i∈I where Mˆi = Mi if a ∈ Ini , and
by definition of B we get (Mˆ, (Mˆi )i∈I ) ∈ B.
(N2): Analogous to (N1), since we do not have to consider internal signals.
(C2): Let x ∈ Out j and M j [x±〉〉. Therefore, w↓ j x± ∈ L(C j ) = L(N)↓ j which implies that there exists a
w′ ∈ (SigN ±)∗ with MN [w′〉〉M ′[x±〉〉 and w↓ j= w′↓ j∈ (Sig j±)∗. Since Sig j is a CSC-support for x ∈ Out j and
obviously code change(Sigj, w) = code change(Sigj, w′), we conclude that M[x±〉〉. Applying (N2) gives the desired
result: (Mi )i∈I [x±〉〉.
(C1): Let x ∈ OutC and (Mi )i∈I [x±〉〉(Mˆi )i∈I . If x± is produced by component j , we get M j [x±〉〉; then our
considerations for (C2) imply M[x±〉〉Mˆ . By definition of B: (Mˆ, (Mˆi )i∈I ) ∈ B. 
7. Implementation according to Dill and Carmona/Cortadella
As mentioned at the end of Definition 2, the simplest possible ‘decomposition’ of a specification N1 would consist
of only one component N2, such that no hiding is needed.8 We write N1 SV N2 and say that N1 is SV-implemented
by N2, if N2 is correct w.r.t. N1. Since the other two implementation relations introduced below require In1 = In2 and
Out1 = Out2, we assume further that N1 SV N2 only if these equations hold; this is no real restriction since missing
signals can be added easily to N2.9
We have already shown that SV is a preorder, and in this section we will compare it to two other implementation
relations. Recall that we require STGs to be deterministic.
The second implementation relation Dill is based on the notion of prefix-closed trace structures, defined by Dill
[7]. Trace structures do not have internal signals, and therefore we will restrict ourselves in Theorem 17 to STGs for
which this holds, too.
The third implementation relation CC is based on the notion of I/O-compatibility ([3]) which defines when an
STG works safely together with another STG, which can be considered as the environment. Since no implementation
relation is defined in [3], we have decided to base one on I/O-compatibility in the spirit of Dill. Since I/O-compatibility
requires the STGs to be livelock-free (see below), we restrict ourselves to STGs with this property in Theorem 20.
To define and study Dill , we first repeat the basic definitions of [7].
8 If one wants to hide some signals, one can directly do so in N2; the external hiding of Definition 2 was introduced for inter-component
communication, cf. the discussion before the definition.
9 Output signals can be added formally to Out2; for every missing input signal a, the transitions a+, a− resp., can be added such that one of
them is always activated and consistency is preserved, e.g. as a ring with two transitions.
M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266 257
Definition 14 (Prefix-Closed Trace Structures). A prefix-closed trace structure is a tuple T = (In, Out,S,F). In is
the set of input signals, Out the set of output signals. We define Sig := In ∪Out. S,F ⊆ Sig±∗ are the sets of success
traces, failure traces respectively. We define the set of possible traces P := S ∪ F . S and P must be prefix-closed,
and the trace structure has to be receptive, i.e. P ·In± ⊆ P . 
A trace structure10 represents an asynchronous circuit, of which the traces are partial executions. Success traces
do not lead to malfunction of the circuit, while failure traces do. Obviously, prefix-closedness is a sound requirement,
because a circuit has to perform all prefixes of v before it performs v. Receptiveness means that every possible trace
can be extended by an input signal, i.e. there are no restrictions for the environment, but such an extension can turn a
success trace into a failure trace. In the original definition of Dill, traces are sequences over a set of signals instead of
signal transitions, which is obviously not a real change.
There are two problems with prefix-closed trace structures. First, they can have inherent nondeterminism in the
form of traces v ∈ S ∩ F , i.e. traces which could lead to malfunction or not. Second, there can exist success traces s
which can be extended by a sequence w of only output signals to a failure trace f = sw, i.e. the malfunction of the
circuit is caused by the circuit itself and cannot be avoided by the environment.
These problems can be corrected by removing traces such as s and v from the success set and adding them to the
failure set. The resulting trace structures are called canonical prefix-closed trace structures.
Definition 15 (Canonical Prefix-Closed Trace Structures). A trace structure T = (In, Out,S,F) is called canonical
if S ∩ F = ∅, F/Out± ⊆ F11 and F ·Sig± ⊆ F . 
For a canonical trace structure, the failure set is completely determined by the success set: F = ((S·In±) −
S)·(Sig±)∗, i.e. failure traces are success traces which are extended with an ‘unexpected’ input and possibly with
additional signal edges. It is therefore sufficient to give (In, Out,S). Clearly, a deterministic STG N without internal
signals can be transformed into the canonical trace structure (In, Out, L(N)), and for the rest of this section we will
identify an STG with its corresponding canonical trace structure.
Dill then defines when a system will work well in some environment, and what he requires from an implementation
to work well in all environments in which the specification works well. In the sense of this implementation relation,
each trace structure is equivalent to a canonical one, such that we can restrict attention to these. Finally, Dill
gives a characterisation of his implementation relation for canonical trace structures, and we use this to define the
implementation relation Dill .
Definition 16. Let T1 and T2 be two canonical trace structures. T1 is Dill-implemented by T2, T1Dill T2, if In1 = In2,
Out1 = Out2, F2 ⊆ F1 and P2 ⊆ P1. 
Since STGs without internal signals can be identified with canonical trace structures, we can apply Dill to such
STGs.
Theorem 17. For deterministic STGs without internal signals we haveSV ⊆ Dill .
Proof. Let N1, N2 be two STGs with N1 SV N2. Due to the definition of SV we have In1 = In2 =: In and
Out1 = Out2 =: Out. Furthermore, there exists an STG-bisimulation B between the markings of N1 and N2.
(1) F2 ⊆ F1. Since Fi ·Sigi± ⊆ Fi , it is sufficient to show that all minimal (regarding prefix) failure traces of N2
are also failure traces of N1. Let f ∈ F2 be a minimal failure trace of N2. Clearly, f = sa± with a ∈ In and s ∈ S2.
If s ∈ F1, so is f and we are done. If otherwise s ∈ S1, the existence of B implies by (N1) and (N2) that
MN1 [s〉〉M1 and MN2 [s〉〉M2 (observe that there are no internal signals) with (M1, M2) ∈ B. M1[a±〉〉 would imply
by (N1) M2[a±〉〉 and f ∈ S2, therefore ¬M1[a±〉〉 and f ∈ F1.
If s ∈ S2, let s = s′b ± s′′ with b ∈ Sig and s′ the longest prefix of s with s′ ∈ S1. As above, the existence of
B implies that MN1 [s′〉〉M1 and MN2 [s′〉〉M2. Since s′b± ∈ S2, M2[b±〉〉. Therefore, b cannot be an output signal,
because (C1) would imply that M1[b±〉〉. Hence, b ∈ In and s′b± ∈ F1 implying f ∈ F1.
10 We will frequently omit ‘prefix-closed’.
11 Let X and Y be two sets of strings; the quotient X/Y is defined as {x | ∃y ∈ Y : xy ∈ X}. Actually, F/Out± ⊆ F is equivalent toF/(Out±)∗.
258 M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266
a b
x y
N1
a
x
N2
Fig. 6. Counterexample for the proof of Theorem 21.
(2) P2 ⊆ P1. Let p ∈ P2 be a trace of N2. If p ∈ F2, using (1) we get p ∈ F1 ⊆ P1. So assume that p ∈ S2. If
p ∈ S1 we are done, if not let p = p′a ± p′′ with p′ the longest prefix of p with p′ ∈ S1. As above, p′ ∈ S2 and (C1)
implies a ∈ In and therefore p′a± ∈ F1 and p ∈ F1 ⊆ P1. 
We come now to the comparison of SV and CC . As mentioned above we define CC in the spirit of Dill where
now a system works well in an environment when both are I/O-compatible, defined as follows.
Definition 18 (Livelock-Free and I/O-Compatibility). (1) An STG is livelock-free if there is no reachable marking
which enables an infinite sequence of internal edges.
(2) Let N1 and N2 be two livelock-free STGs with In1 = Out2, In2 = Out1 and Int1 ∩ Int2 = ∅. N1 and N2 are I/O-
compatible, denoted N1  N2, if there is a relation R between the markings of N1 and N2 such that (i = 1, 2):
(IO1) (MN1 , MN2 ) ∈ R
(IO2) Receptiveness
(a) If (M1, M2) ∈ R and M1[x±〉〉M ′1 with x ∈ Out1, then M2[x±〉〉M ′2 with (M ′1, M ′2) ∈ R.
(b) vice versa for M2
(IO3) Internal Progress
(a) If (M1, M2) ∈ R and M1[u±〉〉M ′1 with u ∈ Int1, then (M ′1, M2) ∈ R.
(b) vice versa for M2
(IO4) Deadlock-freeness
(a) If (M1, M2) ∈ R and {a ∈ Sig1 | M1[a±〉〉} ⊆ In1, then {a ∈ In2 | M2[a±〉〉} ⊆ In2.
(b) vice versa for M2.
The idea is that the two circuits described by N1 and N2 are working together without failures or deadlocks, one
producing the signals which are received by the other one as inputs. (IO2) requires that outputs must be matched
immediately – without preceding internal signals – by the other circuit, and (IO3) means that the components can
produce internal signals unobserved by the other. (IO4) forbids deadlocks, i.e. at least one circuit must activate an
output or internal signal, and because they are livelock-free, eventually an output must be produced.
Definition 19 (of CC ). Let N1 and N2 be two livelock-free deterministic STGs. We write N1 CC N2 if for all
livelock-free and deterministic STGs N , N1  N implies N2  N .
The proof of the following theorem can be found in Appendix B.
Theorem 20. For deterministic livelock-free STGs: SV ⊆ CC.
Finally, we prove that the inclusions of Theorems 17 and 20 are strict.
Theorem 21. For deterministic STGs without internal signals (which are therefore livelock-free) we have: (1)
CC ⊆ SV (2)Dill ⊆ SV .
Proof. Consider the counterexample in Fig. 6. To keep it simple we do not consider signal edges, which makes no
difference for the purpose of this proof.
(1) Obviously, ¬N1 SV N2, because in the initial marking N1 activates the output y which is not activated in N2,
violating Condition (N2).
On the other hand N1 CC N2: consider an STG N with N  N1. In the initial state, N must be ready to receive
the signals x and y as inputs due to condition (IO2). After x (y resp.) occurred, N1 only activates the input a (b resp.)
M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266 259
and (IO4) implies that N has to activate an output. This output can only be a (b resp.) because any other signal would
violate condition (IO2). Therefore the reachability graph of N is bisimilar (see [13]) to the one of N1. It is therefore
sufficient to show N1  N2, where N1 denotes the mirror of N1, i.e. the STG which is exactly the same as N1 but
inputs and outputs are exchanged.
In the initial state, N2 only activates the output x which is expected by N1. Then N1 activates the output a which
is expected by N2. Observe that it is not relevant that N1 activates the input y in the initial state. In this example, it is
essential that N1 has a dynamic conflict between outputs.
(2) The Universal Do-Nothing Module U is a system which accepts all inputs and produces no output. The
corresponding canonical trace structure U over (In, Out) is given by SU = (In±)∗ for arbitrary In and Out.
Clearly, ¬N2 SV U . On the other hand, for any STG N it is valid that N Dill U ([7]): by Definition 15 we get
FU = ((SU ·In±) − SU )·(Sig±)∗ = ∅·(Sig±)∗ = ∅ ⊆ FN . Consider now a trace p ∈ PU = (In±)∗. If p ∈ SN , we
are done. Otherwise, let p = p′a ± p′′, such that p′ is the longest prefix of p with p′ ∈ SN . Since a is an input, p′a±
and therefore p are also failure traces.
It is essential for this counterexample that Dill requires an implementation to produce only allowed outputs, but
does not prescribe to produce all of them. 
8. Conclusion
We have generalised the correctness definition of [16,15] to decompositions of STGs with internal signals and
proven that speed-independent CSC-solving as performed by PETRIFY is correct. We have shown that the new
correctness is preserved in a top-down decomposition, and this result has a number of consequences: now we can
use step-wise decomposition in the algorithm of [16,15] to improve efficiency as it is described in [14], and we know
that this algorithm in combination with speed-independent CSC-solving gives correct results. Applying the correctness
definition to compare two STGs, we get an implementation relation, and consequences of our result are that this is a
preorder and, with a small restriction, a precongruence for parallel composition, relabelling and hiding.
As another application of the correctness definition, we have shown that a decomposition method based on integer
linear programming [3] is correct. It remains an open problem whether a related method in [18] is correct: while
the first method checks on the original STG to be decomposed whether a set of signals is a CSC-support and in the
positive case removes the other signals, the related method removes some signals and checks CSC on the remaining
STG. There are examples where the remaining STGs have CSC, but the decomposition is – also intuitively – clearly
incorrect.12 This does not show that the method in [18] is incorrect, since there is the additional assumption that the
so-called trigger signals of each output are kept; this assumption is therefore essential for correctness, and the proof
of the latter might be difficult.
Finally, we compared our implementation relation with the one derived from the notion of I/O-compatibility in [2]
and the one defined by Dill. We have shown that our implementation relation is strictly stronger than the latter ones.
As described in the discussion of the new correctness definition, we allow to hide signals on the level of the
parallel composition to enable internal inter-component communication. A future research topic is to enhance our
decomposition algorithm in a way that such signals are introduced appropriately.
Acknowledgement
Mark Schaefer is supported by the DFG-project ‘STG-Dekomposition’ Vo615/7-1.
Appendix A. Examples for CSC-solving and Theorem 13
In this section, we give an example for CSC-solving of a component by input-proper event insertions. Doing so we
show how the corresponding STG-bisimulations are combined to get one for the new decomposition fulfilling CSC.
Consider again the VME bus controller from Fig. 1, which is shown with all places in Fig. A.1; their names are
needed later on for the STG-bisimulations. Recall, that VME itself does not fulfill CSC.
12 Consider a Muller-C-element [12] with inputs a and b and output x; removing b gives a cycle where a and x alternate; this single component
clearly has CSC, but it generates x without waiting for b.
260 M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266
dtack- dsr+ lds+
d- lds- ldtack- ldtack+
dsr- dtack+ d+
p1
p4
p2
p3
p6
p8
p7
p5
p9
p11p10
Fig. A.1. VME bus controller.
dsr+ lds+
d- lds- ldtack-
ldtack+dsr- d+
p2 p3
p4 p5
p1
p6
p7
p8 p9
C1
dtack- d+
dtack+d-
p1
p2
p3
p4
C2
Fig. A.2. Decomposition of VME bus controller into two components C1 and C2.
({p3, p6}, {p2, p4}, {p4})
({p3, p7}, {p2, p5}, {p4})
({p3, p4}, {p2, p3}, {p4})
({p1, p6}, {p2, p4}, {p1})
({p1, p7}, {p2, p5}, {p1})
({p1, p4}, {p2, p3}, {p1})
({p2, p6}, {p1, p4}, {p1})
({p2, p7}, {p1, p5}, {p1})
({p2, p4}, {p1, p3}, {p1})) ({p5}, {p6}, {p1}) ({p9}, {p9}, {p1})
({p11}, {p8}, {p2})
({p10}, {p8}, {p3})({p8}, {p7}, {p3})
∗ ∗
∗ ∗
∗ ∗
∗
∗
Fig. A.3. STG-bisimulation B for the decomposition of Fig. A.2. Elements have the form: (Marking of VME, Marking of C1, Marking of C2).
In Fig. A.2 a possible decomposition of VME into two Components C1 and C2 is shown, which was generated by
our tool DESIJ [14]. The components C1 and C2 are correct decompositions (with empty hiding) of VME due to the
STG-bisimulation B shown in Fig. A.3. An arrow between elements b1, b2 ∈ B denotes b1 ∈ B ⇒ b2 ∈ B due to
Definition 2; the arrow is labelled with ∗ if item (N1) was applied, and it is unlabelled if item (N2), (C1) resp. were
applied.
C1 also has a CSC conflict, which can be resolved, yielding C ′1 in Fig. A.4, by adding one new internal signal csc
in the form of two new events csc+, csc− as described in Definition 7. C ′1 is a correct implementation of C1 due to
STG-bisimulation B′ shown in Fig. A.5 for the following reasons.
Let Cˆ1 be the result of the first event insertion with the corresponding marking relation M1; by Theorem 10, Cˆ1
is a correct implementation of C1 and in the proof it is shown that M1 is the corresponding STG-bisimulation. For
M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266 261
dsr+ lds+
d- lds- ldtack-
ldtack+dsr- d+
csc+
csc-
p2 p3
p4 p5
p1
p6
p7 p8 p9
p′1
p′6
Fig. A.4. Component C ′1 obtained with PETRIFY from C1 by input-proper insertion of the new internal signal csc according to Definition 7.
({p2, p4}, {p2, p4})
({p1, p4}, {p1, p4}) ({p2, p5}, {p2, p5})
({p1, p5}, {p1, p5}) ({p2, p3}, {p2, p3})
({p1, p3}, {p1, p3})
({p1, p3}, {p′1})
({p6}, {p6})
({p9}, {p9})
({p8}, {p8})
({p7}, {p7})({p7}, {p′6})
Fig. A.5. STG-bisimulation B′ for C1 and C ′1.
the second event insertion, we get analogously that M2 is an STG-bisimulation for Cˆ1 and C ′1, and in the proof
of Theorem 3, it is shown that B′ = M1 ◦ M2 is an STG-bisimulation for C1 and C ′1. The STG-bisimulation B′′
(Fig. A.6) for the correct decomposition (C ′1, C2) of VME, which fulfills CSC, can be derived from B and B′ as
described in this proof, too.
In Fig. A.5, the two elements b′1, b′2 of B′ not belonging to the identity relation over [MC1〉〉 are shaded. In Fig. A.6,
the elements of B′′ not belonging to B are shaded; they correspond to b′1 and b′2.
The STG VME can also be used as an example for the correctness of the ILP decomposition approach from
Section 6. In order to apply this method to VME, one has to solve CSC previously, which results in the STG VME’ in
Fig. A.7. Consider now the decomposition (C ′1, C2): clearly, both components are deterministic and have CSC. With
some reflection, one can also see that their languages are L(VME ′) projected onto Sig′1, Sig2 resp. Hence, (C ′1, C2)
fulfills the conditions stated before Theorem 13, which implies therefore that (C ′1, C2) is a correct decomposition of
VME’.
This can also be seen as follows: Theorem 10 implies that VME’ is implemented by VME, of which (C ′1, C2)
is a correct decomposition (see above and Fig. A.6); applying Theorem 3 yields that (C ′1, C2) is also a correct
decomposition of VME’.
Appendix B. Proofs
Proof of Theorem 3.1. Let C = (||i∈I Ci )/HC , CK = (||k∈K Ck)/HK and C ′ = (||i∈I ′Ci )/H , where H := HC ∪HK .
Without loss of generality assume I = {1, 2, . . . , | I | }, j = | I | and K = { | I | + 1, | I | + 2, . . . , | I | + | K | }.
We will write Outi for OutCi etc.
First, we show that the parallel composition of (Ci )i∈I ′ is defined. Obviously, Loci ∩ Loci ′ for different i, i ′ with
either i, i ′ ∈ I \ { j} or i, i ′ ∈ K , because ||(Ci )i∈I and ||(Ck)k∈K are defined. Therefore let k ∈ K , i ∈ I \ { j}; then
262 M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266
({p3, p6}, {p2, p4}, {p4})
({p3, p7}, {p2, p5}, {p4})
({p3, p4}, {p2, p3}, {p4})
({p1, p6}, {p2, p4}, {p1})
({p1, p7}, {p2, p5}, {p1})
({p1, p4}, {p2, p3}, {p1})
({p2, p6}, {p1, p4}, {p1})
({p2, p7}, {p1, p5}, {p1})
({p2, p4}, {p1, p3}, {p1})
({p2, p4}, {p′1}, {p1})
({p5}, {p6}, {p1})) ({p9}, {p9}, {p1})
({p11}, {p8}, {p2})
({p10}, {p8}, {p3})({p8}, {p7}, {p3})
({p8}, {p′6}, {p3})
Fig. A.6. STG-bisimulation B′′ for VME and (C ′1, C2).
dtack- dsr+ csc+ lds+
d- lds- ldtack- ldtack+
dsr-csc- dtack+ d+
Fig. A.7. VME’: VME with CSC.
Lock ∩ Loci = (Intk ∪ Outk) ∩ Loci = Intk ∩ Loci ∪ Outk ∩ Loci = ∅ ∪ Outk ∩ Loci , by assumption about Intk .
Outk ∩ Loci ⊆ (Out j ∪ HK ) ∩ Loci = (Out j ∩ Loci ) ∪ (HK ∩ Loci ) ⊆ (Loc j ∩ Loci ) ∪ (HK ∩ Loci ) = ∅ by the
assumption about HK and because ||i∈I Ci is defined.
For i, i ′ as above, Inti ∩ Ini ′ = ∅. Let therefore i, k be as above, then Ink ∩ Inti ⊆ In j ∩ Inti = ∅ and Intk ∩ Ini = ∅
by the assumptions.
Next, we show the requirements for the sets of output and input signals.
OutC ′ =
⋃
i∈I ′
Outi \ H =
( ⋃
i∈I\{ j }
Outi ∪
⋃
k∈K
Outk
)
\ H
=
( ⋃
i∈I\{ j }
Outi ∪
(⋃
k∈K
Outk \ HK
))
\ HC
=
( ⋃
i∈I\{ j }
Outi ∪ Out j
)
\ HC =
⋃
i∈I
Outi \ HC ⊆ OutN ,
where the third equality holds by the second assumption on HK .
For the input signals, we have
InC =
⋃
i∈I
Ini \
⋃
i∈I
Outi ⊆ InN and
⋃
k∈K
Ink \
⋃
k∈K
Outk ⊆ In j .
M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266 263
It follows that
InC ′ =
⋃
i∈I ′
Ini \
⋃
i∈I ′
Outi =
( ⋃
i∈I\{ j }
Ini ∪
⋃
k∈K
Ink
)
\
( ⋃
i∈I\{ j }
Outi ∪
⋃
k∈K
Outk
)
⊆
( ⋃
i∈I\{ j }
Ini ∪ In j
)
\
( ⋃
i∈I\{ j }
Outi ∪
⋃
k∈K
Outk
)
(∗)⊆
⋃
i∈I
Ini \
( ⋃
i∈I\{ j }
Outi ∪ Out j
)
= InC ⊆ InN .
The inclusion (∗) might fail if we had only (⋃k∈K Outk) \ HK ⊆ Out j . Observe that we do not need to consider
hiding here.
We proceed with the main part of this proof: the requirements for an STG-bisimulation between the markings of
N and C ′. Let B1 be the STG-bisimulation between the markings of N and C and B2 the one between the markings
of C j and CK . We define a relation B between the markings of N and C ′ as follows:
(M1, M2, M3) ∈ B ⇔ (M1, M2, M j ) ∈ B1 and (M j , M3) ∈ B2 for some M j ,
where M1 denotes a marking of N , M2 a marking of C1|| . . . ||C | I | −1, M j one of C j and M3 a marking of
C | I | +1|| . . . ||C | I | + | K | ; thus (M2, M j ) can be regarded as marking of C and (M2, M3) as one of C ′ and we write
(M1, M2, M3) instead of (M1, (M2, M3)) etc. We will show that (Ci )i∈I ′ is a correct decomposition of N when hiding
H due to STG-bisimulation B.
(1): Obviously fulfilled by definition of B.
(2): Let (M1, M2, M3) ∈ B due to (M1, M2, M j ) ∈ B1 and (M j , M3) ∈ B2 for some marking M j of C j .
(N1): a ∈ InN and M1[a±〉〉Mˆ1.
1. Let a ∈ InC ′ ⊆ InC (see above) and therefore (B1) (M2, M j )[a±〉〉(Mˆ2, Mˆ j ) and (Mˆ1, Mˆ2, Mˆ j ) ∈ B1 for some
(Mˆ2, Mˆ j ).
(a) If a ∈ In j we get M j = Mˆ j and immediately (M2, M3)[a±〉〉(Mˆ2, M3) with (Mˆ1, Mˆ2, M3) ∈ B.
(b) a ∈ In j with M j [a±〉〉Mˆ j implies (B2) either a ∈ InK and M3[a±〉〉Mˆ3, (Mˆ j , Mˆ3) ∈ B2 for some Mˆ3 or
a ∈ InK and (Mˆ j , M3) ∈ B2.
In the first case, we get (M2, M3)[a±〉〉(Mˆ2, Mˆ3) with (Mˆ1, Mˆ2, Mˆ3) ∈ B.
In the latter one we get (M2, M3)[a±〉〉(Mˆ2, M3) with (Mˆ1, Mˆ2, M3) ∈ B.
2. Let a ∈ InC ′ . There are two reasons for this:
(a) a ∈ InC . Then, (Mˆ1, M2, M j ) ∈ B1 and by definition (Mˆ1, M2, M3) ∈ B.
(b) a ∈ In j , but a ∈ Ini for i = j and a ∈ InK (a only element of C j ). Therefore, a ∈ InC and
(M2, M j )[a±〉〉(M2, Mˆ j ) with (Mˆ1, M2, Mˆ j ) ∈ B1, since a cannot be a signal of any other component. a ∈ InK
implies (Mˆ j , M3) ∈ B2 and by definition (Mˆ1, M2, M3) ∈ B.
(N2): Let x ∈ OutN and M1[x±〉〉Mˆ1.
Then (B1) (M2, M j )[vx±〉〉(Mˆ2, Mˆ j ) and (Mˆ, Mˆ2, Mˆ j ) ∈ B1 for some (Mˆ2, Mˆ j ) and v ∈ (IntC±)∗. Let
v′ = vx± = v1v2 . . . vn with vi ∈ SigC± for i = 1, . . . , n and
(M2, M j ) = (M02 , M0j )[v1〉〉(M12 , M1j )[v2〉〉(M22 , M2j ) . . . (Mn−22 , Mn−2j )[vn−1〉〉(Mn−12 , Mn−1j )[vn〉〉(Mn2 , Mnj )
= (Mˆ2, Mˆ j ).
We will show by induction over m ∈ {0, . . . , n} that
∃wm ∈ (IntC ′±)∗{x±, λ}, Mm3 : (M2, M3)[wm〉〉(Mm2 , Mm3 )
∧wm↓ExtC′ = (v1 . . . vm)↓ExtC ∧(Mmj , Mm3 ) ∈ B2.
Observe that the case m = n proves our claim.
For m = 0 let M03 = M3 and w0 = λ. By assumption (M0j , M03 ) ∈ B2.
264 M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266
Now let m < n and (M2, M3)[wm〉〉(Mm2 , Mm3 ), wm↓ExtC′ = (v1 . . . vm)↓ExtC ,
(Mmj , M
m
3 ) ∈ B2 and (Mm2 , Mmj )[vm+1〉〉(Mm+12 , Mm+1j ). We distinguish several cases, where in items (3)–(5) we
have either vm+1 ∈ HC or vm+1 = vn = x±:
1. vm+1 ∈ Inti for i ∈ I \ { j} ⇒ Mm+1j = Mmj , Mm+13 = Mm3 , wm+1 = wmvm+1 and (Mm+1j , Mm+13 ) ∈ B2.
2. vm+1 ∈ Int j ⇒ Mm+12 = Mm2 ,Mm3 [w′m〉〉Mm+13 , w′m ∈ (IntK ±)∗, wm+1 = wmw′m and (Mm+1j , Mm+13 ) ∈ B2.
3. vm+1 ∈ Outi for i ∈ I \ { j} and vm+1 ∈ In j : ref. item (1).
4. vm+1 ∈ Outi for i ∈ I \ { j} and vm+1 ∈ In j : (N1) implies
(a) vm+1 ∈ InK ⇒ Mm3 [vm+1〉〉Mm+13 with (Mm+1j , Mm+13 ) ∈ B2 and wm+1 = wmvm+1.
(b) vm+1 ∈ InK ⇒ Mm+13 = Mm3 with (Mm+1j , Mm+13 ) ∈ B2 and wm+1 = wmvm+1.
5. vm+1 ∈ Out j : this implies Mm3 [w′m+1vm+1〉〉Mm+13 with w′m+1 ∈ (IntK ±)∗, (Mm+1j , Mm+13 ) ∈ B2 and wm+1 =
wmw
′
m+1vm+1.
(N3): Let u ∈ IntN and M1[u±〉〉Mˆ1. Therefore (M2, M j )[v〉〉(Mˆ2, Mˆ j ) with (Mˆ1, Mˆ2, Mˆ j ) ∈ B1 and v ∈
(IntC±)∗ for some (Mˆ2, Mˆ j ). At this point the proof can be continued analogously to the previous item.
(C1): Let x ∈ OutC ′ and (M2, M3)[x±〉〉(Mˆ2, Mˆ3).
1. If x ∈ OutK \ HK = Out j it follows that M j [vx±〉〉Mˆ j , (Mˆ j , Mˆ3) ∈ B2 for some Mˆ j and v ∈ (Int j±)∗.
Let M j [v〉〉M ′j [x±〉〉Mˆ j ; then (M2, M j )[v〉〉(M2, M ′j ) and by (C3) M1[w1〉〉M ′1 with w1 ∈ (IntN ±)∗ and
(M ′1, M2, M ′j ) ∈ B1 for some Mˆ ′1.
Since (M2, M ′j )[x±〉〉(Mˆ2, Mˆ j ) (where M2[x±〉〉Mˆ2 or Mˆ2 = M2), we get by (C1) for B1 that M ′1[w2x±〉〉Mˆ1
and (Mˆ1, Mˆ2, Mˆ j ) ∈ B1 for some Mˆ1 and w2 ∈ (IntN ±)∗. Altogether, we get that M1[w1w2x〉〉Mˆ1 with
w1w2 ∈ (IntN ±)∗ and (Mˆ1, Mˆ2, Mˆ3) ∈ B.
2. If x ∈ OutK \ HK = Out j , there exists an m ∈ I \ { j} such that x ∈ Outm ⊆ OutC .
(a) x ∈ In j implies that neither M3 nor M j are changed when firing x±: Mˆ3 = M3 and (M2, M j )[x±〉〉(Mˆ2, M j );
we get directly (B1) M1[wx±〉〉Mˆ1, (Mˆ1, Mˆ2, M j ) ∈ B1 for some Mˆ1 and w ∈ (IntN ±)∗, and by definition of
B: (Mˆ1, Mˆ2, Mˆ3) ∈ B.
(b) If x ∈ In j then M j [x±〉〉Mˆ j by (C2) for B1 and by (N1)
i. x ∈ InK , Mˆ3 = M3 and (Mˆ j , M3) ∈ B2.
ii. x ∈ InK and (Mˆ j , Mˆ3) ∈ B2.
In both cases, (M2, M j )[x±〉〉(Mˆ2, Mˆ j ) and M1[wx±〉〉Mˆ1 with w ∈ (IntN ±)∗ and (Mˆ1, Mˆ2, Mˆ j ) ∈ B1, hence
(Mˆ1, Mˆ2, Mˆ3) ∈ B.
(C2): Let x ∈ Outm, m ∈ I ′ and (M2, M3)|Pm [x±〉〉.
1. x ∈ Outm for some m ∈ I \ { j} and M2|Pm [x±〉〉. Then (B1) (M2, M j )[x±〉〉 and therefore (M2, M3)[x±〉〉,
because either x ∈ In j and x ∈ Sigk for k ∈ K or x ∈ In j , and by (N1) for B2 either M3[x±〉〉 or x ∈ Sigk , k ∈ K .
2. x ∈ Outm for some m ∈ K and M3|Pm [x±〉〉. Hence, M3[x±〉〉 by (C2) for B2.
(a) If x ∈ OutK , then M j [v〉〉M ′j [x±〉〉 by (C1) for B2 for some v ∈ (Int j±)∗ and M ′j . Since C j can fire
its internal signals without changing the state of the other components, we get (M2, M j )[v〉〉(M2, M ′j ) and
(M ′1, M2, M ′j ) ∈ B1 for some M ′1. Applying (C2) for B1 we get (M2, M ′j )[x±〉〉 and therefore (M2, M3)[x±〉〉,
too.
(b) If x ∈ IntK , i.e. x ∈ HK , then M3[x±〉〉 gives immediately that (M2, M3)[x±〉〉.
(C3): Let u ∈ IntC ′ and (M2, M3)[u±〉〉(Mˆ2, Mˆ3).
1. If u ∈ Inti for i ∈ I \{ j}, then Mˆ3 = M3, (M2, M j )[u±〉〉(Mˆ2, M j ) and by (C3) for B1: M1[v〉〉Mˆ1, w ∈ (IntN ±)∗
and (Mˆ1, Mˆ2, M j ) ∈ B1 for some Mˆ1 and therefore (Mˆ1, Mˆ2, Mˆ3) ∈ B.
M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266 265
2. If u ∈ Intk for k ∈ K , then Mˆ2 = M2 and by (C3) for B2: M j [v〉〉Mˆ j , v ∈ (Int j±)∗ and (Mˆ j , Mˆ3) ∈ B2 for
some Mˆ j . Let v = v1v2 . . . vn , vi ∈ Int j± and M j = M0j [v1〉〉M1j . . . Mn−1j [vn〉〉Mnj = Mˆ j . By (C3) for B1 we get
that M1 = M01 [w1〉〉M11 [w2〉〉 . . . Mn−11 [wn〉〉Mn1 = Mˆ1 with wi ∈ (IntN ±)∗ and (Mm1 , Mˆ2, Mnj ) ∈ B2 for every
m = 0, . . . , n. In particular, M1[w1 . . . wm〉〉Mˆ1 , (Mˆ1, Mˆ2, Mˆ j ) ∈ B1 and therefore (Mˆ1, Mˆ2, Mˆ3) ∈ B.
3. If u ∈ Outi ∩ HC for i ∈ I \ { j}.
(a) u ∈ In j . Then (M2, M j )[u±〉〉(Mˆ2, Mˆ j ) and by (C3) forB1: M1[v〉〉Mˆ1, v ∈ (IntN ±)∗ and (Mˆ1, Mˆ2, Mˆ j ) ∈ B1;
(N1) for B2 implies either M3[u±〉〉Mˆ3 or u ∈ InK and Mˆ3 = M3; in both cases (Mˆ j , Mˆ3) ∈ B2 and it follows
that (Mˆ1, Mˆ2, Mˆ3) ∈ B.
(b) u ∈ In j . Analogous to item (1).
4. If u ∈ Outk ∩ HC for k ∈ K . Then, M3[u±〉〉Mˆ3 and by (C1) for B2 M j [vu±〉〉Mˆ j and (Mˆ j , Mˆ3) ∈ B2 for
v ∈ (Int j±)∗. Furthermore, (M2, M j )[vu±〉〉(Mˆ2, Mˆ j ) and by (C3) for B1 we get M1[w〉〉Mˆ1, w ∈ (IntN ±)∗ and
(Mˆ1, Mˆ2, Mˆ j ) ∈ B1 (with Mˆ2 = M2 if u ∈ Inm for m ∈ I \ { j}). It follows that (Mˆ1, Mˆ2, Mˆ3) ∈ B.
5. If u ∈ Outk ∩ HK for k ∈ K . Then, by (C3) for B2: M j [v〉〉Mˆ j , v ∈ (Int j±)∗ and (Mˆ j , Mˆ3) ∈ B2 and Mˆ2 = M2.
Thus, (M2, M j )[v〉〉(Mˆ2, Mˆ j ) and by (C3) for B1: M1[w〉〉Mˆ1, w ∈ (IntN ±)∗ and (Mˆ1, Mˆ2, Mˆ j ) ∈ B1. Therefore
by definition of B, (Mˆ1, Mˆ2, Mˆ3) ∈ B. 
Proof of Theorem 20. Let N1 SV N2 due to STG-bisimulation B and let N be an arbitrary STG with N  N1 due
to relation R. We will show that N  N2 due to relation R′ = R ◦ B. In the following (M, M2) ∈ R′ implies
therefore the existence of some marking M1 of N1 with (M, M1) ∈ R and (M1, M2) ∈ B.
(IO1) Obviously, (MN , MN2 ) ∈ R′.
(IO2) Receptiveness
(a) Let (M, M2) ∈ R′ and M[x±〉〉Mˆ for x ∈ OutN . This implies x ∈ InN1 and M1[x±〉〉Mˆ1 with
(Mˆ, Mˆ1) ∈ R; furthermore (N1) for (B) implies M2[x±〉〉Mˆ2 with (Mˆ1, Mˆ2) ∈ B. By definition of R′,
we also have (Mˆ, Mˆ2) ∈ R′.
(b) Let (M, M2) ∈ R′ and M2[x±〉〉Mˆ2 for x ∈ OutN2 . This implies by (C1) for B: M1[v〉〉M ′1[x±〉〉Mˆ1,
v ∈ (IntN1±)∗ and (Mˆ1, Mˆ2) ∈ B. By (IO3)(b) ofR we know that (M, M ′1) ∈ R, and by (IO2)(b) forR we
get M[x±〉〉Mˆ and (Mˆ, Mˆ1) ∈ R; hence (Mˆ, Mˆ2) ∈ R′.
(IO3) Internal Progress
(a) Let (M, M2) ∈ R′ and M[u±〉〉Mˆ for u ∈ IntN . We get immediately (Mˆ, M1) ∈ R and (Mˆ, M2) ∈ R′.
(b) Let (M, M2) ∈ R′ and M2[u±〉〉Mˆ2 for u ∈ IntN2 . By (C3) for B we get M1[v〉〉Mˆ1, (Mˆ1, Mˆ2) ∈ B and
v ∈ (IntN1±)∗. Repeated application of (IO3)(b) forR then implies (M, Mˆ1) ∈ R and (M, Mˆ2) ∈ R′.(IO4) Deadlock-freeness
(a) Let (M, M2) ∈ R′ and {a | M[a±〉〉} ⊆ InN . If M2[u±〉〉 for some u ∈ IntN2 we are done. So assume that
this is not the case.
By (IO4)(a) forR, we have that {s | M1[s±〉〉} ⊆ InN1 . Let v ∈ (IntN1±)∗ be a maximal sequence (w.r.t.
prefix) such that M1[v〉〉Mˆ1; v exists because N1 is livelock-free. (IO3)(b) for R implies (M, Mˆ1) ∈ R and
(N3) for B implies (Mˆ1, M2) ∈ B by assumption. As above (IO4)(a) implies {s | Mˆ1[s±〉〉} ⊆ InN1 ; since v
is maximal, there exists a signal x ∈ OutN1 with Mˆ1[x±〉〉 and by (N2) and assumption we get: M2[x±〉〉.
(b) Let (M, M2) ∈ R′ and {a | M2[a±〉〉} ⊆ InN2 (∗).
This implies that no output signal is enabled at M1, since otherwise a local signal would be enabled at
M2. As in the previous item, let v ∈ (IntN1±)∗ be a maximal sequence (w.r.t. prefix) such that M1[v〉〉Mˆ1.
Then (∗) and (N3) for B imply (Mˆ1, M2) ∈ B, and (IO3)(b) for R implies (M, Mˆ1) ∈ R. As above, in
Mˆ1 no output signal is enabled and since v is maximal, no internal signal either, i.e. {a | Mˆ1[a±〉〉} ⊆ InN1
and due to (IO4)(b): {a | M[a±〉〉} ⊆ InN . 
References
[1] Josep Carmona, Structural methods for the synthesis of well-formed concurrent specifications, Ph.D. Thesis, Universitat Polite`cnica de
Catalunya, 2003.
[2] J. Carmona, J. Cortadella, Input/output compatibility of reactive systems, in: Formal Methods in Computer-Aided Design, FMCAD 2002,
Portland, USA, in: Lect. Notes Comp. Sci., vol. 2517, Springer, 2002, pp. 360–377.
266 M. Schaefer, W. Vogler / Theoretical Computer Science 388 (2007) 243–266
[3] J. Carmona, J. Cortadella, ILP models for the synthesis of asynchronous control circuits, in: Proc. of the IEEE/ACM International Conference
on Computer Aided Design, 2003, pp. 818–825.
[4] T.-A. Chu, Synthesis of self-timed VLSI circuits from graph-theoretic specifications, in: IEEE Int. Conf. Computer Design ICCD ’87, 1987,
pp. 220–223.
[5] J. Cortadella, M. Kishinevsky, A. Kondratyev, L. Lavagno, A. Yakovlev, Petrify: A tool for manipulating concurrent specifications and
synthesis of asynchronous controllers, IEICE Transactions on Information and Systems, E80-D 3 (1997) 315–325.
[6] J. Cortadella, M. Kishinevsky, A. Kondratyev, L. Lavagno, A. Yakovlev, Logic Synthesis of Asynchronous Controllers and Interfaces,
Springer, 2002.
[7] D. Dill, Trace Theory for Automatic Hierarchical Verification of Speed-Independent Circuits, MIT Press, Cambridge, 1988.
[8] J. Ebergen, Arbiters: An exercise in specifying and decomposing asynchronously communicating components, Science of Computer
Programming 18 (1992) 223–245.
[9] F. Garc-Vall, J.M. Colom, Structural analysis of signal transition graphs, in: Petri Nets in System Engineering, 1997.
[10] A. Kondratyev, M. Kishinevsky, A. Taubin, Synthesis method in self-timed design. Decompositional approach, in: IEEE Int. Conf. VLSI and
CAD, 1993, pp. 324–327.
[11] V. Khomenko, M. Koutny, A. Yakovlev, Logic synthesis for asynchronous circuits based on Petri net unfoldings and incremental sat, in:
M. Canada Kishinevsky, Ph. Darondeau (Eds.), ACSD 2004, 2004, pp. 16–25.
[12] D.E. Muller, W.S. Bartky, A theory of asynchronous circuits, in: Proceedings of an International Symposium on the Theory of Switching,
Harvard University Press, 1959, pp. 204–243.
[13] R. Milner, Communication and Concurrency, Prentice Hall, 1989.
[14] M. Schaefer, W. Vogler, R. Wollowski, V. Khomenko, Strategies for optimised STG decomposition. in: Proc. ACSD’06, 2006.
[15] W. Vogler, B. Kangsah, Improved decomposition of signal transition graphs, Fundamenta Informaticae 76 (2006) 161–197.
[16] W. Vogler, R. Wollowski, Decomposition in asynchronous circuit design, in: J. Cortadella, et al. (Eds.), Concurrency and Hardware Design,
in: Lect. Notes Comp. Sci., vol. 2549, Springer, 2002, pp. 152–190.
[17] A. Yakovlev, M. Kishinevsky, A. Kondratyev, L. Lavagno, M. Pietkiewicz-Koutny, On the models for asynchronous circuit behaviour with
or causality, Formal Methods in System Design 9 (1996) 189–233.
[18] T. Yoneda, H. Onda, C. Myers, Synthesis of speed independent circuits based on decomposition, in: ASYNC 2004, 2004, pp. 135–145.
