Synthesis of Speed Independent Circuits from STG-unfolding Segment by Semenov A et al.
Synthesis of Speed Independent Circuits from
STG-unfolding Segment

Alex Semenov, Alexandre Yakovlev Enric Pastor, Marco A. Pe~na,
Department of Computing Science Jordi Cortadella
University of Newcastle Department of Computer Architecture
Newcastle upon Tyne, NE1 7RU Universitat Politechnica de Catalunya
England 08071 Barcelona, Spain
Abstract
This paper presents a novel technique for synthesis of speed-independent circuits.
It is based on partial order representation of the state graph called STG-unfolding
segment. The new method uses approximation technique to speed up the synthesis
process. The method is illustrated on the basic implementation architecture. Experi-
mental results demonstrating its eciency are presented and discussed.
1 Introduction
The problem of synthesis of speed-independent circuits from their Signal Transition Graph
(STG) specications has been approached by many researchers. Several tools exist today,
such as SIS [12], Assassin [15], Forcage [3] and Petrify [2], which are capable of synthesising
circuits of moderate size. With the exception of Forcage, all tools use some form of State
Graph (SG) representation to obtain truth tables of the implementation logic. Petrify uses
Binary Decision Diagrams (BDDs) to represent SG symbolically and can thus synthesise
circuits from larger descriptions. Forcage, on the other hand, uses Change Diagrams (partial
order model) to derive an implementation but is restricted to specications without choice.
Construction of SG hits available computational limits due to state explosion. A struc-
tural method in [8] can implement STGs avoiding exhaustive state exploration. It uses
concurrency relation between transitions of the STG to obtain an initial approximation of
the implementation. If this approximation does not satisfy correctness criteria, then itera-
tive renement is performed, where the procedure uses State Machine (SM) decompositions.
Although powerful, this method has a drawback, it is restricted to free-choice specications.
The main goal of this work is to develop a method for implementing STGs that cannot be
synthesised by the above techniques due to the large size of their SG. A way to achieve this
goal will be analogous to the one in [8] { it will draw upon relations at the event-based, rather

This work was supported in part by the SERC grant No. GR/J 52327 and ESPRIT ACiD{WGNr.21949.
Collaboration between University of Newcastle and Universitat Politechnica de Catalunya was supported by
British-Spanish joint research programme (Acciones Integradas) between the British Council and Spanish
Ministry of Education and Culture, grant number MDR(1996/97)1159
1
than state-based, description level. This method will, however, be free from the limitations
of [8].
The solution to this problem is found in the use of a partial order approach, already known
to have given positive results in STG verication. It is based on an implicit representation of
SG in the form of a nite STG-unfolding segment [11]. It was shown [11] that such a segment
can often be built for those examples where the construction of SG fails. While the segment
is being constructed it is also veried for correctness. Thus, after the verication stage is
completed, an implementation can be derived from an already built STG-unfolding segment.
Two approaches are possible within the new synthesis method: exact and approximate.
The former obtains an implementation equivalent to that derived from the SG. At the end
of the synthesis procedure this approach produces an implementation by recovering binary
states from the segment (similar to the approach of [6]). Although it benets from the
unfolding methodology which restricts the set of states needed to examine for each signal, the
exact approach may suer from exponential explosion of states. To battle the complexity,
the latter approach uses concurrency relation to initially approximate and then to rene
an approximated implementation. The structural method of [8] works on the STG level,
assuming that two transitions are concurrent if they can ever re simultaneously. On the
contrary, our method works with a partial run of the STG specied behaviour. Thus it
is possible to pin-point when exactly any two transitions become concurrent. This local
information gives a more accurate initial approximation and a more precise renement.
The aim of this paper is to suggest and illustrate synthesis of speed-independent circuits
from the STG-unfolding segment built for their specications. The method is illustrated
on the atomic complex gate per signal architecture. The paper is therefore organised as
follows. Section 2 describes general approach to synthesis of speed-independent circuits and
describes the atomic complex gate per signal architecture. Section 3 introduces new notions
required for the synthesis from STG-unfolding segment. Section 4 describes and illustrates
the suggested method. Results showing performance and comparing the new method with
existing ones are presented in Section 5.
2 Synthesis of Speed-Independent circuits
2.1 General synthesis approach
We assume that the reader is familiar with the basics of the Petri net theory [9]. A marked
Petri net (PN) is a tuple N = hP; T; F;m
0
i where P and T are non-empty sets of places
and transitions, respectively, F is a ow relation and m
0
is an initial marking. A Signal
(Transition) Graph (STG) [10, 1] is a tuple G = hN;A;Li (labelled PN) where N is a
marked PN, A is a set of signals and L : T ! f+; g  A is a labelling function. STGs
are a special case of labelled PNs, used for low level descriptions of asynchronous circuits.
The set of transition labels represents changes of signals: +a
i
(for up) and  a
i
(for down).
Notation a
i
indicates a transition labelled with a change of a
i
regardless of the direction of
this change.
Conventionally, to obtain an implementation for an STG G a corresponding SG is built.
The SG S, also called State Transition Diagram (STD), is derived by constructing the reach-
ability graph (representing all reachable markings) of the underlying PN and then assigning
binary codes v
i
to each vertex s. The binary codes must be assigned consistently, i.e. :
2
 every arc is labelled with exactly one signal transition, and
 for each pair of states s
1
and s
2
connected with an arc labelled with a
i
the following
is true:
{ v
1
[i] = 0 and v
2
[i] = 1 if  = +
{ v
1
[i] = 1 and v
2
[i] = 0 if  =  
Once a consistent state assignment was performed, truth tables are obtained for each output
signal and an implementation is produced. The process of obtaining a truth table depends
on the implementation architecture chosen (for this particular signal).
Correctness criteria for synthesis of speed-independent circuits can be divided into general
correctness criteria and architecture specic correctness criteria. The former are behavioural
properties of an STG, which characterise an STG to be implementable \in principle". In
addition to the consistent state assignment, they also include:
 Boundedness, which guarantees that the behaviour specied by an STG can be imple-
mented into a nite size circuit;
 Semi-modularity (also called \output signal persistency"), which implies that excited
output signals cannot be disabled by some input signal change and thus cause a hazard.
The latter group of properties is usually checked during the actual logic synthesis process.
These are generally referred to as coding conicts and indicate that although the STG is
implementable \in principle", there may exist some binary state associated with dierent
markings which makes them indistinguishable at the circuit level. The Complete State Coding
(CSC) condition introduced in [1] requires any two states with equal binary codes to have the
same set of excited output signals. It was proved in [1] that STGs satisfying CSC property
are implementable as speed-independent circuits.
An implementation is obtained by building a cover function. A boolean function with
a variable corresponding to each signal is said to be covering a state s
j
if it evaluates to
TRUE when the variables have the values equal to the elements of binary code v
j
assigned
to s
j
. A function C covering a set of states is called a cover function (or simply cover) for
this set of states fs
i
g. Each term of the cover is called cube as it may cover several states in
the state space.
A cover is not required to be exact , i.e. to cover only the states in fs
i
g. It could be ob-
tained explicitly from their binary codes. However, if a cover is obtained somehow dierently
(e.g. using an oracle), it may cover some other states. For example, methods described in
[14, 8] use structural information to obtain covers. Such cover is called approximated cover,
and needs to be checked for correctness. There are dierent requirements for correctness of
covers according to the implementation architecture chosen.
The following three architecture types are normally considered:
 Atomic complex gate per signal implementation;
 Atomic complex gate per excitation function implementation;
 Atomic complex gate per excitation region implementation.
3
The rst architecture can be considered as a basic type. The other two aim at reducing
the size of customised complex gates. In these architectures it is assumed that the output
signal is implemented using a memory element. The Set and Reset excitation functions for
this memory element are implemented as atomic complex gates (the former) or a network
of atomic complex gates (the latter). Depending on which memory element is used, the
implementations are divided into i) Standard C-element implementation, which uses Muller
C-element as the memory element, and ii) RS-latch implementation, where an RS-latch is
used.
To demonstrate the novel technique we chose the atomic complex gate per signal archi-
tecture. Our method can be easily adapted to the other architectures. The remainder of
this section overviews the synthesis strategy for the chosen architecture type. Note that the
limits of this paper also do not allow us to consider lower level logic decomposition, usually
known as technology mapping.
2.2 Atomic complex gate per signal implementation
This is a basic architecture for speed-independent circuits studied in [1, 5]. The circuit
is implemented as a network of atomic gates. Each gate uniquely implements one output
signal. Its boolean function can be represented as Sum-Of-Products (SOP) or Sum-Of-
Functions(SOF). An example of such gate is shown in Figure 1. Each gate is allowed to be
sequential (latch), i.e. contain an internal feedback with a zero delay. The delay between its
internal \ANDing" and \ORing" parts is also assumed to be negligible. The gate depiction
is used to denote the implemented boolean function as the actual implementation is resolved
on the transistor level.
Two sets of the reachable states are distinguished in the SG, on-set On(a
i
) and o-set
Off (a
i
), which include all states in which the value of the output signal a
i
is implied to be
TRUE and FALSE, respectively. The remaining (unreachable) subset of combinations of
the boolean values of signals forms the Don't care set (DC-set).
The implementation is derived by building the on-set
1
. Each state can be represented by
a term which has jAj variables, each corresponding to one and only one signal a
i
. The term
becomes TRUE only when the values of the variables are equal to those in the binary code
assigned to the state. The cover C for implementation is obtained from the terms included
into the on-set. The DC-set can be used for optimising the size of C. This is done in standard
minimisation tools, such as Espresso [12].
The synthesis for this architecture is illustrated in Figure 1 for an STG shown in the
Figure 1(b). Suppose that an implementation of the signal b is required. The on-set of b is
found as: On(b) = f(p
2
; p
3
); (p
3
; p
5
); (p
2
; p
6
; p
8
); (p
5
; p
6
; p
8
); (p
7
; p
8
); (p
4
)g. The cover function
C(b) is obtained as: C(b) = a

bc+ abc+ a

bc+ abc+abc+a

bc = a+ c. The DC-set in example
in Figure 1(c) is empty so no further minimisation can be done.
Obtaining exact covers usually means that all states in the on- or o-set must be known.
An approximation algorithm produces approximated covers of the on- and o-sets. There-
fore, in this implementation architecture, covers of on- and o-sets must satisfy the following
condition:
Denition 2.1 Two covers C

On
(a
i
) and C

Off
(a
i
) are said to be correct i C

On
(a
i
) and
1
Here and further, for simplicity, it is assumed that the on-set is constructed. Usually, the simplest from
the on- and o-sets is chosen for implementation.
4
a1
an
ao
+b
-a
-b
-c
+c
+a +c
+b
p1
p2 p4p3
p5 p6
p7
p8
p9
p2p3
100
p3p5
110
p2p6p8
101
p5p6p8
111
p7p8
011
p4
001
000
p1
p9
010
abc
+a +c
+c
+c
+b
+b
-a
-b
-c
+b
(a) (b) (c)
Figure 1: An example of an STG and a corresponding SG.
C

Off
(a
i
) cover On(a
i
) and Off (a
i
) respectively and C

On
(a
i
)  C

Off
(a
i
)  DC-set. 2
If the covers do not satisfy the above condition, then the approximation is too loose and
needs to be rened. If, on the other hand, the covers are exact but still intersect outside the
DC-set, then this STG has CSC problem. In this case it should be corrected by changing the
specication, e.g. by inserting additional signals.
3 Slices in STG-unfolding segment
3.1 STG-unfolding segment
Analysis of STGs using STG-unfolding segment was studied elsewhere [11]. An STG-unfolding
segment is a tuple G
0
= hT
0
; P
0
; F
0
; L
0
i where T
0
, P
0
and F
0
are sets of transitions, places
and the ow relation, respectively, and L
0
is a labelling function which labels each element
of G
0
as an instance of elements of G. G
0
is a partial order obtained from an STG G by the
process of its unfolding which starts from the initial marking. The unfolding process uses the
structural properties of the constructed partial order to determine the relations of conict,
concurrency and precedence between instances. These relations are used to decide where to
instantiate the next element. The following key notions were introduced in [4]:
 The min-set of transitions needed to re t
0
, including t
0
, is called local conguration of
t
0
and is denoted as dt
0
e.
 A set of place instances reached by ring all transitions in dt
0
e is called postset of dt
0
e
and is denoted as dt
0
e. Mapping a postset onto places of the original STG is called
nal state of dt
0
e and gives a marking of the original STG.
 Any non-conicting and transitively closed set of transitions of T
0
is called conguration
C. The postset of a conguration, denoted as C, is found from the postsets of
transitions comprising it.
The unfolding algorithm examines only states reached through ring of an instance t
0
ex-
cited by a minimal set of causes. It is based on the fact that no new information about
5
the behaviour of the system can be obtained once the states started repeating. Thus the
algorithm constructs no new instances after any instance t
0
c
whose ring reaches an already
examined state. Transition instance t
0
c
is called a cuto transition of the unfolding.
In contrast to PN-unfolding [4], the STG-unfolding takes into account specic signal
interpretation of PN transitions and keeps track of the binary codes reached by transition
ring. However, it still examines only a subset of all reachable states of STG G and thus is
more ecient than SG analysis for a vast number of examples.
Each instance t
0
of STG-unfolding segment is assigned with a binary code 
dt
0
e
which is
reached by ring transitions in dt
0
e. Similar to its postset, the binary code corresponding to
a conguration C is calculated from 
dt
0
e
of transitions comprising it. It was shown in [11]
that all states of the SG are represented in the STG-unfolding segment as postsets of some
conguration. For each instance t
0
labelled with signal transition a
i
a set of transitions
next(t
0
) is dened as a set of instances labelled with a
i
reachable from t
0
without any
intermediate transitions of a
i
. Set first(a
i
) is a set of transitions of a
i
rst reached from
the beginning of the segment. A special transition, called initial transition, is introduced in
the unfolding to represent the initial state of the STG. This transition, denoted as ?, has a
postset which maps onto the initial marking m
0
and has an assigned binary code 
d?e
equal
to the initial binary state v
0
of the STG.
It was demonstrated in [11] that an STG-unfolding segment can only be constructed for
an STG specication satisfying boundedness and consistent state assignment criteria. The
last general correctness criterion, semi-modularity, can be checked on the STG-unfolding
segment in linear time.
3.2 Cuts
To represent a state of SG we dene a cut. A cut of STG-unfolding segment is a maximal
set of concurrent places p
0
2 P
0
. Each cut c of an STG-unfolding segment thus represents
some reachable marking of the original STG. A sequence relation is dened between two cuts
c
1
 c
2
if 8p
0
i
2 c
2
;9p
0
j
2 c
1
: p
0
j
 p
0
i
. For each instance t
0
the following four types of cuts
are found.
 A minimal excitation cut c
min
e
(t
0
), which represents a state at which t
0
becomes rst
enabled.
 A minimal stable cut c
min
s
(t
0
), which represents a state which is reached by ring of t
0
.
 A maximal excitation cut c
max
e
(t
0
), which represents a state from which, in a correct
STG no advancement can be made unless t
0
is red.
 A maximal stable cut c
max
s
(t
0
), which represents a state which is reached after ring of
t
0
from which ring of any transition leads to a state enabling the next change of the
signal a
i
labelling t
0
.
Each instance of the STG-unfolding segment uniquely identies c
min
e
(a
0
i
) and c
min
s
(a
0
i
) and
the sets of c
max
e
(a
0
i
) and c
max
s
(a
0
i
). Thus each instance identies states bounding the subset
of the on-set (or o-set) of a
i
which is found for this particular instance.
6
p’5
p’’7
p’6 p’’8
p’3 p’4
p’7 p’8
p’9
-a’
011
110
+b’’
+a’
100
+c’’
101
-c’
010
011
+b’
+c’
001
000
-b’
p’2
p’’1
000
p’1
p’2
p’5
p’’7
p’6 p’’8
p’3 p’4
p’7 p’8
p’9
-a’
011
110
+b’’
+a’
100
+c’’
101
-c’
010
011
+c’
001
000
-b’
p’’1
000
p’1
p’2
p’5
p’’7
p’6 p’’8
p’3 p’4
p’7 p’8
p’9
-a’
011
110
+b’’
+a’
100
+c’’
101
-c’
010
011
+b’
+c’
001
000
-b’
p’’1
000
p’1
+b’
(a) (b) (c)
Figure 2: An example of an STG-unfolding segment and illustration of slices and cuts.
3.3 Slices
To represent a (connected) set of states we introduce a notion of a slice of the STG-unfolding
segment. A slice of STG-unfolding segment is a set of cuts S = hc
min
; C
max
i dened with
a min-cut of the slice, c
min
, and a set of max-cuts, C
max
, such that 8c
i
2 S the following
is true: c
min
 c
i
and 9c
max
j
2 C
max
: c
i
 c
max
j
. No two cuts in the set of max-cuts are
sequential.
In other words, a slice is dened between one min-cut and a set of max-cuts. Every
cut in between the min-cut and a max-cut is encapsulated in the slice S. Furthermore, for
any two cuts c
i
and c
j
encapsulated by S, if c
i
 c
j
, then all cuts between c
i
and c
j
are
also encapsulated by S. Since each cut represents some state in the SG, for any two states
s
i
and s
j
represented as sequential cuts in a slice, all states on any path from s
i
to s
j
are
also represented as cuts encapsulated into S. The number of cuts in the set of max-cuts
corresponds to the number of congurations (non-conicting runs of the STG) which include
conguration producing the min-cut. The elements of the STG-unfolding segment, i.e. places
and transitions, bounded by instances in min-cut and max-cuts are said to belong to the
slice.
A slice represents a subset of reachable states found in the SG for any STG bounded by
the cuts dening it. As discussed earlier, the synthesis of speed-independent circuits is based
on nding subsets of reachable states. Therefore, slices of the STG-unfolding segment can
be used to identify and represent these subsets.
Cuts and slices are illustrated in Figure 2. Consider a cut c = (p
0
7
; p
0
8
) in Figure 2(b).
This cut is a minimal excitation cut for the transition  c
0
and is a minimal stable cut for +b
0
.
Another cut, c = (p
0
2
; p
0
6
; p
00
8
) is a maximal stable cut for transition instance +a
0
. At the same
time this is a maximal excitation cut for the instance +b
00
. This example also illustrates the
relations between cuts. Intuitively, if a transition a
0
i
causes a
0
j
, then the minimal stable
cut of a
0
i
is the minimal excitation cut of a
0
j
and vice versa.
Slice S
1
= h(p
0
1
); f(p
0
7
; p
0
8
)gi encapsulates cut(Figure 2(c)) c = (p
0
4
). Another slice S
2
is
dened between a min-cut (p
0
2
; p
0
3
) and a set of max-cuts f(p
0
5
; p
0
6
; p
00
8
)g and includes all cuts
between them. It is also possible to dene a slice between (p
0
2
; p
0
3
) and f(p
0
3
; p
0
5
); (p
0
2
; p
0
6
; p
00
8
)g.
In this case the slice will include all cuts but one enabling a
0
. This slice, therefore, represents
all states at which signal a is stable at \1".
7
Each cut is produced by some conguration of the STG-unfolding segment. Hence, the
binary codes of the SG states represented by cuts encapsulated in a particular slice can be
recovered by examining its cuts.
4 Synthesis from STG-unfolding segment
4.1 Obtaining exact covers
First, consider the problem of synthesis from the STG-unfolding segment G
0
by nding exact
covers for the on-(o-)set. To implement an output signal of an STG as an atomic gate, its
on-set
2
is required. Since its SG is represented as an STG-unfolding segment, the problem is
to nd a set of slices in this segment which represents all states in the on-set, i.e. an on-set
partitioning of G
0
for a
i
.
To dene each slice we need to identify a min-cut and a set of max-cuts. From all instances
in the STG-unfolding segment only instances of +a
i
may change the value of corresponding
element in the binary codes. Furthermore, for each instance +a
0
i
its minimal excitation cut
c
min
e
(+a
0
i
) represents the rst state at which +a
0
i
becomes excited. Any cut at which +a
0
i
is excited or stable at \1" must be sequential to c
min
e
(+a
0
i
). A special case is the initial
transition ? of G
0
. If in the initial state of the STG the corresponding bit of binary code was
\1", then the set first(a
i
) will consist of the down instance  a
0
i
. In this case, the minimal
stable cut of ? is the rst cut from which this slice can be dened. Thus the set of minimal
cuts, which is used to dene a set of slices, is taken as a set of minimal excitation cuts of
instances +a
0
i
and the minimal stable cut of ?, if the signal a
i
is at \1" in the initial state.
Thus a set of transitions, called entry transitions, is identied on the STG-unfolding segment
which includes all instances of +a
i
and may include ? if a
i
is at \1" in the initial state.
For complete denition of each slice we need to determine a set of max-cuts for each
slice. The minimal excitation cut of any instance  a
0
i
represents the rst state at which  a
0
i
becomes excited. This cut belongs to the o-set.
For each instance +a
0
i
the slice must be bounded by a set of cuts which can be reached
from min-cut without exciting  a
i
. The slice is bounded by the maximal excitation cuts
of immediate predecessors of next(+a
0
i
), i.e. cuts at which an immediate predecessor of a
transition from next(+
0
a
i
) is the only transition to re. This is the furthest state to which
advancement of the system can be made from +a
0
i
without enabling  a
i
. In the case of
initial transition the set of max-cuts for the rst slice is chosen using first(a
i
).
Due to the unfolding algorithm, a particular conguration may contain no instances of
 a
i
. This may happen if the conguration contains a cuto transition, or simply leads to a
deadlock. In this case the cut reached by such conguration bounds the slice.
Consider synthesising signal b from an example in Figure 1. The on-set partition-
ing of the segment is shown in Figure 3. There are two instances +b
0
and +b
00
and one
instance  b
0
. Thus there are two slices S
1
On
(+b
0
) = h(p
0
4
); f(p
0
7
; p
0
8
)gi and S
2
On
(+b
00
) =
h(p
0
2
; p
0
3
); f(p
0
5
; p
0
6
; p
00
8
)gi representing states from the on-set and one sliceS
Off
= h(p
0
9
); f(p
00
1
)gi.
Once the slices are dened, the set of states represented by these slices is found: On
1
(b) =
f100; 101; 110; 111g and On
2
(b) = f001; 011g. The on-set cover is obtained from slices as
C
On
= On
1
(b) [ On
2
(b) = f100; 101; 110; 111; 001; 011g which after standard boolean trans-
formation gives C
On
= f1    ;    1g = a + c. If the o-set implementation were chosen,
2
O-set if an o-set implementation was chosen. In this case instances of  a
i
should be considered.
8
S  (+b’)1
On
p’1
p’2
p’5
p’’7
p’6 p’’8
p’3 p’4
p’7 p’8
p’9
-a’
011
110
+b’’
+a’
100
+c’’
101
-c’
010
011
+b’
+c’
001
000
-b’
p’’1
000
S  (+b’)
Off
S  (+b’’)2
On
Figure 3: Illustration of synthesis from the STG-unfolding segment.
then the cover would be C
Off
= f010; 000g = ac.
4.2 Deriving cover approximation from STG-unfolding segment
The synthesis procedure described in the previous Subsection suers from one drawback. If
many concurrent transitions belong to a slice, then obtaining the binary codes for all cuts
will suer from exponential explosion of states. To battle this an approximation method is
suggested.
Two types of nodes can be identied in the on-set of signal a
i
: those which have +a
i
excited and those at which a
i
is stable at \1". The former is traditionally called excitation
region (ER) and the latter quescient region (QR) of +a
i
. A set of states at which a particular
place p
l
is marked is called a marked region (MR) of this place. It was pointed out in [7]
that a cover for any set of states can be found as an intersection of covers for places which
are marked at each state. Thus a set of states at which a particular transition is excited can
be found as an intersection of MRs of its preceding places. However, at the unfolding level
the instances of transitions are known. Furthermore, the minimal excitation cut c
min
e
(a
0
i
)
for each instance a
0
i
indicates where this instance becomes rst enabled.
Any state reachable from c
min
e
(a
0
i
), preserving the excitation of a
0
i
, can only be reached
by ring transitions which are concurrent to a
0
i
. If a signal transition instance a
0
j
is concur-
rent to a
0
i
, then the value of its corresponding element in the binary code may take values
of both \0" and \1". A cover approximation C

e
(a
0
i
) is found from the binary code  assigned
to the cut c
min
e
(a
0
i
). Literals corresponding to signals whose instances belong to S
e
(a
0
i
) and
are concurrent to a
0
i
are substituted by \{" (don't care). Approximation reduces the num-
ber of literals in cover C

e
(a
0
i
) and increases the number of combinations covered by C

e
(a
0
i
).
However, such approximation guarantees that no marking at which a
0
i
is excited was lost.
Furthermore, for a CSC-compliant STG, C

e
(a
0
i
) will only cover those reachable states where
a
0
i
is excited.
For example, consider calculation of C

e
(+d
0
) for the instance +d
0
in Figure 4(a). The
binary code corresponding to its minimal excitation cut c
min
e
(+d
0
) = (p
0
2
; p
0
3
; p
0
4
) is found
from the binary code of its local conguration d+d
0
e as  = f1000000g (the order of signals
is abcdefg). There are four signals fb; c; e; fg whose instances belong to the slice and are
concurrent to +d
0
. Thus the ER cover approximation for +d
0
will be C

(+d
0
) = f1    0  
9
p’4
+d’
p’7
+g’
p’5
p’8
+b’
+e’
p’11
-a’
p’9
+f’
p’6
+c’
p’3 p’2
+a’
p’1
p’10
p’4
+d’
p’7
p’10
+g’
p’5
p’8
+b’
+e’
p’11
-a’
p’9
+f’
p’6
+c’
p’3 p’2
+a’
p’1
+e’
p’1
+a’
+d’
p’3
p’5
p’8
+b’
+c’
-a’
p’2
p’4
p’7
p’9
p’6
(a) (b) (c)
Figure 4: Illustration of cover approximation and renement.
 0g = a

dg.
The rest of the states in the on-set which are represented as cuts encapsulated by
S
On
(a
0
i
) can be approximated by taking cover approximations for MRs of places belonging
to S
On
(a
0
i
) and sequential to the entry transition of the slice.
For each place p
0
l
its MR approximation cover C

mr
(p
0
l
) is obtained from the binary code

dt
0
k
e
assigned to its preceding transition. Similar to ER approximation, any marking at
which p
0
l
is marked can only be reached by ring transitions concurrent to p
0
l
. Thus literals
corresponding to signals whose instances belong to S
On
(a
0
i
) and are concurrent to p
0
l
are
replaced by \{".
An MR cover approximation for a particular place p
0
l
will cover all states at which p
0
l
is
marked with any other concurrent place p
0
j
. Thus only mutually non-concurrent subset of
places belonging to S
On
(a
0
i
) can be considered. A set of such places is called approximation
set P
0
a
. Furthermore, anMR cover approximation must not cover markings enabling instances
t
0
j
2 next(a
0
i
). Thus, the MR cover approximation for any such place p
0
l
is found as C(p
0
l
) =
P
C

t
0
k
(p
0
l
) where C

t
0
k
(p
0
l
) is a cover approximation found for p
0
l
with a set of concurrent signal
instances excluding an instance t
0
k
immediately preceding t
0
j
. To reduce the size of MR cover
approximations, it is also convenient to choose P
0
a
so that it includes one input place from
each instance in next(a
0
i
). The cover approximation for each slice S
On
(+a
0
i
) representing
the states from the on-set of signal a
i
is therefore calculated as:
C

On
(a
0
i
) = C

e
(+a
0
i
) +
X
C

mr
(p
0
l
); p
0
l
2 P
0
a
where C

e
(+a
0
i
) may be empty if the entry transition of S
On
(+a
0
i
) is the initial transition of
the segment.
Consider approximation of the on-set cover for signal +a
0
shown in Figure 4(b). The slice
representing states from the on-set is found as S
On
(+a
0
) = h(p
0
1
); f(p
0
7
; p
0
8
; p
0
9
); (p
0
6
; p
0
8
; p
0
10
);
(p
0
5
; p
0
9
; p
0
10
)gi. To approximate states represented by this slice an approximation set is chosen
as P
0
a
= fp
0
4
; p
0
7
; p
0
10
g. The initial values for MR cover approximations for place p
0
4
and p
0
7
are found using  of their predecessors +a
0
and +d
0
respectively. Both places have the same
set of concurrent instances of other signals. Their MR cover approximations are found as
C

mr
(p
0
4
) = f1    0    0g = a

dg and C

mr
(p
0
7
) = f1    1    0g = adg. Place p
0
10
, on the
other hand, is an input to  a
0
2 next(+a
0
). Therefore its MR cover approximation is found
10
as C(p
0
10
) = C

f
0
(p
0
10
) + C

e
0
(p
0
10
) = f1    1   01g [ f1    10   1g = ad

fg + adeg. There is
only one state in the ER of +a
0
which is covered by a cover C

(+a
0
) = f0000000g = a

bc

de

fg.
The cover approximation representing the on-set of a is found as C

On
(a) = a

bc

de

fg + a

dg +
adg + ad

fg + adeg (in SOP form).
4.3 Cover renement
Due to the approximated nature of the covers, an on-set cover found from the STG-unfolding
segment may implement an incorrect function. Indeed, if a output signal is implemented
using an on-set cover approximation which covers a state belonging to the o-set, then the
output will change to \1" where it is suppose to be \0". Thus cover approximations obtained
using the algorithm described before need to be checked. To check cover correctness both
on- and o-set cover approximations are required.
Suppose that both approximated covers for the on- and o-set of a
i
were obtained.
Suppose also that their intersection is non-empty. The covers' intersection may only belong
to the DC-set. However, to nd the DC-set all codes in both on-set and o-set must be
known. Therefore, to ensure the covers implement the logic functions correctly we check
a stronger condition: approximated covers for on- and o-set are said to be correct if their
intersection is empty. The approximation produces semi-optimised covers. Exact covers have
their intersection empty by construction. Therefore, if the covers' intersection is non-empty,
then they need to be rened until their intersection becomes empty, possibly restoring the
exact covers. Thus the use of a stronger condition only aects the quality of optimisation
rather than correctness of covers. If after complete renement on- and o-set covers still
intersect, then this STG has a CSC problem and cannot be implemented without changes
to the specication. Correct rened covers can be optimised using any known minimisation
technique using the DC-set, such as Espresso.
A pseudo-code of the algorithm for deriving covers for on- and osets is shown in Figure 5.
The initial on- and o-set cover approximations are found as described in the previous
Subsection. If the approximated covers' intersection is not empty, then these covers are
rened. Only concurrency relation was used for nding approximated covers. Other relations
between transitions concurrent to a
0
i
were ignored. The general idea behind renement is
that using these relations some of the information about the cover is restored. Covers are
rened until \they are good enough", i.e. covers' intersection becomes empty.
The on- and o-set covers' intersection may become non-empty due to approximation of
MR cover for some places in the approximation set. These MR cover approximations may
intersect with the ER cover approximations of some instances of the opposite signal transition.
In this case only cover approximations for these places (but not all in the approximation set)
and the instance of opposite signal transition need to be rened. Further, the set of signals
Sig which cause the intersection is also known. These are exactly those signals whose value
is undened in one of the cubes B 2 C

. Thus we need to consider a problem of rening a
cover approximation for an element x
0
of STG-unfolding segment with a set of signals Sig.
To restore some of the relations a rening set P
0
r
is chosen. This set is constructed
from non-concurrent places belonging to the slice S
On
(a
0
i
) such that 8p
0
k
2 P
0
r
: x
0
kp
0
k
.
Furthermore, the set is chosen so that for at least one signal a
j
from Sig for each its instance
t
0
k
2 S
On
(+a
0
i
) one of the successors of t
0
k
is in P
0
r
. Thus each rening step will rene at
least one signal from Sig. A rened cover C

new
(x
0
) is obtained from the old approximation
as: C

new
(x
0
) = C

(x
0
)  [
P
C
r
mr
(p
0
k
)] ; p
0
k
2 P
0
r
. Cover C
r
mr
(p
0
k
) is a restricted MR cover for p
0
k
11
for each implementable signal a
i
do
Find set of on-slices S
j
On
for each slice S
j
On
do
Find approximation set P
0
a
C

On
= C(t
0
e
) + [
P
C

mr
(p
0
l
) : p
0
l
2 P
0
a
]
end do
Find set of o-slices S
k
Off
for each slice S
k
Off
do
Find approximation set P
0
a
C

Off
= C
k
(t
0
e
) + [
P
C

mr
(p
0
l
) : p
0
l
2 P
0
a
]
end do
/* initial approximations found */
while C

On
 C

Off
6= ; then do
for each C

mr
(p
0
l
) and C
k
(t
0
e
) such that C

mr
(p
0
l
)  C
k
(t
0
e
) 6= ; do
Find the set of oending signals Sig
Choose oending signal a
j
from Sig
Find rening set P
0
r
for p
0
l
w.r.t. a
j
C

new
(p
0
l
) = C

mr
(p
0
l
)  [
P
C
r
mr
(p
0
k
) : p
0
k
2 P
0
r
]
Find rening set P
0
r
for t
0
e
w.r.t. a
j
C

new
(t
0
e
) = C

(t
0
e
)  [
P
C
r
mr
(p
0
k
) : p
0
k
2 P
0
r
]
end do
end do
end do
Figure 5: Algorithm for deriving on- and o-set cover approximations from STG-unfolding
segment
where only those literals are set to \{" whose instances t
0
l
kp
0
k
belong to S
On
(a
0
i
) and are
successors of x
0
.
Informally, at each step the renement procedure restores the marking component of
reachable states represented by the slice. It nds a set of places which can be marked
together with each already partially restored marking. The cover function is then changed
reecting the fact that partially restored markings now include found places. Thus in the
end, when the procedure terminates, the covers correspond to fully restored markings and
cover only states with these marking components.
Since each step renes the value of at least one variable and the set of signals is nite,
the renement procedure will terminate in nite number of steps producing an exact cover
for the states represented the slice S
On
(a
0
i
).
Consider a fragment of STG-unfolding segment shown in Figure 4(c). Suppose that on-
set cover approximation C

On
, found with approximation set P
0
a
= fp
0
1
; p
0
3
; p
0
5
; p
0
8
g, intersects
with C

Off
for some signal. Suppose also that a cube B = de which is an MR cover
approximation of place p
0
5
causes this non-empty intersection. The set of oending signals
is found as Sig = fa; b; cg. Let a be the signal chosen for renement. Its only instance
which should be used in renement is  a
0
. A renement set is chosen as P
0
r
= fp
0
2
; p
0
4
; p
0
7
; p
0
9
g.
12
Benchmark Sigs PUNT ACG Other tools
UnfTim SynTim EspTim TotTim LitCnt Petrify SIS LitCnt
imec-master-read.csc 18 0.39 73.56 3.05 77.00 83 125.66 630.52 69
nowick.asn 7 0.02 0.26 0.69 0.97 17 1.44 0.51 20/17
nowick 6 0.02 0.17 0.38 0.57 15 1.10 0.23 14
par 4.csc 14 0.03 1.12 2.48 3.63 36 12.31 168.55 36
sis-master-read.csc 14 0.16 4.53 1.09 5.78 48 27.09 130.66 48
tsbmSIBRK 25 0.44 37.64 4.62 42.70 72 299.90 141.51 72
pn stg example 6 0.01 0.19 1.57 1.77 19 4.20 6.84 19
forever ordered 8 0.03 0.31 1.12 1.46 20 5.24 8.81 16
alloc-outbound 9 0.05 0.32 0.48 0.85 16 1.75 1.53 16
mp-forward-pkt 20 0.02 0.34 0.47 0.83 17 1.50 0.22 17
nak-pa 10 0.02 0.37 0.57 0.96 20 2.28 0.29 20
pe-send-ifc 17 0.12 1.91 0.50 2.53 68 19.50 1.16 75/72
ram-read-sbuf 11 0.02 0.48 0.58 1.08 25 3.28 0.26 22
rcv-setup 5 0.02 0.06 0.17 0.25 8 0.72 0.14 8
sbuf-ram-write 12 0.04 0.80 0.64 1.48 23 4.04 0.38 23
sbuf-read-ctl.old 8 0.03 0.36 0.47 0.86 15 1.29 0.19 15
sbuf-read-ctl 8 0.02 0.22 0.47 0.71 15 0.99 0.16 15
sbuf-send-ctl 8 0.02 0.37 0.49 0.88 19 1.95 0.21 19
sbuf-send-pkt2 9 0.02 0.49 0.48 0.99 19 2.16 0.23 19
sbuf-send-pkt2.yun 9 0.04 0.58 0.45 1.07 31 3.43 0.26 31
sendr-done 4 0.02 0.02 0.19 0.23 6 0.33 0.14 6
Total 228 1.72 124.10 20.96 146.78 592 520.16 1092.77 580/574
Table 1: Experimental results
Consider calculation of the restricted MR cover approximation for p
0
2
. The only instances
which can be used in approximation is +e
0
as other concurrent instances, +a
0
and +d
0
,
precede p
0
5
. Thus C
r
mr
(p
0
2
) = f1001 g (the order of signals is abcde). Similar, MR cover
approximations are found for other places in P
0
r
. The rened cover approximation is thus
found as: C

new
(p
0
5
) = f   10g\ [f1001 g[f1101 g[f1111 g[f0111 g] = acde+bcde.
The resulting cover is an exact cover of MR for place p
0
5
. Note that if simply MR cover
approximation C

mr
(p
0
2
) =

bc were chosen for p
0
2
(or any other place from P
0
r
), then renement
would not rene a.
5 Experimental results
The method suggested in this paper was implemented on the basis of the unfolding tool
\PUNT". Experiments are divided into two major series.
The goal of the rst series was to demonstrate the quality of the proposed method.
Results of the synthesis procedure, tested on a set of benchmarks, are shown in Table 1. The
table presents time breakdown (in seconds) for synthesis a speed-independent circuit from
its STG specication in the atomic complex gate per signal architecture (\PUNT ACG").
Columns \UnfTim", \SynTim" and \EspTim" show times taken to construct the STG-
unfolding segment, derive on- and o-set covers and apply Espresso to optimise the covers,
respectively. Column \TotTim" shows the total time taken to synthesise a particular circuit.
For comparison, same set of benchmarks was synthesised using two known tools Petrify
and SIS. Their timings are grouped in the column \Other tools". Literal count (columns
\LitCnt") was used as a measure of the quality of the new synthesis method. The literal
count shows the total number of literals in the obtained covers of nal implementations. The
number of signals (column \Sigs"), inuencing the complexity of the specication and its
behavioural representation, is also given for each specication.
As it can be observed, the synthesis technique based on the STG-unfolding segment
produces implementations comparable with those produced by other tools. The timing
results show that our technique compares favourably to Petrify. It is also comparable with
SIS on the benchmarks with low count of signals and it becomes increasingly better with the
13
10
100
1,000
(sec)
No. signals252015105 30 40 50
SIS
Petrify
PUNT
cfpp
Time
10,000
Figure 6: Experimental results for Muller pipeline.
growth of the signal count. These results show that for small sized benchmarks, the overheads
of constructing the STG-unfolding segment and traversing it may outweigh the time spent on
constructing a small reachability graph with an ecient implementation. Using a stronger
correctness condition for approximated covers may produce a slightly worse implementation
due to the fact that the DC-set is partitioned.
The second series of experiments shows the feasibility of the new method on a set of
scalable examples. The Muller pipeline was chosen to illustrate the ability of the new method.
Experimental results are shown in Figure 6. As can be observed, existing tools soon choke on
the size of the specication either running out of memory or taking prohibitively long time.
Both SIS and Petrify exhibit doubly exponential growth of time taken. The rst dependency
is due to the state space explosion, the second is due to the exponential complexity of the
exact synthesis process used in both tools. In addition, we synthesised a Counterow pipeline
specication [13] which has 34 signals. From the existing tools, only Petrify was able to
synthesise it taking more than 24 hours. At the same time PUNT was able to synthesise
it in under 2 hours thus giving an order of magnitude gain in speed. This is shown on the
graph as a circled dot.
6 Conclusions
In this paper we presented a new method for synthesis of speed independent circuits. Our
approach is based on the STG-unfolding segment. It uses the segment as a model from which
an implementation is obtained. As the size of the STG-unfolding segment is often smaller
than the size of the SG, it is possible to synthesise specications of larger sizes. In addition,
due to the smaller size of the semantic model, the implementation can be achieved faster on
a number of moderate sized examples. We demonstrated applicability of our method on an
existing set of benchmarks.
Future developement of this method can be directed into exploring heuristics for the
renement procedure. In addition, this method can be adapted to the other implementation
architectures. In this case, the approximation will be targeted at obtainning the excitation
functions for memory elements. Furthermore, the method can be enhanced by accomodating
checks for weaker correctness conditions for approximated covers. It is possible to nd a set
of slices of the STG-unfolding segment where a particular cover becomes TRUE. Thus if an
approximated on-set cover becomes TRUE within slices of the o-set cover approximation
14
(and vice versa), then further renement is required. Thus, even if the covers intersection
is non-empty, but they do not become TRUE within the slices of the opposite cover, then
their intersection belongs to the DC-set. This allows to check the cover correctness condition
introduced in Section 2 and to obtain covers with better literal counts.
References
[1] T.A. Chu. Synthesis of Self-Timed VLSI Circuits from Graph-theoretic Specications.
PhD thesis, MIT, 1987.
[2] Jordi Cortadella, Michael Kishinevsky, Alex Kondratyev, Luciano Lavagno, and Alex
Yakovlev. Petrify: a tool for manipulating concurrent specications and synthesis of
asynchronous controllers. In Proc. of the 11th Conf. Design of Integrated Circuits and
Systems, Barcelona, Spain, November 1996. to appear.
[3] M. Kishinevsky, A. Kondratyev, A. Taubin, and V. Varshavsky. Concurrent Hardware:
The Theory and Practice of Self-Timed Design. John Wiley and Sons, London, 1993.
[4] K.L. McMillan. Symbolic Model Checking. Kluwer Academic Publishers, Boston, 1993.
[5] T.H.-Y. Meng, R.W. Brodersen, and D.G. Messershmitt. Automatic synthesis of asyn-
chronous circuits from high-level specications. In IEEE Transactions on Computer-
Aided Design, volume 8(11), pages 1185{1205, 1989.
[6] T. Miamoto and S. Kumagai. An ecient algorithm for deriving logic functions of asyn-
chronous circuits. In Proc. of the Second International Symposium on Advanced Research
in Asynchronous Circuits and Systems (ASYNC'96), pages 30{35, Aizu-Wakamatsu,
Fukushima, Japan, March 1996.
[7] Enric Pastor and Jordi Cortadella. An ecient unique state coding algorithm for signal
transition graphs. In Proc. of the IEEE International Conference on Computer Design,
pages 174{177, USA, October 1993.
[8] Enric Pastor, Jordi Cortadella, Alex Kondratyev, and Oriol Roig. Structural methods
for the synthesis of speed-independent circuits. In Proc. European Design and Test
Conference (EDAC-ETC-EuroASIC), pages 340{347, Paris(France), March 1996.
[9] Wolfgang Reisig. Petri Nets, An Introduction. Springer-Verlag, 1985.
[10] L.Ya. Rosenblum and A.V. Yakovlev. Signal graphs: from self-timed to timed ones. In
Proceedings of International Workshop on Timed Petri Nets, Torino, Italy, July 1985,
pages 199{207.
[11] A. Semenov and A. Yakovlev. Event-based framework for verication of high-level
models of asynchronous circuits. Technical Report 487, University of Newcastle upon
Tyne, 1994.
[12] E.M Sentovich and et. al. SIS: A system for sequential circuit synthesis. Memorandum
No. UCB/ERL M92/41, University of California, Berkeley, 1992.
15
[13] A. Yakovlev. Designing control logic for counterow pipeline processor using petri nets.
Technical Report 522, University of Newcastle upon Tyne, 1995.
[14] C. Ykman-Couvreur, Bill Lin, Gert Goossens, and Hugo De Man. Synthesis and op-
timization of asynchronous controllers based on extended lock graph theory. In Proc.
European Design and Test Conference (EDAC-ETC-EuroASIC), pages 512{517, Febru-
ary 1993.
[15] Ch. Ykman-Couvreur, B. Lin, and H. DeMan. ASSASSIN: A synthesis system for
asynchronous control circuits. Reference manual, IMEC, 1995.
16
