Methodology and tools for state encoding in asynchronous circuit synthesis by Cortadella, Jordi et al.
5.3 
Methodology and tools for state encoding 
in asynchronous circuit synthesis * 
Jordi Cortadella, Univ. Politecnica de Catalunya, Barcelona, Spain 
Michael Kishinevsky, Alex Kondratyev, The University of Aizu, Japan 
Lucian0 Lavagno, Politecnico di Torino, Italy 
Alex Yakovlev, University of Newcastle upon Tyne, United Kingdom 
Abstract 
This paper proposes a state encoding method for asynchronous cir- 
cuits based on the theory of regions. A region in a Transition 
Sjstem is a set of  states that “behave uniformly” with respect to 
a Ziven transition (value change o f  an observable signal), and is 
analogue to a place in a Petri net. Regions are tightly connected 
with a set ofproperties that must be preserved across the state en- 
coding process, namely: (1) trace equivalence between the original 
and the encoded specification, and (2) implementability as a speed- 
independent circuit. We build on a theoretical body of work lhat 
has shown the significance of regions for such property-preserving 
transformations, and describe a set of algorithms aimed at effi- 
cimtly solving the encoding problem. The algorithms have been 
implemented in a software tool called petrify. Unlike many 
existing tools, pe t r i fy  represents the encoded specification as an 
SI%, and thus allows the designer to be more closely involved in 
thc synthesis process. The efficiency o f  the method is demonstrated 
on a number of “difficult” examples. 
1 Introduction 
In the last decade, Signal Transition Graphs (STGs) [7, 11 have 
attracted much of the attention of the asynchronous circuit de- 
sign community due to their inherent ability to capture the main 
paradigms of asynchronous behaviour: causality, concurrency and 
data-dependent and non-deterministic choice. STGs are Petri nets 
whose events are interpreted with signal transitions of a modeled 
circuit. The STG model, exactly like “classical” Flow Table mod- 
els, may require some state signals to be added to those initially 
specified by the designer. Adding those state signals is commonly 
reierred to as solving the Complete State Coding (CSC) problem. 
Since [l] a number of different techniques have been proposed to 
solve the CSC-problem. The first totally general method, described 
in [8], used an algorithm whose complexity practically precluded 
any optimization, but produced only one,often suboptimal, solution. 
The most recent method [9] is based on the concept of an excitation 
*This work has been partially supported by grant CICYT TIC 95-0419 (J. Cor- 
tadella), EPSRC visiting fellowship GWJ78334 (M. Kishinevsky), MURST project 
“VLSI architectures” (L. Lavagno), and EPSRC grant GWJ52327 (A. Yakovlev). 
33rd Design Automation Conference@ 
Permission to make digitalhard copy of all or part of this work for personal or class- 
room use is granted without fee provided that copies are not made or distributed for 
profit or commercial advantage, the copyright notice, the title of the publication and 
its date appear, and notice is given that copying is by permission of ACM, Inc. To 
copy otherwise, to republish, to post on servers or to redistribute to lists, requires 
prior specific permission and/or a fee. 
DAC 96 ~ 06/96 Las Vegas, NV, USA 
01996 ACM 0-89791-779-0/96/0006..$3.50 
region for a signal transition (a set of states in which a signal is 
enabled to change its value). It has been able to improve on [8] 
by adopting a coarser granularity in the exploration of the solution 
space. This coarser granularity has a price, though: as we will show 
in Section 6, there is a number of examples of STGs which could 
not be solved by their method (nor by previous ones, mainly due 
to the large nurnber of states), unless changes in the specification 
(e.g., reductions in concurrency) are allowed. Moreover, the authors 
could not characterize the class of STGs for which their method was 
guaranteed to find a solution. 
Our approach differs from the previous work in the area, because 
it is based on the notion o f  regions of states, which is more general 
than, albeit related to, that o f  excitation regions (an excitation region 
is a spec& intersection of regions). By exploring a broader design 
space than [9], we can thus solve a larger number of problems, 
and potentially reach better solutions especially in terms of circuit 
performance. For example, our approach can eficiently trade off 
logic complexity with execution speed, by changing the level of 
parallelism with which state signal transitions are inserted. On the 
other hand, our search space is still reduced with respect to [8], and 
thus we can claim better control on the quality of the solution. 
This paper is organised as follows. Section 2 provides some 
theoretical background (the interested reader is referred to [2] for 
the details). Sections 3 and 4 define the idea of property-preserving 
event insertion and apply it to solving the CSC problem. Sections 5 
and 6 describe implementation aspects and experimental results. 
2 Theoretical background 
2.1 TransiUion systems and Petri nets 
P3 
(a) (b) (C) 
Figure 1: A TS (a), the corresponding PN (b), its RG (c) 
Informally, a TS ( [6] )  can be represented as an arc-labeled directed 
graph. A simple example of a TS without cycles is shown in 
Figure 1 ,a. A TS is called deterministic if for each state s and each 
63 
label a there can be at most one state s’ such that s 5 s’. A TS is 
called commutative if whenever two actions can be executed from 
some state in any order, then their execution always leads to the 
same state, regardless of the order. 
A Petri Net is a quadruple N = (P, T ,  F, mo), where P is afinite 
set of places, T is afinite set of transitions, F 2 ( P  x T )  U (T x P )  
is the flow relation, and mo is the initial marking. A transition 
t E T is enabled at marking ml if all its input places are marked. 
An enabled transition t may fire, producing a new marking m2 with 
one less token in each input place and one more token in each output 
place (ml 4- m 2 ) .  A PN expressing the same behavior as the TS 
from Figure 1 ,a is shown in Figure 1 ,b. 
The set of all markings reachable in N from the initial marking 
mo is called its Reachability Set. A net is called safe if no more 
than one token can appear in a place in any reachable marking. The 
graph with vertices corresponding to markings of a PN and with 
an arc ( m ~ ,  m2) in the graph if and only if ml + m2 is called 
its Reachability Graph (RG). One can easily check that the RG 
Figure 1 ,c derived for the PN from Figure 1 ,b is isomorphic to the 
TS (Figure 1 ,a). 
2.2 Regions and Excitation Regions 
Let SI be a subset of the states of a TS, SI C S. If s $Z SI 
and s’ E SI, then we say that transition s 5 s’ enters 5’1. If 
s E S1  and s’ @ SI, then transition s s’ exits SI. Otherwise, 
transition s 3 s’ does not cross S I .  A region is a subset of 
states with which all transitions labeled with the same event e have 
exactly the same “entry/exit” relation. This relation will become 
the predecessor/successor relation in the Petri net. 
Let us consider the TS shown in Figure 1. The set of states 
T~ = ( s 2 ,  s3 ,  sfi} is a region, since all transitions labeled with a and 
with b enter ~ g ,  and all transitions labeled with c exit ~ 3 .  On the 
other hand, ( s 2 ,  sg} is not a region since transition S I  5 s3 enters 
this set, while another transition also labeled with b ,  s 4  5 sfi, does 
not. 
A region T is apre-regionof event e if there is a transition labeled 
with e which exits T. A region T is apost-region of event e if there is 
a transition labeled with e which enters T .  The set of all pre-regions 
and post-regions of e is denoted with ‘ e  and e o  respectively. 
While regions in a TS are related to places in the corresponding 
PN, an excitation region for event a is a maximal set of states in 
which transition a is enabled. Therefore, excitation regions are 
related to transitions of the PN. A set of states is called an excitation 
region for event a (denoted by ER, ( a ) )  if it is a maximal connected 
set of states such that for every states E ER, ( a )  there is a transition 
s 2. Since any event a can have several separated E h ,  an index j is 
used for the distinction between different connected occurrences of a 
in the TS. In the TS from Figure 1 ,a there are two excitation regions 
for event a :  E R l ( a )  = {SI} and ER2(a)  = {sg} .  Similarly tc 
ERs, we define switching regions as connectedsets of states reached 
immediately after the occurrence o f  an event. 
3 Property-preserving event insertion 
Event insertion is informally seen as an operation on a TS which 
selects a subset of states, splits each state in it into two states and 
creates, on the basis of these new states, an excitation and switching 
region for a new event. Figure 2 shows the chosen insertion scheme, 
analogous to that used by most authors in the area, in the three main 
cases of insertion with respect to the position of the states in the 
insertion set E R ( z )  (entrance to, exit from or inside E R ( z ) ) .  
................................ 
c h  
d i  W x )  
+T3 .... t ..... 7.. .... T...T ....,: c - h  
................................. 
ER@) ~ 
j ER(n) ................................. 
Figure 2: Event insertion scheme 
State signal insertion must also preserve the speed-independence 
of the original specification, that is required for the existence of a 
hazard-free asynchronous circuit implementation. 
An event a of a TS A is said to be persistent in a subset S’ 
of states of S iff V s l  E S’,b E E : [sl A(s1 5 s2) E 
TI + s2 5. An event is said to persistent if it is  persistent 
in S. For a binary encoded TS, determinism, commutativity and 
output event persistency guarantee speed-independenceof its circuit 
implementation. Formally, we say that an insertion state set E R ( x ) ,  
in a TS A’ obtained from a deterministic and commutative TS A by 
inserting event x, is a speed-independence preserving subset (SIP- 
set) iff ( 1 )  for each a E E ,  if a is persistent in A, then it remains 
persistent in A’, and ( 2 )  A’ is deterministic and commutative. 
The following two properties of insertion sets, based on theory 
developed in [2], link together the notions of TSregions and SIP-sets 
and provide a rationale for our approach. 
Property 3.1 
( P l )  If r is a region in a commutative and deterministic TS, then r 
is an SIP-set. 
e (P2)  I ~ T  is an excitation region of an event a in a commutative and 
deterministic TS and a is persistent in T ,  then r is an SIP-set. 
(P3)  I ~ T I  and r2 arepre-regionsof the same event in a commutative 
and deterministic TS, T I  n ~2 is connected and all exit events of 
rl n ~2 are persistent, then r1 n r 2  is a SIP-set. 
These properties suggest that the good candidates for insertion sets 
should be sought on the basis of regions and their intersections 
(while the approach of [9] could exploit only case P2). Since any 
disjoint union of regions is also a region, this gives an important 
corollary that nice sets of states can be built very efficiently, from 
“bricks” (regions) rather than “sand” (states). 
4 Solving Complete State Coding 
A Signal Transition Graph (STG, [l, 71) is a Petri net labeled with 
up and down transitions of a set of signals (denoted by zt and x- 
for signal z respectively). 
A necessary condition for STG implementability i s  consistent 
labeling. Informally, this means that in every firing sequence from 
the initial marking, rising and falling transitions altemate for each 
signal. In other words, each marking can be uniquely labeled with a 
vector of signal values. Once consistency is ensured, Complete State 
Coding (CSC) becomes necessary and sufficient for the existence 
of a logic circuit implementation. A consistent STG satisfies the 
CSC property if for every pair of states s, 3’ of the associated TS, 
such that U(.) = U(.’), the set of non-input transitions enabled in 
both is the same. 
Assume that the set of states S in a TS is partitioned into two 
subsets which are to be encoded by means of an additional signal 
to solve some CSC conflicts. Let T and F = S - T denote the 
blocks of such a partition. In order to implement such an encoding, 
we need to insert appropriate transitions of the new signals in the 
border states between the two subsets. 
In this paper we shall consider the so-called exit border (EB) of 
a partition block T ,  denoted by E B ( r ) ,  which is informally a subset 
64 
o f  states of r with transitions exiting r .  We call E B ( r )  welllformed 
if there are no transitions leading from states in E B ( T )  to states in 
r - EB(r ) .  
Consider the example in Figure 3 (enabled signals have their 
value followed by * in the signal label). State pair (1 * 1, 1 * 1 *) has 
a CSC conflict, assuming that signal a is input and b is non-input, 
artdsodo (1’1, l*l*) and (0*1,01*) (while (00*,0*0*) doesnot, 
hecause b is enabled in both). The partition r = r2,  F = r2‘ 
separates all conflicting pairs, and can thus be tentatively used to 
sclve the conflicts. The borders, in this case, are denoted by the 
shaded areas. If they are selected as excitation regions for the new 
signal y, we obtain the TS (c). Note that some border states are 
ccnflicting. This means that the new TS will still have secondary 
C<SC problems, that must be solved by iterating the procedure (the 
proof of convergence is given in [2]). 
(a) 
........................... 
... 
...... 
................................. 
~ , ]”U*- GT-J r2 ~ 
d 
3 - -  
r?: .  .. -1 
Figure 3: Illustration of event insertion 
Note that we need each new signal x to orderly cycle through 
states in which it has value 0, 0*, 1 and l*. We can formalize 
this requirement with the notion of I-partition ([8] used a similar 
definition). 
Given a TS T S  = (S, T,  E ,  s in) ,  an I-partition is a partition of 
S into four blocks: So,  SI, St and S-. S”(S’) defines the states 
in which z will have the value 0 (1). S+(S-)  defines ER(z+) 
(ER(x-)). For a consistent encodingof x, the only allowed events 
crossing boundaries of the blocks are the following: So + S’ - 
S’ -+ S- -+ So, Ss --+ S- and S -  -+ S s  (the latter two would 
cause a persistency violation, though). 
The problem of finding an I-partition is reduced to finding a 
bipartition S. Each block b of S induces a bipartition { b ,  b } ,  ii;‘ = 
S\ b). Given a block b, an I-partition can be calculated by defining 
Si’ and S- with the following recursion: 
i .  { s E b 1 3 s + s ‘ A  s ’ E b } s S +  
{ s E b 1 3 s + s f  A s ’ E b } C S -  
: I .  [s E s+ A s ’ E b  A s-s’] =+ s ’ E S +  
[ S E S -  A S ’ E F A  s - + s ’ ]  + S ’ E S -  
and finally So = b-St  and SI = b- S-. The sets of states defined 
bricks = calculate-allbricks () 
frontier = goodblocks = {the best F W  bricks} 
repeat /* heuristic search */ 
new-frontier = 0 
for each bl E frontier do 
for each br E bricks adjacent to bl do 
new& = bl U br 
if cost(newJ1) < cost(bZ) then 
goodblocks = goodblocks U {newN} 
newfrontier = newfrontier U {nezudl} 
frontier = select the best FW blocks from newfrontier 
until newfrontier = 0 
return the best block in goodblocks 
Figure 4: Heuristic search to find a block for event insertion 
by condition 1 correspond to the smallest “legal” exit border of b with 
respect to b (EB(b)) .  The additional states of condition 2 define 
the smallest well-formed EBs. We will denote by MWFEB(b) the 
minimal welllformed EB of b. 
The set of candidates explored by our encoding algorithm will 
be restricted to be an I-partition by construction. We proved in [2]  
that the method is complete, in that it can solve CSC for any safe, 
consistent, output-persistent STG. 
5 A heuristic-search strategy to solve CSC 
The main algorithm for the insertion of one state signal is as follows: 
1. Generate a set of I-partitions that preseme speed independence 
2. Estimate tlhe cost of the generated I-partitions 
3. Select the best I-partition 
4. Increase the concurrency of the inserted signal 
Initially, all bricks of the TS are calculated by ( 1 )  obtaining all 
minimal regions: of the TS and (2) calculating all possible intersec- 
tions of pre-/post-regions of the same event. Since the number of 
pre- and post-regions of an event is usually small, an exhaustive 
generation is feasible. 
The best block for event insertion is obtained as the union of 
adjacent bricks. At each iteration of the search, a frontier of FW 
(frontier width, i i  parameter trading off solution quality versus time) 
“good” blocks is kept. Each block is enlarged by adjacent bricks 
and the new obtained blocks are considered candidates for the next 
iteration only if they are “better”, according to the cost function, 
than their ancestors. The final block for insertion is calculated as 
the union o f  best disconnected blocks. A greedy block merging 
approach guided by the cost function is used. 
Given a block b, Ss and S- are initially calculated as the 
MWFEB of b and b respectively. This leads to a solution with 
minimum concurrency of the inserted event. Concurrency can be 
increased by enlarging S+ and/or S-([SI). In o w  approach, after 
having calculated the best configuration for event insertion, S+ and 
S- are greedily enlarged by adding bricks that are adjacent to them. 
The enlargement is only accepted if the new configuration improves 
the cost of the solution. The following factors are considered in the 
cost function for the insertion of signal x (in order of priority): 
(figure 4) 
ER(z+) and ER(x-) must be SIP blocks. 
e The insertion of x must not modify the specification of the 
environment (e.g., x cannot be inserted before input events). 
65 
benchmark I places I trans. 1 signals I states I CPU 
master-read I 37 I 26 I 18 I 18856 I 927 benchmark states 
adfast 44 
par16 
pipe8 24 
pipel6 
ASSASSIN petrify 
area I CPU area I CPU 
390 I 0.4 294 I 10.5 
Table 1: Results for STGs with a large number of states 
The number of solved CSC conflicts must be maximized. 
The estimated complexity of the circuit inust be minimized. 
In the current implementation, the complexity of the circuit is ap- 
proximated by the sum of the number of trigger signals for each 
ER. Each trigger signal labels one of the transitions which enter an 
ER and corresponds to a fan-in signal in the implementation. More 
accurate estimations are foreseen for future iniplementations. 
6 Experimental results 
The region-based approach presented in this paper has been inte- 
grated in pe t r i fy ,  a tool for the synthesis of Petri nets [3]. We 
have used several benchmarks that no other automatic tool, such as 
SIS or ASSASSIN, has been able to solve. Some of them are even 
difficult to solve manually by expert designers. Our approach has 
succeeded in all of them. 
One of the most important features of the CSC algorithm imple- 
mented in p e t r i f y  is the capability of managing extremely large 
state graphs generated from STGs with high concurrency. Two fac- 
tors are essential for this capability: (1) the symbolic representation 
and manipulation of the state graph by means of Ordered Binary 
Decision Diagram (2) the exploration of blocks of states at the 
level of regions rather than states. Table 1 presents the CPU times 
(in seconds on a SPARCSTATION 20) required to satisfy CSC for 
some examples with a vast state space, which cannot be solved in a 
reasonable amount of memory or time by SIS or ASSASSIN. 
Table 2 reports the results obtained with petr i fy in comparison 
with the ones obtained by ASSASSIN ([SI). The quality of the 
results is comparable to those obtained by ASSASSIN. Even with 
the estimation of logic performed in pe t r i fy ,  ASSASSIN can 
still offer slight improvements in a few examples. This means that 
an estimation of logic based on only trigger signals is not accurate 
enough. 
7 Conclusions 
In this paper we have presented a method and associated algorithms 
for solving state coding problems by means of state signal insertion. 
Our main target here was solving Complete State Coding problem, 
one of fundamental issues in asynchronous circuit synthesis from 
Signal Transition Graphs. We believe that our approach to: (1) 
Transition System partitioning, (2) new signal insertion, and (3) re- 
construction of the model in Petri net form, based on the concept 
of region of states, will prove useful in solving other problems 
in asynchronous circuit synthesis. In particular, the technology 
mapping problem for Speed-Independent circuits ([4]) can be cast 
in this form. 
nak-pa 
ram-read-sbuf 
sbuf-ram-wnte 
sbuf-read-ctl 
mux2 
postoffice 
duplicator 
specseq4 
seqmix 
seq8 
trcv-bm 
tsend-bm 
ircv-bm 
mod4xounter 
master-read 
mmu 
mro 
iIKl 
mmu0 
mmu 1 
par.4 
divider8 
vme2int 
combuf2 
total 
36 
58 
15 
99 
58 
20 
20 
20 
36 
44 
41 
44 
16 
1882 
174 
302 
190 
174 
82 
628 
18 
74 
11 
406 
764 
244 
1386 
1094 
294 
236 
324 
480 
826 
1010 
842 
648 
726 
698 
1008 
912 
886 
700 
506 
848 
1014 
270 
17658 
0.2 
0.7 
0.0 
3.0 
1 .O 
0.1 
0.1 
0.1 
0.4 
0.6 
0.6 
0.4 
0.1 
607.7 
10.6 
40.0 
17.9 
8.4 
1.8 
206.4 
0.4 
0.8 
0.2 
902.8 
406 
300 
244 
1774 
800 
294 
236 
324 
480 
824 
962 
I042 
648 
750 
132 
626 
650 
610 
514 
506 
914 
938 
262 
16364 
6.0 
23.9 
1.4 
142.2 
0.0 
5.9 
6.2 
7.5 
37.8 
56.5 
0.0 
64.3 
0.0 
75.1 
51.9 
153.6 
23.0 
48.4 
45.0 
88.0 
18.7 
44.4 
3.7 
927.4 
Table 2: Experimental results compared with ASSASSIN 
References 
[I]  T.-A. Chu. On the models for designing VLSI asynchronous digital 
systems. Integration: the VLSIjournal, 4:99-113,1986. 
121 J. Cortadella, M. Kishinevsky, A. Kondratyev, L. Lavagno, and 
A. Yakovlev. A region-based theory for state assignment in asyn- 
chronous circuits. Technical Report 95-2-006, University of Aizu, 
Japan, October 1995. 
[3] J. Cortadella, M. Kishinevsky, L. Lavagno, and A. Yakovlev. Syn- 
thesizing Petri nets from state-based models. In Proceedings of the 
Infernationul Conference on Computer-Aided Design, November 1995. 
[4] A. Kondratyev, M. Kishinevsky, B. Lin, P. Vanbekbergen, and 
A. Yakovlev. Basic gate implementation of speed-independent circuits. 
In Proceedings qf the Design Automation Conference, 1994. 
[5] Bill Lin, Chantal Ykman-Couvreur, and Peter Vanbekbergen. A general 
state graph transformation framework for asynchronous synthesis. In 
Proceedings of the European Design Automotion Conference (EURO- 
DACJ, pages 448-453. IEEE Computer Society Press, September 1994. 
161 M. Nielsen, G. Rozenberg, and P.S. Thiagarajan. Elementary transition 
systems. Theoretical Computer Science, 96:3-33,1992. 
171 L. Y. Rosenblum and A. V. Yakovlev. Signal graphs: from self-timed 
to timed ones. In International Workshopon Timed Petri Nets, 1985. 
[8] P. Vanbekbergen, B. Lin, G. Goossens, and H. De Man. A generalized 
state assignment theory for transformations on Signal Transition Graphs. 
In Proceedings of the International Conference on Computer-Aided 
Design, pages 112-1 17, November 1992. 
[91 C. Ykman-Couvreur and B. Lin. Optimised state assignment for asyn- 
chronous circuit synthesis. In Proc. Second Working Cot@ on Asyn- 
chronous Design Methodologies, London, May 1995. 
66 
