Automatic synthesis of fast compact self-timed control circuits by Stevens, Kenneth & Coates, Bill




K enneth S. Stevens*
H ew lett-Packard Laboratories 
Palo Alto, CA USA
*D epartm ent of Com puter Science 
University Of Calgary 
Calgary, A lberta  T2N 1N4 
C anada
ABSTRACT
We present a tool called M EAT which has been designed to  autom atically  synthesize transistor 
level, CMOS, self-timed control circuits. M EAT has been used to  specify and synthesize self-timed 
circuits for a fully self-timed 300,000 transisto r com m unication coprocessor. The design is specified using 
finite sta te  machines which perm it burst-mode inputs. Burst.-mode is a lim ited form  of MIC (multiple 
input change) signalling. The prim ary goal of MEAT is to  produce fast and com pact circuits. In order to 
achieve this goal, M EAT im plem entations perm it tim ing assum ptions which can be verifiably supported  
at the physical im plem entation level, and result in significant im provem ents in speed and area of the 
design. Since M EAT has been used for large designs, we have also been forced to  make the algorithm s 
efficient. The result is a tool which is efficient, easy to  use by to d a y ’s hardw are designers since the 
specification is based on the commonly used finite sta te  machine control model, and synthesizes CMOS 
transisto r im plem entations th a t are self-timed, fast and com pact. The paper presents a description of 
the tool, the nature  of the algorithm s used, and examples of its use.
1
1 In trod uction
Three m ajor constraints -  speed of operation, size, and design tim e m ust be considered w ith any large 
chip design, be it a commercial product or a laboratory  prototype. As in tegrated  circuit technology 
improves, the am ount of logic th a t can be placed on a VLSI circuit increases quadratically, and the 
speed of the com ponents become faster linearly. W ithou t im provem ent in design methodology, the 
design tim e of circuits will increase at least quadratically. In addition, synchronous design techniques 
are facing critical design and perform ance difficulties as circuit complexity escalates due to  a num ber of 
prim ary factors:
•  It is becoming extrem ely costly to  m anage quadratically  narrowing clock skew requirem ents.
•  An increasingly d isproportionate am ount of power m ust be budgeted to  the global clock lines. For 
example, the DEC A lpha CPU [7] uses 17 w atts of the  chip’s massive 30 w att power budget to 
drive the clock.
•  Increm ental perform ance im provem ents in a synchronous design are extrem ely costly due to  the 
global nature  of the synchronous tim ing model.
One clear alternative is to  adopt an asynchronous design style. M any proponents would also argue 
th a t asynchronous circuits are inherently faster since they are controlled by locally adaptive tim ing ra ther 
th an  the usual global worst-case clock frequency constraints. W hile we believe th a t this claim has m erit, 
we feel th a t in general it is m isleading. A synchronous circuits usually require more com ponents to 
im plem ent the same function. This may result in longer wires, increased area, and reduced perform ance. 
Perform ance is lost in synchronous system s where there is a significant difference betw een the average 
and the worst case operation delay since the clock period has to  accom m odate the worst case. A simple 
example of this is seen in ripple carry arithm etic circuits, where operational delay is dom inated by the 
carry propagation tim es and where the average carry only propagates a short distance. In this simple 
case, it is relatively inexpensive to  adopt more sophisticated carry chain circuits in order to  narrow  
the average to  worst case difference. W hen com pared to  a well tuned  synchronous design where this 
difference is small, a functionally equivalent asynchronous im plem entation may actually run  slightly 
slower. Hence the perform ance advantage of asynchronous circuits, while often valid, m ust be analyzed 
carefully. In practice the speed of a design is more dependent on the quality of the design and the 
fabrication process. O ur experience has been th a t, for large designs, it is easier to  achieve the necessary 
quality using an asynchronous style th an  it is in the synchronous discipline prim arily due to  the speed 
and simplicity obtained from  localized com m unication and control. The speed of asynchronous circuits 
has been dem onstrated  to  be on par w ith th a t of synchronous versions[12,14].
The lack of a global clock in asynchronous designs inherently elim inates the clock skew and dispro­
portionate  clock power budget problems. The down-side is th a t asynchronous circuit im plem entations 
m ust be hazard free which inherently  requires additional logic and more careful design. Fortunately  these 
issues need to  be considered only a t the lowest level and therefore do not become in tractab le  concerns 
as the design becomes complex.
Perhaps the clearest advantage th a t asynchronous designs have over the synchronous approach 
is functional m odularity. Asynchronous design modules inherently  keep tim e to  themselves, whereas 
the tim ing model intrinsic to  synchronous m ethods applies globally. The result is th a t a change to  a 
synchronous module often requires concom itant changes in the other system  modules. Asynchronous 
system  modules connect to  other system  modules through a functional interface which encodes the 
tem poral constraints. These interfaces impose sequencing constrain ts on the interconnected modules 
and frees them  from  the need to  operate consistently in a global tim ing model [17]. Hence a change in 
some module, which significantly changes the modules perform ance bu t not its interface will not require
2
changes in other modules to  m aintain the functional consistency of the system. As system  complexities 
escalate, the  need to  produce designs as composable modules becomes m andatory. Design tim e and 
design perform ance have become equally critical success factors. In addition the ability to  reuse modules 
of previous designs can be an im portan t way to  save on the design tim e of subsequent efforts. Reuse of 
asynchronous modules is extrem ely simple while adapting  synchronous modules to  a new global tim ing 
model is potentially  very costly.
Still, after a half century of synchronous design m om entum  there is m uch to  inhibit a change to 
asynchronous design:
1. For board level designs, there is little in the way of an asynchronous com ponent selection.
2. Complex designs require sophisticated electronic CAD tool kits, and these tools have been con­
structed  to  support synchronous design styles.
3. There is a huge pool of experienced synchronous designers who have dem onstrated  their ability to  
produce complex working system s. A significant style change will be painful.
The first problem  can be bypassed if the design is an IC ra ther th an  a board. Recently there have 
been several a ttem pts which, like M EAT, represent a s ta rt on a solution to  the second problem . The th ird  
problem  is significant and will likely take considerable tim e to  solve completely bu t to d a y ’s hardw are de­
signers are relying on increasingly sophisticated synthesis tools in their CAD suite to  produce/synthesize 
the im plem entation from  a design specification. If an asynchronous synthesis tool perm itted  a similar 
specification style then the change would be less traum atic . This has been our approach w ith MEAT. 
Hardw are designers are used to  thinking in term s of finite s ta te  machines for control and separate d a ta ­
p a th  form ulation as their design en try  specification. It is our view th a t asynchronous and synchronous 
d a tap a th  design techniques are quite similar, whereas the im plem entation stra tegy  for the asynchronous 
control com ponents, i.e. the finite sta te  machines, m ust satisfy some additional constraints. Most of 
this additional burden is handled by the M EAT tools. The result is th a t an experienced synchronous 
circuit designer will notice a very minor conceptual shift in order to  use M EAT in the creation of an 
asynchronous design.
M EAT is by no means complete or totally  original. M EAT is best viewed as an ensemble of 
m ethods for asynchronous finite sta te  m achine synthesis, m any of which were created elsewhere bu t 
modified to  suit our needs in the im plem entation of M EAT. Hence the nam e M EAT for Modified Ensem ble 
Asynchronous Tool.
All asynchronous design styles are fundam entally  concerned w ith the synthesis of hazard  free 
circuits. To avoid subsequent confusion, we use the following term s:
•  Self-timed and asynchronous  are general term s used to  describe any circuit th a t is not synchronous 
and therefore exhibits hazard free behavior under some conditions. We use them  synonymously.
•  Delay-insensitive circuits exhibit hazard  free behavior w ith a rb itra ry  delays assigned to  bo th  wires 
and gates.
•  Speed-independent circuits exhibit hazard  free behavior w ith a rb itra ry  gate delays bu t assume zero 
delay wires.
There are a large num ber of ra ther different design styles in to d ay ’s asynchronous design com­
munity. One partition  of design styles can be based on the type of asynchronous circuit target: locally 
clocked [23,11,6], delay-insensitive [15,2,33,20], or various forms of single- and multiple- input change 
circuits [31]. Yet another distinction could be m ade on the natu re  of the control specification: graph 
based [24,18,34,4], program m ing language based [15,2,33,1], or finite s ta te  m achine based [23,11]. For
3
the finite sta te  m achine based styles, there is a fu rther distinction th a t can be m ade based on the m ethod 
by which sta te  variables are assigned [13,29]. The design style space is large and each design style has 
its own set of m erits and dem erits. It is worthwhile to  note th a t v irtually  all of the design styles focus 
on the design of the  control pa th  of the circuit since there is little  to  distinguish the asynchronous and 
synchronous d a tap a th  design styles.
The m ethods which produce delay-insensitive circuits, while not perfect [16], are the m ost to lerant 
of variations in device and wire delays. This tolerance improves the probability  th a t a properly designed 
circuit will continue to  function under variations in supply voltage, tem perature, and process param eters. 
We chose to  slightly expand the dom ain of tim ing assum ptions which m ust rem ain valid to  retain  
hazard  free im plem entation since this perm its higher perform ance im plem entations at the expense of 
reduced operational tolerance. O ur view is m otivated by the reality th a t our designs have to  meet 
certain perform ance requirem ents. For any given layout and fabrication process, we have models which 
predict the speeds of the wires and transistors for the desired operational window. We also know the 
percentage of error th a t can be to lerated  in those predictions. We could not live w ith a rb itrary  delays 
for perform ance reasons and therefore it seems im practical to  assume arb itrary  delays in order to  ensure 
hazard  free operation of the circuits. The approach taken in M EAT has therefore been to  insure hazard 
free operation under sets of tim ing assum ptions th a t can be verified as being w ithin acceptable windows 
of fabrication and operational tolerance.
Compiled im plem entations based on program m ing language like specifications [15,2,33,1], while 
elegant and robust, suffer in perform ance because they are presently compiled into in term ediate library 
modules ra ther th an  into optim ized transisto r networks. The module of greatest concern is the  C-element.. 
C-element.s are common circuit modules in asynchronous circuits and elim inating them  completely is 
unlikely. However it has been our experience over the past decade th a t C-element.s are sim ilar to 
the proverbial G O TO  statem ents in program m ing languages, i.e. too m any of them  are indications 
of serious trouble. C-element.s are stylized latches and as such are synchronization points. Too much 
synchronization reduces parallelism  and perform ance. Our design style does not. use C-element.s for finite 
sta te  m achine im plem entations, although our designs do use C-element.s sparingly in interface circuits 
such as arbiters.
In order to  achieve the necessary hazard free asynchronous finite sta te  m achine (AFSM) imple­
m entation, it. is necessary to  place constrain ts on how their inputs are allowed to  change. The most, 
common is the single input change or SIC constraint. [31]. SIC circuits inherently  require sta te  transi­
tions after each input, variable transition . In cases where the next, interesting behavior is in response 
to  m ultiple input, changes, the circuit, response will be artificially slow, either due to  too m any sta te  
transitions or due to  the external arbiters required to  sequence the m ultiple inputs. Multiple input 
change or MIC circuit, design m ethods have been developed [31,5] but. either required input, restrictions 
or im plem entation techniques that, were unsuitable for our purposes. As a result, we developed a design 
style that, we call b u rst-m od e which perm its a certain style of m ultiple input, change. Our burst.-mode 
im plem entation m ethod does not. require perform ance inhibiting local clock generation or flip-flops.
D uring the development, of MEAT, we were fortunate  to  have Steve Nowick, a m em ber of David 
D ill’s Stanford U niversity research group, spend two summers w ith us. He incorporated David D ill’s 
verifier [8] into the tool kit., and modified the verifier to  accom m odate our burst.-mode tim ing model and 
our tim ing assum ption based, perform ance oriented design style. Subsequently the HP and Stanford 
efforts have had a substan tia l coupling. In particular, the burst.-mode influence can be seen in the 
work of Ken Yun and Steve Nowick [23,22]. The more theoretically oriented Stanford work has pointed 
out. some serious oversights in our early M EAT algorithm s and has influenced our approach to  hazard 
removal.
This paper presents the M EAT synthesis tool, which has proven its ability to  greatly reduce 
design tim e while also generating com pact, high-perform ance, self-timed circuits. M EAT allows the 
designer to  specify the logical operation of asynchronous control com ponents as a finite sta te  machine.
4
M EAT synthesizes a verifiably correct, hazard  free im plem entation of the design to  produce a complex 
gate CMOS transistor level schem atic. A complex gate is a fully com plem entary CMOS function which 
im plem ents the sum  of products equations th a t describe the im plem entation. M EAT has been used to 
develop a control intensive m ulticom puter com m unication chip called the Post Office [27]. The Post 
Office contains 300,000 transistors and has an area of 11 x 8.3 m m  in the 1.2 micron MOSIS CMOS 
process.
The rem ainder of the paper describes the nature  of the design specification, the M EAT algorithm s, 
and presents several design issues th a t are exem plar of our design style.
2 MEAT - a Tool for Control Circuit Synthesis
The M EAT tools are fast enough th a t alternative design options can be explored. The designer is 
freed from  the task  of understanding the underlying transform ations required to  produce hazard-free 
asynchronous circuits. Asynchronous circuits are specified for M EAT as a burst.-mode Mealy sta te  
machine. This style of specification provides a powerful way to  encapsulate concurrency, com m unication, 
and synchronization in an accurate and easily understood form. The input specification is compiled into 
a set of CMOS complex gate. The result is an im plem entation which is efficient bo th  in term s of speed 
and area.
S tate flow diagram s are used to  model the behavior of sta te  machines im plem ented using MEAT. 
They provide an intuitive m ethod for defining control functionality, and are similar to  the flow charts 
and sta te  diagram s th a t are commonly taugh t in m ultiple disciplines to d a y [9]. A finite sta te  m achine is 
modeled as a directed graph, where the nodes represent states and arcs represent transitions between 
states. Each arc is labelled w ith the set of input firings which trigger the transition  and an associated 
set of ou tp u t firings. These s ta te  diagram s can easily represent parallelism  and synchronization, and are 
reasonably com pact when com pared to  other graphical specification m ethods.
M EAT sta te  diagram s allow a constrained form of MIC operation, which we refer to  as b u rst­
m ode. W hen a sta te  change is triggered by a conjunction of input signal transitions (an input burst), 
these signals are allowed to  change in any order and at any tim e. Allowing MIC operation simplifies the 
definition of synchronization operations and tends to  more closely m atch the designer’s m ental model of 
the hardw are. Presently  M EAT does not contain a s ta te  graph editor so a tex tual specification form at 
is used. The more n a tu ra l graphical s ta te  m achine description may be trivially m apped to  the tex tual 
version: each arc in the sta te  diagram  is m apped to  a single sta tem ent in the tex t file, which indicates 
the source and destination sta tes along w ith the associated input and ou tpu t bursts.
Burst.-mode sta te  diagram s are reasonably com pact when com pared to  petri-nets, m -nets, S T G ’s, 
and other graphical representations. These diagram s work well for transition  (2 cycle) or level-mode (4 
cycle) signalling protocols. Figure 1 shows an example of an STG (a), enhanced STG (b), and burst.- 
mode sta te  diagram  (c) for an asynchronous D flip-flop. In this paper we assume positive logic, hence 
a. | corresponds to  a. high transition  on signal a. In the tex tual version a\  is represented simply as a and 
a. | as a.~. The corresponding tex tual en try  version for M EAT is:
:fsm Asynch-Flip-Flop ;name of FSM for documentation.
:in (D Clk) ;list of input variables.
:out (Q) ;list of output variables.
:init-in () ;initial value of inputs, default zero.




Figure 1: Sample Flip-Flop Specifications
:init-state 0 ;initial state, default is State 0.
:state 0 (Clk~) 0 () ;specification of state transitions:
:state 0 (D * Clk) 1 (Q) ; format is Ccurrent state> Cinput burst>
: state 1 (Clk"") 1 () ; Cnext state> Coutput burst>
:state 1 (D~ * Clk) 0 (Q~)
The first au tom ated  task  perform ed by M EAT is to  generate a prim itive flow table [31] from  the 
tex tual FSM specification. This is a two-dimensional array structu re  which captures, in a more detailed 
form, the behavior represented by the sta te  diagram . Each row of this table represents a node in the 
sta te  diagram ; each column represents a unique com bination of input signals. Each entry in the table 
thus represents a position in the possible sta te  space of the FSM.
For each entry, the value of the ou tpu t signals and the desired next sta te  may be specified. If a 
nex t-sta te  value is the same as th a t of the current row, the sta te  machine is said to  be in a stable sta te . If 
the nex t-sta te  value specifies a different row, the table en try  represents an unstable state. A simple way 
of understanding the flow table is to  note th a t horizontal movement w ithin a row represents changes in 
the values of input signals, while vertical movement w ithin a column represents a state transition. All of 
our specifications are given in normal form ,  th a t is, each unstable en try  in the table m ust lead directly 
to  a stable state.
Each allowed input bu rst will result in a particu lar p a th  th rough the FSM state-space, starting  at 
the stable en try  where the burst begins. O ther entries in the same row m ay be visited during the course 
of the input burst. In order for MIC behavior to  be correctly represented, it m ust be guaranteed th a t 
the circuit will rem ain stable in the initial row until the input burst is complete. This is an im portan t 
point and is a cornerstone of the burst.-mode methodology. In essence, any m interm  formed from input 
variables which can be reached during the course of an input burst m ust be covered by a stable en try  in 
the flow table. The m interm  defined by the completion of the burst will correspond to  an unstable sta te
6
which will cause a transition  to  the target row and fire the ou tpu t burst.
The ou tp u t burst, if any, may occur concurrently w ith the sta te  change, or can be constrained 
to  happen after the s ta te  change has occurred. To allow the flexibility for the later synthesis stages to 
choose either option, signals in the ou tp u t burst are labeled as don’t cares in the unstable exit s ta te  of 
the flow table. Since all s ta te  transitions are STT Single Transition T im e , the monot.onicit.y of ou tpu t 
voltage changes is guaranteed, regardless of w hether the value of a given transitioning ou tpu t in an 
unstable entry is m apped to  logic level zero or one.
Any en try  in the flow table not reachable by any allowed sequence of input bursts is labeled as a 
don't care and can take on any value for the ou tpu ts or nex t-sta te  values. As in the case of ou tpu t bursts 
discussed above, it is not im m ediately evident which values will lead to  the simplest circuit. Therefore, 
the assignm ent of specific values to  the don’t care entries is deferred for as long as possible. The inclusion 
of these don’t cares can significantly simplify sta te  reduction and boolean m inim ization, and also lead 
to  more com pact circuits.
The next step in the design process is to  a ttem p t to  reduce the num ber of rows in the flow table by 
merging selected sets of two or more rows into one while retaining the specified behavior. This involves 
first calculating the set of maximal compatible states. The set of m axim al com patibles consists of the 
largest sets of sta te  rows which can be merged, which are not subsets of any other such set. There may 
be various valid com binations of the m axim al com patibles th a t can be chosen to  produce a reduced table 
w ith the same behavior.
This is essentially the well-known state-reduction problem; unfortunately  complications are in­
troduced due to  the MIC nature  of the input bursts. “T raditional” m ethods norm ally apply only to  SIC 
circuits, and when used for our burst.-mode specifications may produce hazards in the final implement.a-
Nowick et.. a.i.[21] have developed the modifications necessary to  the st.at.e-reduct.ion and subse­
quent. synthesis steps to  guarantee that, the resulting im plem entation will be hazard-free under burst.- 
mode conditions. These modifications are not. presently incorporated into M EAT. C urrently  we use a 
verifier [8] on the synthesized im plem entation. The verifier has been modified to  operate w ith explicit, 
tim ing assum ptions. Hazards detected in the im plem entation are then reviewed to  see if the circuit, 
would exhibit, correct, behavior under reasonable delay assum ptions. If these assum ptions fall w ithin 
acceptable bounds of fabrication and operational constrain ts then  the tim ing assum ption is entered into 
the verifier. If an unacceptable assum ption is required then  the circuit, is fixed either by m anual repair 
or by modifying the state-m achine specification. The m anual repair usually involves the addition of 
appropriate inverter chain to  delay the race critical path .
The final choice of minimized sta tes is an example of the bmate covering problem . There are three 
constrain ts on this choice. First., and obviously, only com patible sta tes m ay combined (compatibility 
constraint.). Second, each sta te  in the original design must, be contained in at. least, one of the reduced 
sta tes (completeness constraint.). T hird , selecting certain sets of states to  be m erged may imply that, 
other states must, also be m erged (closure constraint.). Grasselli and Luccio [10] have developed a tabu lar 
m ethod for determ ining a closed cover of states, which is also in the process of being incorporated into 
M EAT. At. present., M EAT requires the user to  m anually determ ine and enter a sta te  covering. If any 
of the necessary constrain ts are not. satisfied, M EAT will inform the user that, the covering is invalid.
A new flow table representing the behavior of the minimized FSM is then  generated by merging 
the specified rows of the original flow table. It. should be noted that. it. is not. always true that, minimizing 
the num ber of states will simplify the hardw are or increase perform ance. However, a reduced sta te  
m achine can result, in fewer sta te  variables which in most, cases does indeed result, in a smaller and faster 
im plem entation.
A set. of s ta te  variables must, then  be assigned to  uniquely identify each row of the reduced flow
7
table. These sta te  variables are used as feedback signals in the final circuit. In contrast to  synchronous 
control logic design, sta te  codes may not be random ly assigned, bu t m ust be carefully chosen to  prevent 
races. The M EAT sta te  assignm ent algorithm  is based on a m ethod developed by T racey[29]. The 
Tracey algorithm  has the advantage th a t it produces STT sta te  assignm ents which minimizes delay in 
the im plem entation. In cases where tw o or more sta te  variables m ust change value when transitioning 
to  a new sta te , all variables involved are allowed to  change concurrently, or race. It m ust be guaranteed 
th a t the outcome of the  race is independent of the order in which the sta te  variables actually transition  
in order to  produce a non-critical race which exhibits correct asynchronous operation. Several valid 
assignm ents m ay be produced, and each will be passed to  the next stage for evaluation. Each sta te  
assignm ent will result in a unique im plem entation.
After sta te  codes are assigned, the next synthesis stage com putes a canonical sum  of products 
boolean expression for each ou tpu t and sta te  variable. A modified Quine-M cCluskey m inim ization 
algorithm  is used. The resulting expression includes all essential prim e im plicants, and possibly other 
prim e im plicants and additional term s necessary to  produce a covering free of logic hazards. It m ay be 
possible for each ou tpu t or sta te  variable to  be specified using several a lternate  m inim al equations. The 
large num ber of don’t care entries typically present in the flow table causes the standard  algorithm  to be 
ra ther inefficient and increases the likelihood th a t more th an  one m inim al expression will be found. The 
M EAT im plem entation contains optim izations for don’t care dom inant functions. Each possible solution 
is given a heuristic “weight” th a t indicates the expected speed and area cost, of im plem entation using 
complex CMOS gates. W hen m ultiple s ta te  assignm ents have been produced in the previous step, the 
to ta l weight of each unique SOP (sum  of products) equation is then  used to  choose betw een the various 
instantiations.
The minimized equations produced in the previous step are then  used to  autom atically  generate 
transisto r netlists, suitable for sim ulation, representing complex CMOS gates. An interface to  the 
E lectric[25] design system  is used to  autom atically  produce a schematic diagram  to help guide the 
layout process, which unfortunately  has not yet been autom ated. The com plem entary nature  of CMOS 
n-t.ype and p-t.ype devices is exploited to  generate a single, complex, static  gate th rough simple function 
preserving transform ations. These transform ations can increase perform ance while reducing the area 
and device count. As a SOP equation is folded in to  a complex gate, the num ber of logic levels required 
to  generate the ou tpu t can be reduced. If the function is too large to  be im plem ented as a single module, 
it can easily be broken up into a tree of complex gates w ith 2 or more logic levels, bu t b e tte r overall 
performance[28]. Typical sta te  m achine im plem entations have response times between 3 and 5 2-input. 
NAND gate delays.
Our complex gate design generates negative logic ou tpu ts (low voltage levels for asserted signals). 
A convention of positive logic levels is assum ed for all signals external to  the sta te  machine, requiring 
that, the ou tpu ts be inverted. This is a feature for perform ance reasons as the gain of the inverter can be 
used as a driver to  increase signal s trength  and reduce rise and fall times. W hen ou tpu ts need to  drive 
a large load, a buffer tree can be used.
All s ta te  machines also require a reset, signal to  place the storage logic into the correct, in itial 
s ta te . Storage in these sta te  machines is im plem ented via the sta te  variables. If a single complex gate is 
used to  generate the output., the sta te  storage is reset, by NOR-ing the output, w ith the reset, line. For 
complex gate trees, a reset.a.ble NAND gate is used. A lthough the perform ance of the NO R gate is not. 
optim al, the load on the feedback lines is local to  the s ta te  m achine and typically small so a large gain 
is not. required.
8
3 Design Issues and Examples
Figure 1 essentially shows how AFSM designs are specified using MEAT. R ather th an  presenting a series 
of more complex designs which will show roughly the same thing, we will present a num ber of design 
vignettes which illustrate interesting points in the design space, and an example of M EAT usage.
3.1 A Story about C-elem ent Design
D uring a 1986 course on asynchronous circuits taugh t by Ivan Sutherland and Bob Sproull, the discussion 
tu rned  to  the design of the common C-element. A t th a t tim e, the standard  C-element consisted of a 
2-high stack, followed by an inverter. This elem ent had the problem  th a t it was a dynamic  gate. If the 
two inputs rem ained at different voltage levels for long enough the C -elem ent’s sta te  would be lost and 
cause an invalid ou tp u t transition . This was clearly unacceptable for general asynchronous applications. 
Static versions of the circuit were created by including a weak “trickle charge” inverter to  m aintain  
correct voltage on the in ternal node in the  absence of it being directly driven by the 2-high stack.
The trickle charge inverter was a problem  for several reasons. F irst, it reduced the perform ance 
of the circuit. W hen the in ternal node c (in Figure 2a) needed to  be flipped to  a different voltage, the 
trickle inverter would be actively driving the circuit one way, while the 2-high stack was actively driving 
it another way. The 2-high stack needed to  charge the node, as well as dissipate the current supplied 
from  the trickle inverter. This caused increased power consum ption due to  the existence of a DC path  
between the power rails during a sta te  change. Secondly, the inherent gain of an inverter is greater than  
the gain of a 2-high stack. This design requires the 2-high stack to  overpower the inverter to  flip the 
sta te  of the device. Unless the drive of the 2-high stack is significantly greater th an  the inverter, the 
node becomes susceptible to  noise problem s which could result in hazards. This gain difference can only 
be overcome by reducing the size of the inverter and increasing the size of the 2-high stack. Hence the 
sizing of the com ponents becomes critical. Increasing the size of the 2-high stack slows the circuit by 
requiring additional input drive. Decreasing the w idth and increasing the length of the inverter reduces 
the reliability of the  inverter and the portab ility  to  o ther processes.
After the d ay ’s discussion, we spent several hours a ttem pting  to  come up with a be tte r C-element 
design which elim inated the trickle inverter, yet did not add significant complexity to  the com ponent. 
U ltim ately a design was found which was com pact and efficient. This design has been widely used in a 
num ber of sites. This design required 4 more transisto rs th an  the trickle charge design. However, the 
2-high stack could be of optim ally sized transistors and there was no fight to  drive the in ternal node c. 
A lthough this circuit was larger, and the inputs drive twice the num ber of devices, it was significantly 
faster th an  the original design and avoided the power consum ption, noise, portability , and function 
problem s of the old design.
Several years later, curiosity lead us to  see w hat M EAT would produce for a C-element. The 
exact same circuit was produced from  M EAT in an instan t. M EAT generated equations for the circuit 
shown in Figure 2b and the back-end schem atic generated the equivalent bu t optim ized version shown 
in Figure 2c.
3.2 U sing Burst-M ode to Increase Performance
Burst.-mode assumes that, inputs and ou tpu ts are generated as discreet, sets, or bursts. In general, this 
violates dela.y-insensit.ive and speed-independent, assum ptions. For example, assume th a t an input, burst, 
has completed, and the resulting output, burst, causes several ou tpu ts to be generated. One of the ou tputs
9
a) Trickle inverter C-element b) Complex gate for c = ab + ac + be
Figure 2: C-Elem ents, Hand O ptim ized M atched by M EAT
10
could be generated before the others. This ou tpu t can be received by a destination module which could 
in tu rn  generate an ou tpu t which is fed back as an input to  the original module even before the  rest of the 
ou tpu ts have been generated. This violates burst.-mode operation as the next input burst has occurred 
before the previous ou tpu t burst has completed. Burst.-mode assumes th a t all ou tpu ts  in the burst, must, 
be generated before the environment, can respond to  the output, burst, or com putation interference may 
occur. The cases where com putation interference can occur can be flagged and checked by circuit, tim ing 
analysis.
M EA T’s burst.-mode MIC model is similar to  the fundam ental mode assum ptions for trad itional 
SIC AFSM designs. Namely we assumes th a t once an input, burst, has arrived the AFSM  will settle in a 
stable sta te  before the next, burst, can arrive. If this assum ption cannot, be met. then external arb itra tion  
will be required to  enforce the assum ption.
If an input, burst, changes an in ternal sta te  variable, speed-independent, operation will generally 
require the sta te  variable to  stabilize before the output, can be changed. Perform ance can be improved 
if ou tpu ts can change concurrently w ith sta te  changes. M EAT accomplishes this by m aking the transi­
tioning output, a don’t, care in the unstable exit, point, of a row in the flow table. This places a priority 
on logic m inim ization, but. usually will produce a circuit, which can generate an output, concurrent, with 
sta te  changes. The fundam ental mode assum ption guarantees that, the AFSM  is ready to  accept, the next, 
input, burst, when it. arrives, as the sta te  variable transition  has completed and the logic has stabilized. 
Unger has shown that. it. is possible to  weaken this fundam ental mode assum ption [30], although his 
m ethod is not. presently incorporated into MEAT.
3.3 W hen Speed-Independent Circuits Fail: The Isochronous Fork
Ideally all asynchronous circuits should be designed as dela.y-insensit.ive modules. However, performance 
requirem ents may force one to  make weakening assum ptions about, circuit, behavior. M any of these 
assum ptions are realistic, as physical devices and wires do not. require unbounded delays to  generate and 
propagate signals. However, care must, be used to  assure th a t the  circuit, complies to  these assum ptions 
under all operating conditions or the design will be unsafe and costly failures may occur.
Simplifying assum ptions are best, exploited when they are constrained to  a fixed extent, physical 
dom ain as is the case w ith AFSM  modules. Hierarchical composition of these modules can then  proceed 
conforming to  dela.y-insensit.ive rules since all of the external interfaces should be designed avoid tim ing 
assum ptions. Inside an AFSM, the relative delay of wires and gates can be more easily controlled, 
analyzed, and modified as the constrain ts are all local. W hen these tim ing assum ptions apply outside 
an individual module then  the entire system  must, be analyzed to  assure compliance w ith the tim ing 
assum ption set.. At. this point, there is little to  distinguish the circuit, from  a synchronous one.
A common perform ance and synthesis assum ption m ade by m any asynchronous circuit, designers 
is th a t of speed-independence. The assum ption th a t wire delay is zero leads to  the isochronous fork  
assum ption. This implies th a t m ultiple devices driven by a single component, react, to  the signal change 
at. approxim ately the same tim e. This model works well for situations where the transistors are slow 
and the paths are fast.. U nfortunately this model becomes less valid as IC technology progresses and is 
certainly suspect, even today.
Furtherm ore, whenever the rise or fall tim e of an isochronous fork is greater th an  the switching 
delay of any physical device, failure m ay occur due to  variances in switching thresholds. Noise, long wires, 
and high-capacit.ance paths exacerbate the problem. W ithin  a particu lar AFSM module, this problem  
can be m anaged successfully but. between modules it. is difficult.. M artin [14] and Van Berkel [32] have 
bo th  described circuit, failures due to  paths which did not. behave in an isochronous fashion. Both failures 
were the result, of using C-element.s in module interfaces. C-element.s inherently contain an isochronous
11
YState Machine
MEAT State Machine Logic Blocks
Figure 3: S tate  M achine G eneration
fork. Namely the ou tpu t of the C-element. will be an ou tpu t of the module as well as being fed back 
locally to  m aintain  the C -elem ent’s state.
The philosophy we have used in the M EAT tool and in the design of our circuits is to  remove 
isochronous forks from  external interfaces. M EAT sta te  machines are broken into the partitions shown 
in Figure 3. Our philosophy is th a t we would ra ther increase the cost, and difficulty of designing modules 
if it. can simplify the composition of systems. Tim ing assum ptions are always easier to  analyze and fix 
in a small, local cell ra ther th an  across a series of modules. Systems are hard  to  design and low-level 
modules are relatively easy. If by m aking the module design harder, it. becomes easier to  do the inherently 
complex task  then the overall difficulty is reduced.
The trigger box has two functions. First., high capacitance inputs (inputs w ith a slow rise tim e) will 
be passed through an inverter or Schmitt, trigger. This will reduce the load on the input, line, which can 
increase circuit, perform ance. It. also results in crisp rise and fall tim es of signals in ternal to  the AFSM. 
Secondly when an una.ssert.ed input, signal is required by the state or output boxes, the  trigger box will 
invert, that, signal. Each input, will have its inverted and uninvert.ed signal shared among all function 
blocks in the s ta te  m achine to  elim inate hazards and create a smaller im plem entation. The isochronous 
forks created by sharing the inverters are easily controlled w ithin the AFSM domain. C om ponents w ithin 
a particu lar AFSM are physically close. Hence wire delays of the in ternal signals and the trigger box 
delay are norm ally insignificant..
The driver block is used to  generate positive output, voltage levels and to  increase the signal 
s trength  when the output, is heavily loaded. Circuit, perform ance is enhanced since it. is sized to  drive 
its output, load appropriately. Isochronous forks in M EAT will only exist, when a sta te  variable is used 
directly as an output.. In such cases, the output, can be buffered by one or two inverters to  assure the 
fork is isolated w ithin the AFSM . W hile this decreases the perform ance of the circuit., the module can 
function in a delay-insensitive m anner and can be safely used without, analyzing i t ’s load in a broader 
context..
This design style has been tested  continuously over the last, five years. We have designed several 
large asynchronous circuits which have generally worked the first, tim e, merely using sim ulators to  verify 
correct, composition of the modules. The result, of this experience has led to  a high confidence factor in 
the m ethod.
3.4 An AFSM  exam ple
In order to  illustrate exactly what. M EAT does, we will transcribe an actual synthesis run using M EAT 
to create a Post. Office s ta te  m achine called the SBUF-SEND-CTL. The behavior is initially specified 
as a burst.-mode AFSM as shown in Figure 4. This example is taken from  the suite of Post. Office sta te  
machines publicly available for use by other researchers [26,23].
12
13
The specification of sbuf-send-ctl from Figure 4 is textually  entered for M EAT as follows:
:fsm sbuf-send-ctl
:in (Deliver Begin-Send Ack-Send) ;list of input variables
:out (Latch-Addr IdleBAR Send-Pkt) ;list of output variables
state 0 (Deliver)













state 6 (Deliver” * Ack-Send)
7 (Send-Pkt~ * Latch-Addr)
state 7 (Ack-Send~)
2 0
The following is a transcrip t from  a M EAT session. The specification resulted in a single imple­
m entation  w ith two sta te  variables.
> (meat "sbuf-send-ctl.data")
Max Compatibles: ((0 5) (1 2 7) (3 4) (6))
Enter State set: ’((0 5) (1 2 7) (3 4) (6))
SOP for "Yl":
18: DELIVER + Y1*BEGII-SEIFD~
SOP for "Y0":




30: ACK-SEND + BEGIN-SEND + Y0 + Yl 
SOP for SEND-PKT:
12: Y0*BEGIU—SEND ~
HEURISTIC TOTAL FOR THIS ASSIGNMENT: 100
The im plem entation can then be verified for hazard-free operation by the verifier. The verifier 
reads the specification and im plem entation. For this example, the s ta te  variables and ou tpu ts generated 
by M EAT are im plem ented as t.wo-level A N D /O R  logic. Each signal is generated independently  of the 
others. Only direct inputs are shared, so the same inverted signal in different ou tpu t logic blocks will 
use separate inverters. Separate inverters will result in verification errors in the burst.-mode speed- 
independent. analysis. In this example, the begin-send signal is shared by Y l  and send-pkt. The two 
inverters are merged and the output, is forked to  bo th  logic blocks. This im plem entation is then verified.
14
ack-send 





Y O  |
|| | deliver || |
Figure 5: Complex CMOS G ate for sbuf-send-ctl YO
The verifier points out a d-trio hazard [31] which is removed by adding an inverter to  change the 
sequencing of begin-send into the YO logic. The im plem entation is then  verified as hazard free as 
follows:
> (verifier-read-fsm "sbuf-send-ctl.data")
Max Compatibles: ((0 5) (1 2 7) (3 4) (6))
Enter State set: ’((0 5) (1 2 7) (3 4) (6))
> (setq *impl* (merge-gates ’(1 11) *impl*))
> (verify-module *impl* *spec*)
10 20 30 40 50
Error: Implementation produces illegal output.
> (setq *impl* (connect-inverter 10 6 *impl*))
> (verify-module *impl* *spec*)
10 20 30 40 50 60 70 79 states.
The canonical SOP equations generated by M EAT are then transform ed into complex gates 
for im plem entation. The CMOS circuit for YO is shown in Figure 5. The complex gates are then 
m anually im plem ented using the Electric [25] layout editor. The physical layout is then  sim ulated w ith 
COSMOS [3] to  check for layout errors. C ooperating sets of sta te  m achine cells are interconnected to 
form  larger modules, integrating  clocked d a tap a th  logic when necessary.






0 0 01 11 10
0 0 0 -- 1 0
1 1 [D] ID' " 1 X
Req-S
------------------------  Done )W8 — Done
Logic with d-trio hazard D-trio hazard removed
Figure 6: H azard removal from  “Sendr-Done” s ta te  machine
Figure 6 shows a static  d-trio  or nonessential function hazard which is found in some of the sta te  
machines produced by M EAT. D -trio hazards are fundam ental and cannot be removed in every case, 
bu t they will be detected by the verifier In this cases the hazard  occurs because the input burst resulted 
in an in ternal s ta te  change while the ou tpu t burst contained no transition  for the Done  signal. The 
d-trio hazard in this example can produce a static  1-hazard on the Done  signal. The input burst is 
perceived by the Done  ou tpu t logic after the sta te  change bu rst thereby creating the hazard.
The W S  signal of the logic w ith the d-trio also contains an isochronous fork. If we ignore the 
po ten tia l threshold deviations then tim ing analysis shows th a t the physical behavior will not exhibit the 
hazard. However, this circuit cannot be included in a system  w ithout analyzing the driver, load, and 
stray  capacitance on the W S  input or errors will result.
By modifying the trigger logic in the Sendr-Done s ta te  m achine shown in Figure 6, we can both  
elim inate the d-trio hazard and the external isochronous fork. This incurs no perform ance penalty. The 
W S  signal to  the Done logic rem ains delayed by a single inverter, while the W S  signal to  the sta te  logic 
becomes double inverted ra ther th an  fed directly into the logic from  the input.
The double inversion has the effect enforcing correct sequencing of the order of arrival of the W S  
signal to  the Done  logic. Transitions on W S  will always be perceived by the Done  logic before changes 
in the sta te  variable, resulting in hazard-free circuit operation. Transitions are ordered such th a t the 
assertion of the sta te  variable is not critical to  the perform ance of the circuit, so the double inversion of 
W S  in to  the sta te  logic has no deleterious effect.
3.6 W hen MIC Circuits Cannot Be Designed: The N A K ing Arbiter
M EAT sta te  graphs m ust be unam biguous and determ inistic. Nondeterm inistic behavior inside a sta te  
graph is not allowed as it can result in m etastability . However, the operation of a sta te  m achine may 
be nondeterm inistic if a m utual exclusion element (ME) is used to  order the arrival of two or more 
concurrent inputs into the sta te  machine. M E ’s are analog devices, and are the only external device 
th a t m ay be required to  im plem ent control functions using the M EAT methodology. They are easily
16
A«><
Figure 7: Naking A rbiter SIC S tate M achine Specification.
fabricated in m ost VLSI technologies, requiring 12 transisto rs in CMOS.
W hen m ultiple edges exit a single state, there m ust be a t least one pair of m utually exclusive 
signals for all pairs of edges exiting the s ta te [19]. If there is no pair of m utually  exclusive signals for 
all pairs of edges then the sta te  machine can only operate in single input change (SIC)  mode for those 
signals. This has been referred to  as the semi-modularity  property  [4].
A rbiters are inherently nondeterm inistic circuits which cannot be directly im plem ented as an 
AFSM. The Naking A rbiter of Figure 7 is an SIC sta te  machine. Since the environm ent perm its the R l 
and R2 signals to  arrive concurrently these signals pass th rough a sequencer before entering the sta te  
machine. A sequencer consists of a set of ME gates, AND gates, and latches, w ith an input to  enable 
the next transition . Sequencers are nondeterm inistic and ra ther expensive to  build in term s of size and 
speed.
4 Summary
The goal in the developm ent of the M EAT tool was to  generate fast, com pact, efficient circuits. Showing 
the excellent perform ance th a t can be achieved with asynchronous designs is an im portan t p art of 
forwarding this technology to  the general circuit design community. W hile experienced asynchronous 
designers understand  th a t there are more benefits in the  asynchronous approach than  speed, it is clear 
th a t the dom inant metric in evaluating circuit design styles in the commercial arena is perform ance. Our 
Post Office design was no exception; as long as the circuit was fast nobody cared how we did it except 
us. We view this as a sad reality, since it relegates the  im pact of the conceptual elegance of asynchronous 
circuits to  the academic community.
Building a large, fully self-timed circuit has resulted in m any insights. The need for synthesis 
and analysis tools th a t compare w ith those available to  the synchronous design com m unity is of prim ary 
im portance. We hope th a t M EAT is a step in the direction of a ttrac tin g  more broad based interest. 
We have publicly offered bo th  the M EAT tool and m any of the Post Office sta te  machines to  the IC 
CAD design com m unity in hopes th a t others will improve on this step. The need for more robust circuit 
behavior and for higher perform ance levels is ubiquitous.
MEAT, like any CAD tool, is incomplete. The back-end only produces schem atics. M anual lay­
out. is prohibitively tim e consuming. Some form  of autom atic layout, is necessary unless we abandon the 
complex gate approach in order take advantage of s tandard  cell and technology m apping approaches.
17
A utom atic layout is a difficult, task  and should also include autom atically  sized transistors for the  per­
formance needs of the design. Using standard  cells will result, in some lost, perform ance but. the synthesis 
task  is easier. We are investigating bo th  options. There are other perform ance oriented factors th a t 
should be included. As a design is passed down through the different, stages of MEAT, some infor­
m ation is lost.. The complexity of the algorithm s and sim plicity of the circuits could be enhanced by 
preserving some of this inform ation. S tate graphs lack the formalisms required to  analyze compositions 
of these circuits for safety, liveness, deadlock, and other properties. We are currently  investigating a 
process calculus as a means of specifying and generating M EAT sta te  graphs as well as proving correct, 
operation and construction. M EAT also needs to  be connected to  existing CAD tools. An example is 
the connection to  a tim ing analyzer so that, the tim ing assum ptions can be autom atically  analyzed for 
compliance. Since d a tap a th  design is similar to  that, of synchronous designs, we need to  integrate the 
M EAT capability into an existing CAD framework. Presently, too m uch designer in teraction  is required 
to  traverse the seams separating M EAT and other pieces of our tool environment..
A pproxim ately a fifth of the Post. Office control p a th  design was done manually, and the rest, 
was done using MEAT. The au tom ated part, of the design took one-fourth the amount, of design time 
and was v irtually  error free. Those errors were corrected when Steve Nowick pointed out. a flaw in 
our m inim ization algorithm s. Our design style has proven to  be a very n a tu ra l transition  for existing 
hardw are designers, prim arily since it. is based on trad itional finite s ta te  m achine control. O ur synthesis 
techniques have generated compact, high-perform ance circuits that, work, and the complexity of the 
synthesis algorithm s has proven to  be viable for large designs.
References
[1] Erik Brunva.nd and Robert. Sproull. T ranslating Concurrent. Program s into Dela.y-Insensit.ive Cir­
cuits. In IE E E  International Conference on Computer Aided, Design: Digest of Technical Papers , 
pages 262-265. IE E E  C om puter Society Press, 1989.
[2] Steven M. Burns and Alain J. M artin. The Fusion o f  Hardware Design and Verification, chap­
ter Synthesis of Self-Timed Circuits by Program  Transform ation, pages 99-116. Elsevier Science 
Publishers, 1988.
[3] Ca.rnegie-Mellon University. User's Guide to COSMOS.
[4] Ta.m-Anh Chu. On the models for designing VLSI asynchronous digital system s. Technical Report. 
M IT-LCS-TR-393, M IT, 1987.
[5] Henry Y. H. Chua.ng and Sa.nt.anu Das. Synthesis of multiple-input, change asynchronous machines 
using controlled excitation and flip-flops. IE E E  Transactions on Computers, C-22( 12): 1103—1109, 
December 1973.
[6] A. L. Davis. The A rchitecture of DDM1: A Recursively S tructured  D ata-D riven Machine. Technical 
Report. UUCS-77-113, U niversity of U tah, Com puter Science Dept., 1977.
[7] Digital Equipment. Corporation., M aynard, MA. Alpha Architecture Handbook, 1992.
[8] David Dill. Trace Theory fo r  Automatic Hierarchical Verification of Speed-Independent Circuits. An  
A C M  Distinguished Dissertation. M IT Press, 1989.
[9] W illiam I. F letcher. An Engineering Approach to Digital Design. Prent.ice-Ha.ll, 1980.
[10] A Grasselli and F. Luccio. A M ethod for Minimizing the N um ber of In ternal States of Incom pletely 
Specified Sequential Networks. IE E E  T E C , June 1965.
18
[11] A. B. Hayes. Stored State Asynchronous Sequential Circuits. IE E E  Transactions on Com puters, 
C-30(8), August 1981.
[12] A. B. Hayes. Self-Timed IC Design with P PL ’s. In R. E. Bryant, editor, Third Caltech Conference on 
Very Large Scale Integration , pages 257-274, Rockville, Maryland, 1983. Computer Science Press, 
Inc.
[13] Lee A. Hollaar. Direct implementation of asynchronous control units. IE E E  Transactions on 
Com puters, C-31( 12): 1133-1141, December 1982.
[14] A.J. Martin, S.M. Burns, T.Iv. Lee, D. Borkovic, and P.J. Hazewindus. ’’The Design of an Asyn­
chronous Microprocessor” . In C.L. Seitz, editor, Advanced Reserach in VLSI: Proceeedmgs o f the 
Decennial Caltech Conference on V L SI, pages 351-373. MIT Press, 1989.
[15] Alain Martin. Compiling Communicating Processes into Delay-Insensitive VLSI Circuits. Dis­
tributed Com puting , 1(1 ):226—234, 1986.
[16] Alain Martin. The Limitations to Delay-Insensitivity in Asynchronous Circuits. In William J. Dally, 
editor, Sixth M IT  Conference on Advanced Research in V L SI , pages 263-278. MIT Press, 1990.
[17] C. Mead and L. Conway. Introduction to V L SI System s. McGraw-Hill, 1979. Chapter 7.
[18] Teresa Meng. Synchronization Design fo r  Digital System s. Ivluwer Academic, 1990.
[19] R.E. Miller. Switching Theory, II: Sequential circuits and machines. Wiley, 1965. Chapter 10.
[20] Charles E. Molnar, Ting-Pien Fang, and Frederick U. Rosenberger. Synthesis of Delay-Insensitive 
Modules. In Henry Fuchs, editor, Chapel Hill Conference on Very Large Scale Integration, pages 
67-86. Computer Science Press, 1985.
[21] S. M. Nowick and D. L. Dill. Synthesis of asynchronous state machines using a local clock. In 1991 
IE E E  International Conference on Com puter Design: V L SI in Computers and Processors. IEEE 
Computer Society, 1991.
[22] S. M. Nowick, Iv. Y. Yun, and D. L. Dill. Practical asynchronous controller design. In 1992 
IE E E  International Conference on Com puter Design: V L SI in Computers and Processors. IEEE 
Computer Society, 1992.
[23] Steven M. Nowick and David L. Dill. Automatic synthesis of locally-clocked asynchronous state 
machines. In 1991 IE E E  International Conference on Com puter-Aided Design. IEEE Computer 
Society, 1991.
[24] S.S. Patil. Coordination of asynchronous events. Technical Report TR-72, MIT Project MAC, June 
1970.
[25] Steven M. Rubin. Com puter A ids fo r  V L SI Design. VLSI Systems. Addison-Wesley, 1987.
[26] L. Lavagno; Iv. Iveutzer; A. Sangiovanni-Vincentelli. Synthesis of Verifiably Hazard-Free Asyn­
chronous Control Circuits. Technical Report UCB/ERL M90/99, Univ. of California at Berkeley, 
November 1990.
[27] Kenneth S. Stevens, Shane V Robison, and A.L. Davis. “The Post Office -  Communication Sup­
port for Distributed Ensemble Architectures” . In Proceedings o f 6th In ternational Conference on 
Distributed Computing Systems, pages 160 -  166, May 1986.
[28] Ivan E. Sutherland and Robert F. Sproull. Logical effort: Designing for speed on the back of an 
envelope. In Carlo H. Sequin, editor, Proceedings o f the 13th Conference on Advanced Research in 
VLSI, pages 1-16. UC Santa Cruz, March 1991.
19
[29] J. H. Tracey. Internal state assignments for asynchronous sequential machines. IE E E  Transactions 
on Electronic Com puters, EC-15:551-560, August 1966.
[30] S. H. Unger. A Building Block Approach to Undocked Systems. In Proceedings o f the 26th H ICSS  
Conference, January 1993. To appear.
[31] S.H. Unger. Asynchronous sequential switching circuits. Wiley-Interscience, 1969.
[32] C. H. van Berkel. Beware the Isochronic Fork. Technical Report Nat. Lab Rep. UR 003/91, Philips 
Research Laboratories, January 1991.
[33] C. H. (Ivees) van Berkel. Handshake circuits: an interm ediary between communicating processes 
and VLSI. PhD thesis, Technical University of Eindhoven, May 1992.
[34] Peter Vanbekbergen, Francky Catthoor, Gert. Goossens, and Hugo De Man. Optimized synthesis of 
asynchronous control circuits from graph-theoretic specifications. In In ternational Conference on 
Computer-Aided Design. IEEE Computer Society Press, 1990.
20
