CFSIM: A concurrent compiled-code functional simulator for hopCP by Akella, Venkatesh & Gopalakrishnan, Ganesh
CFSIM: A Concurrent Compiled-Code 
Functional Simulator for hopCP
V EN K A T ESH  AKELLA  
G ANESH  G O PA LA K R ISH N A N
UUCS-92-002
D epartm ent of Com puter Science 
U niversity of U tah  
Salt Lake City, U T  84112, USA
January 22, 1992
A b stract
C on tro l in ten sive  IC s p o se  a significant challenge to  th e users o f  fo rm al m e th o d s  in designing hardw are. 
T hese IC s have to  su p p o rt a w ide va r ie ty  o f  requ irem ents in clu din g  synchronous and asynchronous opera­
tions, p o llin g  and in terru p t-d riven  m o d es  o f  opera tion , m u ltip le  concurrent threads o f  execu tion , com plex  
com p u ta tio n s , and prog ra m m a b ility . In th is paper, we illu s tra te  th e  use o f  form al m e th o d s  in th e design  
o f  a con tro l in ten sive  IC  called th e “In tel 8251” U niversal S yn ch ronou s/A syn ch ronou s R e ce ive r /T ra n sm itte r  
(U S A R T ), using our form al h ardw are descrip tion  language ‘h o p C P ’. A  fea tu re o f  h o p C P  is th a t i t  su p p o rts  
com m un ication  v ia  asynchronous ports (d is tr ib u te d  shared variables w ritab le  b y  e x a c tly  one process), in 
a dd ition  to  synchronous m essage passing. We sh ow  th e usefulness o f  th is com bination  o f  com m un ication  
con structs. W e o u tlin e s ta tic  analysis a lgorith m s to  determ in e safe usages o f  asynchronous p o r ts , and also  
to  discover o th er s ta t ic  p ro p e rtie s  o f  th e specification . We discuss a com piled -code concurrent functional 
sim u la to r  called  C F SIM , as well as the use o f  concurrent testers for driv in g  CFSIM . T h e use o f  a sem an ­
tica lly  well specified  and s im p le  language, and th e assoc ia ted  a n a lysis /s im u la tio n  too ls  h elps conquer the  
co m p lex ity  o f  specify in g  and va lida tin g  con tro l in ten sive ICs.
Formal Aspects o f VLSI Research Group 
University o f Utah, Department o f Computer Science
CFSIM : A  C oncurrent C om p iled -C od e Functional S im ulator for hop C P
VENKATESH AKELLA (akella@cs.utah.edu)
GANESH GOPALAKRISHNAN (ganesh@bliss.utah.edu)
Dept, o f C om puter Science
U n iversity  o f Utah
Salt Lake C ity, Utah 84112
A b str a ct. We present a hardware description language (HDL) called hopCP for writing system-level specifications 
of hardware, and a simulation environment called CFSIM to validate hopCP behavioral specifications. HopCP is a 
process-oriented language based on the Communicating Sequential Processes (CSP) paradigm with a few extensions: 
computations are specified in a purely functional style, and a restricted form of distributed shared variablesis provided 
to support asynchronous value communication. HopCP addresses high level protocol modeling as well as low level 
signaling details. A simulator in the CFSIM environment is generated by compiling a hopCP description into a 
Concurrent ML (CML) module. This CML module can then be compiled using the native code generator of Standard 
ML (SML) and executed. Since the CFSIM implementation is based on a polymorphically typed functional language 
(SML) and a concurrent language (CML), strong type-checking is automatically carried out before the simulation 
begins. CML offers a simple and extensible scheme for realizing hopCP’s constructs. The compiled nature of the 
simulator guarantees much more efficiency over interpreted simulators. Support for design verification and circuit 
synthesis is also provided in the same framework. The implementation of CML as well as our experience with it are 
described.
1 In trod u ction
Of the tools that are used to design large VLSI systems, functional simulators play a major 
role. Functional simulators are used for debugging high level hardware descriptions of the system 
being designed. Designing effective functional simulation tools for large and complex VLSI systems 
is a non-trivia! task. In the very same VLSI system, one is forced to describe -  and simulate 
-  both high level control/algorithmic aspects as well as low level signaling details. Most system 
level descriptions of VLSI are large, and therefore require high efficiency. Flexibility is required 
in the way the simulator is implemented so that HDL extensions can be easily accommodated, or 
experimented with. Strong type checking before simulation is another desirable feature to avoid 
simulating descriptions that have a type conflict. Last, but not the least, effective ways have to be 
developed to select simulation vectors, apply them, and observe the simulation results.
We present a hardware description language (IIDL) hopCP for writing system-level specifications 
of hardware, and a simulation environment called CFSIM to validate hopCP behavioral specifica­
tions. HopCP is a multi-paradigm HDL. Computational aspects of hardware behavior are specified 
in a purely functional style and the communication and synchronization aspects are described ex­
2 VENKATESH AKELLA, GANESH GOPALAKRISHNAN
plicitly  in a process-oriented style like CSP. hopCP is equipped with constructs to model hardware 
phenomenon like busy waiting, asynchronous message passing  and broadcast, and barrier synchro­
nization  directly. It has a simple operational semantics [2]. It addresses high level protocol modeling 
as well as low level signaling details.
We describe a simulation environment called CFSIM to validate hopCP behavioral specifications. 
A simulator in the CFSIM environment is generated by compiling a hopCP description into a 
Concurrent ML (CML) module. This CML module can then be compiled using the native code 
generator of Standard ML (SML) and executed. Since the CFSIM implementation is based on 
a polymorphically typed functional language (SML) and a concurrent language (CML), strong 
type-checking is automatically carried out before the simulation begins. CML offers a simple 
and extensible scheme for realizing hopCP’s constructs. The compiled nature of the simulator 
guarantees much more efficiency over interpreted simulators. We have developed effective simulation 
schemes using tes ter  processes  in which the simulator is driven by well chosen vectors that try to 
establish high level properties. Finally, support for design verification and circuit synthesis is also 
provided in the same framework.
This paper deals with the design and implementation of CFSIM. Simulation based design vali­
dation is not new. Hardware simulation can be performed at the device level (e.g. SPICE), switch 
level with timing (e.g. IRSIM) and without timing (e.g. COSMOS), register transfer level (e.g. 
ISPS), or the functional level (e.g. the VHDL behavioral simulator). Whereas low-level simulators 
serve the purpose of modeling the underlying physics of devices, and reduce the time and expense 
needed to build the part and test it, behavioral simulators help debug the designer’s understanding 
own of the hardware, and also help correct inconsistencies between two levels of an HDL description.
System-level behavioral specification of integrated circuits is often characterized by synchronous 
and asynchronous operations, polling and interrupt-driven modes of behavior, multiple concurrent 
threads of execution, and complex computation. Simulating such specifications using scalar inputs 
(i.e. vectors of 0’s and l ’s) to obtain output waveform traces is not a very satisfactory approach 
because of the large number of scenarios that have to be simulated. CFSIM allows the designer 
to write tes ter  processes that can animate  the environment of the process being simulated. For 
example, if a communications chip C  with a send and a receive channel is being simulated, two 
tester processes 7 \ and T2 can be written, one to continuously send messages into the send channel, 
and another to continuously receive messages from the receive channel. Tj and T? can then be 
run in parallel with C ,  thereby getting the effect of concurrently sending messages and reading 
messages from C .  This effect is virtually impossible to achieve using traditional scalar simulation. 
This proved to be quite valuable; for example, in debugging a communications chip (called the Intel 
8251) described in [3]. The testers that we wrote actually proved to be very readable and succinct 
specifications of the system  being debugged.
O v erv iew  o f  C F SIM
A hopCP behavioral specification has three components: (i) the set of user-defined functions 
which capture the computational aspects of the module, (ii) the set of communication ports of 
the module, and (iii) the HFG (hopCP Flow Graph). hopCP Flow Graph is a concurrent state- 
transition system (analogous to a Petri net) which serves as the intermediate representation for the 
hopCP specifications.
The CFSIM simulation environment consists of tools to compile the hopCP specifications into 
executable CML (Concurrent ML) [6] source code. CML is a concurrent extension of Standard ML 
of New Jersey which supports first class synchronous operations. The major steps in CFSIM are 
translating the user-defined functions in a hopCP specifications into SML function definitions, the 
communication ports into CML channels and the HFG into a set of communicating threads  in CML. 
The crux of this translation process is to preserve  the semantics of hopCP. This requires building 
abstractions in CML to support hopCP constructs. For example, we built abstractions to support 
hopCP style barrier synchronization, shared variables and compound actions (restricted fork-join 
construct). A compiled simulator in CFSIM can be driven by tester processes, as explained in 
section 5.
S a lien t F eatu res o f  C F S IM
CML is a strongly-typed, polymorphic, and higher-order concurrent language that facilitates 
building several concurrency abstractions. It is very easy to build abstractions to simulate hardware- 
specific constructs. The strong-typing facilitates debugging hopCP specifications for semantic errors 
like illegal port connections and inconsistent (and incorrect) usage of variables. The other advan­
tages of the proposed simulation environment include the ability to simulate concurrency accurately 
and its efficiency with respect to time and space. CML capitalizes on the continuation-passing  style 
implementation of the SML of New Jersey compiler [4] and supports concurrency with very little 
overhead [6]. In addition, the preemptive scheduling feature of CML makes CFSIM fairly responsive 
during interactive usage.
O rgan ization  o f  th e  P a p er
In the next section we will illustrate hopCP with an example and bring out its salient features. 
In section 3 we will introduce CML briefly (sufficient to understand this paper) and motivate its 
advantages. Section 4 deals with the design and implementation of CFSIM while section 5 deals 
with an illustration of CFSIM to validate a pipelined stack specification in hopCP. We conclude 
by highlighting the principal advantages of CFSIM and presenting a brief sketch of our plans to 
extend CFSIM.
CFSIM: A C O N C U R R E N T  C O M P IL E D -C O D E  F U N C T IO N A L  S IM U L A T O R  F OR  H O P C P  3
4 VENKATESH AKELLA, GANESH GOPALAKRISHNAN
2 h op C P  
O verview o f hopCP
hopCP is a notation for describing concurrent-state transition systems based on a functional 
language augmented with features to express synchronous and asynchronous value communication. 
The basic unit of description is a M O D U L E  which consists of a set of communication ports, an 
optional set of user-defined functions, and a behavioral description called hopCP Flow Graph (or 
HFG). A H F G  consists of a set of states,  a set of actions  and a set of transitions. States in hopCP 
are (contro l,  da ta)  state pairs where control states are like finite-state machine (FSM) states, and 
data states capture the contents of internal storage locations. An action in hopCP is either a 
communication action or the evaluation of an expression. There are three types of communication 
actions:
1. Data Query and Data Assertion: These involve value communication and  synchronization. For 
example, p l x  called data query denotes synchronizing on input port p  and receiving a value 
denoted by x  while p\e called data assertion  denote synchronization on the output port p  and 
sending  the value denoted by expression e.
2. Synchronous Control Actions: These involve only synchronization no value communication. 
For example, p? denotes an input synchronization action on input port p  while p\ denotes an 
output synchronization on output port p.
3. Assignment Actions: Assignment actions provide asynchronous communication via shared 
variables. For example, a := e is an assignment action which involves writing the value 
denoted by expression e into the shared variable denoting the asynchronous port a.
A transition t r  G T r a n s i t io n  is a triple ( p r e ( t r ) ,a c t ( t r ) ,p o s t ( t r ) )  where p r e ( tr )  denotes a set 
of states called precondition  of the transition, p o s t ( t r ) denotes a set of states called postcondition  
of the transition, and a c t( tr )  denotes the action  of the transition. The execution sem antics  of a 
H F G  are similar to that of a P etr i  net. Let t r  6 T ra n s it io n ;  if t r  is enabled (i.e. execution reaches 
p r e ( tr ) )  then the system performs actions a c t ( t r ) and the execution reaches p o s t ( t r ) .  Note that 
no notion of clocks or time is being associated with the performance of the actions ac t( tr ) .  Also 
note that if more than one t r  € T r a n s i t io n  is enabled, they can perform their respective actions 
concurrently. Next we will illustrate hopCP with an example.
Specification o f a Pipelined Stack in hopCP
Figure 1 the structural specification of the pipelined stack as an interconnection of three modules 
namely, the address register (AddrReg), the memory unit (Memory) and the controller (CTRL).
CFSIM: A  C O N C U R R E N T  C O M P IL E D -C O D E  F U N C T IO N A L  S IM U L A T O R  F O R  H O P C P 5
push pop reset top
Figure 1: Structural Description of Pipelined Stack
MODULE AddrReg
TYPE
addrType : vector 16 of bit;
SYNCPORTS





AREG [cs] <- (ld?inp -> addr :“ inp -> AREG [inp]) .
1 (inc? -> (addr :■ (cs +1) )  -> AREG [cs +1])
1 (dec? -> (addr :“ (cs - 1)) -> AREG [cs -1])
END
Let us examine the hopCP specification of the address register in detail. addrType is a user- 
defined type. l d , in c  and dec are the three input synchronous ................. .ition channels. Commu­
nication on these channels is via handshake, addr is declared as a >!mn ,1 variable (asynchronous 
port). The user-defined function definition section is empty. Tin l» liavior section describes a
6 VENKATESH AKELLA, GANESH GOPALAKRISHNAN
[cs,inp]
Figure 2: hopCP Flow Graph (HFG) for the Address Register 
state-transition system or HFG shown in figure 2. Here the notation a /b  means rep/ace( up date) b
The address register module exhibits a lte r n a te  (or conditional) behavior. AREG is the initial 
control state and c s  is the initial d a ta p a th  s ta te  of the module. In the initial state, the module has 
three m o d e s  o f  beh avior, the address register can be loaded with a new value via the Id channel, 
or the address register can be in c re m e n te d  or d e c re m e n te d  by corresponding commands on the inc  
and dec channels. The current value the address register is always available on the asynchronous 
port addr.
The specification of the memory system is shown in figure 3. In the top-level state, denoted by 
MEM, the module can accept a read or write request. A write request comes with the corresponding 
data to be written, denoted by din and the address is denoted by the asynchronous port addr 
(which is shared by the address register and the memory modules). Note that the write operation 
is captured by an expression action u p d ate(s  ,a d d r ,d in ) where u p d a te  is a predefined operation 
in hopCP. The read operation is more interesting. The result of the read operation is not delivered 
immediately; instead the module proceeds to an intermediate state, denoted by MEM-AUX wherein 
it has the capability of entertaining another read or write operation while delivering the result of 
the previous read. The output operation is done via a data assertion, namely, dout! ( in d e x ( s ,a ) )  
which informally means, output the value denoted by the expression in d e x (s ,a )  on the output 
synchronous channel dout. The specification of the controller is notationally similar, and is shown 
in the appendix for completeness.
CFSIM: A C O N C U R R E N T  C O M P IL E D -C O D E  F U N C T IO N A L  S IM U L A T O R  FOR H O P C P 7
MODULE Memory 
TYPE
word : vector 16 of bit; 







(MEM [ms] <= (write?din -> MEM [ update(ms,addr,din)]) 
I(read? -> MEM.AUX [ms,addr]);
MEM_AUX [s,a] <= (write?din -> dout!(index(s,a)) -> MEM [update(s,addr,din)])
I (read? -> dout!(index(s,a)) -> MEM_AUX [s.addr]))
END
Figure 3: Specification of a Pipelined Memory in hopCP
The example above illustrated most of the features of hopCP like synchronous and asynchronous 
value communication, expression actions, alternate behavior (or guarded commands) and concur­
rency (by the fact that address register and memory could operate independently). Some of the 
features of hopCP that could not be illustrated with the simple pipelined stack example are as 
follows:
• Compound Actions: A transition could be annotated with a tuple of actions a i , a 2, . . .  , a n 
instead of a single action a as was the case in the above example. The actions a i, a2, . . . ,  an 
could be data queries, data assertions, input control action, output control actions or assign­
ment actions with the restriction that all a,- and aj  should be non-interfering , i.e., not two
a,- and aj should use the same channel or try to update the same variable. The execution 
of the system via a compound action is analogous to that of the cobegin/coend  statement of 
concurrent programming languages.
• Multiway Rendezvous: Multiway rendezvous is said to occur when there is more than one agent 
willing to perform a input operation (data query) corresponding to a given output operation 
(data assertion). The semantics of multiway rendezvous in hopCP subsumes broadcast (point 
to multipoint communication) style of communication and barrier synchronization  (alignment 
of time). Multiway rendezvous is a powerful construct which facilitates the specification of a 
wide variety of concurrent algorithms very naturally.
In the next section we will introduce CML and bring out its salient features.
8 VENKATESH AKELLA, GANESH GOPALAKRISHNAN
3 C oncurrent M L
CML is a high-level, high-performance language for concurrent programming. It is derived by 
augmenting SML with concurrency primitives. CML inherits all the good features of SML like 
higher-order functions, strong static typing, polymorphism, datatypes and pattern matching, ex­
ception handling and state-of-the-art module facility. CML capitalizes on the continuation-passing 
style implementation of the SML of New Jersey compiler and is very efficient [6]. The salient fea­
tures of CML include: (i) high-level model for concurrency with dynamic creation of threads and 
typed channels, and CSP style synchronous communication based on a distributed memory model,
(ii) provides events  (or synchronous operations) as first-class  values, which facilitates building new 
abstractions tailored to specific applications, (iii) provides an integrated I/O  support, (iv) uses 
preemptive scheduling to guarantee responsiveness and (v) it is practical language tested in a va­
riety of large-scale projects like eXene  (a muti-threaded interface to X protocol), distributed ML 
and distributed Nuprl implementations. These features make CML a natural choice for the hopCP 
simulator.
4 D esign  of C FSIM
In this section we will present the details of the algorithm underlying CFSIM. We first outline 
the major steps in the algorithm and then illustrate the simulation of each construct of hopCP in 
CML.
A lg o r ith m  u n d erly in g  C F S IM
Let M i , M 2 , . . . ,  M n be the submodules of a hopCP module M .
In itia liza tion :
Let, M  =  M \  || M 2 || • • • || Mn, Si =  synchronous ports in M,-, a,- =  asynchronous ports in M,-, /,• 
=  user-defined functions in M,-, (s;, a,-,/,- could be empty)
Modules M i . .  .M n are parsed to extract the synchronous ports s,-, asynchronous ports a, and 
user-defined functions /,-. The simulation environment is pre-loaded with the function definitions 
before commencing the simulation. Synchronous ports s,- are analyzed to detect multiway ren­
dezvous. If a port does not have exactly one writer (outputting module) an error message is issued. 
If a port has more than one receiver, it is marked as a barrier channel, and is implemented by a 
B arrier  abstraction. Ports with exactly one producer and one receiver is implemented directly by 
the channel 0  construct in CML. Asynchronous ports (or shared variables) are simulated by the 
the structure A syncB arrier and its associated operations. The details of the implementation of 
B a rrier  and A syncB arrier will be presented later in this section.
D eco m p o sitio n :
Let H  denote the composite H F G  denoted by the module M . Decompose H  into , S 2 , • • •, S m
CFSIM: A  C O N C U R R E N T  C O M P IL E D -C O D E  F U N C T IO N A L  S IM U L A T O R  F OR  H O P C P  9
[*1,*2, •••,*»!]
Figure 4: A Simple Sequential HFG
where S \,  S ? , . .  .S m are sequential H F G s . A sequential H F G  is one in which all transitions are 
of the form (5 , a, S ') where | S  | =  | S' | =  1. In otherwords, sequential H F G  is one in which 
every action  has exactly one predecessor state and one successor state. Each sequential H F G  Si is 
translated into a CML thread. Execution-wise sequential H F G s  are not strictly sequential. They 
could spawn new threads which could execute in parallel. An example of such a scenario is the 
implementation of compound-actions. A separate thread is spawned to execute each component of 
a compound-action. This will be illustrated in an example later in this section.
T ran sla tin g  S eq u en tia l H FG s:
Each sequential H F G  is translated into a CML thread which consists of a set of mutually recursive 
function definitions (one function for every transition).
Let ((5 , [.?i, X2 , . . . ,  £n]), a, (T,  [ei, e2 , . . e ]^)) be a transition in a sequential H F G  . This is 
translated into a function definition in CML as follows: The control state name S  (in the precon­
dition of the transition) becomes the name of the function, the datapath variables X\ , X 2 , . .  . , x n 
become the formal parameters of the function, the action a becomes the body of the function and 
the postcondition of the transition is translated into a function call, with the control state name T  
being the name of the function and the expression e j , e 2 , . .  . , e j  being the actual parameters.
Figure 4 shows a simple sequential H F G  and the CML code fragment below illustrates the 
translation scheme outlined above.
fun S xl x2 . . xn = ( body a; T el e2 .. .ej)
and
T yl y2 . . yj = ( body b; S gl g2 .• -gn)
10 VENKATESH AKELLA, GANESH GOPALAKRISHNAN
Here body a and body b denote the CML code implementing actions a and b. Generating CML 
code corresponding to particular hopCP action will be discussed next.
T ran sla tin g  A ctio n s
Expression actions are directly compiled into CML code since they have a similar evaluation 
semantics. The rest of the action categories are translated as follows:
• Data Query: A data query p lx  is implemented as a r e c e iv e  or a accep t operation on a 
synchronous channel p in CML. Note that we have declared p as a synchronous channel in 
STEP 1. For example, ((5 , [ x i , x 2 , . .  . , x n]),p?a:, (T , [ei, e2, . .  . , ej])) will be simulated as
fun S xl x2 . . xn *= let
val x *= accept p
in
T el e2 ...ej
end
Here we assumed a 2-way rendezvous on channel p. The implementation of a multiway 
rendezvous will be illustrated later.
• Data Assertion: A data assertion pie is implemented as a tra n sm it or a send operation on a
synchronous channel p in CML. For example, the transition ((5 , [ x i , x 2, . . .  ,£ „ ] ) ,p!e, (T, [e\, e2, . .  . ,ej]) 
will be simulated as
fun S xl x2 ... xn = (send(p,e); T el e2 ...ej)
T ran sla tin g  Specia l C on stru cts:
There are three special constructs in hopCP which do not have direct counterparts in CML. 
These are implemented as follows:
# Compound Actions:
Compound actions are implemented by spawning a new thread to implement each component 
of the compound action and waiting for the completion of all the threads. For example, the 
transition ((5 , [.ti, x 2 , . . . ,  .Tn])i ^2 ), (T,  [ei, e2, . . . ,  ej])) has a compound action a with ai 
and a 2 as its components. This will be simulated as follows:
CFSIM: A C O N C U R R E N T  C O M P IL E D -C O D E  F U N C T IO N A L  S IM U L A T O R  F OR  H O P C P 11
fun S xl x2 . . xn = let
val c__54 » channel ()
val c__55 «= channel 0
fun s__56 xl x2 ... xn “ (body al; send(c__54,0))
fun s__57 xl x2 ... xn » (body a2; send(c__55,0))
in
(spawn (fn () => s_56 xl x2 .. xn);
spawn (fn () => s_57 xl x2 .. xn);
let
val _ = accept c__54
val _ = accept c__55
in
T el e2 ...ej
end
end
s__56 and s_ 5 7  are the new threads spawned to implemented ai and 0 2 . c _ 5 4  and c _ 5 5  are 
temporary channels to implement the synchronization between the threads s_ 5 6  and s__57. 
Note that a l  and a2 could be performed concurrently.
• Asynchronous Ports:
(*






val mChannel : ’la -> ’la mchan
val newPort : a mchan -> ’a CML.event
val multicast (’a mchan * ’a) -> unit
end (* ASYNCMULTICAST *)
Asynchronous ports are implemented by a one-place buffer abstraction whose signature is 
shown above. The operation mChannel creates a new asynchronous port, the operation 
m u lt ic a s t  is used to write a value on the asynchronous port and the operation newPort 
is used to read from the asynchronous port. The value on the asynchronous port can be read 
without synchronizing. The assignment action in hopCP, a := e is implemented by the 
CML code fragment
12 VENKATESH AKELLA, GANESH GOPALAKRISHNAN
let
val a = AsyncBarrier.mchannel 0 (* initialized to 0 *) 
val a__9 = AsyncBarrier.newPort a
in
Async Barrier.mult icast(a,e )
end
• Multiway Rendezvous: Multiway rendezvous is implemented by an abstraction which imple­
ments a busmaster. The producer checks in the value to the busmaster and waits for the 
acknowledgement; the receivers wait for a value from the busmaster and an acknowledgement 
to proceed further; the busmaster receives the value from the producer and transmits it to 
all the receivers and then sends an acknowledgement to all the receivers and the producer. 
This implements both broadcast and barrier synchronization. If only broadcast is desired, the 
receivers need not wait for a final acknowledgement. They can proceed as soon as they receive 
the value from the busmaster. The code implementing the barrier abstraction is omitted to 
conserve space.
T ran sla tin g  C on tro l S tru ctu re:
The control structure of hopCP is simulated in CML in the following manner: The sequencing 
construct (-> ) is implemented explicitly by the SML sequencing operator ( ; )  or implicitly by a 
l e t  construct. The choice construct or guarded commands (denoted by I) is implemented directly 
by the non-deterministic choose combinator of CML. The I I construct is simulated by spawning 
independent threads for each sequential I I F G  as discussed above.
U n sy n ch ro n ized  A ction s:
Some of the actions in a hopCP description do not have corresponding  partners for rendezvous. 
In a hardware scenario these correspond to the external inputs and outputs. In CFSIM, unsyn­
chronized inputs are directed to the standard input (i.e. keyboard) and unsynchronized outputs 
are directed  to the standard output. This feature was found to be extremely useful in simulating 
concurrent specifications, because unsynchronized inputs could be used to arrest  the progress of 
the system and obtain the effect of single-stepping  through the behavior.
Sa lien t F ea tu res  o f  C F S IM
• Efficiency: The size of the simulator is proportional to the number of transitions in the 
H F G  which is usually small because we initially eliminate the || operator by decomposing the 
H F G s  into sequential H F G s  . The simulation is extremely fast because the CML code 
is directly executed as opposed to being interpreted which is common with most of the 
simulators. For example, it took less than 90 seconds (with garbage collection) to simulate the 
operation of a communication chip (Intel 8251) with more than 160 states and 6 concurrent 
processes, on a Sparc IPC with 24 Mb of memory. The operation involved approximately
32000 synchronizations and more than 60000 function calls.
• Static Checks: Since we translate the H F G s into CML source code and execute them in 
Standard ML environment, most of the static checks like consistency of types of the variables, 
name clashes, undefined variables and function names etc. are detected during compilation. 
This is facilitated by the strong typing offered by Standard ML.
• Interactive: CFSIM  generates interactive simulators wherein the user can step-through the 
execution of the module by controlling the input to the system. This is facilitated by di­
recting all the unsynchronized  output actions in a H F G  to the standard output and the 
unsynchronized  input actions to the standard input. This is extremely useful for debugging 
asynchronous  circuits where time is continuous. An unsynchronized action could be used to 
discretize the behavior.
• Flexibility: hopCP is an evolving language. It is being continuously modified to cater to new 
application scenarios. Since CFSIM is based on a higher-order applicative language with 
a sophisticated module facility, it is very easy to build abstractions to simulate the new 
constructs of hopCP and support the evolution of the HDL.
5 T e s te r  P r o c e s s e s  a n d  D e s ig n  V a lid a tio n
In this section we will describe how hopCP specifications can be validated using CFSIM. Val­
idation of specifications via CFSIM involves two phases: (i) identification of interesting modes of 
behavior of the system, and (ii) construction of high-level simulation vectors, which enable the cho­
sen mode of behavior. These are now illustrated on the pipelined stack specification introduced 
earlier.
5 .0 .1  T ester  M o d u les  and M o d es o f  B eh av ior
A mode o f  behavior of a system is a partial order of control/data related actions on the system  
that has a well defined and intuitive outcome. In terms of their overall execution effect, they 
correspond to an expression involving many operations of the system. For example, if one were to 
view the module M  being specified in hopCP as an abstract data type, the various axioms  which 
algebraically characterize M  each describe their own modes o f  behavior. For example, let P  be the 
pipelined stack, x € V A L ,  then top(push(P , x ))  — x is an axiom of the stack datatype which P  
should satisfy; the corresponding mode of behavior is the partial order of actions captured by the 
following tester process:
TESTPUSH[] <= usergive?x -> pushix -> top?y -> ( (x*y) -> EXECUTI0I0K[]
I (not(x=y)) -> ERROR [] )
There are several other axioms which a pipelined stack should satisfy. Each of these axioms could 
be encoded as a mode o f  behavior.
CFSIM: A C O N C U R R E N T  C O M P IL E D -C O D E  F U N C T IO N A L  S IM U L A T O R  F O R  H O P C P  13
14 VENKATESH AKELLA, GANESH GOPALAKRISHNAN
A tes ter  module for a hopCP specification H  is a hopCP description of a module which interacts 
with the system being tested (i.e. H ) and guides the execution of H  along a chosen mode of  
behavior. A tester module can be viewed as an interface between the user and the system under 
test. It receives the inputs from the user (via the CFSIM  interactive environment) and provides 
the necessary stimulus to the system and it receives the responses from the system and channels 
them back to the user.
Testers can be written in such a manner that they check the results of the simulation themselves 
and give a “yes” or “no” answer. This is the reason why algebraic axioms (such as used above) are 
a good source for generating testers. This avoids the designer having to manually read and certify 
voluminous simulation outputs. ■
Illustrating Tester M odules
Let us consider validating the pipelined push operation in our example. To recapitulate, the 
memory module being used in the pipelined stack example, can entertain new read/write requests 
when the current read is still in progress. It delivers the result of the readi  when it is processing 
rear/(,-+1) or write^i+1 ,^ where i is the operation number. The controller pipelines the push operation 
by issuing a w r ite  request to the Memory module and while the Memory is being updated it can 
entertain the next operation.
A tester module called TEST which validates this operation is shown below
MODULE TEST
TYPE
vord : vector 16 of bit;
SYNCPORT




(TEST [] <= reset! -> TEST.AUX [];
TEST_AUX [] <= push 1 -> push!0 -> pop! -> top! -> next? -> TEST [])
END
The tester module starts in a initial state denoted by TEST where it issues a reset operation 
to the pipelined stack module. Then it issues two push operations followed by a pop and top. 
The n ext operation is a dummy action which is an unsynchronized  operation and facilitates the 
simulation process by providing a means to interact with the system through the keyboard. This 
is not necessary if the tester module was written as a fin ite  process.
The CML code generated by CFSIM for the behavior part of the pipelined stack and the tester 
is presented in the appendix. Executing the code in CML environment validates the pipelined push
operation of the stack. The executable CML code is quite compact and takes less than a second to 
validate the operation in question.
5.1 D iscu ssio n  on T ester  P ro c esse s  and D esign  V alidation
The stack example shown in this is paper is fairly straightforward and does not illustrate the full 
power of the tester process in the design validation task. The concept of tester processes could be 
extremely useful in the validation of hopCP specifications of control-intensive integrated circuits 
like the Intel 8251 USART [3], which depict concurrency and complex protocols. Each of the three 
submodules in the USART had over 50 states and 30 transitions. Simulating all possible interac­
tions of such a specification is neither efficient nor meaningful. Generating simulation vectors and 
understanding the simulation output for such specifications is quite challenging. Tester processes 
and the concept of identifying modes of behavior enables one to fil ter  out most of the irrelevant 
details and focus on the input/output relationships for the particular facet of the behavior being 
simulated. This is achieved by making the tester process more intelligent than a simple sequence of 
inputs and outputs. The full hopCP specification language could be used to write the tester mod­
ules. This gives us the ability to write testers which can have conditional behavior and concurrency. 
In fact, the tester used to simulate the Intel 8251 USART was itself a collection of three concurrent 
processes capturing the behavior of the CPU, the serial receiver and the serial transmitter.
One apparent drawback with our approach is the necessity to use an independent tester for each 
mode of behavior. In general, for complex circuits there could be scores of modes of behavior. So, 
a naive approach to building tester processes could be inefficient. In practice, one need not do 
that. Actually something similar to conventional fault simulation  could be employed, Again, this 
feature was very widely used in simulating the USART specification. A single tester module was 
programmed with appropriate control words to check for more than one mode of behavior.
Finally some of the attractive features of this style of validation process which could be investi­
gated further are:
• Testers are specified in the same IIDL. This opens up several promising avenues for further 
research like extracting BIST (Built-in-Self-Test) hardware by synthesizing the  tester module 
just like the rest of the hopCP specification, including DFT (Design For Testability) ideas 
in the specification i.e. writing a specification which when synthesized becomes more easily 
testable.
• Tester modules provide a systematic and elegant approach to functional simulation, since 
most of the details of the simulation are buried within the tester module. The user does not 
have to deal directly with the specification.
• By expressing the tester module in hopCP and simulating it via CFSIM  we are actually 
validating our system in truly concurrent environment.
CFSIM: A C O N C U R R E N T  C O M P IL E D -C O D E  F U N C T IO N A L  S IM U L A T O R  F OR  H O P C P  15
16 VENKATESH AKELLA, GANESH GOPALAKRISHNAN
6 C onclusions and Future W ork
We introduced a concurrent HDL called hopCP for system-level specification of VLSI circuits. 
hopCP is characterized by process-oriented features to specify communication and synchroniza­
tion, a functional sublanguage to specify computation, and distributed shared variables. We then 
presented a simulation environment for hopCP called CFSIM. CFSIM is a compiled-code concur­
rent functional simulator and is obtained by translating the intermediate representation of hopCP 
specification (HFGs) into CML source code. Several advantages of using a strongly typed, poly­
morphic, higher-order applicative language for behavioral simulation were pointed out. Finally, we 
introduced a notion of tester processes to support efficient simulation and preliminary validation 
of hopCP specifications.
CFSIM is efficient and flexible enough to cater to the simulation requirements for a evolving 
HDL like hopCP. Currently, CFSIM is being extended in two ways: A high-level synthesis system  
based on hierarchical refinement of actions is being implemented [1]. CFSIM currently simulates 
specifications at the level of rendezvous (or handshakes). CFSIM is being extended to simulate 
circuits at the level of signal-transitions so that it could be used as a simulator during the synthesis 
phase. This will be done by treating every wire as an asynchronous port and implementing every 
signal transition as a pair of assignment actions on the corresponding asynchronous port. The 
other extension planned for CFSIM is to incorporate the ability to verify simple timing constraints. 
This will be done by extending hopCP to specify timing constraints as shown in [5] and using the 
w a itU n til  construct of CML.
A ck n o w led g em en ts: We wish to thank John Reppy of Cornell University for providing CML and 
the SML of New Jersey implementation team for providing and supporting SML.
R eferen ces
1. A k e l l a , V ., a n d  G o p a l a k r i s h n a n , G .  Hierarchical Action Refinement: A  Methodology 
for Compiling Asynchronous Circuits from a Concurrent HDL. In Proceedings of the Tenth 
International Symposium on Computer Hardware Description Languages and their Applications, 
Marseille, France (Apr. 1991).
2. A k e l l a , V ., a n d  G o p a l a k r i s h n a n , G .  hopCP: A  Concurrent Hardware Description Lan­
guage. Tech. Rep. UUCS-91-021, Department of Computer Science, University of Utah, Oct. 
1991.
3. A k e l l a , V ., a n d  G o p a l a k r i s n a n , G .  Specification and Validation of a USART in hopCP. 
Tech. rep., Department of Computer Science, University of Utah, 1991. In preparation; available 
upon request from the authors.
4. APPEL,  A .  W. Compiling with Continuations. Cambridge Univ. Press, 1992. ISBN 0-521­
41695-7.
CFSIM: A C O N C U R R E N T  C O M P IL E D -C O D E  F U N C T IO N A L  S IM U L A T O R  F OR H O P C P 17
5.  N e s t o r , J. A ., a n d  E .T h o m a s , D. Behavioral Synthesis with Interfaces. In Proceedings of 
the International Conference on Computer-Aided Design (Nov. 1986), pp. 112-115.
6. R e p p y ,  J. H. CML: A Higher-order Concurrent Language. In A C M  S IG P L A N ’91 Conference 
on Programming Language Design and Implementation (June 1991).
7 A p p en d ix
MODULE CTRL 
TYPE
word : vector 16 of bit; 
addrType : vector 16 of bit;
SYNCPORTS




(PCTRL [] <= (reset? -> ld!0 -> PCTRL [])
I(push?vl -> inc! -> write!vl -> PCTRL_AUX [])
I (pop? -> dec! -> PCTRL [])
I (top? -> read! -> PCTRL []) ;
PCTRL.AUX [] <= (reset? -> ld!0 -> PCTRL [])
I(push?v2 -> inc! -> write!v2 -> PCTRL_AUX []) 
I (pop? -> dec! -> PCTRL [] )
I(top? -> read! -> PCTRL[]))
hopCP Specification of the Pipelined Stack Controller
IS VENKATESH AKELLA, GANESH GOPALAKRISHNAN
fun pstack () *
lot
val addr “ AsyncBarrier.mChannel 0
val addr__49 = AsyncBarrier.newPort addr
val top = channel () val pop = channel O
val push = channel O  val reset = channel ()
val write ■ channel O  val Id = channel ()
val read = channel O  val inc = channel ()
val dec = channel () val next = channel ()
val dout = channel O
fun s__21 vl = (( send (write,vl ) ; PCTRL.AUX ()))
and
s__28 O - (( send (read,0); PCTRL O  ))
and
s__25 O = (( send (dec,0); PCTRL O  ))
and
s__22 vl = (( send (inc,0); s_21 vl ))
and
s..19 O = (( send (ld,0 ) ; PCTRL ()))
and
PCTRL O = sync(choose [wrap (receive reset, fn (_u_) =>( s__19 () )),
wrap (receive push, fn (vl) => (s__22 vl )),
wrap (receive pop, fn (_u_) *>( s__25 O  )),
wrap (receive top, fn (_u_) =>( s__28 () ))])
and
PCTRL.AUX () = sync(choose [wrap (receive reset, fn (_u_) =>( s__31 O )),
wrap (receive push, fn (v2) “> (s__34 v2 )),
wrap (receive pop, fn (_u_) ■>( s__37 () )),
wrap (receive top, fn (_u_) =>( s__40 () ))])
and
s.,31 () = ( ( send (Id,0 ) ; PCTRL () ))
and
s__34 v2 = ( ( send (inc,0); s__33 v2 ))
and
S..37 () = ( ( send (dec,0); PCTRL () ))
and
s__40 () = ( ( send (read,0) ; PCTRL () ))
and
s..33 v2 = ( ( send (write,v2 ) ; PCTRL.AUX () ))
fun TEST () = ( ( send (reset,0); TEST.AUX () ))
and
TEST.AUX () = ( ( send (push,l ) ; s_47 () ))
and
s__47 O = ( ( send (push,0 ) ; S..46 () ))
and
s_46 () = ( ( send (pop,0); s_45 () ))
and
s..45 O ■ ( ( send (top,0); a_44 () ))
and
s__44 () = (C10.print( "Waiting for Input on Channel next? \n");
let
val _ « input_int(sync(CIO.input_line std.in))
in
CIO.print("Synchronized on channel next?\n");TEST_AUX O
end)
.......  Continued On Hext Page
CFSIM: A  C O N C U R R E N T  C O M P IL E D -C O D E  F U N C T IO N A L  S IM U L A T O R  F O R  H O P C P 19
....  Continued From Previous Page
fun HEH ms ” sync(choose [ wrap (receive write, fn (din) “>
(HEH (let




wrap (receive read, fn (_u_) =>
(HEH.AUX ms (let , 





HEH.AUX s a =
sync(choose [wrap (receive write, fn (din) “> (s__14 s a din)),
wrap (receive read, fn (_u_) “>( s__16 s a ))])
and
s__14 s a din = ( (CIO.print (" Output on Channel dout!"*Integer.makestring(s)*"\n");
HEH (let





s__16 s a = ((CIO.print (*' Output on Channel dout!"“Integer.makestring(s)“"\n");
. HEH.AUX s (let




fun AREG cs = sync(choose [wrap (receive Id, fn (inp) => (s_3 cs inp )),
wrap (receive inc, fn (_u_) =>( s_5 cs )),
wrap (receive dec, fn (_u_) =>( s__8 cs ))])
and
s_3 cs inp = ((AsyncBarrier.multicast(addr.inp); AREG inp ))
and
s_5 cs = ((AsyncBarrier.multicast(addr,(cs + 1)); AREG (cs + 1) ))
and
s_8 cs = ((AsyncBarrier.multicast(addr,(cs - 1)); AREG (cs - 1) ))
in
spawn (fn () »> PCTRL () ); 
spawn (fn O  => TEST () ); 
spawn (fn () => HEH 0 ); 
spawn (fn () => AREG 1 );
()
e nd;
CFSIM Generated CML code for the Pipelined Stack Example 
(Memory || CTRL || AddReg || TEST)
