Using logic programming and coroutining for electronic CAD  by Bieker, Ulrich & Neumann, Andreas
NORTH- 
USING LOGIC  PROGRAMMING AND 
COROUTIN ING FOR ELECTRONIC  CAD 
ULRICH B IEKER AND ANDREAS NEUMANN 
i> We show how an extended Prolog can be exploited to implement differ- 
ent electronic CAD tools. Starting with a computer hardware description 
language (CHDL) several problems like digital circuit analysis, simulation, 
test generation, and code generation for programmable microprocessors are 
discussed. For that purpose the MIMOLA (machine independent micro- 
programming language) system MSS (MIMOLA hardware design system) 
is presented. It is shown that logic programming techniques have several 
advantages especially in the area of integrated circuit design. One of the 
main advantages i the small code size, which translates to easy mainte- 
nance. We make extensive use of two main features of standard Prolog 
and constraint logic programming, i.e., backtracking and the coroutining 
mechanism, to express Boolean constraints. <~ 
1. INTRODUCTION 
Due to the increasing complexity of digital circuits, the design process is supported 
by design tools covering a wide range of problems like synthesis, simulation, verifi- 
cation, test generation, microcode generation, placement, routing, etc. For readers 
not familiar with VLSI design, we first describe the design subtasks before describ- 
ing our work. Many of these subtasks are of high complexity, e.g., test generation 
is even NP-complete. Therefore, electronic CAD systems, commonly written in 
imperative languages, consist of a very large amount of source code. Maintenance, 
Address correspondence to Ulrich Bieker, University of Dortmund, Department of Computer 
Science, D-44221 Dortmund, Germany or Andreas Neumann, University of Trier, Department 
of Computer Science, D-54286 Trier, Germany. E-maih bieker~lsl2.informatik.uni-dortmund.de; 
neumann~ti.uni-trier.de. 
Received July 1994; accepted June 1995. 
THE JOURNAL OF LOGIC PROGRAMMING 
(~) Elsevier Science Inc., 1996 0743-1066/96/$15.00 
655 Avenue of the Americas, New York, NY 10010 SSDI 0743-1066(95)00099-6 
200 u.  B IEKER AND A. NEUMANN 
portability, and adaptability are problems. We will describe significant software 
engineering advantages by the use of Prolog for these problems. 
MIMOLA [2] is a computer language with Pascal-like constructs. It supports 
design, test, simulation, and programming of digital computers and is integrated 
into the CAD system MSS [18, 19]. MIMOLA, influenced by other hardware de- 
scription languages like VHDL [14], allows structural and behavioral descriptions 
of circuits. Originally the complete system was written in Pascal, but beginning 
with MIMOLA 4.0, we started to redesign tools using Prolog. 
Using the extended Prolog system ECLIPSE [10], new concepts to solve problems 
in the area of digital circuit design have been found. For example, coroutining, 
which allows the user to express a condition under which a call to a specified 
goal will be delayed, is a very useful mechanism to avoid unnecessary backtracking 
during simulation, test, and code generation. 
Several approaches to digital circuit design using logic programming have been 
presented [7, 9, 11, 12, 24-26], most of them concentrating on the gate level or 
even lower levels of abstraction. Only a few contributions consider higher levels of 
abstraction i  the context of logic programming [8, 13, 16, 20, 23]. 
In this paper we describe the use of Prolog for a very high level of abstraction. 
A very elegant simulator, based on a hardware description language and a suitable 
Prolog circuit representation based on trees, is presented. The simulator is able to 
simulate aprocessor together with a given microprogram. We also present a concept 
to generate microcode for a given hardware structure that can be used to test the 
processor. The part of the MSS system concerning this paper is shown in Figure 1. 
IM Hardware 
Description 1 
IMOLA or VHDL)fl 
I fronl-end [ 
(circuit TREE) 
I circuit analyzer ] 
(circuit info ) 
"---.1 
I retargetable i 
self-test program 
compiler ~ 
F IGURE 1. System overview. 
: - (mic rocode)~ 
- \  stimuli ) 
~ nitializations) 
I simulator I 
simulation~ 
LOG J 
LOGIC PROGRAMMING AND COROUTINING FOR ECAD 201 
Starting with a circuit given as a MIMOLA or VHDL RT-level hardware de- 
scription, a tree-based Prolog circuit representation is generated by a front-end 
compiler. Afterwards, a circuit analyzer creates a circuit info file that can be used 
as input for the code generator. A retargetable compiler maps a program onto the 
given hardware, resulting in a microprogram and a set of external stimuli patterns 
for the primary inputs. Finally the generated program can be simulated together 
with the circuit description, an initialization file for registers and memories, and 
the set of stimuli. 
In what follows we first introduce a small processor to be used as an example 
throughout he paper. We continue with the simulator concept based on three 
levels of abstraction, followed by a section describing the circuit analyzer. In the 
last section we generate a load instruction as a typical example of microcode gen- 
eration. 
2. S IMPLECPU:  A SMALL  EXAMPLE PROCESSOR 
Figure 2 shows SIMPLECPU, a small programmable microprocessor consisting of 
eight modules. The SIMPLECPU controller (shaded area) consists of a program 
counter, an instruction memory, an incrementer, and a multiplexer. A 16 × 4 
register file, a 4-bit ALU, a second multiplexer, and a clock are also part of the 
CPU structure. The register file and the program counter are connected to the 
clock (not shown) and control signals are denoted by "c" followed by an index 
range. MIMOLA hardware descriptions contain register transfer modules, their 
behavior and their interconnections. For instance, the 4-bit ALU is specified in 
MIMOLA as follows: 
,Primary ~ c(4:4) / ~¢/ J  
) Primary Output 
datapath 
Instruction 
MEMORY 
controller 
F IGURE 2. SIMPLECPU. 
202 U. BIEKER AND A. NEUMANN 
MODULE ALU (IN a, b : (3:O); IN ctr : (l:O); 
OUT redt: (3:O); OUT ~ondition:(O:O f); 
BEHAVIOR IS CONBEGIN 
result <- CASE ctr OF 
0: a; 
1: b; 
2: atb; 
3: a-b; 
END AFTER 1; 
condition <- CASE ctr OF 
0: a=O; 
1: b=O; 
2: a+b=O; 
3: a-b=O; 
END AFTER 0; 
CONEND; 
CONBEGIN and CONEND denote a concurrent block, containing two case ex- 
pressions as assignments to the outputs. In MIMOLA, the default data type is the 
bit vector. Its index range is denoted as (high-bit : low-bit), i.e., the ALU has two 
4-b% data inputs a and b, a Chit output result, a l-bit output co~~~~~o~, and a 
2-bit control input ctr selecting the ALU function. 
Using MIMOLA as the input language, we generated a tree-based Prolog in- 
termediate format in two steps. First MIMOLA is transformed into TREEMOLA 
[5, 61, an intermediate language of the MSS. The second step is done by a converter 
written in Prolog, which leads to a circuit repr~entation as a list of module de- 
scriptions. Every module consists of a list of connections, a list of storing cells and 
a behavior tree as shown in Figure 3 for a part of the ALU. Such a tree is easily 
represented by a Prolog structure. The list of connections contains information 
about inputs and outputs of the module and interconnections to other modules. 
Every signal is represented by a logic variable and this variable also occurs in the 
behavior tree when the signal is referenced. If signals are instantiated elsewhere, 
this leads to an immediate signal propagation to all modules using this signal. 
FIGURE 3. Behavior tree. 
LOGIC PROGRAMMING AND COROUTINING FOR ECAD 203 
3. S IMULAT ION OF  A CHDL 
The implemented simulator is based on three levels of abstraction: the built-in op- 
erators, an interpreter for the behavior of a single component, and an event driven 
simulator for a circuit together with the microcode. Especially for the implemen- 
tation of the operators, we made extensive use of the coroutining concept of the 
ECLIPSE language. 
3.1. Implementation f Operators 
In order to interpret a hardware description language, an implementation of its 
built-in operators is necessary. These range from logic primitives to complex arith- 
metic operators. They are represented as Prolog predicates, which mainly have to 
meet the following criteria: 
1. The operators must work bidirectionally, so that they can also be used for 
backward simulation of a circuit. 
2. They should work deterministically, i.e., subsequent backtracking steps do 
not produce the same solution. This is especially important for backward 
simulation. This is because the mapping of an operator is not necessarily 
definitely reversible. Certain backtracking alternatives have to be pruned to 
avoid duplicate solutions. 
3. The computation must be, at least at the operator level, data driven, i.e., 
the application of an operator to unbound variables is propagated symboli- 
cally as a delayed goal, until the instantiation of the variables is absolutely 
unavoidable. In this way, the number of backtracking steps is reduced. 
The third point is achieved by using the coroutining mechanism of the ECLIPSE 
language, which allows the programmer to specify conditions, under which the 
execution of a goal shall be delayed, depending on the bindings of its parameters. 
Whenever a variable occurring in one of these is bound, either to a value or another 
variable, the goal will be enabled, and the delay conditions will be checked again. 
At the end of a simulation the set of all delayed constraints must be consistent. 
There should be a constraint solver I that finds contradictions and, if possible, 
solutions for variable bindings. Since such a constraint solver is rather complex, 
there should only be a few types of constraints. It would be Sufficient o consider a 
minimal complete set of operators, but for efficiency reasons we used a set containing 
AND, OR, XOR, and NOT. The Prolog code for those operators is divided into 
delay clauses and program clauses, e.g., the logical AND is implemented as and/3, 
with X and Y as input parameters and Z as output parameter: 
delay and(X,Y,Z) if vat(X), var(Y), var(Z), X\ ==Y. 
delay and(X,Y,Z) if var(X), var(Y), Z==0, X\ ==Y. 
and(X,Y,Z) :- nonvar(Y), !, andl(Y,X,Z). 
and(X,Y,Z) :-nonvar(X), !, andl(X,Y,Z). 
and(X,X,X). 
and1(0,_,0). 
andl (1,X,X). 
1We developed a Boolean constraint handler using the constraint handling rules (CHR) of the 
ECLIPSE system. 
204 U. BIEKER AND A. NEUMANN 
The delay clauses cover the case when the two input parameters are distinct un- 
bound variables and the output parameter is either unbound or zero. In these cases 
it is impossible to draw any conclusion, so the call to the predicate is delayed. The 
program clauses use the commutativity of the logical AND. The first two of them 
deal with the case when one of the inputs is bound, and invoke andl//3 with the 
bound input as the first argument. For the third clause there are, due to the delay 
clauses, only two possibilities left: either the output is 1, which forces the inputs to 
take the same value, or the two inputs are identical variables, to which the output 
will be bound, too. The auxiliary predicate andl//3 expects its first input to be 
instantiated. If it is bound to a 0, the result must be 0; if it is 1, the output is 
identical to the second input. 
The more complex operators are based on these four logical primitives, e.g., a 
full adder can be defined as follows: 
halfadd(Inl, In2, Sum, Cout) :- 
and(Inl, In2, Cout), 
xor(Inl, In2, Sum). 
fulladd(Inl, In2, Cin, Sum, Cout) :- 
halfadd(Inl, In2, Sum1, Carryl), 
halfadd(Cin, Sum1, Sum, Carry2), 
or(Carry1, Carry2, CarryOut). 
Of course the set of operators is not restricted to single bit operations, but for 
each of them there is also a version for bit strings, which are represented as lists. 
On top of these there are built arithmetic operators like addition and multiplication 
and string manipulation operators like shifting and concatenating. 
3.2. Interpretation of a Behavior Tree 
For the interpretation of the behavior tree of a module it is necessary to model the 
context, i.e., the contents of memory cells and input signals at a given time. A 
signal is now represented as a sorted binary tree, with a time and a value mark at 
each node. Readers familiar with logic programming recognize this as a common 
dictionary. Updates on a signal are realized by the following predicate: 
sigValue((T,Val, _Before, _After), T,Val) :- !. 
sigValue((Time, _Val, Before, _After), T,Val) :- 
T<Time, sigValue (Before,T,Val). 
sigVal((Time, _Val, _Before, After), T,Val) :- 
T>Time, sigValue ( After ,T,Val). 
Lookups are realized in a similar way, except that if there is no entry for the 
specified time, the least recent entry must be found, because the signals are assumed 
to be holding. 
Input ports of a module and memory cells are represented by port descriptions, 
which are nothing more than lists of such signals. The contents of memory cells of a 
circuit are held in a dictionary. This dictionary is stored in a binary tree structure 
similar to the signal tree. In this case the search key is an atom consisting of an 
identifier and a list of values. The identifier can be the name of a register or a pair 
consisting of a memory name and an address. The  values are port descriptions. 
LOGIC PROGRAMMING AND COROUTINING FOR ECAD 205 
at_up 
iiput / / \  
pc input[ delay 
in 1 
!at up( 
input(clk), 
load( 
pc, 
input(in), 
delay(1 ) 
) 
) 
F IGURE 4. 
havior tree. 
Program counter be- 
The comparison predicates in the clauses of the concerned lookup predicate must 
then be replaced by the standard term order comparators. 
The interpreter itself has only three parameters: the behavior tree, a time frame, 
and the dictionary with all global values in it, and is implemented inductively on 
the structure of the behavior tree. Such a tree normally consists of some concur- 
rent statements, which may contain nested expressions. For the interpretation of 
statements, a behavior tree and the corresponding representation asa Prolog term 
are shown in Figure 4. The figure shows the statement behavior tree for loading 
the program counter when the clock rises. 
Calling the interpreter with this statement tree will invoke one of the following 
clauses: 
interpret(at_up(ClkExp,Statement), Time,Dictionary) :- 
Time1 is Time-l, 
interpret_exp(ClkExp, Time, Dictionary,[1]i, 
interpret_exp(ClkExp, Time1, Dictionary, J0]), 
!, 
interpret(Statement,Time,Dictionary). 
interpret(at_up(_Clk, _Stmnt), _Time, _Dictionary). 
If the calls to interpret_exp/4 are successful, the interpreter calls itself with the 
load statement as an argument. This call will relate to the following clause, which 
adds the specified elay factor to the current ime and enters the value of the input 
expression into the port description of the program counter, which is taken from 
the dictionary: 
interpret(load(RegId, Expr, delay(Delay)), Time,Dictionary) :- 
lookup(RegId,Dictionary, PartPort ), 
interpret_exp (Expr,Time,Dictionary, Value), 
NewTime is Time+Delay, 
portValue(PartPort,NewTime,Value). 
Note that the delay structure in the behavior tree is distinct from the coroutining 
built-in with the same name. Other constructs like conditional or case statements, 
writing to an output port, concurrent nodes, etc., are implemented similarly. 
The interpreter for expressions has one more argument for returning the value 
of an expression. Except for this, its structure is the same [consider the behavior of 
the program counter incrementer (Figure 5)]. The interpreter clause for the output 
tree, which is very similar to that for the load statement, will call interpret_exp/~ 
206 U. BIEKER AND A. NEUMANN 
jr% 
out ~ncr 
in!ut 
in 
output( 
out, 
incr( 
input(in) 
) 
) 
F IGURE 5. Incrementer behavior tree. 
with the incr subexpression, invoking the following clause: 
interpret_exp(incr(Expr), Time,Dictionary, Value) :- 
interpret_exp (Expr,Time,Dictionary, Value), 
incr(Expr,Value). 
The method is first to evaluate the arguments of an operator and then to apply 
it to the results. The dictionary is needed here only for the read expression, which is 
evaluated as the value of a storage. More complex expressions like the conditional 
or case construct are implemented in the same way. 
3.3. An Event Driven Simulator 
The task of the simulator is to simulate the behavior of a circuit, given the initial 
state of the storage and the values of the primary inputs for the considered time 
interval. The circuit consists of a set of modules with a specified behavior which 
are interconnected by some signals. In an event driven simulator, an event is a 
pair consisting of a time and a module behavior. All events yet to be simulated 
are held in a queue, which is initialized at the start of the simulation by all events 
that are involved by the change of a primary input, the toggle of a clock, or the 
initialization of a register or memory. A new event for a module is generated if and 
only if at least one of its input signals or one of its storing cells has changed ue to 
simulation of a former event. Thus the execution of one event is the following: 
doOneEvent (Modul,Time,Dictionary, NewEvents):- 
Modul = (Name, Behavior, Connections, Stores), 
interpret(Behavior,Time,Dictionary), 
storeEvents(Stores,Name,Time,Events 1), 
newEvents(Connections,Time,Events2), 
append(Eventsl,Events2,NewEvents). 
newEvents([ ] ,  _, [1 ). 
newEvent s ([( Mo d, Sign als )l Rest Cons], Time, [( Mo d, C hangeTime--RestEvents]) :- 
lastChange(Signals,ChangeTime), 
ChangeTime > Time, !, 
newEvents(RestCons, Time,RestEvents). 
newEvents ([_lRestConnections], Time,Events):- 
newEvents(RestConnections, Time,Events). 
The predicate storeEvents/4 is similar to newEvents/3, but checks the storage 
of the module for changes, and if any change is detected, generates an event for the 
same module. Note that the event queue must be sorted and allow no duplicates. 
LOGIC  PROGRAMMING AND COROUTIN ING FOR ECAD 207 
Moreover, there must  be a kind of pr ior i ty for the order of s imulat ion of two events 
with the same t ime---a feature which was omit ted here. 
The s imulator  itself is now defined as follows: 
simulate_circuit (C i rcu i tName,MaxTime)  :- 
. . .  % get the circuit informations 
doClocks(Clocks,MaxTime,ClockEvents) ,  
do InPorts(St imul i , InPorts , InputEvents) ,  
in i tStores( InitL ines,Dict ionary,  In itEvents),  
mergeEvents (ClockEvents, InputEvents, In i tEvents,Events) ,  
doAl lEvents(Events,Dict ionary,  MaxTime).  
doAl lEvents([  ] , _, _). 
doAl lEvents([(  _, Time) [_], _, MaxT ime) : -  
T ime > MaxTime. 
doAllEvents([Event[ RestEvents],  Dict, MaxWime):- 
doOneEvent  (Event,Dict ,NewEvents) ,  
merge(NewEvents,RestEvents,EventsAfter) ,  
doAl lEvents(EventsAf fer ,Dic ,MaxTime).  
The predicate merges the new events after each step with the remaining ones 
from the queue and calls itself recursively with the result, unti l  the queue is empty  
or the max imum t ime is reached. 
For s imulat ing a circuit together with a microprogram, one only has to specify 
the code as init ia l izat ion to different lines of the instruct ion memory  and star t  the 
simulator.  Consider the following example program for S IMPLECPU:  
Example 3.1. 
PROGRAM sum_up IS 
VAR x : nibble; 
BEGIN 
x := l ;  
REPEAT x := x+pi ;  UNTIL  x = 0; 
STOP;  
After init ial iz ing a variable x with 1, a loop adds x to the value of the pr imary  
input unti l  x is zero. The microcode shown in Table 1 consists of five instructions. 
IM 0 denotes the memory  content of the instruct ion memory  with address 0. 
Wi th  the pr imary  input constant ly  set to [0,1,0,1], the s imulat ion of this program 
passed 149 events, which took 0.83 s of CPU t ime on a SPARC 10 workstat ion.  
Note that  every event means simulat ion of a complete behavior tree. 
TABLE 1. Example microprogram. 
Bits 19 18:15 14:9 8:7 6:5 4 3:0 
IM 0 0 0001 XXXXXX 00 01 0 0000 
IM 1 X XXXX XXXXXX 00 XX 1 XXXX 
IM 2 1 XXXX XXXXXX 00 10 0 0000 
IM 3 X XXXX 000001 10 00 1 0000 
IM 4 X XXXX XXXXXX 00 XX 1 XXXX 
208 u. BIEKER AND A. NEUMANN 
4. C IRCUIT  ANALYS IS  
4.1. Simulator Priorities 
When simulating a circuit, it is necessary to give priorities to different modules con- 
cerning the order in which to simulate two events at the same time. The reason for 
this is the causal dependencies between components, which are connected without 
delay. This priority can be compared to the A-delay of VHDL. The intention is 
that an event may be simulated only when all events its inputs depend on have been 
considered before, i.e., the priority of a module is the maximum of the priorities of 
all its predecessors incremented by 1. Assume that we have already computed a
priority list of triples (Mod, Prio, Preds), where Pred is a list of pairs (Mod,Prio'), 
so that every occurrence of a module in the whole structure has its Prio component 
bound to the same variable. Now, for each element of the priority list, we only have 
to compute the maximum priorities in the predecessor list and bind the priority to 
this value incremented by 1: 
delay max(A,B,M) if var(A). 
delay max(A,B,M) if var(B). 
max(A,B,A) :- A > B,!. 
max(A,B,B). 
maxPriority([ ], Max, Max). 
maxPriority([(_, Prio)[ Rest], Max0,Max):- 
max(Prio,Max0,Maxl), 
maxPriority(Rest,Maxl,Max). 
setPriorities([ ]). 
setPriorities([(Mod,Prio,Preds) I Rest]):- 
maxPriority(Preds,0,MaxPrio), 
plus(MaxPrio,l,Prio), 
setPriorities(Rest). 
In standard Prolog this would lead to difficulties because we could not compare 
unbound variables. This is easily resolved by delaying the max/3 predicate. If there 
are no critical races in the circuit, i.e., there are no cyclic dependencies, there must 
be at least one module whose predecessor list is empty, so it will get priority 1. 
This will wake up at least one other max goal and so on, so that all priorities will 
be computed correctly. If there is a cycle, then a conflict occurs and an error must 
be raised. Such a conflict can easily be detected by checking for delayed goals by 
a system call. Note that the plus/3 predicate must also be delayed, which is done 
automatically by ECLIPSE. 
4.2. Microcode Preparation 
To prepare code generation, several tasks are done by the circuit analyzer. The 
main task is to generate a lot of facts describing special characteristics of a given 
circuit to reduce the complexity of code generation. Application of some facts is 
shown in the following section. Table 2 gives an overview of some generated facts, 
LOGIC PROGRAMMING AND COROUTINING FOR ECAD 209 
TABLE 2. Some selected facts, generated by circuit analysis. 
Fact / arity Arguments Example 
transparent Module name 
/3 List of inputs 
List of outputs 
path Source 
/3 Destination 
Path 
mcrementPC 
/2 
jump 
/1 
Delayed goal 
Path 
Path 
transparent (alu, 
[ [D,C,B,A], [0,0,0,0], [1,0] ], 
[ [D,C,B,A], [Condition] ). 
path(im,reg, [ 
(im,[[ .......... _]], [[0,D,C,B,A ................. 0,1 ......... _]]), 
(mux, [[ ...... _], [D,C,B,A], [0]], [[D,C,B,A]]), 
(alu,[[ ...... _], [D,C,B,A],[0,1]], [[D,C,B,A],[_]]), 
(reg, [[ ...... _], [D,C,B,A], [_],[_]], [[ ...... _]] )]). 
incrementPC(incr([F,E,D,C,B,A], [L,K,J,I,H,G]), [ 
(inc, [[F,E,D,C,B,A]], [[L,K,J,I,H,G]]), 
(pcmux, [[_],[0,0],[L,K,J,I,H,G], [ .......... _]], [[L,K,J,I,H,G]]), 
(pcreg, [[L,K,J,I,H,G], [_]], [[F,E,D,C,B,A]])]). 
jump([ 
(im,[[ .......... _]], [[ .......... F,E,D,C,B,A,0,1 ............. _]]), 
(pcmux,[[_],[0,1],[ .......... _], [F,E,D,C,B,A]],[[F,E,D,C,B,A]]), 
(pcreg, [[F,E,D,C,B,A], [_]], [[ .......... _]] )]). 
but due to the lack of space, not all generated facts can be considered in detail. In 
the following we want to describe these facts and the methods to generate them. 
One of these facts is transparent~3, denoting an identity mapping from one input 
to at least one output,  so that  the module becomes "transparent." That  means that  
with a special control code, the considered module is able to pass one input to one 
output.  Most of these facts might be found at multiplexer modules. On the other 
hand, the transparent~3 example of Table 2 shows a possibil ity to switch input 
a of the S IMPLECPU ALU to the output  result, i.e., the signal list [D,C,B,A] is 
switched. This is done by unifying input b with the neutral  element [0,0,0,0], to 
perform an identity mapping for the selected operator.  The binary control code 
c(6:5) -- [1,0] selects the add operator of the concerned ALU. 
How can we generate a transparent~3 fact? Using the interpreter and the oper- 
ators defined in Section 3, this task is easy to solve. The basic idea is to unify one 
module input with one module output and to perform an interpretat ion step for 
this module. The interpretation step has to lead to an instantiat ion of some inputs 
for the following reasons: 
1. Choosing a control code to select an operat ion that  is able to perform an 
identity mapping (e.g., c(6:5) = [1,0] to select ALU addition). 
2. If necessary, choosing a neutral element for the selected operat ion (some oper- 
ations do not need a neutral  element, e.g., a multiplexer or the ALU operat ion 
selected by the control code c(6:5) -- [0,0] to switch input a to the output  re- 
sult). 
A successive selection of all operations performed by a module is done by backtrack- 
ing. Afterwards, the selected operat ion has to be executed symbolically, holding 
the input port to be switched as the list of variables. Execution of the selected 
operat ion Op is done by the clause findTransparent/$. The lists l ibrary predicate 
checklist/2 succeeds if var/1 succeeds for every element of SwitchPort, ensuring 
that  the selected input is switched to the selected output for all possible values of 
SwitchPort. Finally, we assert the generated fact. 
210 U. BIEKER AND A. NEUMANN 
findTransparent(Module, Op, InPorts, OutPorts):- 
member(SwitchPort, InPorts), 
member(SwitchPort, OutPorts), 
Operation =.. lOP, InPorts, OutPorts], 
call(Operation), 
checklist(vat, SwitchPort), 
assert(transparent(Module, InPorts, OutPorts)), 
fail. 
findTransparent(_, _, _, _). 
The fact considered next is path~3, describing a path from a source module to 
a destination, module, possibly through certain other modules which are able to 
perform an identity mapping. A fact path/3 is a triple with parameters source, 
destination, and Path. Path is a list of triples (module name, list of inputs, list 
of outputs). The first element of the list is the source module, whereas the last 
element is the destination module. All modules between source and destination are 
able to switch an input to an output by the use of transparent/3. A path~3 fact 
contains all control codes, i.e., signals which have to be 0 or 1 to switch the Path. 
As source and destination-only sequential modules, i.e,, modules that are able to 
store a value, are considered. Additionally, modules able to yield a constant, e.g., 
a decoder, can serve as a source. The example given in Table 2 shows a Path from 
the instruction memory im through the multiplexer mux and the alu to the register 
file reg. Therefore binary control codes c(19) = [0] for the multiplexer and c(6:5) 
= [0,1] to switch [D,C,B,A] via the alu are selected. [D,C,B,A] is the list of values 
connected by this path. 
A simplified version of the predicate generating path/3 facts is findPath/3. The 
first clause terminates the search of a path if Destin is a direct successor of Source. 
In the second clause we try to find a path through a module Next, which has 
to be a successor of the current Source and must be switched into a transparent 
mode. Afterwards, a recursive search with Next as source is started. A lot of 
implementation details are omitted, e.g., the check to prevent entering a cycle and 
the complete circuit representation. 
findPath(Source, Destin, [Source, Destin]):- 
successor(Source, Destin). 
findPath(Source,Destin, [Source I RestPath]):- 
successor(Source, Next), 
transparent(Next, Inputs, Outputs), 
findPath(Next, Destin, RestPath). 
A frequent subtask of microcode generation is to increment the program counter. 
Therefore we generate a symbolic increment instruction where the address is un- 
bound. The real address will be instantiated at the end of code generation. For 
that reason we generate a delayed goal, so that the code generator is able to bind 
these addresses to real values with respect o certain constraints. As a consequence 
of that an increment instruction incrementPC/2 is a pair, containing a delayed goal 
that performs the increment operation and a Path from the output of the program 
counter to the input of the program counter. In the generated Path, two occurrences 
LOGIC PROGRAMMING AND COROUTINING FOR ECAD 211 
of the program counter are avoided by omitting the program counter as source. Path 
is a list of triples as described above. The given example of Table 2 shows the unique 
solution to increment the program counter pcreg for the example processor. There- 
fore the binary control code c(8:7) = [0,0] is selected for the multiplexer pcmux. 
[F,E,D,C,B,A] is the current state of the program counter, whereas [L,K,J,I,H,G] 
will be the next state. The delayed goal incr([F,E,D,C,B,A],[L,K,J,I,H,G]) denotes 
the operation to be executed at the end of code generation. 
A further subtask of code generation is to perform unconditional jumps, i.e., to 
move a constant value into the program counter without consideration ofa condition 
from the arithmetic unit. Therefore, jump/1 is simply a fact denoting a Path from 
a sequential source module to the program counter. SIMPLECPU has only one 
possibility to perform such an unconditional jump by selecting c(8:7) = [0,1] as 
control code for the multiplexer pcmux as shown in Table 2. The new symbolic 
jump address [F,E,D,C,B,A] originates from the instruction memory im. 
The facts incrementPC/2 and jump/1 are mainly generated by the use of path~3 
and transparent/3. Using failure driven loops (see, e.g., findTransparent/4), all
possible solutions of the described facts are generated and asserted. 
We conclude this section by enumerating some additional facts not considered 
here: 
1. constant/3 denotes a module that is able to yield a constant as output (e.g., 
a decoder). 
2. conditional Jump/2 denotes a conditional jump version, i.e., a conditional path 
to the program counter. 
3. noload/3 denotes a micro instruction, indicating that the contents of a register 
or memory must not change to prevent side effects. 
We have tested the circuit analyzer with several examples. One of them is PRIPS, 
a coprocessor with a RISC-like instruction set, which provides data types and in- 
structions upporting the execution of Prolog programs. The structure consists of 
50 register transfer modules. A complete circuit analysis took 77 s, leading to 1/2 
MB of facts. 
5. CODE GENERATION 
A microcode generator is a tool for mapping algorithms to predefined hardware 
structures by generating the required binary code. If such a compiler is target 
independent, i.e., the programmable microprocessor is an input of the compiler, 
we call this method retargetable compilation [21, 22]. The original intention for 
this work is to generate self-test microcode, i.e., microcode that is able to perform 
a test for programmable microprocessors. The following example describes code 
generation for a variable assignment, called load instruction. 
Assuming the assignment reg[0] := 1 to be generated as used as first instruction 
of the simulation example, i.e., we want to load register 0 of the register file with 1. 
The binary values are [0,0,0,0] for the address and [0,0,0,1] for the data to be loaded. 
After a justification step has driven necessary values for the load instruction to the 
inputs of the register file, the following three values have to be generated: 
address = [0,0,0,0]; data = [0,0,0,1]; c(4:4) = [0]. 
212 U. BIEKER AND A. NEUMANN 
TABLE 3. Binary code for reg[0] := 1. 
Bits 19 18:15 14:9 8:7 6:5 4 3:0 
Code 0 0001 XXXXXX 00 01 0 0000 
Having unified the input ports of the register file with these values, we have to 
perform a backward simulation to search for modules that are able to yield the con- 
stants. This module is usually the programmable instruction memory or a decoder. 
Backward simulation in general is nondeterministic and therefore backtracking and 
bidirectionality of Prolog is advantageous. 
In our example, the control code c(4:4) and the address = c(3:0) are direct 
predecessors of the instruction memory. It is more difficult to have the data loaded, 
because we have to pass the value [0,0,0,1] through certain modules. However, with 
the use of the path/3 facts generated before, the problem is easy to solve. The path/3 
fact shown in Table 2 gives all information to generate a solution for the required 
data transfer. Table 3 shows the resulting binary code. If this instruction is part of 
a complete microprogram, additional tasks could be done concurrently. The address 
for the next instruction has to be determined, which could be done by incrementing 
the program counter by c(8:7) = [0,0]. Alternativly a jump. or a conditional jump 
could be performed, leading to values for the 6-bit jump address c(14:9). Therefore, 
the facts jump/1 and conditional Jump/2 are used, whereas incrementPC/P is used 
to increment the program counter. 
At the end of code generation the microprogram has to be bound to real addresses 
of the instruction memory. We perform global scheduling while concurrently com- 
pacting and binding the code. Here we make extensive use of linear constraints over 
the integer domain. In this way it is possible to exploit the parallelism of the target 
processor (e.g., in VLIW architectures). At the beginning we unify the symbolic 
address of the first instruction with the start address e.g., 0. Delayed goals like 
incr/2 are awakened and this leads to a successive binding of concerned addresses. 
The process is supported by a labelling procedure. The resulting microprogram can 
be simulated by the simulator described above. We would like to extend the code 
generator to handle pipelined architectures. 
6. EXPERIMENTAL RESULTS 
The tools described above have been applied to several target structures. Table 4 
gives information about the example circuits: simplecpu, as shown in Section 2, 
demo [2], prips [1] and mano [17]. The number of RTL components, the width 
of the microinstruction controller, and the width of the datapath are given. The 
results hown here indicate that the tools can be applied even to realistic structures. 
All times are measured on a Sparc 10 workstation. 
The times shown in Table 5 are achieved by simulating a simple loop, such as 
the program mentioned in Section 3. Every event means simulation of a complete 
behavior tree (RT events). The original MIMOLA simulator (written in Pascal for 
an earlier, more restricted version of MIMOLA) simulates on average about 300 
RT events/s. Although the Prolog simulator is slower by a factor of 3-5, its main 
advantages are the support for backward and symbolic simulation. These features 
are important for test generation, e.g., to justify signals. 
LOGIC PROGRAMMING AND COROUTINING FOR ECAD 213 
TABLE 4. Example circuits. 
Circuit RTL modules Instruction width Datapath width 
simplecpu 10 20 4 
demo 16 84 16 
prips 50 83 32 
mano 21 50 16 
TABLE 5. Simulation CPU times. 
Circuit Events CPU(s) Events/s 
simplecpu 149 0.83 179.5 
demo 1394 25.05 55.6 
prips 1003 21.99 45.6 
mano 478 4.3 111.1 
TABLE 6. Circuit analysis times. 
Circuit Generated facts CPU(s) 
simplecpu 26 0.56 
demo 61 2.96 
prips 415 77.03 
mano 131 11.85 
The results shown in Table 6 are measured for the microcode preparation phase 
of Section 4.2. We can see that for larger circuits, a lot of facts are generated 
by the circuit analyzer. Therefore, the given hardware has been analyzed and 
microoperations that can be executed by the hardware have been extracted. 
7. CONCLUSION 
We have described how logic programming and coroutining can be exploited for 
some tools in the MIMOLA hardware design system. A simulator for structural 
hardware models, described in a hardware description language, has been presented. 
The simulator consists of 2700 lines of code, whereas the original Pascal simulator 
has about four times more lines of code. Most of the new simulator can be used 
bidirectionally and symbolically, which is very important for code and test gener- 
ation. Using coroutining to express certain constraints, many backtracking steps 
can be avoided. The circuit analyzer consists of 2200 lines of code, whereras a 
comparable C++ implementation [15] has about 10,000 lines. The circuit analyzer 
cooperates with a retargetable self-test program compiler [4, 27]. 
The original simulator was very difficult to maintain. The time to develop VLSI 
tools using logic programming is much shorter than for imperative languages. On 
the other hand, software written in standard Prolog is slower. With the new concept 
of constraint logic programming [3] this disadvantage becomes maller, because this 
technique leads to a significant reduction of unnecessary backtracking steps. 
Additionally, a tool to generate schematics for structural hardware models has 
been implemented in Prolog. 
This work was supported by the DFG, the German research foundation. 
214 u. BIEKER AND A. NEUMANN 
REFERENCES 
1. Albrecht, C., Bashford, S., Marwedel, P., Neumann, A., and Schenk, W., The Design 
of the PRIPS Microprocessor, 4th EUROCHIP Workshop on VLSI Training, Toledo, 
Spain, September 1993, pp. 254-259. 
2. Bashford, S., Bieker, U., Harking, B., Leupers, R., Marwedel, P., Neumann, A., and 
Voggenauer, D., The MIMOLA Language--Version 4.1, Technical Report, Computer 
Science Dept., University of Dortmund, September 1994. 
3. Benhamou, F. and Colmerauer, A. (eds.), Constraint Logic Programming: Selected 
Research, Cambridge, MIT Press, MA, 1993. 
4. Beckmann, R., Bieker, U., and Markhof, I., Application of Constraint Logic Pro- 
gramming for VLSI CAD Tools, First Int. Conference Constraints in Computational 
Logic, Lecture Notes in Computer Science, Vol. 845, Springer, Berlin, pp. 183-200. 
5. Bieker, U., On the Semantics of the TREEMOLA Language, Version 4.0, Report 
435, Computer Science Dept., University of Dortmund, 1992. 
6. Beckmann, R., Schenk, W., Pusch, D., and Joehnk, R., The TREEMOLA Lan- 
guage Reference Manual, Version 4.0, Report 391, 2nd ed., Computer Science Dept., 
University of Dortmund, 1991. 
7. Clocksin, W. F., Logic Programming and Digital Circuit Analysis, J. Logic Program- 
ming 4:59-82 (1987). 
8. Cheng, G., Tsui, C., Pyo, I., Huang, I., Koh, Y., Su, C., Liu, S., Pan, K., Wu, S., 
Chen, H., and Despain, A., A b-hll-Range Design Automation System for Instruction 
Set Processors, in: First International Conference on the Practical Applications of 
Prolog, London, April 1992. 
9. Dincbas, M., Simonis, H., and Van Hentenryck, P., Solving Large Combinatorial 
Problems in Logic Programming, J. Logic Programming 8:75-93 (1990). 
10. ECLIPSE 3.4 User Manual, ECRC Common Logic Programming System, ECRC 
GmbH, Munich, Germany, 1994. 
11. Gullichsen, E., Heuristic Circuit Simulation Using PROLOG, Integration, VLSI J. 
3:283-318 (1985). 
12. Horstmann, P. W., Automation of the Design for Testability Using Logic Program- 
ming, Dissertation, University of Missouri, 1983. 
13. Huang, I. and Despain, A. M., High Level Synthesis of Pipelined Instruction 
Set Processors and Back-End Compilers, in: 29th Design Automation Conference, 
1992. 
14. Design Automation Standards Subcommittee of the IEEE, Draft Standard VHDL 
Language Reference Manual, IEEE Standards Department, 1992. 
15. Leupers, R., Instruction Set Extraction from Programmable Structures, in: EURO- 
DA C, Grenoble, September 1994. 
16. Lichtenstein, Y., Welham, B., and Gupta, A., Time Representation in Prolog Circuit 
Modelling, in: 3rd Annual Conference on Logic Programming, Edingburgh, Springer, 
Berlin, 1991, pp. 78-93. 
17. Morris Mano, M., Computer System Architecture, 3rd ed., Prentice-Hall, Englewood 
Cliffs, N J, 1993. 
18. Marwedel, P., The MIMOLA Design System: Tools for the Design of Digital Pro- 
cessors, in: Proc. 21st Design Automation Conference, 1984, pp. 587-593. 
19. Marwedel, P., Matching System and Component Behavior in MIMOLA Synthesis 
Tools, in: Proc. EDAC 1990, 1990. 
20. Neill, M. D., Jani, D. D., Cho, C. H., and Armstrong, J. R., BTG: A Behav- 
ioral Test Generator, Computer Hardware Description Languages and their Appli- 
cations, in: Proceedings of the IFIP WG 10.2 Ninth Int. Symposium on Computer 
Hardware Description Languages and their Applications, Washington, June 1989, 
pp. 347-360. 
LOGIC PROGRAMMING AND COROUTINING FOR ECAD 215 
21. Nowak, L. and Marwedel, P., Verification of Hardware Descriptions by Retargetable 
Code Generation, in: 26th Design Automation Conference, Las Vegas, June 1989, 
pp. 441-447. 
22. Nowak, L., Graph Based Retargetable Microcode Compilation in the MIMOLA De- 
sign System, in: 20th Annual Workshop on Microprogramming (Micro-20), 1987, pp. 
126-132. 
23. Reintjes, P. B., A Set of Tools for VHDL Design, in: Logic Programming, Proc. of 
the Eight Int. Conference, 1991, pp. 549-562. 
24. Simonis, H., Test Generation Using the Constraint Logic Programming Language 
CHIP, in: Proceedings of the 6th International Conference on Logic Programming, 
Lisboa, Portugal, June 1989, pp. 101-112. 
25. Simonis, H., Nguyen, N., and Dincbas, M., Verification of Digital Circuits Using 
CHIP, in: G. J. Milne (ed.), The Fusion of Hardware Design and Verification, North- 
Holland, Amsterdam, 1988, pp. 421-442. 
26. Svanaes, D. and Aas, E. J., Test Generation through Logic Programming, Integra- 
tion, VLSI J. 2:49-67 (1984). 
27. Bieker, U. and Marwedel, P., Retargetable S lf-Test Program Generation Using Con- 
straint Logic Programming, in: Proc. 32nd Design Automation Conference, San 
b-Yancisco, June 1995, pp. 605-611. 
