Silicon Compilation by Johannsen, David Lawrence
Silicon Compilation 
Thesis by 
David Lawrence Johannsen 
in Partial Fulfillment of the Requirements 
for the Degree of 
Doctor of Philosophy 
California Institute of Technology 
Pasadena, California 
1981 
(submitted May 13, 1981) 
-ii-
© 1981 
David Lawrence Johannsen 
All Rights Reserved 
-iii-
to my wife and son 
-iv-
Acknowledgments 
The author expresses his grateful appreciation to Jesus Christ for His patient 
guidance and many blessings. Special acknowledgments also go to my wife, Janet, 
for her encouragement and support. I would also like to thank my parents and 
sisters for their contributions to my development. 
I wish to extend my thanks to some of the people who worked with me at Caltech: 
to Carver Mead, for his many insights and his sense of humor, 
to Ron Ayres, for his friendship and aid, and USC-Information Sciences Institute, 
to Jim Tobias, Jeff Sandeen, and John Wawrzynek, for their participation and 
support·, 
and to Chuck Seitz, Ivan Sutherland, and the Caltech Silicon Structures Project. 
-v-
Abstract 
Modern integrated circuits are among the most complex systems designed by man. 
Although we have seen a rapid increase in fabrication technology, traditional 
design methodologies have not evolved at a rate commensurate with the increasing 
design complexity potential. These circuit design methodologies fail when applied 
to Very Large Scale Integrated (VLSI) circuit design. This thesis proposes a ne-w 
design methodology which manages the complexity VLSI design, allowing 
economical generation of correctly functioning circuits. 
Cost is one measurement of a design methodology's value. A good design 
methodology rapidly and efficiently translates ~igh level system specifications in.to 
v11orking parts. Traditional techniques partition the translation process into many 
steps; each design tool is focused upon one of these design steps. This partitioning 
precludes the consideration of global constraints, and introduces a literal explosion 
of data being transfered between design steps. The design process becomes 
error-prone and time consuming. 
The technique of silicon compilation presented in this thesis automatically 
translates from high level specifications into correct geometric descriptions. In this 
approach, the designer interacts at a high level of abstraction, and need not be 
concerned with lower levels of detail, facilitating exploration of alternate system 
architectures. Furthermore, since the implementation is algorithmically generated, 
chip descriptions can be made correct by construction. Finally, the user is given 
technology independence, because the high level specification need not require 
. knowledge of fabrication details. This flexibility allows the user to take advantage 
of technology advances. 
This thesis explores various aspects of silicon compilation, and presents a prototype 
compiler, Bristle Blocks. The methodology is demonstrated through the design of 
several chips. The practicality of the methodology results from the concern for 
efficiency of the design process and of the chip d,esigns produced by the system. 
-vi-
Table of Con ten ts 
Ac know I edgemen ts • • • . . • . • . . . • . . • . . • • • • • . . . • . • • • . • . • • • . • • • • . • • • . • • • • . • . . • . • • • iv 
Abstract .........•••••.......••...•....•••.••••..••.•.••...•..•.........••.. v 
List of Illustrations ..•.•.••....•.•••..•.•..•.....••.•••..••.•••..••.••• viii 
1 ntrocluc t ton . . • . . . • . • • . • . . . . . . • • . • • . . • .. . • • • • • . . • • • • • . • • • • • • • • • • • • • • • • • . . • • . • 1 
Similarities between Software and Si I icon Campi lers 
Differences between Software Compilation and Silicon Compilation 
Design of a Si I icon Compiler 
Part I . De.s i gn ile tho do I og i es . . . . . . . . . . . . . . . . . . • . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 





1. 5: Efficiency 
1. 5: Cone I us ions 
Chapter 2: Hand Design ....................•..•...•.••..••....••..•.•.......• 15 
2.1: Management of Complexity 
2.2: Chip Planning: The Floorplan 
2.3: The Slicing Floorplan 
2.4: Global versus Local Optimization 
2.5: Conclusions 
Chapter 3: Imbedded Languages ....•.......•....•.•.••.•.•.•...•.•.••.•...•.• 30 
3.1: I CUC 
3.2: Parameterized Cel Is 
3.3: Conclusions 
ChaiJter 4: Chip Assemb \ ers ••••....••••.••••.••••••••.•••.•..••••..•.•.•..•• 45-
4 .1: Cel I Composition 
4.2: Pow~r Routing 
4.3: Composition Methods 
4.3.1: Cel I Abuttment 
4.3.2: Ce! I Stretching 
4.3.3: River Routing 
4.3.4: One-sided General Interconnect 
4.3.5: Four-sided General Interconnect 
4.4: Conclusions 
Chapter 5: A Simple Si I icon Compiler •••.••••.••..••••..•••.•••••.•..•••.•.• 64 
5.1: The Floorplan 
5.2: Chip Assembler 
5.3: The Campi ler 
5.4: Campi ler Extent ions 
5.4.1: Optimizers 
5.4.2: Generators and Parsers 
5.4.3: Examiners 
5.5: Conclusions 
Part II. Bristle Blocks .................................................... 87 
Chapter 8: Introduction to Bristle Blocks .••.•.••.••.•.••••••••••••••.••••• 88 
Chapter 7: The Bristle BlocKs Input Language ............................... 92 
7.1: Field Declarations 









7.3.7: Variable Timing Equations 
7.3.8: Decode Operations 
7.3.9: Sources 
7.3.10: Destinations 
7.~: Comments and Macros 
Chapter 8: How Bristle Blocks Works ••.••.••••••••••••••••.•••••••••••••••• 118 8.1: Parse Input 
8.2: Generate Instruction Decoder Functions 
8.3: Bui Id Datapath Core 
8.4: Add Buffers and PLSRs 
8.5: Add Instruction Decoder 
8.8: Add Pads 
8.7: Conclusions 
Chapter 9: Bristle Blocks Examples ••.••.•••••••••••••••.••••••••••.•••••.• 135 9.1: Lamp Dimmer 
9.2: Random Tune Generator 
9.3: Frequency Scaler Chip 
9.4: SOLC Chip 
9.5: A Micro~rogrammed Microprocessor 
Chapter 10: The History of Bristle Blocks •..•••••••...•.•••.•..•••...••.•. 179 10.1: The Past 
10.2: The Future 
Appendicies ............................................................... 194 
Appendix 1: ICLIC Reference Guide •.•...•••.••..••.••..••••..••.••••••.•••. 135 Appendix 2: Imbedded Language Example .•.••.••••.•.•...•..••••••..•......•• 201 Appendix 3: River Routers ....•......••..•.•.•••••.•........•.•..•.•.•.•••. 211 Appendix 4: The RLC Campi ler •....•••••..••.•.•••.••••••...••.••••.••.••.•. 231 Appendix 5: Bristle Blocks Elements .•••••••.•••••.••.•.••••••••••••••••••• 278 A5.1: Registers 
A5.2: Simple Arithmetic Elements 
A5.3: Arithmetic/Logic Units 
A5.4: Ports 
A5.5: Constants 
A5.8: Barrel Shifters 
A5.7: Bus Precharge Elements 
A5.8: Random Simple Elements 
A5.9: Compound IR Elements 
AS.10: Compound Output Port Elements 
A5.ll: Compound Swapping Elements 
AS.12: Compound CAM Elements 















































OM2 Datapath Chip Block Diagram ..••...•.•..••...••.•.••..••••..•... 18 Ful I Chip Logical Floorplan ....•.•...•..•...•.....•••...•••..•...•. l:J Datapath Logical Floorplan .•....•...•.••.•.••...••...•.••..••...... 20 Sing I e Bit-SI ice Log i ca I FI corp I an ....•.••.••...••. , • • • • . • • . • . . • • • . 20 ALU Log i ca I F I corp I an . . . . . . . . . . . . • . . . . . . . . . . • . . . . . • • . • . • . • . • . . . . • . • 21 ALU Phys i ca I FI corp I an .... ; . . . . . . . . . . . . . • . . • . . . . • . . • . . . . • . • • . . • • . . . 21 Bit-SI ice Phys i ca I FI corp I an ....•••••••..•••.. , •..•.•...•.•..•..•.. 22 Datapath Physical Floorplan •.•..........•...•....•....••••..•.•.•.• 22 Ful I Chip Physical Floorplan ..••••..•.•.••.••.••...•.•.•••.•..••••• 23 OM-2 Datapath Chip Mask Set ••..........•...••..•.......•••.....•... 24 Floorplan with no Preferred Order ....•...•.••••....•.•.•....•.••••. 25 The Three SI icing Cel I Floorplans ..••••...••.....•..••.••......•..• 25 SI icing a Chip . • . . . . . • . . . . . . . . . . . • . . • . . . • . . • • . • . . • • . . . . . • . • . • . . • . • . 27 
ICLIC Primitives .•.•.....•....•........•.•.•...•.•••.•..•••.•...... 33 Shift Register Ci rcu i t • . . . . . . • . . . . • . . . • . • . . . . • . . . . . . . . . • . . . • • . . . . • . 35 Shi ft Register Layout .•..•.•........•••••••...•..••.•...••••.•..... 35 Parameterized Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Two Cell Instances ..•••••..••••.•••..••.••.•......•••••.•..•..••.•. 38 Six Shift Register Layouts ••..•....•.•••••.••.•••••••••.••.•.•••••• 39 Graph of Size Poss'rbilities ••.•••. · ..••.•••••••••.•••••••••••••.••.• 43 Five Shift Array Candidates ................•...•...•.•..•.•..•••... 44 
Power Line Convent i ans ............•.•.•.•..•.....•..•...••..•.••.• , 48 Horizontal Power Connections••••••••••••••••••••••••••••••••••••••• 49 Alignment of Vertical Boxes ...••.•.•...••..••••.•••••....••.••••... 50 Positioning Horizontal Box ..•..••......•......•.•.•. , ....••.•....•. 50 Completed Power Connections .....•...••...•.•......••.•.••..•..••... 52 Hi erarch i ca I I y Sharing Boxes .•... , .....•.•• , •••• , ••• , • . . . . . • . . • • . . . 52 Ce I I Abut tn1ent •..•••...•..•.•..•.••..•••••••••.•• , ••••••••••••••• , • 53 Cel I Stretching ..•.....•.....•••........•.•....•....••....•..•••... 55 River Rau t i ng . . . . . . . . . . • . • . . • . . . . • • • . . . . . . . . . . . . . • • . . . • • . • . . . • . . . . . 58 General Interconnection ............... ~ .•............•......••••... 50 Four Sided Interconnection ...•..........•..•......••.•••.... ,,, .. ,. 51 Immediate Interconnect ..••..•.•.•.•••..••.••..•.•.••.••.••••.••..•. G2 De I ayed Interconnect . • . . . • • • • • • • . . . • • . . • • . • . • • . • . . • • . . . • . • . • • • • • . . . 52 
RLC F I oorp I an . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GS Pu I se Synchronizer Circuit . • . • . . • . . • . . . . • • . • . • . . • . • • • • . • • • • . • • • • . • • 57 
Layout of Pulse Synchronizer···~································ ... 59 Examples of Redundant Inverters ...••.••.•...••.••••.•..••..••••••.• 77 .Dperat i.on of GET INVERT •.•••.•.••••..•..••••••••••••...•••••••••..• 78 Three Frequency Dividers !Transformed} ....•••••.......••...••.••... 81 Multiple Representations ..•..••...••.•...••••••••...•••.•..••••.••• 82 S i mu I at i on P I o t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 
5-1: Block Diagram of a Generalized Datapath ............................ 88 5-2: Bristle Blocks Logical Floorplan .•.•••••••.•.••••.....•.••.••.••.•• 89 5-3: Bristle Blocks Physical Floorplan ••••••••••••.•••••••••.••••••••••• 90 



































Comp I ete Register Ce 11 ..•••••.•.••••.•••.••••••••••••••••.••..•••. 128 Sample Chip Register Instances •••.••••••.•••••••.•••••••••..•••.•• 129 Sa mp I e Chip Core Layout . • • • • . • • • • • • • • . • • . • • • • • • • • • • • • • • • • • • • • • • • • . 130 Sample Chip Buffers ..•.•...••••••.••••••••••.•.•••••.•••••.•••.•.. 131 Sample Chip PLSRs .••.•.•.••.•..•••••••••••••••••••••••••••••••••.. 132 Sa mp I e Chip Decoder • . • • • • • • . • • . • . • • • • • • • . • • • • • • • • • • • • • • • • • • • • • • • • . 133 Sample Chip Pads .•.•.••.••.••••••.•••.••••••••••••••••.••••••••••• 135 
Lamp Dimmer System . • . • • . • • • • • • • • • • • • • • • • • • • • • • • • • • • • . • . • . • . • • • • • . • 135 Lamp Dimmer Chip Block Diagram •..••.•••••••.•.••.•••••••••.••••.•. 137 Lamp Dimmer Chip Bounding Box Plot ••••••••••.•••••••••••••••••.•.• 143 Random Tune System Block Diagram .•••.•••••••••••••••••••••••••••.• 143 A Note Object ..•...•.•...••..•••...•••..••••.••••.•••..•••.•••••.. 144 Frequency Scaler Block Diagram •••••••••••••••••••••••••••••••••••• 150 OM System Block Diagram ..•••••••.•••••••.•.•.•••••••••••••••.••••• 157 Controller Register Transfer Diagram .•••.•.•••••••••••••••••••••.. 158 Oatachip Bfock Diagram .•.•••••••...•••••••••.•••••••••••••..•••••• 161 Bounding Box Comparisons .•••••••.•.•••.•••.•••••••••••••.••••••••• 173 
General Purpose Register Block Diagram ..••••••••.•.•.••••.••••••.• 182 Shifter Loop Block Diagram ••••.••••.•••••••••••••••••••••••••••••• 186 
Connector Pairs ................................................... 212 Computing New Path •.••••..••.•..••. • •..•••••••••••.•••.•••••••••••• 214 Jogging the Path of a Wire ••••••....•••••••.•••.••••.••••••••••••. 218 Moving Lower Ce I I . . . • • • . . • • • • • • • • • • . • • . . • • • • • • • . . • • • • • • • . • . . • • • • . • 213 River Route Comparison •••.••.•.•.•..••.•••.••.•••••••••••••.•••.•• 221 River Routing to Pads ...•.•••••••••.•••.•.•••••.•••••••••••••••••• 222 Unto Id i ng the Box ••.• , •.•.••.••.•••••••.••.•••••• , ••• , •••••••••••• 222 Constraining Jogs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 l'lapping Wires into Box ..•.•.•.•••••.•..••••..•.••.••..•.•.••••••.. 225 Erroneous Wire Wrap-Around ••••••..•.••..••..•.•.••••••.•••.••••••• 225 Non-Independent Wires •..•.•.••..••••••••••••••••••.•.••.•.••••.•.. 226 Boundary Interference .•.•••.•.•...••.••.••.••••••••.••.•••••• , •••. 227 Box Route ......................................................... 229 He><agon Route ........•.••..•..•....•..•••..•......•.•..•...•...••. 230 
-1-
Introduction 
/'\. circuit qualifies as a VLSI circuit when a single designer, using manual design 
methods, requires more than one lifetime to complete the design. 
Using the above criterion, we are currently at the threshold of VLSI: chips are 
beginning to take more than a person's lifetime to design. Furthermore, industry 
experts project that things will get much worse [ 18]. As fabrication technology 
advances and the density of circuitry increases, so will the functionality of the 
chips. As the functionality increases, the complexity of the design will incn~a,r;e 
Pxponentially due to the interactions among the circuit elements, and therefore the 
design time \Vill also increase exponentially. If we are going to FJXploit these 
density increases, we will have to change the way in which \A.Te design cii-cuits. 
Not loo many years ago, software design engineers (programmers) were faced with 
a very similar problem. The computer technology was advancing and giving the 
programmers ever increasing memory space. Exploiting this tremendous increase jn 
mt:1chine capability, program complexity grew rapidly, resulting in exponential 
increases in design and implementation time, due to interaction between pir:!ces of 
the program. One very successful tool that developed was the higher-levP1 
language and compiler. 
Soft\vare compilers allow programmers to specify their designs more by intent, on a 
semantic level, rather than on an implementation level, freeing programme1·s from 
the concerns of many nitty-gritty implementation details. Programmers also en.joy 
the ability to rapidly modify their programs. The compiler handles the tedious and 
E'rror-prone task of translating the high-level semantic definition of the program to 
its low-level implementation. 
If the techniques and concepts used in creating software compilers were used in 
creating hard ware compilers, or silicon compilers, the VLSI designer would be fri:~ed 
from the complexity of low-level implementation details and could spend more 
time on interesting tasks like product definition and algorithm research. 
-2-
Sirnilari ties between Software Compilers and Silicon Compilers 
Both soft\vare compilers and silicon compilers have the same basic t;o<1l: hiding 
implementation details from the user, allowing the user to work at an algorithmic 
or behavioral level. vVhen writing a program in a high-level language, the usr~r 
does not want to know exact physical addresses where his code is being plac1C:rl, nor 
noPs he care how the compilr:!r allocates registers, nor even what the instruction .c;pt 
is on the target machine. Were he to be bogged down with this incredible volurnP 
of implementation details, software costs would be many times higher than thc·y 
are today. In exactly the same way, a user designing with a silicon compHcr 
language does not need to know the exact physical locations where the compiler is 
pli1cing his circuitry, the mask layers that the compiler uses, or even the desi.i:.~n 
rules required by the fab line. 
Compilers maintain datatype consistency. Most of the modern software languat;1:~s 
require data to be typed: the user must specify what kind of data is stored in the 
program variables. The compiler can then check to make sure that the variables are 
used correctly. If the user attempts to store an INTEGER value into a REAL variable, 
the compiler can either notify the user of an improper assignment, or convert thr" 
INTEGER data into the REAL format before completing the assiJ'~nment. Jn lil"e 
manner, silicon compilers can verify the usage of cells and their interconnections. 
If the output of one cell, vvhich is valid during one clock phase, is connected to a 
second cell which samples that signal during a different clock phase, ·the compiler 
can either notify the user of the timing error or take corrective action on its own. 
The outputs of compilers are correct by construction: no one takes every output of a 
FOHTHAN compiler and runs a design rule checking program on it to verify that tlH-, 
assembly code correctly implements the source specification. Similarly, the output 
of a -ivorking silicon compiler is guaranteed correct. The resulting designs do not 
have to be run through DRC programs to verify the correct design of the chip. This 
is very importclnt as we approach VLSI, where the time and cost to perform a full 
DRC are astronomical. 
Compilers allow for continuous modification of the design. Wh.en thP 
specifications for a software program are changed, the program is m.odified to 
reflect the change. Rarely is the entire program scrapped and written ane\.v, as 
-3-
might be the case if the program were written in machine code. It is quite natural 
in the soft\vare world for programs to go through many revisi.ons as new-. 
d iscovcrics arc made. In hard ware design, machine specifications are in a constant 
st.Jti:' of flux, but radical changes to the design are unaffordable if large portions of 
the design must be redone, as is usually the case if the desi.i:;n consists of geometric 
primitiv8s. Compilers make changes affordable, giving the designer freedom to 
explore many.design strategies. 
Compilers \Vork with templates. Software compilers use templates for each of the 
b<Jsic language constructs. For instance, the IF-THEN-ELSE statement 
IF <expression> THEN <statementl> ELSE <statement2> FI 
alw-ilys compiles into the machine code 
labell: 
l .:ibe 12: 
evaluate <expression> 




In a similar manner. silicon compilers have templates, called 'floorplans', for the 
constructs in the language. These floorplans explicitly state the -wh·j ng 
mana,~~ement for the chip and describe how the pieces of the chip connect together. 
Differences between Software Compilation and Silicon Compilation 
The first major difference bet\veen the tasks of software compilers and silicon 
compilers concerns the dimensionality of the result. Software compilers generate a 
one-dimensional result. The location, or address, of any resulting machine -word is 
a single number. Silicon compilers generate a two-dimensional result. The location 
of any resulting primitive device is given as both an X and a Y coordinate. 
To be completely general, silicon compilers would have to do box-packing -which, 
for large designs, becomes impractical. To avoid this problem, silicon compilers 
make use of the hierarchy of the specification, equating the physical hierarchy 
·with the logical hierarchy as specified by the user and directed by the floorplan of 
-4-
the compiler. 
The second ma.jar difference between software compilation and silicon compilation 
concPrlls the communication between various pieces in the design. In software, the 
GOTO and Cl\.LL instructions provide linkage between the various modules. For <ill 
practical purposes, the communication costs l)etween any two points in m0mory is 
constant, regardless of the rf.!lative positions of the two points. In silicon, wires 
provide the linkage between the modules. The cost to communicate betwer~n t\vo 
points is directly related to the relative positions of the points: the further <Jpart thP 
points are located, the more area and time/power are required to implement thP 
communication. In addition, the wire can not be routed arbitrarily across the chip 
because it must avoid other modules and wires that might be in its path. In fact, a 
communication path may be impossible due to obstacles on the chip. A software 
GOTO has no obstacles; it doesn't have to dodge a certain set of words in memory, 
,1nd it doesn't modify every word that it has to pass over. 
It is interesting that this second problem, the GOTO, can be solved using. the same 
t•:>chnique as the solution to the dinumsionality problem: through the hierarchy. 
Using a hierarchical description of the chip, communication bet\.-veen module::; can 
only occur in well-defined manners between objects on a single level of thr:' 
hierarchy or between two adjacent levels of the hierarchy. The floorplan of the 
compiler can guarantee these t\vo types of communication, and hence the silicon 
compiler can compile any chip which can be specified in the compiler's language. 
Design of a Silicon Compiler 
This thesis explores some of the possibilities available using the silicon. compilation 
concept. The first section reviews various design techniques and weighs the 
potentials of design tools. Some of the newer concepts are dicussed in detail, with 
practical examples to illustrate the techniques. The second section documents a 
working silicon compiler and rlicusses the df!Sign tradeoffs and experience learnE!d 
using the compiler. The work presented in this thesis was performed by the author 





Chapter 1: The Design Tool Space 
The first integrated circuit masks were designed completely 'by hand'. All of thr~ 
g0ometric features were drawn directly on the film used to expose the working 
plates. The designer was completely free to lay out any circuit desired, but evi=•ry 
single shape had to be drawn on the film. As digitally controlled film plotters were 
1leveloped, the design style incorporated computers to drive the plotters. The masks 
V·.lere st.ill designed on myl<tr, but then the designs were 'digitized': the geomE1 tric 
;,h.ipes were describe<! to the computer. The computer then drove the film plotter to 
lll<Jke the actual mosks [2]. 
Once the computer-film plotter team was introduced, it was noticed that the final 
goal of the layout designer was not necessarily to generate the actual masks, but to 
build the d<Jta description in the computer's memory. Methods were dfJVelop•:!d to 
Pll lt=~r thA geometric sh,1pes directly into thf~ computer, rathf~r than working on 
myl.:ir then digitizing the result. Interactive graphics systems allowed the user to 
dC'si.,c;n the circuits directly on a CRT screen and get instant visual feedback of the 
computer's perception of the design [1][7][8][33]. 
The design task using interactive graphics 1Nas still formidable. The us•~r had to 
v0ri fy that every geometric .shape met all of the process desien rules. The 
fabrication processes ~vere in a constant state of flux, so the design rnl(~s frequently 
ch.:rnged. Due to the lonf:~ design cycle for large chips, these chips could not use the 
slate-of-the-art technolo/;y. Work then proceeded to divorce the design rule;; from 
the dPsign. If the design could be specified independent of the design rules, and if a 
program. could then convert this technology-independent layout description into 
the actual artwork, chips could make use of the most advanced technology, and as 
the technology advanced, the new· masks could be generated from the old ck~signs. 
This dC'cdgn technique was called the 'sticks' approach [21][24](27](28](32][34]. 
These design methodologies are fundamentally 'Geometric' techniques. The basic 
atoms of the design are graphical objects, and an object in the design, like a rr~gister, 
is a collection of graphical objects. As these graphics techniques were being 
developed, a totally different app1·oach using 'Procedural' cell design techn iqUf!S 
-vvas being expl01·E1d. In the procedural methodology, the cells are described as a 
program which, ivhen executed, would generate the description of the layout 
-7-
directly in the computer memory. 
The first step towards procedural cell design was to develop 1anguagi:~s for 
describing art\vork. As these special purpose languages were used, it was notk~,d 
that there ·was a need for higher level programming constructs, which led to the 
development of an 'Imbedded Language'. Rather than design a special purpose 
langu<1ge from scratch, routines for generating the geometric shapes were written 
in a lligh level programming language. The designer in using an imbedded languar~e 
h,1s the fnll power of the high level language to describe the layout [ 4][ 19][ 31 ]. 
vVllen chips are designed using the procedural approach, a large collection of 
subroutines are written to implement each of the low-level units of the design. To 
complete the design task, these pieces must b~ glued together. The tedious and 
Prror-prone wiring task can be done by a program. An imbr~dded language syst<:'m 
'"'i.th au tom a tic wiring generators and cell management systems becomes a 'Chip 
Assern b ler'. In the chip assembler, the designer is interconnecting a series of 
macro-modules. The user designs the low level layouts, while the program 
l~enerates the wiring that puts together the chip [6][29]. 
Extending this approach one step further, the high-level features of a chip class c011 
be hardv\tired into the design system. To design a particular chip in this class of 
chips, the user need only specify the uniq:ue features of the chip, allovving tlrn 
program to automatically generate the remainder of the chip. These systems are 
ciJlled 'Silicon Compilers', reminiscent of software compilers used to write programs 
[5][12][13]. 
VVe have mentioned six bosic design approaches here. Each of these systems Jw::; 
ddvantages for certain design requirements and disadvantages for othr~rs. We wil.1 
discuss some of the design requirements here to put a perspective upon the design 
style used throughout this thesis. 
1.1: Flexibility 
One comparison of the design tools that can be made is that of flexibility: Hovv 
flexible are the design tools? Perhaps the first type of flexibility that comes to. 
-8-
mind has to do with the architectures of chips. Can we design any type of chip 
·with a particular design tool, or docs the tool restrict the kinds of designable chips. 
The most n:!strictive design tool presented here is the Silicon Compiler. The Silicon 
f:ompiler accepts a formal, high-level language specification of the chip to lie 
implmnented. Hence, the kinds of chips compilable with a particular compiler art:J 
limited to those expressible in the language. If you can express the chip in thP. 
input language, you can compile the chip. Chip Assemblers are more flexible than 
compilers. With an assembler, the use·r fuses collections of macro-modules to form 
thE" chip. The number and complexity of the macro-modules, alcing with the 
communict1tion costs, are the primary limiting factors on chip architecture. Still 
more floorplan-flexible are the Sticks and Interactive Graphics systems. Th8se only 
limit the geometric primitives available in the design, and restrict thi:J C(~Jl 
boundries to be rectangles. Finally, hand design and imbedded language system.s 
allow the most flexible design system. It is possible to design any designable chip 
tvith these two system. 
Th~~ above discussion talked about the absolute limits of each of the desjgn system. 
Un the other hand, there are practical limitations to each of the tools. Perhaps t.he 
biggest limitations are design time and the notorious complexity issues. While it ]s 
theor8tically possible to design any chip with the hand design methodology, thr~ 
implc•mentation time may be astronomical. Similarly, the design complexity may be 
so l;irge that it is virtually impossible to design a provably correct chip in a 
reasonable time. Systems like silicon compilers, however, may rie able to design 
these chips in a very short time. 
Another flexibility measure has to do with technology dependence. The silicon 
technology is always advancing. Are the tools able to take advantange of 
tPchnology advancements? It is here that the Sticks design systems really shine. 
Since the system performs all of the design rule dependent operations, the user 
designs 'technology free' designs. This does not mean that the user is not aw-are of 
the CMOS/NMOS differences, but the user does not need to see differencr~s in 
various NMOS processes. Each of the other design systems require work to modify 
an existing design to make use of new technology. For silicon compilers and chip 
cissem blers, the cell libraries have to be redesigned for the new design rules. For the 
other systems, the entire chip must be redesigned when the technology is modified. 
-9-
A third flexibility might be calli:~d specification flexibility. When the 
c:;pccifications for the .chip ch<inge, how much work is it to redesign the chip? 
During the desit;n of virtually every chip, ways are found to make thr:! chip better. 
Another design team may change the environment of the chip, or the chip rlesi,i:;ners 
may rliscover a hidden cost during the implementation of the chip. In any case., it 
may he very desirable to 'start all over' and re-implement the chip. In many cas•'.!s, a 
redesign of the chip may require starting from scratch, and this cost of scrapping 
the whole design may be prohibitive. If the company has invested several 
man-years in the design, the design modifications are not economically feasiblP. 
\Vith a silicon compiler, howr.>ver, a comp~my invests man-days, not man-years, into 
a rlesit:~n. vVith this small in vestment in to the design, redesigns a re virtually f re>!. 
The designer may quickly and easily explore many design tradeoffs, which are 
impossible with the other design techniques. 
1.2: Specification 
Part of the process of riesigning a chip may be thought of as specification 
translations. The design team is given an input specification of a chip, usually a 
functional specifict1tion, and must produce an output specification of the chip. Thi::; 
output chip specification may be a large drawing of the chip, as in hand r:losign, or 
the specification may be a data structure in a computer's memory, as in graphics and 
sticks systems, or the specification may be a program in a high-level langua,go, as in 
th0 compiler, assf:~mbler, and imbedded language systems. The fundamental ta:>k of 
the design team is to translt1te a specification in the input language to a 
specification in the output language. This translation process may be accomplishod 
in one step, or in many steps with different design groups performing each step in 
the translation. 
One 'vould like to match the output language as closely as possible to thr~ langu;;1,;:;e 
used by the design team. If the design team is working with logic equations, 0110 
'vould like the output language to be logic equations; if the design team workr:!d 
vvi th Hegister Transfer (RT) equations, the optimal output language would be an HT 
language. This language match is desirable for two reasons. First, the design 
specification would be intuitive, so that the designers can easily express their 
intent in the language. An expression in the language could be easily understood by 
-10-
thP. dPsigners. Second, the designers can directly produce the output specification as 
the chip is being designed. 
If the design language is intuitive, a great majority of the design errors c.-m br:• 
<1Voided. If the specification is non-intuitive, it is difficult for the designers to 
catch design errors in the chip specification. Of the six design tools mentions, the 
irnbcddcd langu<ige systems are perhaps the least intuitive. The user wishes to 
implement a function. Short of that, the user wishes to describe the picture of a 
circuit to implement the function. In imbeddcd language systems, the usrff writes a 
pro~~ram to generate a picture to implement the function. Things are better with 
lhe hand design, graphics, and sticks tools. With these tools, the USFJr directly 
gf~nerates the picture to implement the function. The most intuitive systems, 
ho'\vevPr, <1re the assemblers and compilers. Here, the user describes the function, 
which is the desired quantity. 
\Nhen the designers are directly designing in the output language, the chip 
.sp1~cification is complete when the designers have finished implemi:mting the 
function. If the designers' specification has to go through translations or 
re-specifications, there is a gnJater probability of errors. Therefore, the output 
langu,1ge should closely match the design language used by the design~!rs. 
The specification language can enforce design correctness. With a sui tahli=> 
spPcification language, it is impossible to generate most design errors because the 
lan,guage does not allow for the specification of errors. For example, the Sticl'..s 
dpsi,t~n ;,,ystem does not allow the user to design circuits with dimensional desi[~H 
rnle violations because the Sticks language does not permit the sped fication of 
dimensional information (except transistor sizes). Since the user can not specify 
the spacing between t"\vo metal features, it is impossible for the user to desi~-i;n 
circuits ·with metal-to-metal spacing violations. Similarly, ·with assemblers and 
compilers. it may he impossible for the user to generate chips with timing errors or 
logical interconnection errors simply because the input languages do not permit 
specification of these errors. 
-11-
1.3: Corn.position 
The design of the lo\-V level cells, which comprise 80% of the chip area, typically 
takP.s lPss than 10% of the design time, with the rest of the design time consumr:~ri 
hy the interconnection of these low level cells. Most of the design errors occur in 
these interconnections, also. Low level cells are small, self-contained units that the 
designers can completely understand while the cell is being designed, while the 
interconnection cells are large, global units which are impossible to fully 
understand. A good design system should aid in the composition of cells. 
There are t"-vo sides to composition systems. On one hand, the system should aid in 
the generation of interconnect geometry where needed. If the design tool can 
automatically generate the interconnection geometry, a great deal of the 
interconnection design time can be performed by the machine. Secondly, the 
system should verify that the interconnection was correct. The verifications n1ay 
assure that electrical and timing constrai.nts are met. At a higher level, the 
composition system may verify that the logical contraints are met, and that signals 
are used correctly. 
Currently, very few of the design tools have the conception of cell composition. 
The chip assembler and silicon compiler have squarely faced the issue of chip 
r.omposition and interconnection verification. In the other systems, it is difficult to 
see how a composition system can be added. 
Closely related to the composition aspect of the chip design system is thr=2 
hierarchical philosophy of the system., The hierarchy of virtually all design tools i.s 
recursive. You are either dealing with a composition cell or with a leaf cell. All 
composition cells look and act the same. This means that the same design tool can be 
.) 11.scd to design every composition cell on the chip. Unfortunately, very few 
hierarchies exhibit a recursive nature. 
In human hierarchies, large companies, for instance, one sees several levels of 
managmnent directing the operation of the companies. Other than the fact that 
people fill each of the positions in a company tree, there is little similarity in each 
of the positions. The tasks of the vice presidents are very different from the tasks 
of the section- managers. The chairman-of-the-board's job relates little to a projPct 
-12-
rnand£er's job. A person well suited to one of these jobs can not in general fill 
another person's job. 
Simil<u-ly, v11e have a hierarchi8s in our dE!Sign systems. At a low level, the u;;i=•r 
may be dealing "tvith polygons. At higher levels, the user is dealing with flip-flop::;, 
registers, ALUs, inicroprocessors, then complete systems. A microprocessor is not 
the same sort of object as a register. One does not design an 68000 the way one 
designs a static D-flip-flop. 
Most existing design tools are recursive systems. At the highest level of design, the 
user is still drawing boxes and polygons. The primitives of the design system are 
still the graphics primitives, rather than being data buses, registers, or 
microprocessors. The silicon compiler is a hierarchical system, but not necessarily a 
recursive system. The system knows the difference between an inverter and an 
f\ T ,U. Any of the other design tools except hand design can make use of 
non-recursive hierarchies, yet none of these systi=Jms currently takes advantagi:J of 
hierarchies of specification primitives. 
1.4: Verification 
J\s VLSI becomes a reality, the verification issue must be squ<Jrely faced. In present 
design systems, verification is done by analysing the graphics primitives which 
comprise the mask sets. All information regarding the structure of the design has 
lieen throtvn atvay. This is like writing a program to analyse the object file 
produced by a FORTRAN compiler to verify that the compiler is operating correctly. 
\Vith VLSI chips, it is impractical to perform verification checks by analysing the 
artwork. 
Instead, we n1ust guarantee correctness by construction. If we generate correct 
layouts, we do not have to verify the artwork. We need only verify our methods 
for constructing the layouts. The task of verifying our construction methods is 
much simpler than verifying artwork. Our construction procedures take a wPll 
defined input language; we need only verify that every legal input produces correct 
output. To verify artwork, we must be prepared to accept any input, including 
tricks-of-the-trade. With the graphics systems and imbedded languages, the input 
-13-
Jan.i:~uage is a direct specification of the artwork, so our verification task is hy 
d1'finition the task of verifying the artwork. Hence, it will become impossible to 
vPrify VLSI designs produced by the current graphics and imbedded language tools. 
The ass0mbler and compiler, hm."7ever, take an input language which is far more 
concise than an artwork specification. Hence, we have a hope of verifying de~-;ign~_; 
prod uccd by these systems. 
Another side _of verification has to do with the capturing of intent. When the 
rlesigner df-~signs a cell, the desiener has an intent about how the cell is to be us•c'd. 
To properly use a cell, the user must know this intent and meet the restrictions of 
the in tent. If the user exceeds the limits of the intent, the cell vv-ill not function 
properly. In design systems which consider a cell to be nothing more than artwork, 
this in tent information must be captured in cell documentation, since it can not bf! 
c;1ptured vv-ith the cell. Users of the cell n1ust check the documentatlon and 
m<1nually verify that the cell is being used properly. The procedural d(c:sign systems 
do not xest1·ict the concept of ri cell to just the artwork. The designer writ.;~::; a 
progxan1 to generate the artwork. The designer can add additional code to the 
program to capture additional intent. This documentation is kept with the cell. In 
addition, the cell itself can verify that it is being used properly. 
1.5: Efficiency 
When one speaks of design efficiency, one usually refers to measures of chip <1rea, 
chip speed, and chip power consumption. Given a chip specification and infinite 
time, one "vould expect hand design and imbedded language systems to produce the 
most optimal chips, follov,red by interactive graphics systems, sticks systems, chip 
assemblers, and finally silicon compilers. These later design systems have area 
penalties due to fixed floorplans and geometric primitives. 
One rarely has infinite time, however, in which to implement a chip. An 
approximation of the ideal chip must be made. For instance, with hand desi~n 
methods, one spends a lot of time planning alternate architectures and 
approximating the design costs for various approaches. Once the range of design 
candidates is narrowed, detailed design can begin. As the dF!tailed design nPars 
completion, many of the design approximations may be found to have been 
-14-
erroneous. Due to the large investment in the design, a redesign of the chip i.s 
seldom feasible. As a result, the final chip may be non-optimal in the ideal ~;Fm;.;•~. 
but may be fairly good from a practical standpoint. 
vVith the more inefficient design systems, chips can be implemented much more 
rapi.dly than with hand design systems (otherwise these other systems would not 
r-xi~;t). Because of this reduced design time, it becomes affordable to iterate the 
design. When these design approximations are found to be in error, the chip may rie 
redesigned. With the possibility of design iterations, dramatic architecture 
variations can be explored. Chips resulting from architecture modifications may 
well have very large performance advantages over the original hand designed chip. 
In highly complex systems, the system organization has a much greater effect upon 
performance than the details of low level cells. Thus, even though the resulting 
chi.p is known to be less optimal than a htind-design of the same architecture, the 
chip is more optimal than the hand designed chip since the hand designed chip 
\.Vould not be implemented in the new architecture. 
1.6: Conclusions 
There arc many design techniques in use today. Each of these systems cater to .::i 
particular desi.!c.;n style. They have various limitations on design capabilities, and 
they have different aides for the designer. As technology advances towards VLSI, 
our design requirements are going to change. We will require fundamentally new 
design principles. Although Silicon Compilers may have undesirable restrictions on 
the types of chips we design, they provide design capabilities that are impossiblo to 
achieve with our present day tools. They have the potential to implement in nn 
hour \.Vhat current design techniques implement in a lifetime, Machine 
architecture tradeoffs can be explored in an almost interactive environment. DE!sit,n 
verification can be performed at a level previously unattainable. For these reasons, 
and others, this thesis explores the realm of the Silicon Compiler. 
-15-
Chapter 2. Hand Design 
The fundamental task of designing a VLSI circuit is to manage the complexity of the 
design. Even modest chips designed today have several million n:Jctang1es anrl 
hundreds of thousands of devices. Unless the management of the design complexity 
is squarely faced when the design process is begun, the implementation of the chip 
may become an impossible task. Fortunately, techniques exist which successfully 
aid in the management of complexity. In this chapter, we will discuss methods for 
man.1ging complexity and chip planning. 
2 .1: Managen1en t of Co1nplexi ty 
We can observe complexity management principles being applied in almost every 
area of life. We can use these same techniques for designing large integrated 
circuits. Three of these techniques will be examined. One technique is the use of 
conventions, a second is partitioning of the design, and a third involves abstraction 
of data. 
Examples of the first complexity tool, conventions, are readily observable in daily 
life. Traffic signals are a successful convention in our modern world. If everyon8 
agrees to abide by the restrictions implied by traffic signals, Cl much more complex 
and inefficient system of maintaining road safety can be avoided. Traffic laws and 
La"\v Enforcement Officers assure us that (almost) everyone agrees to the 
convention. In VLSI design systems, conventions can be made with regard to 
functional partitioning or timing relationships. If the designer faithfully adheres 
to these conventions, he may feel confident that the design will operate correctly. 
If there are circumstances where the designer feels that the conventions should not 
be follov.red, he will have to tol"-e extra steps to verify that the circuit will still 
operate correctly. 
The second design aid is partitioning. Rather than solving a large problem all at 
oncP, the problem can be broken into smaller, separable pieces each of which is 
easier to solve than the original problem. This process may again be repeated for 
each of these new, smaller problems, until we are left with simple problems that 
are straightforward to solve. If we have properly partitioned the problem, each of 
-16-
the solutions can be combined to solve the whole problem. To allow each of th es(?. 
~;···p.ir,1tr> solutions to be used together, we must design and specify an int•:>.rface 
between the pieces. For example, in software programming, a large program. is 
broken into several subroutines. To assure proper operation of the collection of 
subroutines, guidelines concerning register and memory usage, data structure::.:;, 
calling conventions, and parameter types are developed and adhered to. 
The third aid to handling complexity is data abstraction. There are (at least) tvvo 
branches of physics dealing with objects in motion. In Classical Mechanks, 
everything is very deterministic, and we treat objects like air pucks as indivisible, 
uniform objects. If we look very closely at our air pucks and how they interact, we 
find that Classical Mechanics does not precisely describe the observable 
interactions. We use Quantum Mechanics when· we need these precise equations. If 
"\Ve look closer still, we find discrepancies between the physically observable events 
and the calculated Quantum Mechanical events: Quantum Mechanics does not 
completely define how our air pucks work. In both of these cases, we k.now that 
our theories and formulas are wrong, and yet we can still profit by using them. In 
each of these fields, approximations are made. We do not look at each of the 
individual subatomic particles which compose an air puck. Instead, vve abstract 
thi.s incredibly large amount of data into a fairly simple model. Similarly, for VLSI 
design, we do not have to examine every single geometric primitive within a region 
of the chip when designing the neighboring regions. Almost every function 
implemented on a chip, certainly every function of reasonable size, requires two 
areas of silicon: The first is a private area over which this function has exclusive 
rights, and the second is an interface area, ·where external signals connect to the 
function. To use this function, no knowledge of the private area is required. We 
can abstract the function to an interface and a 'black box', and hence have Jr~ss 
information externally required to use the function. By imposing suitable design 
conventions, the interface area can be a small percentage of the total function area, 
which greatly reduces the data requirements. 
These three techniques of complexity management are used in the cell concept of 
VLSf design. A cell's layout is defined to be a rectangular area of silicon with the 
geometric shapes required to implement the cell's function contained completely 
iNithin the rectangular limits and an interface area limited to the perimeter of the 
area. Along this perimeter, there are 'ports' or 'terminals', where external signals 
-17-
may connect to this cell. No external cells or geometric shapes may extend vvithin a 
c1c'll's rectangular limit, the minimum bounding box (MBB) of the cell. 
An important by-product of the cellular design approach concerns data sharing: if .=1 
function is replicated on a chip (or across many chips), the layout whir::h 
implements the function can be shared between the various instances. The function 
is converted to silicon once, and the resulting pattern can be used many times, thus 
factoring the design cost of the chip. This layout sharing is identical to the us•:: of 
suhroutines for code sharing in software programming. 
The internal structure of a cell's layout consi:>ts of combinations of primiUve 
.r~comctric shapes and instances of other cells. This recursive nature of the cell 
definition allows us to hierarchically design chips. Rowson [25] has defined tvJo 
types of cells: Leaf cells and Composition cells. Leaf cells contain only geometric 
primitives, no references to other cells. Composition cells contain only references 
to other cells, no geometric primitives. We will use this Leaf cell definition, Lut 
"tve "tVill allov.r our Composition cells' layouts to contain geometric primitives. 
Adding geometry to composition cells is done for conceptual ease; simple 
transformations convert from one form to the other. 
2.2: Chip Planning: The Floorplan 
The arrangement of subcells within a composition cell can have a dramatic efff~ct 
upon the size and performance of the chip. To aid the user in composing cells, the 
110tion of a 'floorplan' has been developed. A floorplan is the bltwprint whkh 
indicates topologically how the subcells fit together to form the complete cell. The 
floorpl,1n also shows the wiring strategy used in the cell. Floorplans are invaluable 
aids for top-down chip planning. The relative size and placement of the major 
subdivisions of a cell are quickly visuallized. The communication costs for various 
arrangements can also be determined. 
To illustrate the use of floorplaning, we will discuss the planning for the OM2 
datapath chip [ 15][ 16]. A functional block diagram of the datapath chip is shown 
in fi~ure 2-1. At the highest chip level, we needed a chip with three bi-directional 
Input/Output ports. These were to communicate with the datapath of the chip. One 
-18-
of thesE? ports was to be mainly a control port, which brought the instruction word 
into the decoders. The othe1· two ports wm·e data ports, connected to the internal 
data buses. In addition to the datapath, we also required some fl<1g logic, and 
additional control input pads. Our primary data flow was to run horizon tally 
through the chip and the primary control flow was to run vertically. 
Control Flag 
Port Logic 
t 1 0 a .µ ..µ 0 
a 0 
0 
+> ..µ l 








i •n _J er: 
Control Inputs 
Fig. 2- 1: OM2 Datapath Chip Block Diagram 
Figure 2-Z shows the high level floorplan for the chip. We have the two data ports 
on the '\Vest and east edges of the chip, the clatapath in the center, thr" literal port 
.ind fl.1gs to the north, and the control input pads on the south. Thr" sizes of the 
various boxes ·were estimated, considering the functions of each element, which 
completes the planning of the highest level composition cell layout. 
We can now decompose the subcells within the global floorplan. The datapath 
s~?ction was to be composed of data processing elements and instruction decodrc)rs. 
The d0coders wei·e to take th~ microcontrol bits entering the chip and drive 8ach of 
the processing elements' control lines as a function of the input. Thr;) instruction 
decoder ·was broken into two sections. One section was placed above the processing 
elements, the other was placed below the elements. This was done because the cell 
size estimates showed the processing elements to be much narrower than the full 
-19-
Literals ~la9s 
! Data/Contra 1 ! Data 
+> +> 0 0 l l +> .+> 0 0 0 0 
()_ ()_ 0 0 
~ Datapath -- - +> +> 
L 4- ..-t ........ 
ID 0 0 m t t •rl _J +> 1> 0: c [ 




Fig. 2-2: Full Chip Logical Floorplan 
decoder. Buffers were ploced between the decoder and datapath. ThPse buffers 
synchronized the decoder's signals, and satisfied the electrical requirements of 
drivin~ large datapath loads from WC?ak decoder outputs. Figure 2-3 shows the 
floorplan of the datapath section of the chip. 
Decomposing the processing element section, we needed a register array, a barn-'!l 
shifter, and an ALU. To relieve much of the register bottleneck, we had tvvo data 
buses running between the individual elements. Each of the registers in the array 
could read or write to either bus. The shift array read data from the bus and drove 
the ALU multiplexer. The ALU could read data from either of the buses or from the 
shifter. The ALU and shifter were chained together to speed up the multiply and 
divide operations. The ALU's output could drive either bus. The buses also 










Fig. 2-3: Datapath Logical Floorplan 
(/) 
l 
ID +> ~ 
+> 4- 0 x :::J 
(/) ·r-1 l J __J 
•rl L l 2: <( 
m ()) <C. 
ID 
er:: 
Fig. 2-4: Single Bit-Slice Logical Floorplan 
We can further examine the ALU. The ALU was built of five leaf cells. The first 
spction was the two input registers. The second section was the logic for 
computing carry propagate, carry kill, and carry generate. The third section was 
the actual carry chain. The fourth section computed the ALU output as a function 
of the carry input and the carry propagate signals. The final section was the output 
-21-
registers. Figure 2-5 shows this layout. 
Carry Out 
Input Logic Carry Output. 
Corry In 
Fig. Z-5: ALU Logical Floorplan 
At this point, we began designing the leaf cells. For example, we could s•~e that the 
input registers received data from the left side of the cell and drove data out the 
right side of the cell. The two data buses ran through the cell, but were not used by 
the cell. Control lines for the cell ran vertically through the cell. Once the A LU 
Cl'lls vvere drJsigned, we could draw the physical floorplan for the ALU. In fif~11rP 
;?.-G, \Ve have the subcells shown to scale. vVe have also shown the layer for e<Jch 




Oat.a Bue l 
-1::=:==:±=::==:=======1!::::========::::!===:=:±=====:=============:::it- Dot.a Bue 2 
Carry Resu 1 t. Out.put. Input. Logio 
Fig. 2-6: ALU Physical Floorplan 
i\s the remaining datapath elmnents were designed, the bit-slice physical floorplan 
took shape (fig. 2-7). The control and data lines in the register array had different 
layer conventions than the other processing elements, as shown in the figure. 
Similarly, the datapath floorplan (fig. 2-8) and finally the entire chip floorpldn 
-22-
f cantrol 
Data = Metal 
Control = Poly Data = Poly Control = Metal 
I I 














Fig. 2-7: Bit-Slice Physical Floorplan 
Decoder is~~~1te 
B (""' f • Output.8 u T er s tDif'Fueion 
Processing tcont.rol 
Elements 
B f f ~ Outi:>uts U 8 r 8 t Dif'f'ueion 
Decoder :s~~~r~· 
Fig. 2-8: Datapath Physical Floorplan 
-23-
(fig. 2-9) were completed in the same manner. The final chip layout is shown in 
figure 2-10. Much of the regularity of the design was due to the use of 
floorplanning. Due to the regularity of the design and the completeness of the 
planning, the chip was designed in nine man-months. 
Literals Flogs 
' f l • 
+> 
-i' ~ t +> 0 
l Cl. 
0 
CL ~ +> ~ !-+ ~ -







O!f- ........ ~ -
Pads 
Fig. 2-9: Full Chip Physical Floorplan 
2.3: The SliCing Floorplan 
Specific chip architectures have related floorplans which are suitable for those 
particular chip structures. To build general purpose design tools, we would like to 
have general models for cells and floorplans. We can then build the tools to take 
ildvant<lge of the resulting floorplans. 
In top-dm.1\rn design, we take a description of a large unit, and decompose this unit 
into simpler, smaller units. Each of these units can be similarly decomposed. This 
decomposition process continues until all of the descriptions an be easily 
-24-
Fig. 2-10: OM2 Datapath Chip Mask Set 
-25-
implemented. We then work bottom-up, fusing these lower-level implementations 
to form implementations for each of these larger units. When we reach the highest 
chip level, we have an implementation of the chip. We want our general purpose 
floorplan to model the top•down design, bottom-up implementation style of design. 
·Our complexity management strategy uses rectangular cells. Our general-purpose 
floorplan will therefore use rectangular cells. To perform top-down design, we 
need to provide the capability of decomposing cells. To decompose a cell, we will 
divide the given rectangular cell into smaller rectangular regions. To perform 
bottom-up fusions of cells, we need to interconnect each of the subcells to form the 
implementation of the given cell. 
Completely general 'glue' between the cells would allow transistors to be added in 
the interconnections between the cells. Allowing transistors between cells is 
usually an example of local optimization, rather than global optimization, and the 
specification and verification of these 'glue' circuits can introduce many errors into 
the design. Therefore, for our general model, we will restrict all transistors to Ue 
within subcells, and only allow wiring to fuse the subcells together. 
If we allow completely general subdivision of a cell into subcells, we may have no 
preferred order of composition. With preferred composition orders, we can achieve 
more optimal circuit layouts. Without preferred composition orders, we can not 
determine the optimum design for a particular cell until every other cell on the 
chip has been designed. Hence, we can never achieve an optimum design, althout;h 
we can approach optimal designs by iterating the design many times. Figure 2-11 
shows a rectangular arrangement of cells that does not have a preferred order. Not 
only can we not determine a good order for cell generation, but we can not 
determine a good order for routing the wires in the four wiring channels. 
A floorplan that does have a preferred composition order is the Slicing floorplan. A 
slicing floorplan has the following definition. 
A SI icing Cel I is either 
1) A Leaf cell, 




Fig. 2-11: Floorplan with 110 Preferred Order 
Figure 2-12 shmvs the three possible types of slicing cells. Due to the recursive 
nature of this definition, we have the capability of designing a rather laTgc-? 










b) Hori zonta 1 c) 




For the systE~ms described in this thesis, we will use the Slicing floorplan as the 
floorpl<1n model. While other floorplans can use these same techniques, t:tir:! 
mech<mi.cs of building the tools may be more difficult, and the examples may not he 
as clear as Slicing examples. 
-27-




I I I I I 
._ I I"--
Cut 3 Cut 4 Cut 5 









CJ c::::::J D c:::::J 
oDDD 
~I I._ '-- I I._ I l ...... .__ 
Cut 6 Cut 7 Cut 8 
Fig. 2-13: Slicing a Chip 
2.4: Global versus Local Optimization 
In Small Scale Integration (SSI) and Medium Scale Integration (MSI) designs, much 
time was devoted to performing 'Local' optimizations on the circuits. This was 
because the entire circuit was usually considered 'Local'. For LSI, and certainly for 
VLSI, the situation has changed. No longer is the entire chip design considered a 
-28-
'Loc<ll' a~~sign. Where time was previously spent performing local optimizations, 
time must no\".T be spent performing global optimizations. As our designs increas•~ in 
size, we must' depend more upon global design optimization. Local optim i?.ations ciln 
actually hurt the design from a global point of view. 
For example, logic design in discrete or TTL design regimes involves 'lor~ic 
minimization', ·which actually means transistor minimization. Much effort \.Vas 
spent reducing the number of transistors required to implement functions, because 
transistors were the expensive part of the design, the wires were free. In silicon 
design, the majority of the chip area is devoted to wiring. The actual area for the 
transistors is less than 40% of the total chip area for •good' designs. In m.:rny 
instances, the transistors are free; they are placed under the wires. 
Using v.ri.re-v.rrap boards, the designer has completely arbitrary interconnectability: 
.cmy pin of any chip can be connected to any other pin of any other chip, regardless 
of where the chips are positioned on the board and independent of any othPr 
interconnections. In silicon, it is very inexpensive to interconnect shared edges of 
adjacent cells if the connections are well correlated (in approximately the s<1me 
order in each of the cells). Almost any other circumstance, however, costs a gn:•at 
rleal. vVires can not be arbitrarily drawn across the chip because wires can not 
;1rbitrarily cross other wires or cross through cells. Wires consume a great deal of 
the chip area. 
/'\.. final contrast between TTL design and VLSI design is the difference in wire 
loading. In TTL design, the chips are each capable of easily driving fairly long 
wires (from one side of the board to the other). In silicon, however, the i;vires can 
;1drl a large amount of loading to devices, which slows down the operation of the 
circuit. A small gate can not drive a ·wire from one corner of the chip to the other in 
a short period of time. Hence, in VLSI design, circuits which must communicate 
must be fairly close together, adjacent if possible. 
Each of these points argue for global planning. The communication costs for VLSI 
ilre the dominating cost of the design. Global optimization of the communication 
paths can provide greater performance increase than local optimization of each 
circuit on the chip. 
-29-
2.5: Conclusions 
As >Ne move towards VLSI, we can not continue to design chips as we have in the 
p.1:;t, The complexity of design causes the design cost to rise exponentially. Our 
design tools must he designed to cope with the complexity of the design. We have 
seen some techniques 1.vhich aid in the complexity management issues, and we have 
seen some planning tools which will aid in the global optimization of designs. The 
follm . .ving chapters discuss tools which are built upon the techniques presented in 
this chapter. 
-30-
Chapter 3: Imbedded Languages 
When using the cellular design approach, quite often it is the case that a family of 
similar cells is developed. Each cell instance within this family shares most of the 
characteristics common to the family, but has its own personalization vvhich 
d istiguishes it from the others. For example, a group of cells may be designe<i 
where the cells perform the same function but each consumes a different amount of 
power. Another example mi.ght be a collection of similar cells where the aspect 
ratio of the cells varies amoung members. 
Purely graphical systems dictate that each member of the group be completely 
designed, because graphic systrm1s emphasize the differences between cells. For a 
small family· of ceHs, this is not a problem, but for a large collection, perhaps 
containing many independant variables, it is not practical to design the entire 
family. In these cases, users would copy and edit the cell which most closely 
approximates the required cell. 
If constructs for specifying conditional circuitry were added to a graphics language, 
a designer could specify a cell which represents a family of cells, using the 
conditional operators to distinguish between members of the family. To nse those 
conditional constructs, methods would be added to allow parameter passing to thE! 
cells, so that these parameters could participate in the evaluation of the 
conditionals. It would also require the use of expression evaluators, so that the 
parameters could be operated upon as the conditionals were evaluated. Looping 
constxucts would be very handy for generating arrays and vectors of cells. 
By the time these features were added to a graphics system, the system VV"ould no 
longer be a high level graphics system but a low level programming language. 
Rather than add these complexities to a simple graphics system, we might add 
graphics primitives to existing programming languages. Using this new approach, 
cells \.Vould easily .be designed which can be parametrized and which actually 
generate a whole family of cell instances depending upon the parameters passed into 
the cell. 
Another advantage of designing classes of cells has to do with the binding of design 
decisions. In standard graphics designs, virtually all of the design parameters must 
-31-
be bound before the cells are designed. Using the software programming approach, 
the exact parameter values are not needed, but rather a range of acceptable values is 
required. The cells can then be designed to produce correct layouts over thc:-;e 
ranges. When the actual parameter values are known, these cell programs an:~ 
called with the appropriate values and the layout is generated. The design can 
proceed before the details are completely known. 
A third advantage of designing families of cells has to do with the granularity or 
size of cells. With graphics approaches, cells usually contain 10 to 100 primitive 
components, or transistors. This limitation is brought about by limitations on CRT 
terminals and on the ability of the human mind to design large circuits. These 
small cells are assembled to form the chips. Rarely are configurations of these small 
cells stored as a large cell in the library because of the fact that the la1·ge cells are 
exact physical elements which cannot be changed. Inefficiencies in a particn1ar 
instantiation are usually not tolerated, so the large cell is redesigned in each context 
with the minor variations that each context requires. With the software approach, 
these large cells can be parametrized to vary the arrangements of smaller cells and 
remove the inefficiencies in the layout. Thus large cells can be designed and saved 
in libraries and still yield efficient layouts. 
These software languages are referred to as Im bedded Languages. The construction 
for generating graphics primitives are imbedded in a previously existing 
programming language. There are two classes of imbedded languages: translation 
based languages and data structure based languages. The translator imbedderl 
languages output the graphic primitives as they are encountered during the 
execution of the program. Data structure irnbedded languages build up a data 
structure representing the entire chip as the program is executed. Once this data 
structure is built. the graphic primitives are output. The latter approach allows 
programs to modify the design after it is generated, while the former approach 
forbids such_ modification. Im bedded Languages exist in several languages at 
Caltech: ICLIC, written in iCL [ 4]; LAP, written in Simula [ 19]; Clap, written in C; 
others, written in Pascal, Fortran [ 31 ], and Basic. The language presented here is 
ICLIC, which is an example of a data structure irnbedded language. 
-32-
3.1: ICLIC 
ICLIC is a series of functions and datatypes defined within ICL to allow the USF-'r to 
describe integrated circuits. ICLIC was written in ICL by Ron Ayres and Maun~en 
Stone. Integrated circuit descriptions are ultimately geometrical regions, so the 
primitive constructs in ICLIC are representations of simple geometrical shapes. The 
most primitive shape is the BOX, which we will define to be all points on the plane 
•vhose x-coordinates are between the lower and upper x limits of the box 
(inclusive) and whose y-coordinates are between the lower and upper y limits of 
the box. The following ICL code defines the BOX datatype and a function TO which 
aids the user in generating a box: 
TYPE BOX= CLOW,HIGH:POINTJ; 
DEFINE TO(A,B:POINTl=BOX: CLOW: A MIN B . HIGH: A MAX Bl ENDDEFN 
POINT i's a pre-defined ICL datatype which has two real values labeled X and Y. The 
MIN <md MAX functions are defined for POINTs to work coordinate-wise: the M.IN 
of hvo points has an X value which is the minimum of the two point's X values and 
a Y value ·which is the minimum of the two point's Y values. A user may generate a 
box in one of the following two manners: 
VAR [·H, 82=80X; 
01:= TOl3#4,l0#12t 
82:= 3#12 \TO 10#4; 
The t \VO boxes are id en tic al because the point values are sorted. A second primitive 
geometrical region is a polygon. A polygon is defined to be the set of all points in 
the plane which lie 'inside' the line segments which comprise the edges of the 
polygon. vVe can represent the polygon in the computer's memory as a list of points 
which are the verticies of the edge segments. The following code declares the type 
POLYGON: 
TYPE POLYGON= f POINT l; 
Here we have declared that a polygon is an arbitrary list of points. To generate a 
triangle, the following constructions can be used: 
-33-
VAR Pl.P2=POLYGON; 
Pl:= 3#2 10#4 : 8#12 l; 
3#2 10ft.+2 ; .-2#.+8 I; 
Thr:> sL~cond example makes use of the relative-point feature in ICL. The final 
primitive eeometric region used in ICLIC is a WIRE: Formally speaking a WinE is 
I.he set of all points which lie within a fixerl rlistance from any point on a givr~n 
series of line segments. The collection of line segments is called the 'path' of the 
,,vire and the discrimination distance i.s called the 'radius' of the wire. This formal 
definition of a "\Vire requires that circular arcs be present in the boundary of the 
'"'·ire. ICLIC approximates a formal WIRE by 'squaring off' all of the round efiges. A 
WIRE can be defined in ICL by stating: 
TYPE L.llRE= [l.IIOTH:REAL PATH: I POINT J; 
An example of a wire might be 
VAR t.J=l-J I RE; 
1.l:=[~JIOTH:2 PATH: !3tl3;7#.;.tl5;10#8;14ft.l J; 









{3#2; 1 IZl#4; 8# 12} 






PA TH: {3#3; 7#. ; • #5; 
10#8; 14e. J- J 
We no"\V have representations for the primitive features of our imbedded language. 
We "\.vould like to be able to talk about features in general, not just BOX-featun~s. 
POLYGON-features, and WIRE-features. To do this, we need a new datatype which 
-34-
con be either a BOX, POLYGON, or WIRE. A datatype of this form is called a 'variant' 
in ICL. We declare a datatype RG (for ReGion) which can be any one of the thn~e 
primitives: 








If we have a variable of type RG, we can assign to it a BOX, POLYGON, or WIRE. 
Unfortunately, we can only describe single features with this datatype. Usually IC 
masks contain more than just one geometric primitive. We can extend the 
rlE~finition of an RG to contain the possibility .of many primitives by adding the 
following line to the definition of an RG: 
UNI ON= I RG I 
This now states that an RG may also be an arbitrary list of RGs. In addition to many 
features in an IC design, there are also many 'layers' to an IC sp<Jcification. To 
rliscriminate between layers, the features have a color associated with the region. 
vVe can incorporate this possibility in our definition of RG by adding this line: 
COLOR= lPAINT:RG WITH:SCALARIRED,BLUE,GREEN,BLACK,YELLO~l)J 
Finally, 'tVe may wish to reposition previously declared regions. In almost every 
instance, we are describing features relative to a local origin rather than in absolute 
chip coordinates. Once these sub-pieces are generated, we would like to reposition 
them into the absolute chip coordinates (or to a higher level local coordinate 
system). The primary repositioning operations we would like to perforru are 
translation, rotation, and mirroring. These operations can all be represented as a 
transformation matrix which should be applied to all coordinates of the region to be 
displaced. By adding the matrix displacement case to the RG definition, we can 
-35-
arbitrarily reposition, mirror, rotate, and scale subcells. The following case is added 
to the RG definition: 
DISPLACE= [O!SPLACE:RG BY:MATRIXJ 
and the MATHIX datatype is declared with: 
TYPE MATRIX- CA, B, C, 
0, E, F: REAU; 
We have now completely described the RG datatype definition. With this datatype, 
we can represent the features of integrated circuit masks. 
The datatype definitions presented above are an approximation of the actual ICLIC 
datatype definitions. Appendix 1 lists a more complete description of the datatype 
and functions defined for both generating and examining layouts. The primary 
difference between the definitions presented here and those in the appendix have to 
do \Vith capturing the minimum bounding box (MBB) of the layouts. The MBB of a 
layout is a very useful quantity, and for efficiency, the layout datatype in ICLIC i;~ 
MHG, which stands for Minimum bounding box with ReGion. 
3.2: Parametrized Cells 
To illustrate the design of parametrized cells, let us consider the task of designing a 
shift register cell. The shift register cii·cuit we will implement is sho"Wn in fig~ure 
3-2. This circuit consists of a pair of inverters with a transmission gate connect.int~ 
them and a transmission gate connecting the input to the first inverter. We can 
design the layout of the shift register as shown in figure 3-3. To design this 
layout, we have computed the expected power requirements and aspect ratio of the 
cell. 
As 1.Ve use our shift register cell in various places in several chips, we may find that 
the potver requirements in some cases differ from the power requirements of our 
original cell, so we must design a new cell for these new uses. In other instances, 




Clock 1 Clock 2 




Fig. 3-3: Shift Register Layout 
must redesign our cell to fit these new requirements. 
Each time 'tVe must redesign the cells, we increase the chances of errors in the 
design. We also proliferate cell instances in our database, and we must expend the 
P.ffort to document the new cell. 
In our shift register example, it is very easy to mathematically describe our cell 
layout as a function the power requirements. In figure 3-4 we show the layout 





ISl • J: .:z: .:z: .:z: .:z: ISl I- I- I- I- I-• .... 0 0 0 0 0 19 .... t-4 .... .... t-4 
::. ::. ::. ::. ~ 
+ + + • • ISl It) It) N N .... .... N + + 
It) ISl 
N Cf) 
Fig. 3-4: Parametrized Layout 
requirements of the cell. Given this description of the shift register layout, vve can 
now generate a nevv cell every time we compute a new power requirement. In fact, 
vve can write a little program which will generate this new cell for us. 
-38-
DEF I r--JE SH I FT _REG! STER_CELL (PmJER: REAU =f1RG: 
BEGIN VAR ~IIOTH,LENGTH, TOP=REAL; 
DO LENGTH:= 2/POllER 11AX 4; 
lJIOTH: = 32/LENGTH flAX 2; 
TOP: ,,-LENGTH+l5 f1AX 20; 
GIVE I ~llRE!RE0, !0110:.tlTOPl); 
END 
ENDDEFN 
\,j j RE (GREEN. l-3 .1110; L1#. l ) ; 
ll!RE \RED, IG#G; ll+lJIOTHtl. l); 
lHRE(GREEN, 111#0;.lfTOPl); 




t,J I RE IBLUE, {0#0; 15+1.J IDTH#. l ) ; 
WIRE !BLUE, !0#TOP; 15+~JIOTH#. l); 
GCB\AT 112#0;.#TOPl; 
GRCBD\AT 51./9; 
GRCBU\AT 12#11 l\AT !Otl0;15+WIOTH#01 
Figure 3-5 shows the results of calling this program with power requirements of 
1 /2 and 1 /8. Rather than call these two specific sets of geometrical primitives 
cells, why not call the program the cell? And we can call these layouts instances of 
the cPll. Each instance of the layout may have a completely different set of 
geometrical primitives, yet they are all instances of the one cell. r 
POWER= 1/8 POWER= 1/2 
Fig. 3-5: Two Cell Instances 
vVe can design this cell program before we know the actual pow-er requirements of 
the cell. In our original approach, vve had to compute the power requirements 
before we could begin the layout design. With this programmatical approach, we 
can estimate reasonable ranges we are willing to accept as power requirements, and 
design our cell before.we know all of the implementation details. 
-39-
Let's continue to parametrize our shift register cell. We could have designed our 
cell 'l.Vith many different aspect ratios. Depending upon how and where this shift 
n~gistE!l' is 11sed, the optimal aspect ratio for the cell changes. Figure 3-6 shows 6 
different aspect ratios for the shift register cell. The first aspect ratio is 
approximately square, and takes the least area. In some cases, hovvever, the 
horizontal space is more costly than the vertical space, so we might wish to use a 
ririrrower cell, even though the cell takes more vertical area, as in the second type 
nr l.1yout. The third and forth layouts were designed so that vertical space isat an 
ah.solute minimum, while the fifth and sixth layouts use an absolute minimum 
amount of horizo.ntal space. Each of these layouts are parametrized with re.<;pect to 
the pov.rer requirements. We can now write a cell program which is parametrized 
in terms of both power requirements and aspect ratio. When we call the shift 
rPt;ister cell, the program chooses the layout. which most closely matches our 
<if:;si ri::id aspect ratio, and generates the corresponding layout. 
Fig. 3-6: Six Shift Register Layouts 
Rarely are single shift register bits used. In most cases, a vvhole string of bits are 
required. In standard approaches, one does not think of a shift register row as a 
single primitive cell, rather a single bit is the primitive cell, and the user must 
interconnect each of the cells into a rov..r for each shift register row needed. This is 
done because there is much variability in the requirements of the shift row (such 
as po•ver, aspect ratio, number of bits). Because of this variability, fixed cells are 
usually not helpful. Since we are designing programmable cells, we can program 
-40-
this arlditional variability into the cell. 
For instance, the user might wish to state how many bits are to be in the shift 
register, so the program could generate the whole row. The user may also wish to 
have more than one shift reei.ster. Some area can be saved by placing two identical 
shift registers side-by-side. \Nhen long shift register chains are used, the chip area 
is long and thin. By folding the long shift registers, the chip area becomes more 
square, usually a desii;able option. So let's di=!sign our shift register cell to take th•~~~e 
parameters: number of bits wide, number of bits long, power per bit, and desired 
area. Our cell will take the power and compute cell sizes for the six different a.spect 
ratios. It will then produce a single row and attempt to fold the rovv to match the 
rlesired area. Finally the cell will return a layout of the entire array, using the 
single bit layout and folding factors that will produce the best layout. 
\Nhat if none of the possible implementations of the shift array fit within the 
desired area? Rather than having the program flag an error or abort the design, we 
may tell the program how to choose the next best area. For instance, we may like to 
c;tate that the desi.re(l area is 500 by 800 lambda, but if nothing fits, the x size is 
free to grow while they size must not get larger than 800 lambda. Or, we may say 
that the area should be 400 by 400, and if nothing fits, the instantiation with 1.h•~ 
smallest area should be used. To allow these possibilities, we will add oni=J more 
parameter to our cell program which is a vveight factor: if none of the 
instantiations fit, we will compute an excess cost for all prospective candidates by 
summing the x oversize times the x-coordinate of the weight and the y oversize 
times the y-coordinate of the weight. We select the candidate with the lowest 
excess cost. 
The ICL code for our cell is listed in appendix 2. The organization of the code is as 
follo•vs. We have routines for generating single bits of the shift register, named 
SHIFT~ELL through SHIFT6 CELL. These routines are parametrized in terms of 
pullup transistor size, pulldown t1·ansisto1· c;ize, and power line widths (PU=pullup 
length, PD=pulldown width, SP=width of single row power line, DP=width of 
double row power line, and HP=width of half-row power line). These return the 
layout for a single bit of the shift array. Next we have routines which generate 
rows of these single bits (SHIFT!_B.OW through SHIFT~OW) plus a routine -which 
turns these rows into a complete array (FINISH). These routines are also 
-41-
par<nnctrized in terms of total pmver, number of bits, and folding factor (TP=·width 
of total power line, NR=numbcr of bits per row, RB=number of rows in each shift 
register, NI3=number of shift registers, and NL= number of bits in the last row of 
each register. TB=total number of bit.;; in each shift register=NR"'(RB-1 )+NL). The 
functions SHIFT!_{\RRAY through SHIFT6 ARRAY simply generate the entire array. 
Given these shift array functions, we would like a routine which determines the 
area of each possible shift array. The SIZE function will return the area of a 
candidate and a routine which, when executed, will generate that candidate. We 
don't want to generate the actual layouts of every candidate to select the best layout 
because thi.s \vould take a lot of space in the computer's memory plus it would tal<e 
a long time. Instead, SIZE computes what the size would he, and generates a 
function reference which we may execute if the candidate is selected as best. Th8 
function SHIF~ELL, which is the function a ·user calls, checks many candidate;.; 
and selects the one best fitting the user's description. The best candidate is 
determined by the following algorithm: 
If there are candidates whose x and y values are less than the 
des i r·ed size, the one 1-1hose >< and y va 1 ues are c I oses t (sum of 
squares) is chosen. 
If no candidates fit, a weight is determined for each candidate. 
and the candidate with the smallest weight is used. The 1-1eight 
is determined as fol lows: 
If the x value is less than the desired x value, use 0 
otherwise use the difference between the actual and 
desired x values. 
Multiply this number by the x weight and square the result. 
Similarly, compute the y weight. 
The total weight is the sum of the x and y weights. 
The remaining functions in the listing (GRAPH and TABLE) produce a graph and 
tabular listing of the candidate sizes. These are useful if a designer wishes to see all 
of the candidate sizes for any particular size of shift array. 
This parametrized cell is used as follows. The designer determines the number of 
bits in the array. For our example, 1l've require 4 shift registers of 100 bits each. 
vVe would like these to be fairly low power, so our power requirements will he 
1 /8. Due to chip area constraints, we would like the array to be approximately 500 
-42-
by 800 lambda. We can get a tabular listing of possible candidates by entering ICL 
and typing the command 
TABLE<~.100,.125,23,1000#1500); 
The first parameter is the number of shift registers (4), the second is the number of 
bits per register ( 100), the third is the power rP-quirement (.125), the fourth states 
"t·vhat the maximum number of folds in thi? shift register should be (23), and the 
last parameter states the maximum size we want listed in the table. Concerning this 
last parameter, the program will generate all possible candidates meeting the other 
parameters, but will only list those candidates whose x dimensions are less.than the 
given x limit ( 1000) and whose y dimensions are less than they limit ( 1600). ICL 
will print the following table: 











CLASS: 3 ROI.JS/Bl T: 15 
CLASS:3 ROWS/BIT:l7 
CLASS:3 ROWS/8IT:19 





CLASS: 4 RQl.lS/B IT: 11 
CLASS:4 ROIJS/BI T: 13 
CLASS:4 ROWS/BIT:l5 
Clfl'.:,S: 5 ROllS/B IT: 3 
CLASS:S ROWS/BIT:S 















SIZE: 535. #991. 
Sl ZE: L152. #1095. 
S!ZE:452.tlll33. 
S!ZE:875.#523. 
SIZE: GGS. #731. 
SIZE: 539.11839. 
SIZE:455.#1147. 











None of the candidates fit into the area we requested, but there are five entrif.JS in 
the table which are the approximate size we require. We could also make a plot 
showing these candidate sizes by using the GRAPH function, which takes the same 



















l_P 0 ~ ~-~ ~- -· __ 1 _~-~- -
X Dimension 
Fig. 3-7: Graph of Size Possibilities 
982. 
This graph is shown in figure :3-7. To actually create the layout, we would call the 
SHIFT CELL function. The following code generates five separate shift register 
arrays differing in the area costs. The desired area for all arrays is 500#800. Th<:! 
first array requires that the y dimension is fixed while the x dimension is free to 
vary. The second array fixes the x dimension and allows they dimension to grow. 
The third array allows x and y to vary equally. The fourth array has the x 
dimension costing a bit more than the y dimension, but both are free to vary. The 
final array uses more power, but fits within the 500#800 space requirement. 
Figure 3-8 shows the metal2 layer for each of these arrays. 
-44-
VAR AFlRAY 1, ARRAY2, ARRAY3, ARRAY4, ARRAY5=11RG; 




ARRAYS: =SHI FT ~CELL(4, 100,. 25, 500#800, l#U; 
------------~ --------
ARRAYl ARRAY2 ARRAY3 ARRAY4 
Fig. 3-8: Five Shift Array Candidates 
3.3: Conclusions 
ARRAYS 
We have seen the description of an imbedded language system and how this system 
can be used to construct integrated circuit layouts. We have also seen the benefits 
of using im bedded languages to design chips. 
One of the advantages of imbedded language systems is that they allow the user to 
design a whole family of cell layouts at one time. Based upon parameters given to 
the cell program, the program will compute and generate the correct layout for the 
particular usage of the cell. This emphasizes the similarities between members of 
the cell family. This also reduces the number of cells in the cell libraries: The cell 
program is saved rather than the many cell instances. 
The cell parameters typically refer to behavioral information, not geometrtcal 
information. Our shift register is parametrized in terms of number of bits and 
-45-
power requirements, not inter-cell spacings and transistor sizes. This allows for 
partitioning of the design. The user of the cell thinks in terms of parameters 
interesting to him:, and he does not have to know the details. of the cell 
implementation. 
Along these same lines, parametrized cells delay the binding of design decisions. 
The shift register program was implemented before the power or area require men ts 
v11ere known. Also, since this cell is now flexible, the entire chip layout can be 
designed before the power requirements are known. When the requirements 
change, a few simple parameter changes will completely correct the layout. 
The task of making design decisions is.also aided with parametrized cells and chips. 
\.Vhen the cells are parametrized in the manner presented here, the user can alter 
the design parameters and actually see the effects these decisions make upon the 
design. The designer does not have to guess, the actual results can be seen. 
Parametrized cells tend to encompass much larger functions than fixed cells. 
Parametrized cells are usually a complete function, whereas fixed cells tend to be 
rather small pieces of layouts which must be combined to construct a function. 
Since fixed cells can not reconfigure themselves depending upon how they are 
used, large fixed cells are not frequent since it is rare that a large function will he 
used id en tic ally in many places. Parametrized cells can reconfigure thernsel ves, so 
similar uses of a function can efficiently use the same cell. 
\.Vith imbedded languages, we are not designing chips as purely graphical data. We 
have the freedom to add additional information to our cells, information which can 
further aid the design process. In the next chapter, we explore some of these 
possibilities. 
-46-
Chapter 4: Chip Assemblers 
In the previous chapters, we have reviewed methods for generating leaf cells, 
which is only the first step to designing a chip. To complete the design of a chip, 
'1\-·e need to generate the composition cells which interconnect the leaf cells. The 
task of interconnecting leaf cells is much harder than the generation of the leaf 
cells. The leaf cells are typically small, self-contained units which can be 
completely defined. Composition cells, on the other hand, deal with global 
information, and are fairly large, complex assemblies at the higher levels of the 
chip hierarchy. In this chapter v.re will explore some of the tools which can aid in 
the interconnection of the leaf cells [29]. 
4.1: Cell Composition 
There are three phases of generating composition cells. The first phase deals with 
the specification of the interconnection betv.reen the cells: how should the cells be 
wired together? The second phase deals with the generation of the geometrical 
primitives required to interconnect the cells. The final phase deals with 
verification: was the interconnection specification correct, or did we just short VDD 
and GROUND? 
Each interconnection methodology presents unique constraints upon these three 
phases of cell composition. In some interconnection strategies, the interconnection 
specification is implied by the cells themselves, freeing the user from the task of 
writing an interconnection list. Other techniques do not require wires to perform 
the interconnection, so the generation phase may be trivial. 
Every interconnection methodology, however, should have a checking phase. Most 
of the errors in chip design have to do with erroneous interconnection of modules, 
virtually all of which would be caught by the checking phase of the 
interconnection. By TYPEing the connections to a cell, one can later verify that the 
connector was connected to the proper signal. For instance, one would not like to 
connect two outputs together. By adding this information to the layout 
representation, it can easily be verified that outputs do not connect to other 
outputs. 
-47-
for the composition systems presented in this chapter, we will assume that the chip 
floorplan is a slicing type floorplan, as presented in Chapter 2. 
4.2: Power Routing 
Povv-er signals are special signals in integrated circuits. They can not be routed as 
ordinary data signals, due to the finite resistance and current limits of the wires. 
Therefore, a strategy should be developed to deal specifically with power wires. 
The first requirement that one might state about power lines is that they should 
always run in metal from very close to the transistor terminals to the edge of the 
chip. With two polarities of power lines in NMOS design, this means that some 
planning must be done before the cell design is begun. Without this planning stage, 
the power requirements may be impossible to satisfy. 
We can analyse the structure of NMOS design to develop a general model of power 
routing [ 14]. In specific cases, special purpose power routing schemes are used, but 
in the general case, the follov.ring power routing scheme has been shown to produce 
close to optimal designs. We define a cell to have not only a rectangular outline, 
but also to have a VDD terminal in the North-East corner of the cell and a Ground 
terminal in the South-West corner of the cell. The cell must also contain power 
consumption information, so that the power lines can be made of the appropriate 
width. 
We will place the following conventions upon the definition of the VDD and 
Ground points. To properly connect power to the cell, we need to touch the VDD 
point v.rith a metal VDD box and to touch the Ground point with a metal Ground 
box. We are free to run Ground lines anywhere along the bottom edge of the cell, 
up to the Ground point, or we may run metal Ground lines anywhere along the left" 
edge of the cell, up to the Ground point. Similar statements can be made about 
running VDD lines. Figure 4-1 illustrates these conventions. The first example has 
the power lines running horizontally while the second example routes the lines 
vertically. 
, i' 111,ly dP.fine a data.type CELL which encapsulates the information needed for 
























The CELL has a name, layout, the two power points, and a power consumption 
variable. We will represent the power consumption by a REAL number which 
indicat1:1s the effective conductance (reciprocal of resistance) of the internal 
circuitry. A pre-defined procedure WIDTH converts this conductance into the 
minimum wire width needed to supply the required power. 
We have stated that we will use the slicing floorplan for our chips. As shown in 
Chapter 2, this means that all possible chip floorplans can be implemented as a 
hierarchy of binary cell fusions. If we write routines which will properly 
interconnect two cells in any legal configuration, we will be able to route the 
po~ver for any slicing chip whose cells use the two-point power convention. We 
may recall that there are precisely two legal configurations of two cells in the 
slicing floorplan: one cell may be to the right of the other, or one cell may be above 
the other. We will call these two orientations HORIZONTAL and VERTICAL, 
respectively. 
Let us consider the horizontal case. Given two cells that have already been given 
appropriate relative positions, how do we connect the power line~? Figure 4-2 
gives an example of how this might be done. In the figure, we route boxes frmn 
the pov.rer points to a larger power box which is a suitable distance from the two 
cells. The widths of the two vertical power boxes connected to the left cell are W1, 
which is equal to WIDTH(left.POWER). Similarly, the right cell's power box 
-49-
widths are W2, which is WIDTH(r~ght.POWER). The widths of the large power 
boxes are W3, which is WIDTH(Ieft.POWER+right.POWER). Thus, the w__~rtical 
boxes are wide enough to supply power to one of the cells, while the horizontal 
boxes .are wide enough to supply power to both cells. 
New 
Gr"ound-









It may be noticed that the layout of the power boxes shown in figure 4-2 is fairly 
inefficient. \Ve will now produce more efficient routings. Consider the vertkal 
VDD box of either cell. We have its lower right corner touching the power point of 
the cell. If we had the lower left corner touch the power point, the power box may 
extend past the cell's bounding box if the power requirement is large, as shown in 
figure 4-3a. If we always lined the right edge of the power box with the cells 
bounding box, the box would never extend past the cell's bounding box, but the 
pmver box may not touch the power point, as shown in figure 4-3b. To efficiently 
align the power box, we need to examine the power box width and power point 
location. If the power box width is less than the distance from the power point to 
the cells bounding box, we will align the box to the power point (fig. 4-3c). If the 
po-wer box width is greater than this distance, we align the power box to the cell's 
bounding box (fig. 4-3d). 
Next. let us consider the position of the horizontal power box. In figure 4-2; it w-as 








Fig. 4-3: Alignment of Vertical Boxes 
geometry "\Vithin the cells. On the other hand, our power routing convention states 
that we may run any VDD boxes we wish above the cell, as long as the box stays 
above the VDD point. Hence, what we might do is lower the VDD box until it .just 
rests upon either of the two VDD points, which ever is higher. Figure 4-4 shows 
the only possible situations. If the left VDD point is above the right VDD point, the 
horizontal box rests upon the left's VDD point. Similarly, if the right's point is 
higher, the box rests upon the right's VDD point. If both points have the same 
height, the box i·ests on both. Notice in the first case that the left's verti.cal po\¥!:'r 
box is not required, since the horizontal box completely overlaps the area where the 
vertical box would be. The second case does not require the right power box, and 
the third case requires neither. 
Lef't Higher Right Higher Same Height 
Fig. 4-4: Positioning Horizontal Box 
To complete the routing of the VDD lines, we need to determine where the VDD 
point for the composition cell should be. The definition of the power point is that 
we may route any VDD boxes to the right or above the specified point. The x 
-51-
component of the point can be determined solely from the right cell. The right 
cell's VDD point stated where we could run boxes over the right cell. This same x 
coordinate can be used for the composition cell. For the y coordinate, we need to 
PX amine both the left and the right CfJlls, but again, they have given us acceptable 
values for running horizontal \Vires. We need only satisfy both cells' requirements. 
This is done by using the larger of the two cells' VDD point's Y values. Since the h~ft 
cell's VDD point xis always less than the right cell's VDD point x, we can state that 
the new VDD point is simply the maximum of the left and right cell's VDD points. 
The analysis of the VDD boxes can be used to analize the Ground boxes, with 
appropriate sign changes. We can now code the routine for horizontally fusing t-wo 
cells. 
DEF I NE HORI ZONTAL_PO!-iER_BOXES (L, R: CELL NAf1E: QS) =CELL: 
BEGIN VAR Wl,W2,W3,Xl,X2,X3,X4,VDDY,GNDY=REAL; 
DO Wl:-W!DTHIL.POWERJ; 
1-12: -I.JI DTH !R. POl.IER) ; 
IJ3 : :lJ I DTH ( L . POl.JER +R. PQl.JER J ; 
Xl:·MBB(L.LAYOUTJ.HIGH.X; 
Xl:= IF Xl-Wl<l.VOO.X THEN Xl-Wl ELSE L.YOD.X Fl; 
X2:=MBB!R.LAYOLJT).H!GH.X; 
X2: • IF X2-l-12<R. YOO. X THEN X2-l.J2 ELSE R. YOO. X FI ; 
X3:-M88(L.LAYOUTJ.LOW.X; 
X3:= IF X3~Jl>L.GNO.X THEN X3 ELSE L.GNO.X-Wl Fl; 
X4: =flBB (R. LA YOU Tl. L0!-1. X; 
X4:= IF X4+112<R.GNO.X THEN X4 ELSE R.GNO.X-~J2 Fl; 
VOOY:= L.VOO,Y MAX R.VOO.Y; 
GNOY:= L.GND.Y MIN R.GNO.Y; 





BOX (BLUE, X l /IYDOY\ TO X2 +~l2#YDOY +~l3 J ; 
BOX <BLUE, X3#GNOY-m\ TO X4+W2#GNOY): 
IF VOOY\IS_CLOSE_TO L.VOO.Y THEN 
IF VOOY\JS_CLOSE_TO R.YOO.Y THEN NIL 
ELSE BOXIBLUE,X2#R.YOO.Y\TO X2+W2#VOOY+W3J FI 
ELSE 80X(BLUE,Xl#L.YOO.Y\TO Xl+Wl#YOOY+W3) FI; 
IF GNOY\IS_CLOSE_TO L.GNO. Y THEN 
IF GNOY\IS_CLOSE_TO R.GNO.Y THEN NIL 
ELSE BOXIBLUE,X4#GNDY-W3\TO X2+W2#R.GNO.YJ FI 
ELSE BOXIBLUE,Xl#GNOY-W3\TO Xl+Wl#L.GND.Y> Fil 
YOO: L.YOO MAX R.VOO 
GNO: L.YOO fl!N R.VOO 
POIJER: L. PmJER+R. PmJERJ 
Yigure 4-5 shows the resulting layout. The routine for vertical fusion is similar to 
the horizontal routine. 
-52-
Fig. 4-5: Completed Power Connections 
One final observation. We have located the VDD point and Ground point within the 
cell boundaries. Also, when we produced the composition cell, we kept these pnints 
"vell ·within the boundaries of the new cell. WI~y was this done? To consPrve area. 
The higher levels in the chip heirarchy can share this power channel with tlH! 
route done at this level. If anothN cell were added to the left of our composition 
cell, a larger po"ver box would overlap the horizontal power box drawn for this 
composition cell. Overlapping the boxes does not cause problems, because the larf;er 
power box is wide enough to supply power for all three of the cells. Figure 4-6 
contrasts the layout produced when the power points are inside the cell to the 
layout produced when the power points are at the corners of the cells. 
Interior Power Points Exterior Power Points 
Fig. 4-6: Hierarchically Sharing Boxes 
-53-
4.3: Composition Methods 
We will now look at some of the data line interconnection philosophies, and notice 
·what requirements are made upon the three phases of cell composition. 
4.3.1: Cell Abutment 
The simplest interconnection philosophy is that of cell abutment. In this style of 
composition, interconnection between cells is accomplished merely by abuting the 
two cells [24][27]. It is assumed that the interconnection points of the two cells 
are in precisely the correct position so that simple abutment properly connects each 
pair of ports. Figure 4-7 illustrates this concept. Here we wish to join cells A and 
B, with A 'to the left' of B. Given the bounding box information from the tv.ro cells, 
we can automatically position the two cells to get the interconnection. The 





Fig. 4-7: Cell Abutment 
DEF I NE ABUT T _HOR I ZONT AL (A, B: CELL NA11E: NAt1E) =CELL: 
DO B: :=\AT A.LAYOUT\t·t88\LR - B.LAYOUT\MBB\LL; 
GI VE HOR I ZONTAL_POL.IER_BOXES <A, B, NAt1E) 
ENODEFN 
DEF I NE A BUTT_ VERT l CAL ! A, B: CELL NN1E: NAl1E) =CELL: 
DO B::=\AT A.LAYOUT\MBB\UL-B.LAYOUT\MBB\Ll; 
GI VE VER TI CAL_Pm!ER_BOXES CA, 8, NAMEl 
ENDOEFN 
DEFINE AT<C:CELL P:PO!NTl=CELL: 
ENDDEFN 





DEFINE LL<B:BOXl=POINT: 8.LOW MIN 8.HIGH 
DEFINE URIB:BOXJ=POINT: B.LOW MAX B.HIGH 
DEFINE LRIB:BOXl=POINT: URIBJ.X # LLIBl.Y 





These abutment routines will handle the composition of two cells. Notice that the 
specification phase is trivial: we only specify which two cells to fuse, and in which 
order. Similarly, the generation phase is trivial: we need only position one cell 
relative to the other, then call our power box routines. On the other hand, we have 
done no verification of the design. We have no idea whether the implied 
connecti.on locations of the two cells line up. This little piece of checking, if 
rigorously applied at all levels of the design, will catch most of the design errors. 
To add the verification system to the existing cell system would require a large 
program that would analyse the layout portions of the tvvo cells, extracting the 
circuit information. The program would then have to verify that the composition 
of the two circuits is still a valid circuit. This is a very akward way of determining 
the port configuration of a cell. This is like \-VTiting a sofhvare program which 
examines a core dump to see if all subroutine linkages are correct. A more loeic<Jl 
approach would be to have the user specify the intended port configuration of the 
low-level cells. This information is trivial for the user to specify, since he has to 
;~enerate this information for the cell documentation. Rather than keeping the port 
information in the cell documentation, we will keep the information with the cell 
in machine-readable form, and use it to verify the composition of the cells. 
What sorts of information would we need in the ports of a cell? Obvious data are 
location and layer. To aid the user in examining a cell, we may want to add a name 
to each connector. These names could convey the intent of the signal. We vvould 
also like to kno-tv if a connector was an input, output, or bidirectional signal. With 
this information, we can verify that inputs connect to outputs, and that 
bidirectional signal connect to bidirectional signals. These three types of signals are 
-55-
not inclusive, but they will suffice to illustrate the point. In <lddition to the 
direction of the signal, we would also like to know when the signal is valid. Even 
!~ 
if we have connected an output to an input, if the output is onlyi3.)valid when the 
·~, 
clock is high and the input only samples when the clock is low, we have a dE!Sign 
error. We will add timing information to the connectors. Using a simplified 
tvvo-phase clock model, we C\Jn have signals valid during PHI-1, PHI-2, or always 
valid. Finally, we would like to know if we have connected an incredi'bly large 
load on a frail driver. For the purposes of this discussion, we will model the load 
and drive capabilities of connectors by REAL numbers. When we connect two 
connectors, we wish that the sum of the drives exceeds the sum of the loads. The 
















CONNECTORS= I CONNECTORS l ; 
If we add a CONNECTORS component to our cell definition, our cell designr=rs can 
append this connector information directly to the other information about the cell. 
Due to the implied conventions regarding connectors, we know that all connectors 
must lie on the perimeter of the cell, and th.at the connectors can not be on the metal 
layer (because the power boxes may run in metal). 
To complete the connector addition to our data structures, there are a few routines 
which must be modified. When we move a cell with the AT routine, we m.ust also 
move the connection points. Secondly, when we abut two cells, we must verify 
that _the connectors line up and have the proper characteristics. Finally, we must 
extend connectors so that they lie on the perimeter of the new cell. When we add 
the pmver boxes, the boxes may extend the bounding box of the cell. If this 
happens, our connectors will no longer be on the perimeter of the cell. We check, 
therefore, and if a connector no longer lies on the perimeter, we will move the 
connector and draw a wire of the appropriate color from the old to new points. 
-56-
This cell abutment technique is a very layout-efficient interconnection techniqUf!. 
Since the interconnection requires no' area, the interconnection is as efficient as 
possible. On the other hand, this is not a very general technique. The only time 
when cells abut is when they were designed to abut, which makes for a very rigid 
system. If any of the cells change, several neighboring cells may also have to be 
changed. One would use abutment in special cases, when the set of cells is small 
and well defined. 
4.3.2: Cell Stretching 
A second composition methodology is very similar to the cell abutment approach. 
Suppose that we wish to simply abut two cells, but the connectors are not at the 
same positions. To avoid generating wires to perform the interconnection, "We need 
to convert the original cells into cells which 'can simply abut, which means we 
need to arrange the connectors to be in the same positions. This is done by cell 
stretching. Consider figure 4-8. Here we have two cells whose connectors are in 
the same order, on the same mask layers, but in different positions. To align the 'A' 
connectors, we need to increase the distance between the bottom of the right cell 
and connector 'A'. We can not decrease the distance between the bottom of the left 
cell and connector 'A' because presumably the left cell was designed to have these 
distances minimized. Hence, we stretch out the right cell as shown in figure 4-8b. 
Next, we need to align the 'B' connectors. We stretch out the left cell, as shown in 
fig. 4-Sc. This process continues until all of the connectors have the same 
positions, at which point we can call the abut routines to connect the cells. 
B 8 B 
c 
8 
II A II II 
II 
Initial A Aligned 8 Aligned Final 
Fig. 4-8: Cell Stretching 
-57-
This approach has the interconnection program reaching inside the subceJls, 
modifying the layout, to perform the interconnection. External stretching is a very 
dangerous thing to do: by arbitrarily modifying a cell's layout, the electrical 
properties of the cell will chanee, and the cell may cease to function. Rather, one 
should design the cell to respond to requests to stretch. The system would ask the 
crll to move a connector, a:r:id the cell would be responsible for generating the nevv 
layout. In this manner, the cell can monitor changes in the performance of the 
circuitry, and correct for the cell stretching. 
It may seem that this approach is wasteful, because cells are deliberately expanded 
to take more room, creating a larger chip. In actual fact, smaller chips can result 
from stretching. The space lost at the low level by stretching may be more. than 
compensated for globally because the wiring cells are not needed. Similarly, 
stretching may increase the loads on some signal lines, so it would seem that 
performance v.rould suffer. On the other hand, the routing required betv.reen cells 
degrades the performance of those wires. So stretching the cells may actually 
increase the performance of the system from a global standpoint, even though local 
performance has suffered. Finally, by stretching two cells to fit, the resulting 
layout might be much greater in the stretch direction that either of the tw-o 
original cells, as the example in figure 4-8 shows. These arguments illustrate the 
dangers of arbitrarily stretching cells, but there are well-defined cases where 
stretching does pay off. 
4.3.3: River Routing 
In the cell-stretching interconnection scheme, we fused cells with connectors in 
the same order but in different positions. We stretched the cells so that th8 
connectors were in the same positions. Alternatively, we can draw wires to 
perforn1 the interconnection. Since the two sets of connectors are in the san1e order, 
the '\-Vires that we draw do not have to cross. A routing between cells v.rhere wi.res 
do not cross is called a 'River Houte'. Figure 4-9 shows a river route bet-ween two 
cells. A very simple algorithm. for generating a river route follows. Draw wires 
from each connector on the left cell over one unit. Then, as long as all connectors 
are not in the proper position to connect to the right cell, draw wires from the nevv 
connector positions up or down, coming as close to the final height as possible 
without getting too close to neighboring wires. This process of moving to the side 
-58-
one unit, then approaching the desired height, continues until all wires are at the 
appropriate positions. Once this is done, the two cells can be fused using the 
standard abutment routine. 
Fig. 4-9: River Routing 
The river routing scheme is topologically identical to the stretching and abutment 
schemes. Because of this, the interconnection requirements are similar to the 
requlrements of the other schemes. We do not need to specify the interconnection 
list, because this information is implied from the cells. We have mentioned one 
algorithm for generating the interconnection wires. Finally, the interconnection is 
verified using the simple abutment routine. 
The river routing interconnection scheme is more generally useful than either 
stretching or abuting, since the connector positions are free to move without 
cl t"dSlically affecting the cell size or performance. The connectors are still restricted 
to being in the same order and on a single mask layer. River routers are useful in 
chip assemblers, however, because there are cases where the connectors are in the 
proper order and on the proper layers, but not at the proper positions. For example, 
if the user connects buffers to each connector on a particular side of a cell, the 
buffer cell can be designed to have the appropriate number of connectors in thf~ 
correct order so that the cells can be river-routed together. 
-59-
There are several schemes for improving and generalizing the river route process. 
Appendix 3 discusses river routes in some detail. 
4.3.4: One-sided General Interconnect 
In each of the wiring methodologies presented above, the connectors of the two 
cells "\.Vere required to be in the proper order on the proper layers. For general 
purpose cell composition, such is not the case. For the connectors to satisfy these 
requirements, both cells would have been designed with the interface specification 
known, so that the connectors can be put in the proper locations. This means that 
the ·~1viring is done inside the cells! The user has to do the wiring by hand. There is 
also a one-to-one correspondence between connectors of the two cells, which is a 
serious limitation on the in terconnecta bility of cells. 
A more general interconnection scheme would permit arbitrary interconnections 
between the signals on adjacent edges of cells [5]. The user would specify the 
interconnections as net-lists, which are lists of connectors to be connected together. 
Using this style of interconnection, the user is required to specify the 
interconnection information, whereas the previously presented methods implierl 
the interconnection information. 
An example of a general interconnection is shown in figure 4-10. We no longer 
restrict the connectors' layers or positions. We do not require that there be the same 
number of connectors on the two cells. The only requirement is that the 
interconnections between two cells have the connectors on the edges between the 
cells. 
/\n advant<1ge of this interconnection technique is that the design of the chip can 
easily be partitioned. The hvo cells can be designed by independent design teams 
given only a functional specification of the interface between the cells. Also, if a 
cell is redesigned, the interconnection program is re-run with the original 
specification and the new composition cell is complete. 
One of the disadvantages of this technique is that the user has to specify the 
interconnection between the two cells. This can be a fairly large specificati:on if 
there are many connectors on the cells. Also, the possibility of errors requires 
-60-
Fig. 4-10: General Interconnection 
checking of the specification. Signal typing will catch most of the dumb mistakes, 
l1ut many of the logical errors can only be caught by checking the specifications. 
Another disadvantage is that this style of interconnection consumes more chip area 
than the other approaches. Because of these disadvantages, one would like to use 
the stretching and river routing techniques where they logically fit, and reserve 
the general interconnection schemes for the remaining routes. 
4.3.5: Four-sided General Interconnect 
In the One-sided general interconnector, we require that all interconnections 
between adjacent cells use connectors on the shared edge of the cells. While this 
technique may be useful in many circumstances, there are times when the 
connectors do not lie between the cells. Figure 4-11 shows a route which connects 
to signals on the North and South edges of the cells, in addition the the shared edges 
of the cells. This style of interconnection is termed 'Four-sided interconnect', since 
the connectors may be on any of the four sides of a cell [5]. 
There exists a technique which converts the four-sided interconnect problem into a 
series of one-sided interconnections. This means that the four-sided 
interconnection can be as time and area efficient as the one-sided interconnect, but 
that the generality of the four-sided interconnect can be capitalized upon. In figure 
4-1 Z, we show three steps in the fusion of cells. In this figure, we perform all of 
-61-
Fig. 4-11: Four Sided Interconnection 
the interconnections at one level before moving to the next level. We perform an 
Immediate fusion of the two cells. When we do this, some of our interconnectin,g 
tvires must route out of the channel between the two cells. For example, in the 
figure 4-12.a, some of the wires route on the east sides of the two cells, which i:; 
the channel between cells in figure 4-12b. The first two cells have taken channel 
area from the next higher level. This higher level channel route cannot share the 
area used in this lower level route. If, instead of routing outside the channel, we 
only routed inside the channel, but kept a list of incompleted connections, we can 
sh are the channels for the various levels in the hierarchical fusion. In figure 4-1 :J, 
tve show the same interconnection, but tvith the Delayed technique. We have only 
routed in the channel, but kept the incomplete routes with the composition cell. 
When we go to fuse this cell to neighboring cells, we add these incomplete routes to 
the routes required by the new interconnection and route all of the wires in the 










L - _J 
-- [ 




-- ...._ ...__ 













Fig. 4-13: Delayed Interconnect 
4.4: Conclusions 
In VLSI design, the design of the glue which interfaces cells is considerably harder 
than the design of the cells themselves. Much effort has gone into building systems 
to aid in the construction of the cells, but the interconnect problem has largely been 
ignored. In this chapter we have seen several techniques for fusing cells together. 
A Chip Assembler which contains these interconnectors, would greatly aid in the 
design of large chips. 
-63-
We have also introduced checking into the design of chips. Rather than analysing 
the results of a chip design to verify the interconnection, we design layouts that are 
correct by construction. The analysis style of verification is becomes impractical as 
chip sizes and densities increase. We must move to the synthesis technique of 




Chapter 5: A Sin1ple Silicon Compiler 
To illustrate the concepts involved in silicon compilation, this chapter will develop 
a simple yet complete compiler. This compiler may be called the Random Logic 
Compiler: it is designed to compile TTL-style circuits. Following a discussion of the 
floorplan for this particular compiler, we will see the code for the chip assembler 
and silicon compiler. After this, we will explore some of the possible extensions 
which allow higher-level user specification of the design. 
A silicon compiler is a program which translates a high-level, behavioral chip 
specification in to the 'machine language' of silicon design: a set of VLSI masks. The 
foundation of a silicon compiler is an Imbedded Language system. Within the 
im.bedded language, the structure of the compiler's floorplan is designed. The 
floorplan is the logical and physical arrangement of circuitry that the compiler 
generates. Given this structure and the graphics language, procedures are written 
which generate the 'cells' or circuits to be used on the chips. These cells can take 
parameters and perform calculations as the layout is generated. These cells also 
generate logical information, such as the list of connection points, in addition to the 
actual physical information that describes the design. The user specification is used 
to provide the parameter values for the cell procedures. The compiler links these 
sublayouts together to complete the chip. 
5.1 The Floorplan 
The floorplan limits the capabilities of any compiler. The more limited or fixed the 
floorplan, the smaller the class of compilable chips; the more relaxed or generalized 
the floorplan, the broader the class. On the other hand, the more specific the 
compiler, the more specialized it can be for a particular design style, which has 
t"\vo-fold benefits: the resulting layouts are usually more optimized, and the 
specification for any particular chip are very concise. 
For our example compiler, we want to generate arbitrary interconnections of NAND, 
NOR, and INVERT gates. These gates will be positioned horizontally in a single row, 
as illustrated in figure 5-1. The power lines will run along the top and bottom of 
the row, signal lines will run horizontally between the power lines, and the gates 




Fig. 5-1: RLC Floorplan 
Since we are not restricting the number of gates, nor the interconnection 
possibilities, component locations cannot be fixed to exact physical locations. For 
instance, the location of the upper power line can not be fixed since the power line 
width is related to the power consumed by the circuit, which is a function of the 
number of gates in the circuit. Hence, unless we arbitrarily limit the number of 
gates, we can not state where the upper power line should be for all designs. These 
positions can, however, be parametrized in terms of global variables. For our 
compiler, the variable 'YVDD' will be set to they-coordinate for the center of the 
VDD line. All of our cells will be designed to use 'YVDD' when referring to features 
associated with the VDD line, allowing us to position this line after we know how 
many gates are needed in the circuit. Similarly, 'YGND' will be they-coordinate for 
the center of the ground line, and 'POWER' will be the width of the power lines. 
In addition to the physical aspects of the floorplan as described above, we will need 
conventions for communication of information between the cells and the compiler. 
The1·e is some information that the compiler needs which the cells compute, and 
there is some information that the cells need which the compiler computes. In our, 
logic gate compiler, the procedures which generate each type of gate know 1.Vhere 
the inputs and outputs of the gates should connect relative to the cell's origin, 
\Vhile the compiler knows the origins for each cell. If the compiler were required 
to compute the connection locations, the compiler would be tied to specific cell 
implementations. One could not change a cell without having to change the 
compiler as well, and verification of the changes would be a fo1·midable task. For 
the same reason, local cells should not have to generate information that belongs in 
the compiler. 
-66-
In th!?. logic gate compiler, there are two bilateral communication paths that are 
needed: the c<;nnpiler gives each cell the x-coordinate of its origin, while the cells 
report their width to the compiler, so that the compiler can compute the next 
origin; the compiler assigns vertical position for each interconnection wire, but the 
cells must give the endpoints of the wires based on where the wire connects insi.de 
the cell. The first communication, involving cell origins, is done by direct 
parameter passing. The gate procedures are passed a REAL number which the 
procedures use for a horizontal origin. Each gate returns its width by setting a 
global variable CWIDTH. The second communication, for interconnection positions, 
is done thtough instances of a data type called PHYSICA~ WIRE. PHYSICAL WIREs 
receive y-values from the compiler. The gates can inspect this information in the 
PHYSICAUVIREs to determine which channel the wire uses. The gates may pass 
x-values to the PHYSICAL WIREs so that the wires will extend to the proper 
horizontal positions. 
5.2 Chip Assembler 
Having defined the conventions of the compiler, the cell generation routines may be 
written. The following code gives the implementation routines for the logic gate 
compiler: 
fYPE PHYSICAL_WIRE= [HEIGHT,LEFT,RIGHT:REAL NME:QSJ; 
PHYSICAL_lJIRES= I PHYSICAL_WIRE l; 
VAR YVOO,YGNO,PWIOTH,CWIDTH=REAL: 
DEF I NE CONNECT ([.J IRE: PHYS I CAL_L-l I RE X: REAU : 
@IWIREl.LEFT::= MINX; 
1;,> llH RE) • RIGHT: : = MAX X; 
ENDDEFN 
DEF I NE PULL UP !OUTPUT: PHYS I CAL_l.J IRE X: REAU =MRG: 
DO CONNECTIOUTPUT,X-2); 
GIVE IBOXIRED,X-15#0\TO X-5#5); 
ENDDEFN 
BOXIYELLOW,X-15#-2.\TO X-5#9); 
W RE !GREEN, 2, !X-131/YVDO;. tf3; X-8#.;. ti. -5;. +5#.;. #OUTPUT. HE I GHTf ) ; GCB\AT IX-12#YVDD;X-2#0UTPUT.HEIGHTI; 
GRCBU\AT X-7#-1.l 
DEF I NE NANO I INPUTS: PHYS I CAL_lJ I RES OUTPUT: PHYS I CAL_W I RE X: REAL l =MRG: BEGIN VAR I N=PHYS I CAL_W IRE; NUl1BER= I NT; X2=REAL; 
DO NUMBER:= +1 FOR IN SE INPUTS;; 
-67-
X2:=X-10-2*NUMBER; 
DO CONNECT<IN,X2J; FOR IN SE INPUTS; 
CWOTH: =X2-5; 
GIVE IGCB\AT X-8#YGNO; 
BOXIGREEN,X2+3#YGN0-2\TO X-7#-1.); 
COLLECT IRCB\AT X2#IN.HEIGHT; 
END 
ENOOEFN 
UIRE!RE0,2, !X2#1N.HEIGHT;X-6#.l)l FOR IN SE INPUTS;; 
PU~LUP<OUTPUT,Xll 
DEF I NE NOR {INPUTS: PHYS I CAL_l.J I RES OUTPUT: PHYS I CAL_W l RE X: REAL> =MRG: 
BEGIN VAR IN=PHYSICALJllRE; 
DO DO CONNECTC!N,X-161: FOR IN SE INPUTS; 
nlIOTH: =X-24; 




~JIRE {GREEN, 2, IX-8#YGNO+Pt..IIOTH/2+8;. #-2.1); 
COLLECT fRCB\AT X-18#IN.HEIGHT; 
l.JJRE(RE0,2, IX-lS#IN.HEIGHT+l;X-11#.; .#.+SI J; 
WIRECGREEN,2, IX-20#IN.HEIGHT+4;X-8#.l)l 
FOR IN SE INPUTS;; 
PULLUP<OUTPUT,Xll 
DEF I NE INVERT< INPUTS: PHYS I CAL_l.J I RES OUTPUT: PHYS I CAL_W I RE X: REAL l =MRG: 




GIVE ICCB\AT X-8#YGND; 
BOXIGREEN,X-S#YGN0-2\TO X-7#-1.l; 
RCB\AT X-12tlIN.HEIGHT: 








Fig. 5-2: Pulse Synchronizer Circuit 
At this point, we have routines for implementing NAND, NOR, and INVERT gates. We · 
can assemble chips by generating the required PHYSICAL_ WIREs, initializing 
parameters in each wire, calling the appropriate gate functions, collecting the 
-68-
resulting cells, and drawing the interconnection wires. The following example 
illustrates how one could use our chip assembler for designing a 'pulse 
synchronizer'. Figure 5-2 gives the logic diagram of the circuit. This code will 




Cl.JI OTH: =0; 
VAR l.JIRES=PHYSI CAL_l.JJ RES: l-11 RE=PHYSI CAL_l.JIRE; 
l.JlRES:=![HEIGHT:-8. LEFT:-999999. RIGHT:-399999.J; 
[HEIGHT:-17. LEFT:-999999. RIGHT:-999393.J; 
[HEIGHT:-26. LEFT:-939999. RIGHT:-339399.J; 
CHE!GHT:-35. LEFT:-333333. RIGHT:-939399.J; 
[HEIGHT:-44. LEFT: 993939. RIGHT:-339939.J; 
[HEIGHT:-53. LEFT: 999939. RIGHT:-939999.J; 
[HEIGHT:-17. LEFT: 999933. R!GHT:-999399.J; 
CHEIGHT:-44. LEFT: 939339. RIGHT:-893999.J; 
CHEIGHT:-17. LEFT: 399933. RIGHT:-339393.J; 
[HEIGHT:-52. LEFT: 993939. RIGHT:-999939.J; 
CHEIGHT:-35. LEFT: 993999. RIGHT: 999999.J; 
[HEIGHT:-17. LEFT: 999999. RIGHT:-999999.J; 
CHEIGHT:-8. LEFT: 999993. RIGHT:-993999.J; 
[HEIGHT:-8. LEFT: 993999. RIGHT: 999999.Jl; 
VAR RESULT =l'lRG; 
RESULT: =NANO ( !lJl RES (111 I , l.11 RES [143, Cl-JI DTH}; 
RESULT: :=\UNION NANOCIWIRES(3J;WIRES(SJ! ,WIRES[llJ,CWIDTHI: 
f\ESUL T: : =\UN l ON NANO ( !l-1 I RES (81 ; 1.J IRES [l 0J ; kl I RES [l lJ I , ~JI RES [8J , Cl-I IDTH I ; 
RESULT:: =\UNI ON NANO ( {l.JJ RES (51; 1-11 RES [9]; ~JI RES (13] l , ~II RES [10], CW OTHI; 
RESULT: : =\UN I ON NANO ( fl.JI FlES [GJ : t.J I ~1ES [71 ;I.I I RES [131 I , ~I IRES [8J , GJI DTH > ; 
RESULT: :=\UNION NANOC {lJ!RES(3J 1,ll!RES[7J ,GJ!OTH>; 
FlESUL T:: =\UNI ON NANO< fl.II RES (11 J; ~II RES [131 ! , ~II RES Cl2J, Gil DTHl: 
RESULT:: =\UNI ON NANO ( l~ll RES [4J: l-11 RES (5J; ~JI RES 021 l, ~II RES [131, CW OTH); 
RESULT:: =\UNI ON NANO ( ll~I RES [21: ~!IRES C5l l ,l-ll RES [5J, GllDTHl; 
RESULT: : =\UN ION NANO ( ll-1 I RES (1 J ; ~I I RES (51 J , kl I RES (5 J , GI I OTH l ; 
RESULT:: =\UNION !COLLECT WIRE (BLUE, 3, lCl.11 DTH 11AX ~Jl RE. LEFT # ~!IRE. HEIGHT; 
0 MIN WIRE.RIGHT#.!) 
FOR WIRE IE WIRES; I; 
RESULT: : =\UN 1 ON !BOX <BLUE. rn l OTH+3tlYVD0-3\ TO 4#YVOO+ {POl-IER-3 nAX 2) } ; 
BOXCBLUE,CWIOTH-l#YGN0+2-POWER\TO 0#YGN0+21l; 
PLOTCRESULT, 'Q'\AIF); 
This example shows how our chip assembler has raised the level of user 
specification away from the low-level wires and boxes, yet there are still many 
implementation details left for the user to specify. Too, this specification is not in a 
form conceptually clear for the user. The designer will make many specification 
errors. and these errors will be very difficult to locate, because of the obscure 
nature of the specification language. 
-69-
I Hilt DID nm nm DID nm nm fQJI nm nm 
I l.::i; L~i L~i It.= i 11-~ i: fYi fYi ll:::ii IY~ It..:: i 
rt! ~ rt! ~ 'll "I ~ "I "1 ~ l "1 l &::!J ~ I - .. ... ' ..... 
-... .. II .. , I II I 
I ---- ! I 
I 1 I .. .. .. .. ' .. 
I I I .. ' 
I 
I I _, I - ' " .. I I • 
I I I - .. .. • 
- -
].JI 
I IC: I Ir Ir II II Ir-ii l[lf - ~~---lf~ l[!I II I 
'Fig. 5-3: Layout of Pulse Synchronizer 
5.3 The Con1piler 
It is rather clumsy to generate chips in the assembler form given above. The user 
must constantly be concerned with implementation details, and design errors are 
common. If the implementation details could be hidden from the user so that thrJ 
user could design 11vith a higher level description, the design task would be easier 
and many errors would be eliminated. We will generate new data structures that 
allow us to describe the chip in a more functional manner, without the physical 
details, and write a program which will handle the physical concerns, given one of 
these new data structures. The following section of code lists both the data 
structures and the compiler: 





SI GNAL_t.Jl RES= I SI GNAL_W I RE l ; 
GA TE= [INPUTS: SI GNAL_~J I RES 
OUTPUT: SI GNAL_l-JI RE 
TYPE:GATE_TYPE 
INDEX:INTJ: 








DEF I NE PHYS I CAL <SW: SI GNAL_l-l l RE l =PHYS I CAL_l-l l RE: St.I. PHYS I CAL ENDDEFN 
DEF I NE PHYS I CAL (SlJS: SI GNAL_l-l I RESI =PHYS I CAL_W I RES: 
BEGIN VAR S=SlGNAL_lllRE: 




BEGIN VAR Sl.JS=S I GNAL_l.J I RES: H= I NT: G=GA TE: S=S I GNAL_l.J I RE; 
DEF l NE SORT (SI.JS: SI GNAL_WI RES! =SI GNAL_~JI RES: 
BEG! N VAR OUT =SI GNAL_~JI RES; ~l=S I GNAL_~ll RE; I, J, K= I NT; 
DO OUT:=Nll; 






FOR l-1 SE St-IS; && FOR. J FROf1 1 BY 1: DO 




OUT::= SUS[KJ <S; 
SWS[K-J:=SWS[K+l-J: 
DEFINE DRAW_WIREILEFT:INTI: 
BEGIN VAR l-l=SIGNAL l-IIRE: I=lNT; 
IF THERE_IS l.l.VLEFT>LEFT FOR ~l SE S~lS;&& FOR I FROn 1 BY l; 




ORAW_l-lIRE l~J. YRIGHTl: FI 
FOR G SEC.GATES;&& FOR H FROM 1 BY 1:00 @IGJ.JNDEX:=H; ENO 
FOR S SE C.SIGNALS; DO 
END 
@(S).VLEFT:= IF DEF!NEO{S.TO} 
THENS.FROM.INDEX MlN MIN G.INOEX FORGIE S.TO; 
ELSE S.FROM.INDEX Fl; 
s{SJ.YRIGHT:= S.FROM.INDEX MAX MAX G.INOEX FOR G SES.TO;; 
FORS SE C.INPUTS;OO @(SJ.VLEFT:=0; ENO 
FORS SE C.OUTPUTS;OO @(Sl.YRIGHT:=339999; END 
SlJS: =C. SIGNALS \SORT: 
llHILE DEFINED ISi.JS> ;&& FOR H FROM 1 BY l; DO DRAW_W1RE!-l>; ENO 
END 
ENDOEFN 
DEF I NE SETUP _DitlENS IONS lC: CHIP): 
BEGIN VAR G-GATE;S=SlGNAL~-llRE;H=REAL; 
POl-lER:= lJIDTH<+.25 FOR G SE C.GATES;l 11AX 4; 
YGND:= -3.*(MAX S.VHEIGHT FORS SE C.SIGNALS;J-4-POWER/2; 




BEGIN VAR S-SIGNAL_l.IIRE: 







FORS SEC.INPUTS; DO 
@!S}.PHYSICAL.LEFT:=-933993.; 
ENO 





DEF I NE ORAlJ_CELLS !C: CH IP l =llRG: 
BEGIN VAR X=REAL:G=GATE;M=MRG; 






FOR G SE REVERSE!C.GATES};l 
END 
ENOOEFN 
DEF I NE DRAl.J_l.l I RES ( C: CH J P} =llRG: 
BEGIN VAR S=SIGNAL~JIRE:LEFT,RIGHT=REAL; 
DO LEFT:=CWIDTH+5; 
RIGHT: =-2.: 




FOR S SE C.SIGNALS: 
EACH_DO @IS.PHYSICAL> .LEFT::= 11AX LEFT: 
@IS.PHYSICALl.RIGHT::= MIN RIGHT;;; 
BOX!BLUE,CWJDTH+3#YVDD-POWER/2\TO 4#YVOO+POWER/2l: 
BOX<BLUE,CWJDTH-l#YGND-POWER/2\TO 0#YGND+POWER/2ll 
DEF I NE LOAD IS: SI GNAL_l.J I RE l =REAL: 
BEGIN VAR G=GATE; T -SI GNAL_~ll RE; 
(+ CASE G. TYPE OF 
NOR.: 1 
INVERT: 1 
NANO: +l FORT SE G.INPUTS; 





BEGIN V /1R M=MRG; G=GA TE: S=S I GNAL_~l IRE: 
DO CIJIDTH:=0: 
PACK<Cl: 
SETUP _D HlENS IONS (Cl ; 
INITIALIZE_WIRES(Cl: 






There are two basic datatypes defined here: SIGNA~WIRE and GATE. These are 
abstract representations for PHYSICA~WIREs and instances of the gates. There is an 
additional datatype, CHIP, which holds references to all of the gates and wires 
which comprise the chip. The COMPILE function consumes a CHIP and produces an 
MRG, which is the ICLIC representation for layout. COMPILE calls five procedures. 
The first assigns horizontal channels to each of the interconnection wires. The 
second procedure computes the values for the global positioning variables. The 
third procedure initializes the PHYSICA~WIREs. The fourth procedure calls each of 
the gate cells. The final procedure draws the actual interconnection wires. 
vVe no"\v have a program which will take an abstract structure representing the 
behavioral definition of a chip and generate the layout. To facilitate the 
construction of these abstract chip specifications, support routines may be designed. 
The following code provides routines for modifying this data structure, followed 
by routines for generating this data structure. 
VAR CHIP=CH!P; 
DEFINE EQ(A,B:GATEl=BOOL: MACR0-10('LSPEQS'l 
DEFINE EQ(A,B:S!GNAL_WIREl=BOOL: MACR0-10('LSPEQS'l 
DEFINE LINK_INPUT<G:GATE S:SIGNAL_WIREl: 
@dSl. TO::= G <S; 
@!Cl.INPUTS::= S <S; 
Ef-.JODEFN 




DEF I NE UNLI NK_I NPUT (G: GATE S: SI GNAL_l.J IRE): 
BEGIN VAR O=GATE: R=S I GNAL_lJl RE: 
@ISJ.TO:=ICOLLECT Q FOR Q IE S.TO;WITH -(Q\EQ Gl;I; 
@!Gl.INPUTS:=ICOLLECT R FOR R SE G.INPUTS;WITH -CR\EQ Sl;l; 
ENO 
ENOOEFN 




DEFINE ElllllNATE (G:GATEl: 
OEGJN VAR D=GATE: 
-73-
CHJP.GATES:=!COLLECT Q FOR Q SE CHJP.GATES;WJTH -{Q\EQ GJ;I: 
ENO 
ENOOEFN 
DEF I NE ELI 111 NA TE (S: SI GNAL_ll I RE I : 
BEGIN VAR R=SlGNAL_~lIRE; 
CHIP.SIGNALS:=ICOLLECT R FOR R SE CHIP.SJGNALS;WITH -CR\EQ Sl;I; 
CHIP. INPUTS:=ICOLLECT R FOR R SE CHIP.INPUTS;lJITH -(R\EQ Sl;I; 
CHIP.OUTPUTS:=ICOLLECT R FOR R SE CHIP.OUTPUTS;WITH -(R\EQ Sl;l; 
ENO 
ENDOEFN 
DEFINE FUSE (A,B:S!GNAL_!,IIREl: 
BEGIN VAR G=GATE:C=CHAR;S=SlGNAL_~llRE; 
IF DEF I NED (8. FROM! ! THERE_! S S\EQ B FOR S SE CHIP. INPUTS; THEN 
IF DEF I NED (A. FROr-ll ! THERE_! S S \EQ A FOR S SE CH IP. INPUTS; THEN HELP; 
ELSE @(Al.INPUT:=B.INPUT; 
G: =B. FROt1: 
IF OEFINEO<Gl THEN 
UNLINK_OUTPUT(G,Bl; 
LINK_OUTPUT<G,Al; FI Fl FI 
IF THERE_IS S\EQ B FORS SE CHIP.OUTPUTS; THEN CHIP.OUTPUTS::= A <S; FI 







LET OS BECOl1E SIGNAL_l.JIRE BY 
BEGIN VAR S=SIGNAL~JIRE: 
IF THERE_! s s. NAllE\EO as FOR s SE CH Ip. s I GNALS; THEN s 
ELSE DO S:=[NAME:OSJ; 
CHIP.SIGNALS::= S <S: 
GIVE S FI 
ENO; 
DEF I NE NEIJ_SI CNAL=S I GNAL_l-lI RE: SC (!CHIP. SI GNAL_COUNT:: =+l;)) 
DEFINE SET(S:SIGNAL_WIRE G:GATE): LINK_OUTPUT{G,S}; ENODEFN 
LET GATE BECOl'lE SIGNAL_WIRE BY 
BEGIN VAR S=SlGNAL_l.lIRE; 




DEFINE INPUT!QS:QS}: CHIP.INPUTS::= as <S; ENODEFN 
. 
DEFINE OUTPUT!OS:QS}: CHIP.OUTPUTS::= as <S; ENDDEFN 
ENDDEFN 
-74-




DEF I NE NEl.J_GATE (St JS: S l GNALJ.J l RES TYPE: GATE_TYPEl =GATE: 
BEGIN VAR GA TE=GA TE; St.l=S I GNAL_t.J I RE; 
00 GATE:=UNPUTS:St.lS TYPE:TYPEJ: 
CHIP.GATES::= GATE <i; 




DEFINE NANO <SL.JS: SIGNAL _1-l I RES I =GA TE: NEW_GATE (St.JS, NANO> 
DEF I NE NOR CSLJS: SI GNAL_W I RES l =GA TE: NaJ_GA TE est.JS. NOR) 
DEFINE INVERT !SIJ: SI GNAL_t.JI RE I =GATE: NEU_GA TE ( !SW , I NVERTl 
DEFINE AND<SWS:SIGNALJJIRESl=GATE: SWS\NAND\lNVERT 
DEFINE OR ( SlJS: SI GNAL_l.J I RES l =GA TE: S!.JS\NOR\ ! ~NERT 
DEFINE NAND!A,B:SIGNAL_WIREl=GATE: NANO! IA; Bl ) 
DEFINE NORCA,B:SIGNAL_lJIREl=GATE: NOR( !A;Bl l 
DEFINE ANO <A. 8: S l GNAL_t.11 RE l =GATE: AND< \A;Bl J 










To specify the function of a chip, we call these new procedures. To start the 
description of a chip, we call NEW__fHIP, which initializes the system. Next, -we 
enter the logical equations by calling the SET function. We then state w-hich 
signals are inputs or outputs of the chip by calling the INPUT or OUTPUT 
procedures. Finally, we call the FINISH routine, which completes the linking of 
various portions of the description. Signal wires are identified by enclosing their 
names in single quotes. Logical equations are specified by calling the NAND, NOR, 
AND, OR, and INVERT functions. To specify the 'pulse synchronizer' from above, the 
folloi:"1ing code could be used: 
NEW CHIP; 
SET('ENABLE',NAND('SET',NAND('ENABLE','RESET'))); 
SET( 'COMP' ,NAND( 'CLOCK', 'X')); 
SET('X',NAND({NAND({INVERT('CLOCK');'ENABLE';'Y'}); 




SET('OUT' ,INVERT( 'COMP')); 
INPUT( 'SET' J; INPUT( 'RESET' J ;INPUT( 'CLOCK' );INPUT( 'MODE'); 
OUTPUT( 'OUT') ;OUTPUT( 'COMP'); 
FINISH; 
Notice how concise this description is compared to the description required for the 
chip assembler. In addition, this description is more natural for the designer, which 
assures fe"tver specification errors. In the compiler, we referred to signal wires by 
name, whereas in the assembler we used indexes into a global list. The compiler 
allows us to work \.Vith more of our own semantics, and to include more of this 
semantics in the chip description. 
5.4 Compiler Extensions 
There is a major difference between the assembler and compiler specifications of a 
chip. With the assembler, we write a program which contains the specification of 
the chip; with the compiler, we generate a data structure which contains this 
information. The data structure representation limits our design capabilities since 
the data structure is not as general as a programming language, but there is an 
advantage to data structure representations: we can write programs to modify, 
generate, or examine our chip specification. 
In the RLC, we may wish to perform logic minimization upon a set of equations to 
reduce the number of gates requited to implement those equations. Programs of 
this class are called Optimizers, which are discussed in section 5.4.1. In addition, 
the user may wish to specify the equations using mathematical notation, letting the 
program translate this formal mathematical notation into the appropriate data 
structures. Section 5.4.2 shows examples of these Generators and Parsers. Our data 
structure contains more information than strictly a layout. The user may wish to 
examine this information. In RLC, the user may wish to simulate the circuit. Such 
programs are called Examiners, which are discussed in section 5.4.3. 
These extensions have been added to the compiler presented above. Appendix .3 




Through the several levels of chip design (architecture, block, logic, gate, etc.), 
much thought is devoted to optimizing the design. Many of the optimizations nre 
algorithmic in nature: a formula or program can be stated which will apply the 
optimization to the design. Since our compiler's input is a data structure, we can 
design programs which will operate on the input data in attempts to produce more 
optimal chips. 
One optimization we might consider is the removal of unnecessary inverters. When 
using predefined cells, the user may need to invert a signal before connecting to an 
input of the cell, only to have the signal re-inverted by a gate within the cell. 0118, 
perhaps both, of the inverters are superfluous and can be removed. We can design 
an optimization program \vhich scans for series inverters and removes th8 
unnecessary inverters. Fig1ue 5-4 illustrates this process. In the first example, 
both polarities of the signal are required, in which case the second inverter is the 
only unnecessary inverter. The second example shows a case where the signal is 
inverted t\vice, but the intermediate signal is never used, in which case both 
inverters can be removed. The following routines are used to perform this 
optimization. 
DEF I NE GET I NVERTCS: SI GNAL_l-l IRE) =SIGNAL ~!IRE: 
BEGIN V J\R T =SI GNAL_~J I RE; G=GA TE; 
IF S.FROM.TYPEalNYERT THEN 
GIVING S.FROM.INPUTS[lJ 






Ellf1INATE ISJ; FI 
EF THERE_IS G. TYPE=INVERT FOR G SES.TO; THEN G.OUTPUT 




BEGIN VAR G=GATE;S,T=SIGNAL~~IRE; 
FOR G SE CHIP.GATES:UITH G.TYPE=lNVERT;UlTH DEFJNEDCG.OUTPUTJ; DO 
S:=G.OUTPUT: 















Fig. 5-4: Examples of Redundant Inverters 
The GE"!:lNVERT function is used to efficiently invert a signal. Figure 5-5 depicts 
the various conditions tested by GE"!:lNVERT. In the first case, the inversion of a 
signal (marked by the'*') is required. The signal does not come from an INVERTER, 
and no INVERTERs connect to this signal. In this case, an INVERTER is added to the 
circuit and its output (marked by the '**') is returned. In the second case, the 
original signal does not come from an INVERTER, but an INVERTER does connect to 
this signal, in which case the output of the INVERTER is used. In the third case, the 
signal comes from an INVERTER and is used other places, in which case the input of 
the INVERTER is used. In the final case, the signal comes from an INVERTER, and 
the signal is not used in other gates, in which case the INVERTER can be eliminated 
and its input signal returned. 
Given the GET INVERT function, the REMOVE INVERTERS function is 





Fig. 5-5: Operation of GE'!:_!.NVERT 
to the 'GE'!:...!_NVERT' of the input. 
Other optimizers in the RLC remove redundant gates (for instance two NAND gates 
whose inputs are identical), attempt to replace NAND gates with NOR gates if the 
• gate count would be reduced, and vice ve1·sa, and to merge NAND gates whenever 
possible. These optimizers presented so far look only at the logical specification of 
the chip and attempt to produce a more optimal logical specification by reducing the 
number of gates. Other optimizers look at wire lengths and gate loads to perform 
eletrical optimizations on the design. These optimizers to not change the functional 
-79-
specification of the chip, merely the realization of that specification. This frees the 
designer from many of the design constraints while composing the functional 
specification of the chip. 
5.4.2 Generators and Parsers 
The input to the RLC is a data structure containing the functional specification of 
the chip. \Ve have presented routines which allow the user to directly generate 
these data structures. On the other hand, we can write programs which generate 
these data structures for us. One such program might be a parser which accepts 
mathematical equations and produces proper RLC input for implementing those 
equations. With such a parser, our pulse synchronizer could be specified as 
follows. 
DEFINE PULSE_SYNCHRON!ZER(!NPUTS:SET,RESET,CLOCK,MODE 
OUTPUTS: OUT, CO!IP 
LOCALS:ENABLE,X,Yl: 
ENABLE= SET & (ENABLE & RESET) 
cm1P= CLOCK & x 
X= (-CLOCK & ENABLE & YI & !ENABLE & Y & XI & COf1P 
Y= ENABLE & tlOOE & (COf1P & Yl 
OUT= -COf1P 
ENOOEFN 
The parser which accepts this mathematical notation is listed with the HLC 
compiler in appendix 3. 
We might also write programs that generate the data structures for us. These 
programs specialize in the construction of certain classes of circuits. For instance, 
v.re might like a program that produces divide-by-n circuits. We would call the 
program, passing the divisor n, along with an input and output signal, and the 
program would generate the circuitry for the counter. The following code is in fact 
the program for producing divide-by-n logic. 
DEF I NE DFLOP WAT A, CLOCK, OUT, BAR: SI GNAL_l.I I RE} : 
BEGIN VAR Xl,X2,X3,X4=SlGNAL_wJRE; 
X 1 : =NEl.J_S I GNAL; 
X2: "'NHJ_S I GNAL; 











DEFINE COUNTERIN:INT IN,OUT:SIGNAL_WIREI: 
BEG l N VAR FlJ,,,FlJ; TOGGLE, NEXT ,Q, OBAR, D=Sl GNAL_Wl RE; OUTPUT =SI GNAL_t.H RES; 
OUTPUT:=Nll; 
Fl-J: =N-1 \Fl.J: 
IF N<2 THEN HELP; FI 
WHILE FW<>Ll01; DO 
Q: =NHJ_SlGNAL; 
OBAR: =NEll_S I GNAL; 
0: =NElJ_S I GNAL; 









OUTPUT::= IF FW BIT 0 THEN OBAR ELSE Q FI <i; 





The following input generates three dividers, with ratios of 5, 3, and 25. 
NEl-l_CHlP; 





A plot of the layout is shown in figure 5-6. The layout has been transformed to fit · 
the page better. 
This technique of building procedures within the compiler to aid in the generation 
of the compiler input is very powerful. The user can build his own environment 
within the compiler. With a handful of routines similar to this, the user can 
quickly and easily design new chips or experiment with multiple implementations 









Our chip specification is an abstract .representation of the chip, containing only 
functional information. As such, it is not particularly tied to any technology or sc~t 
of design rules. There are a very few routines which actually convert the data 
structure to physical layouts. The majority of the RLC code is independent of the 
physical implementation. Therefore, by modifying the few physical routines, w-e 
can generate output for a new technology. 
This concept can easily be included in the RLC through the use of ICL's suspendable 
functions. A datatype TECHNOLOGY is defined which includes all of the technology 
dependent information. The user may generate several technology variables, which 
allo-\v him to generate masks for any of these technologies. Figure 5-7 shows ejght 
different implementations of the pulse synchronizer. Some of the 'technologies' are 
merely pictures, and not meant to be actual mask layouts . 
. HHllll 
NMOS NMOS Sticks 
I 
Metal2 NMOS Metal2 Sticks 












Fig. 5-7: Multiple Representations (cont.) 
With this capability, the user may design a chip before the technology is available. 
vVhen the technology is available, the masks can be generated. Also, if designs are 
archived by saving the data structure rather than the mask sets, the designs can be 
updated to new technologies quickly. 
The user may also wish to simulate his circuit. Again, since we have an abstract 
representation of the circuit, it is a simple matter to simulate the chip. In RLC, we 
,generate a new data structure from the chip specification data structure. This UE!W 
data structure contains the information required to simulate the chip. The 
follo,.ving input constructs the simulation data structure for the pulse synchronizer 
and plots the result of the simulation, as shown in figure 5-8. 
llAKE_S I MULA TOR; 
CLOCK((PHASE:500 HIGH:1000 LOW:1000 VALUE:FALSE INPUT:'CLOCK'J); 
[,JAYEFORM C [VALUE: TRUE DELTAS: 1200; 7000; 8000; 21000; 22000! INPUT:' RESET' J l; 
lJAYEFOml ([VALUE: FALSE DELTAS: !4000; 5000; 15000; 17000; 2'1000; 250001 
INPUT: 'SET' l l; 
-84-
WAVEFORM<CVALUE:FALSE DELTAS: 112000;260001 INPUT:'MODE'Jl; 
RUN!30000); 
Simulation terminated at timem30000. 
PLOT (!'CLOCK'; 'tlODE'; 'SET': 'RESET'; 'OUT'; 'COMP'; 'X'; 'Y': 'ENABLE'!, 
' Q' \A IF, . 005 l ; 
CLOCK 
MODE 
SET n n n 
RESET l n n 
OUT 
COMP Uu Uu1-JlJ LJ 
x LJ n 
y n I 
ENABLE LJ LJ LJ 
Fig. 5-8: Simulation Plot 
LJl_f 
A very important advantage of having the. simulation driven from precisely the 
same chip description data structure is that we are guaranteed that the simulator is 
simulating the circuit that the layout generators produce. If the simulator required 
a different specification than the layout producers, the user would manually have 
to verify that the specifications matched (plus he would have twice as much typing 
to do). 
5.5: Conclusions: 
In this chapter we have seen the basics of a silicon compiler. The Random Logic 
Compiler is a very simple compiler, yet it illustrates the techniques and advantages 
of using silicon compilers. 
-85-
Virtually the only disadvantage of using a silicon compiler is the restriction of the 
floorplan. The only chips that may be designed are those that fit the floorplan, and 
forcing a chip into a given floorplan may lead to inefficiencies. On the other hand, 
the floorplan aids the user in specifying his chip, and helps in the verification of 
the design. To ease the floorplan restrictions, several compilers will be designed, 
each one finely-tuned for generating one class of chip or portion of a chip. 
One of the ma_.ior advantages of using a silicon compiler is that the user can work in 
his own language. We have seen with the parsers that the user writes logic 
equations. Logic equations are natural to the user, and the functional specification 
is typically given in terms of logic equations. When the user completes the 
functional specification of the chip, the chip can be generated immediately. 
With this rapid specification-to-layout cycle, the user can explore many of the 
design tradeoffs that would otherwise be impossible. When a decision must be 
made, the user can try several alternatives and quickly see the accurate cost of each 
possibility. This can dramatically shorten the functional design cycle, and the 
resulting chip can be significantly more optimal than a similar chip whose 
functional specification was virtually frozen before the physical layout was begun. 
The user can extend the language. Every working group develops its own language 
for intercommunication. Similarly, software designers develop subroutine libraries 
for co!nmonly used routines. In the same manner, users may extend the language of 
the silicon compiler, adding constructs and procedures which allow a more 
efficient communication of the chip specifications. 
Compilers give us technology independence. Just as FORTRAN is available on many 
machines, and programs written in FORTRAN are portable between installations, 
silicon compilers allow designs to be portable across technologies. When the 
technology changes, the code generation routines are rewritten, but the user need 
never see the change. The old design specifications are still valid, and can quickly 
generate masks in the new technology. 
The silicon compiler gives us three guarantees: there will be no design rule 
violations in the generated artwork, the circuit will correctly perform the specified 
function, and multiple representations of the circuit indeed represent the same 
-86-
circuit. These capabilities and guarantees give the silicon compiler fantastic 





Chapter 6: Introduction to Bristle Blocks 
As the cost of VLSI integrated circuit design increases, the desirability of automated 
circuit design programs grows. Previous automated circuit design systems have 
evolved from the TTL gate technology, and focus attention upon the logic equation 
specificdtion of the design [5](9](10)(11](23](26]. None of these tools have 
confronted the problem of generating efficient designs in the VLSI technology. In 
VLSI design, the communication network is the expensive portion of the design, 
"\Vhereas in TTL design the commun~cation network is essentially free and the 
components are expensive. TTL design optimization focuses upon the reduction of 
the number of components at the expense of increased interconnections. Hence, 
TTL-based design systems yield undesirable results when applied to the design of 
VLSI circuits. 
The Bristle Blocks system addresses the central issues of VLSI design. By adhering 
to a wiring strategy which optimizes communication, designs are generated which 
compare favorably with hand designs in terms of area and performance. This 








II Of'f'-Ch i II 
Control Communioot.ion 
Fig. 6-1: Generalized Datapath Block Diagram 
The 'l.Viring structure implemented in Bristle Blocks is that of a datapath, which 
supports Register Transfer (RT) operations. Figure 6-1 is the block diagram of a 
datapath. A datapath may consist of several data processing elements, such as 
l\.rithmetic/Logic Units (ALUs) and shifters, and storage nodes (registers or latches), 
interconnected by data buses. The datapath elements are controlled by a 
rnicrocontrol word decoder. The microcontrol word is an arbitrarily long series of 
-89-
hinary logic values v.rhich describe the current operation of the datapath. Portions 
of the microcontrol word may be driven by datapath elements, while the remainder 
of the logic value sources are external to the datapath. Given a list of data 
processing elements and a behavioral description of the register transfer operations 
to be performed, Bristle mocks will compile a datapath and control logic layout 
which implements those operations. 
For any preliminary specification of a chip, there may be many structures which 
can be used to implement the specifications. The datapath structure-is one whi.ch 
can be used to implement a variety of functions. In chapter 9 we see examples of 
pipelined chips, signal processing chips, general purpose computing chips, and 
special application chips implemented in Bristle Blocks. 
Although general purpose in nature, restrictions are imposed upon the designs by 
the physical floorplan and the logical and temporal schema of Bristle Blocks. One 
restriction is that all of the data processing elements be of the same width. This 
means that all registers and ALUs, for instance, contain the same number of bits. 
Another major restriction is that complex instruction sequencing is implemented in 
a very inefficient manner. 
Gap 





Fig. 6-Z: Bristle Blocks Logical Floorplan 
The logical block diagram of Bristle Blocks is shown in figure 6-Z. There is a single 
ro'l.v of data processing elements with a limit of two data buses running past any 
element. There can be more than two data buses on a chip by placing a gap in one of 
the two busing channels. The two busing channels are refered to as the 'Upper Bus 
-90-
Channel' and the 'Lower Bus Channel', and the buses in those channels are referred 
to as the 'Upper Bus' and the 'Lower Bus'. These two buses are designed into each of 
the data processing elements, which does limit the number of buses in the system. 
However, by designing these buses into the cells rather than externally routing the 
bus wires, considerable chip area is saved. 
Buf'f'ere 
~-----------------~-Test.obi 1 i ty Shi f't Re9ist:e·r -------------------
Instruction Decode 
Pada 
Fig. 6-3: Bristle Blocks Physical Floorplan 
The physical floorplan of Bristle Blocks is very sim:i,lar to the logical block diagram. 
The physical floorplan is shown in fig1ire 6-3. The datapath elements are 
horL-::ontally abuted in the order they are encountered in the user's specification. 
The buffers, testability shift register, and the instruction decoder are placed below 
the datapath core. Finally, pads are placed around the perimeter of the chip. 
Bristle Blocks uses the two-phase clocking scheme presented in Mead and Conway 
(20]. Each of the data buses transfers data from the source register to the 
destination register(s) when the PHI 1 clock is high. To improve the performance 
of the chip, these buses are precharged during PHI 2, so that the source registers 
need only pull appropriate bus lines low . .If the registers are asked to refresh th8ir 
internal values, refreshing will occur during PHI 2. The processing elements have 
-91-
the opposite timing conventions. The carry chains and other internal nodes are 
precharged during PHI 1, and the computations occur during PHI 2. Output 
registers are loaded during PHI 2. With this timing scheme, data can be transfered 
into an ALU's input registers and the ALU can load its output register in one PHI 1 -
PHI 2 clock cycle. 
The control line buffers isolate the instruction decoding from the datapath core 
control lines. Each buffer samples an instruction decoder output during one clock 
phase, and drives its control line on the opposite clock phase. This delay allows the 
instruction decoder and datapath core to operate in parallel, and eliminates race 
conditions in the instruction decoder. The bus transfer controls, which are active 
during PHI 1, are driven by microcode conditions existing during the previous PHI 
2. Similarly; all ALU operations are decoded during PHI 1 and are then performed 
the during the next PHI 2. 
When both system clocks are low, the control line buffers dynamically latch the 
values which will drive each control line. If the two testability clocks are strobed, 
each buffer will transfer its value to its righthand neighbor. The leftmost buffer 
gets its new value from the testability input pad, and the rightmost buffer transfers 
its v<1lue to the testability output pad. By repeatedly strobing the two testability 
clocks, the user can examine the state of each control line buffer's latch, and can set 
each latch to new values. The instruction decoder can be tested by examining the 
state of the testability vector, and the datapath core can be tested by setting the 
testability vector to.specific values and observing the results. 
The remaining chapters describe Bristle Blocks in greater detail. Chapter '/ 
documents the input specifications accepted by the parser. Chapter 8 describes how 
Bristle Blocks generates a la;yout from a specification. Chapter 9 presents several 
examples of chips compiled by Bristle Blocks. Finally, Chapter 10 presents the 
history of Bristle Blocks, and proposes a new Bristle Blocks system. 
-92-
Chapter 7: The Bristle Blocks Input Language 
The Bristle Blocks Input Language is a formal language which allows for the 
specification of datapath chips. There are four pieces of information needed by 
Bristle Blocks to compile a chip: the name of the chip, the width of the datapath, 
the data processing elements needed in the datapath, and the structure of the 
microcontrol word. The name is used to identify the datapath, since many 
datapaths may reside in the system at any one time. The datapath width is required, 
since Bristle Blocks can generate datapaths of arbitrary width. In fact, many times 
the difference between a 16-bit chip specification and a 32-bit chip specification is 
only this single number. The microcontrol word is described to facilitate the 
specification of element operations. The data processing elements are listed in the 
order they are to appear in the final layout. As these elements are listed, parameter 
values are given which define how each element is to behave. 
The input parser for Bristle Blocks converts all lower case letters to upper case, so 
the input may be typed in either style. All examples presented here will use 
strictly upper case to improve the readability of the text. The parser recognizes the 
follO"~ving tokens: 
<10> Identifiers, which are a single letter fol lowed by 
an arbitrarily long sequence of letters, cligi ts, 
or underscores. Examples: A Hi_There x49 R202 
<MASK> Masks, which are composed of X, I, and 0 
<INT> 
characters. These are used to indicate which 
bits in the datapath are to be operated upon. 
The number of characters in the mask must be 
equal to the datapath width. Examples for 8-bit 
wide datapaths: XI IOOIXX ooooi iii XioxiO!x 
Integers, which are composed 
long, non-empty set of digits. 
1 32424134 0080 
of an arbitrarily 
Examp I es: 
<BLANK> Blank characters. Al I spaces, tabs, carriage 
<OTHER> 
return, and I ine-feed tokens are ignored by the 
parser. 
Any other character. 
not be interpreted as 
definitions becomes a 
Examples: { + 
Any character which can 
a token by the above 
token of this type. 
-93-
The following rules state the syntax for a Bristle Blocks input file. 
<CHIP> 
<NAME> 
: : : = 
: : : = 
.. -.. -.. -.. -
<NAllE> <BODY> END 
NAME <ID> <INT> ; 
<DECLARATION> 
<BODY> <DECLARATION> 
These rules state that a <CHIP>, which is the grammar accepted by Bristle Blocks, is 
composed of a <NAME>, followed by a <BODY>, followed by the token 'END'. A 
<NAME> is the token 'NAME', followed by an <ID>, followed by an <INT>, follow-ed 
by the token ';'. A <BODY> is either a single <DECLARATION>, or it is a <BODY> 
followed by a single <DECLARATION>. This recursive definition for <BODY> states 
that a <BODY> can be any arbitrarily long, non-empty set of <DECLARATION>s. An 
exampie of a <CHIP> might be 
NAME SAt1PLE 8: 
END 
where we have represented the <BODY> by ' .. .'. The <ID> in the <NAME> is the 
name of the chip, while the <INT> is the width of the datapath. We can see here 
that the name of the chip is SAMPLE and that the datapath is 8 bits wide. 
The <DECLARATION>s are specifications of datapath elements, and the description of 
the microcon trol word. The following sections define the syn tax and semantics of 
<DEC LARA TION>s. 
7.1: Field Declarations 
To specify the functioning of a datapath element, the user must be able to state 
m.icrocode conditions associated with each operation of the element. For example, if 
the element is to increment an internal value, the user must state when the 
incrementation is to occur. This is done by describing the states of the microcode 
inputs which should cause this operation to occur. This microcode condition 
specification is called an EQUATION. The user therefore gives the EQUATIONs 
associated with the elem en ts' functions when specifying the datapath. 
-94-
To facilitate the specification of these EQUATIONs, the microcode inputs, or control 
word, can be broken into FIELDs, so that the EQUATIONs become pairs of FIELDs 
"tVi th associated values. When the microcode inputs corresponding to each FIELD 
have the associated value, the EQUATION becomes TRUE, and the element performs 
the desired operation. 




: : : = 
: : = .. -.. -
FIELD <FIELD_DECL> 
<FIELO_SPEC> , <FIELD_DECL> 
<FI ELO_SPEC> ; 
Informally, these rules state that fields are declared by the keyword 'FIELD' 
followed by an arbitrary, non-empty set of field specifications, each separated by 
commas, followed by a semi-colon. Field declarations may occur anywhere in the 
datapath specification, but the fields must be declared before they are used. 
One form of a field specification is the field name followed by numbers indicating 
which bits of the microcontrol word compose the field. For instance, a field 
specification might be 
REG_SELECT<l,3,21> 
This specification has declared a new field, named REq_§ELECT, which is bits 1, 3, 
and 21 of the microcontrol word. In most instances, fields contain contiguous bits, 
so a shorthand can be used: if two of the integers in the list of bits are separated by a 
colon instead of a comma, all of the integers between and including these two 




Bits can not be repeated in a single field. Therefore, this specification is in error: 
SHIFT_CONST<l,2,l,3,2> 
On the other hand, using the short hand notation, if the second integer equals the 
first, no error occurs: 
-95-
A_SOURCE<3,7,9:9> 
Fields may have bi ts in common. For instance, the following three fields all share 
bits 3, 4, and 5 of the microcontrol word, but notice that the third field uses the 
bi ts in reverse order: 
FIELD_l<l:5>,FIEL0_2<3:7>,FlELD_3<8,5:3> 
To aid the use of macros in the field specifications, simple arithmetic operations 
upon the integers in the bit specifications is needed. Therefore, each of the integers 
in the bit specifications can be replaced by a simple equation involving addition and 
subtraction. 
FIELD_X<l,4-2:7+3-5> 
At times, one would like to describe a field not as a collection of specific 
microcontrol word bits, but rather as a subfield of a previously declared field. This 
can be specified as follows: 
FIELD FIELD_A<3,5,2,8,4,7>,FIELD_B=FlELO_A<4:2>; 
Here, FIEL~ is declared to be six randomly ordered bits in the microcontrol word. 
FIEL~ is bits 4 through 2 of FI~L~, which corresponds to bits <B,2,5> of the 
microcontrol word. Additionally, one might like to specify a field which is a 
concatenation of existing fields. This is done as follows 
FIELD A<2:4>,8<8:G>,C= A & 8<2>; 
Here, A is bits 2, 3, and 4, while Bis bits 8, 7, and 6. Field C contains all the bits of 
A and the second bit of B, so C contains bits 2, 3, 4, and 7. One final word about 
field specifications: each field name must be an identifier, which is a letter 
followed by an arbitrary string of letters, digits, and underscores. These rules 















: : = 
: : = 
: : = .. -.. -
: : = .. -.. -
.. -.. -
: : = 
.. -.. -.. -
.. -.. -.. -.. -.. -.. -
-96-
< <B ITSPECl> 
<INTSPEC> : <INTSPEC> <BITSPEC2> 
<INTSPEC> <BlTSPEC2> 









<lNTSPEC> + <INT> 
<lNTSPEC> - <INT> 
7. 2: Microcode Equations 
To specify the operations for many of the datapath elements, the user declares 
EQUATIONs, which associate values with fields. When the microcontrol words 
associated with the fields have the specified value, the EQUATION is TRUE, and the 
datapath element performs its operation. 




















: : :::: .. -.. -
: : = .. -.. -
.. -.. -
.. -.. -.. -.. -
: : = .. -.. -
: : = 
: : !:: 
: : = 
: : = .. -.. -








<EOUATIONl> OR <EOLIATION2> 
<EOUATJON2> 
<EOUA TI ON2> AND <EOLIA TI ON3> 
<EOUATJON3> 
( <EOLIA T IONl > ) 
<10> = <Bl TS> 
IF <EOUA T IONl> THEN <EOLIA T IONl> 
ELSE <EQUATION!> FI 






In the simplest case, the EQUATION would state that a single field have one specific 
value. Given the field declaration 
FIELD SELECT<1:3>,ENABLE<4:5>,0P<6:8>; 
an EQUATION might be 
SELECT=IXO 
This states that the first bit of SELECT should be high and the third bit should be 
low. The state of the second bit of SELECT does not matter. Notice that the high and 
low specifications are the letters I and 0, not the digits 1 and 0. The SELECT field is 
three bits long, therefore the value to be associated with that field must be three 
bits long. 
A more general equation might state that several fields have fixed values. Given 
the field declaration from above, the following example shows use of the AND 
function. 
SELECT=IXO AND ENABLE=XI 
Here we require the second bit of the ENABLE field to be high in addition to the 
value required in the SELECT field. The AND function is practically free in terms of 
-98-
chip area, so the use of AND is welcomed and encouraged. 
To allow more than one value to be associated with a field, an OR function is 
required. If we had written the equation as 
SELECT=IXO OR ENABLE=XI 
then the equation would be TRUE when either SELECT=IXO or ENABLE=XI or both. 
The OR function does cost some area in the instruction decoder, so some care should 
be exercised in its use. The OR functions will apply after all of the AND functions: 
vve say that OR has a lower precedence than AND. Therefore, 
SELECT=IXX ANO ENABLE=XI OR SELECT=XXO ANO ENABLE=OX 
will group as 
<SELECT=IXX ANO ENABLE=XI> OR <SELECT=XXO ANO ENABLE=OX> 
rather than as 
SELECT=IXX AND (ENABLE=Xl OR SELECT=XXOl ANO ENABLE=DX 
To get the second grouping, the parentheses must be used. 
To invert the polarity of an equation, the NOT function is used. The following 
equation is TRUE unless SELECT=IXX and ENABLE=XI. 
NOT( SELECT=IXX ANO ENABLE=XI ) 
The parenthesis are required. Notice that the following two specifications are not 
equivalent. 
NOT ( SELECT =100 
SELECT =DI I 
The first equation will go TRUE if SELECT< 1 >is low OR if SELECT<2> is high OR if 
SELECT<3> is high, whereas the second equation will go TRUE only when 
SELECT< 1 > is low AND SELECT<2> is high AND SELECT<3> is high. 
-99-
Other equations can use IF ... THEN ... ELSE ... FI constructs. One might say 
IF SELECT=IXO THEN ENABLE=Xl AND OP=OIO ELSE OP=IXX FI 
This equation is TRUE if SELECT=IXO and ENABLE=XI and OP=OIO or if 
SELECTOIXO and OP=IXX. Each of the IF, THEN, and ELSE clauses may be any of 
the equations specified up to this point, including other IF ... THEN ... ELSE ... Fls. One 
caution, however: The IF ... THEN ... ELSE ... FI equation can take a relatively large 
area in the instruction decoder. One should not include equations of this form with 
reckless abandon. 
Each of the equation constructs presented so far deal with variable equations, 
equations that depend on microcode inputs. Other equations may have fixed values, 
such as always being low. Fixed equations ma:y have one of five values: ALWAYS, 
NEVER, VDD, GND, and PAD. In the ALWAYS case, the equation will always be 
TRUE; in the NEVER case, the equation will always be FALSE. In the VDD and GND 
cases, the control line is tied directly to the appropriate power line. In the PAD case, 
a pad will be added to the chip, and this control line will be the sole signal which 
depends upon that pad's value. 
7 .3: Parameters 
The datapath elements are parametrized cells. They consume parameters specifying 
the configuration required for the particular instance of the cell and produce the 
corresponding layout. There are several kinds of parameters used in the Bristle 
Blocks cells. The first form of parameter is an EQUATION, where the equation 
specifies when a certain operation should occur. Another type of parameter is a 
REGISTER_?PECIFICATION, which describes a register, for example, the input 
register for an incremen ter. A third parameter is an integer. For Bristle Blocks, 
integers are restricted to positive, usually non-zero values. A fourth kind of 
parameter is a FIELD, which might indicate a shift constant, Jor instance. Another 
parameter type is an OUTPUT, which is used to drive a signal from a datapath 
Plr>ment to either an output pad or into the instruction decoder. A sixth parameter 
type is a MASK, which is used to specify which bits in the datapath are being 
operated upon. A DECODE parameter is used to decode a field into one of many 
instructions. Finally, SOURCE and DESTINATION parameters are used to connect bits 
-100-
from the datapath to bits in the instruction decoder. Each of these parameter types 
vvill be discussed in more detail, with examples. 
There is a uniform syntax for specifying each of the elements in the datapath. The 
first token is an identifier specifying the class of the element, and the second token 
is always an identifier which is the name of that element. For example, 
REGISTER PC .•.•. 
ALU ALU ... ; 
Here we have a REGISTER named PC and an ALU named ALU. Following the name is a 
list of keywords and parameter values. The keywords are a function of the element 
class. REGISTERs have one set keywords, while the ALU has a different set. Some 
of the parameters are required, others are optional. The cell documentation lists the 
parameter keywords, types, and requirement status for each of the element classes. 
The following rules define the syntax for calling a datapath element: 
<DECLARATION> .. - <10> <l 0> ; .. -
<OECLARATlON> : : = <10> <ID> <PARAtlS> 
<PARAth : : := <ID> <DECODE> 
<PARAtl> .. - <ID> <DES TS> .. -
<PARAM> : : = <ID> <EOUATION> 
<PARAfl> : : = <ID> <ID> 
<PARAtl> : : = <ID> <INT> 
<PARAtl> : : := <10> <1-JASK> 
<PARAfl> : : := <ID> <OUT> 
<PAR Alb : : == <ID> <REG_SPEC> 
<PARAtb .. - <ID> <SOURCES> .. -
<PARAtb : : = <ID> <VAR_EOUATION> 
<PARAtlS> : : = <PARAM> <PARAMS> 
<PARAtlS> : : = <PARAM> 
7 .3.1: Equations 
One of the Bristle Blocks elements is a bus precharge unit. This cell will precharge 
the upper data bus when its PRECHARGE parameter is high. The PRECHARGE 
parameter is an EQUATION, but the parameter is optional. If the user does not 
specify the parameter value, the cell will use a default value which always 
precharges the bus. The documentation of the cell reflects these characteristics: 
EI em en t: PRECHARGE_UPPER 
Required Parameters: NONE 
Optional Parameters: 
-101-
Keyword: PRECHARGE Type: EQUATION Oetaul t: AU.JAYS 
The type of the element is PRECHARG~UPPER. There are no required parameters 
and one optional parameter, which is of type EQUATION. The default value for the 
parameter is ALWAYS. One might use this element as follows. 
FlELO PCHG<l>; 
PRECHARGE_UPPER CELL_TO_PRECHARGE_UPPER_BUS PRECHARGE:PCHG=I; 
The element is of type PRECHARGE_UPPER. The name of this particular 
upper-bus-precharger is CELL_TO_J>RECHARGE_UPPER BUS. The one and only 
parameter for this cell has the keyword PTIECHARGE. The user has specified that 
the bus is to precharge whenever the PCHG field is high. The following code uses 
the default value for the PRECHARGE parameter: 
PRECHARGE_UPPER. CELL_TO_PRECHARGE_UPPER_BUS; 
7.3.2: Register Specifications 
A second common parameter type is REGISTE~PECIFICATION, or RE~PEC. A REG 
~PEG describes a register that can be used as an input or output register of a datapath 
clement. For example, an ADDER has two input registers and an output register. 
The user specifies how the register should interface to the data buses. Equations 
may be given to control the reading or writing of the two buses. Additionally, the 
register can be made to refresh its internal value, or load with a predetermined 
(fixed) constant. The syntax of a REG SPEC is 
<REG SPEC> .. - <REG_SPECl> ] - .. -
<REG _SPEC!> .. - <REG _SPEC!> , <ID> <REG _VAL> .. -
<REG _SPECl> .. - [ <ID> : <REG VAL> .. - -
<REG _VAL> .. - <EQUATION> 
<REG _VAL> : : = <f·JASK> 
The keywords for a REG SPEC are READ UPPER, READ LOWER, WRITE UPPER, WRIT~ 
~OWER. REFRESH, SUGGEST, and VALUE. These are all EQUATIONs except VALUE, 
which is a MASK. When SUGGEST is TRUE, the VALUE is loaded into the register (Xs 
in the mask indicate bits of the register that are not modified by the suggest 
operation). For example, 
NAnE EXAllPLE 8: 
FIELD REG_OP<l: 3>; 
•••• lREAO_UPPER: REG_OP=!OO, 
llRI TE_UPPER:REG_OP=l DI, 
REAO_LOWER: REG_OP=llO, 
IJRJ TE_LOIJER: REG_OP=l 11, 
REFRESH: ALWAYS, 
SUGGEST: REG_OP=Ol 1, 
VALUE: XllXOOXXJ •••• 
-102-
When the RE~P field is IGO, this register will take the value from the upper bus 
and store it in its internal node. When REG OP=IOI, the register drives the upper bus 
with the data contained in its internal node. Similar functions occur with the 
lower bus. The register refreshes its internal value every cycle. When REG OP=OII, 
the second and third bits of the register are set high, while the fifth and sixth bits 
are set low. The remaining bits are not modifi~d. All of the parameters in the REG 
~PEG are optional. Also, none of the read or suggest equations should be TRUE when 
either of the '\.Vrite equations are TRUE, because the data buses could be loaded with 
garbage. Unfortunately, the compiler can not verify that these equations are 
exclusive, due to the fact that the various register equations may be driven by 
independent sets of control bits. The correctness of these equations must be insurf!d 
in the software. 
To illustrate the use of a RE~PEC, consider an INCREMENTER. The documentation 
for an INCREMENTER is 
EI em en t: I NCREftENTER 
Required Parameters: 












The INCREMENTER takes the data from its input register, adds one to this value, and 
stores it in the output register when the LOAD equation is true. If the OUTPU'!__ 
_BEGISTER parameter is not specified, the INCREMENTER will store the value into its 
input register. The following code shows two incrementers, INCl and INC2. INC1 
has only a single register; INC2 has separate input and output registers. 
-103-
NMIE I NCREMENTER_EXAllPLE 8; 
FIELD RESET<l>,0P<2:3>: 
I NCREflENTER I NCl 
INPUT_REGISTER: [ lJRITE_IJPPER: OP=Ol, 
SUGGEST: RESET=!, 
VALUE: 00000000 J, 
LOAD: AUJAYS; 
I NCREt1ENTER I NCZ 
INPUT _REGISTER: [ READ _UPPER: OP =XI , 
REFRESH: ALWAYS, 
SUGGEST: DP=IO, 
VALUE: OOOOXXXX ], 




When RESET is high, the first incrementer clears its value. The value in this first 
incrementer is incremented every cycle. When the OP field equals OI, INC 1 writes 
its value onto the upper bus. 
The second incrementer always increments its input value and stores it in its output 
register. When the OP field is 00, the input register for INCZ does not load with a 
ne11\T value, so effectively no operation is done by INC2. When the OP field is OI, the 
input register is reading from the bus, while INC 1 is writing to the bus, so this 
operation is a transfer from INC1 to INCZ. When OP is IO, the input register 
suggests, but only the four most significant bits are altered: they are cleared. When 
OP is II, the input register is also reading from the upper bus, but the output 
register is writing to the bus, so this operation transfers data from the output of 
INC2 back to the input. 
7.3.3: Integers 
The third parameter type is that of Integer. Integers in Bristle Blocks must be 
positive, and usually must be non-zero, although they may have leading zeros. An 
element which takes an integer as a parameter is the STACK element. The 



















Def au I t: [REFRESH: AL~IA YS J 
Default: [REFRESH:ALL-IAYSJ 
Default: ALWAYS 
The STACK is implemented as a TOP register followed by DEPTH-1 MIDDLE registers, 
followed by a BOTTOM register. Between adjacent pairs of registers lie circuitry 
for transfering data between the registers. When the PUSH equation is TRUE, data 
in the TOP register transfers into the first MIDDLE register as data from the first 
MIDDLE register is transfered into the second MIDDLE register, etc. The POP control 
performs the inverse operation. The following STACK has depth 6: 





POP: OP"'! I, 





When OP=OO, the TOP register reads data from the upper bus, overwriting -what 
used to be on the top of the stack. When OP=OI, the data on the top of the stack 
writes to the upper bus, but the stack does not POP. When OP=IO, the stack does a 
PUSH, and the register loads from the bus. When OP=II, the stack POPs, but the TOP 
register does not write to the upper bus. The stack can not perform a POP operation 
on the same cycle that the register is writing to a bus because the bus -will be 
·written with garbage. It is ok to read from a bus while the stack is doing a PUSH, 
however. Also, the stack should not do both a PUSH and a POP at the same time, 
unless the depth of the stack is 1. For longer stacks, registers in the middle of the 
stack "\-Vould be loaded from their two neighbor registers at the same time, so 
garbage would appear in these registers. For a stack of depth 1, however, there are 
only two registers (the TOP and the BOTTOM registers), so a simultaneous PUSH and 
POP will do a swap of the two register values, as illustrated in the following 
example. 
NA~lE STACK_TEST _2 8; 
FIELD 0P<1:4>1 
STACK SWAPPER 
TOP: CREAD_UPPER: OP=IOXX, 
~JR f TE_UPPER: OP= 11 XX, 
REFRESH: AU-IA YS, 
SUGGEST: OP=Illl, 
VALUE: 000000001 , 
BOTTOM: lREAO_UPPER: OP=XXIO, 
-105-
~IRITE_UPPER: OP=XXI I AND NOT <OP=l IXX>, 
REFRESH: AUJA YS, 
SUGGEST: OP=II!l, 






The following table lists the operations performed by this stack. 
OP Ope1-at ion OP Operation 
0000: No Change 1000: Load TOP from 
0001: Copy from TOP to BOTTOM IOOI: Push into TOP 
bus 
0010: Load BOTTOM from bus IOIO: Load both TOP and BOTTOM 
001 I: Read BOTTOM to bus IOI I: BOTTOM goes to TOP and bus 
0100: Copy from BOTTOM to TOP I IOO: Read TOP to bus 
OIOI: Swap TOP and BOTTOM I IOI: TOP goes to BOTTOM and bus 
0110: Push into BOTTOM I I IO: TOP goes to BOTTOM and bus 
011 I : BOTTOM goes to TOP and bus I I I I : Clear TOP and BOTTOM 
7 .3.4: Fields 
Parameters of type FIELD are used to specify shift constants or bit selects. For 
example, a 16-bit datapath may have a shifter capable of shifting data left from Oto 
15 places in one cycle. A 4-bit field can specify the size of the shift in this case. 
For a 32-bit datapath, however, the shifter can shift between O and :31 places in one 
cycle, which requires a 5-bit field to specify the shift constant. The SIMPLE 
~HIFTER element is one example of an element which requires a field to supply the 




K e1J1-JOrd: flOST _SIGNIFICANT _l.JQRO 
Ke1Jword: LEAST _SIGNIFICANT _1.JORO 
Keyword: OUTPUT_REGISTER 
Keyword: SHIFT_CONSTANT 
K el.lL~ord: LOAD 
Option~! Parameters: NONE 
One might use this shifter as follows. 
NAME SHIFT_TEST 16; 
F !ELD REG_SELECT <l: 3>, SHIFT _CONST <4: 7>; 
S IflPLE_SH I FTER SH IF TER 





Type: FI ELD 
T\:Jpe: EQUATION 
OUTPUT REGISTER: [WRITE UPPER: REG SELECT=IXXJ, 
t10ST _SIGNJF !CANT _WORD: [READ_UPPER: REG_SELECT =XXI, REFRESH: AUJA'fSl, 




Signals like the carry output of an adder come from the datapath, and may go either 
to pads or into the instruction decoder. If the signal goes to a pad, Bristle Blocks 
vvill add an output pad to the chip and connect the pad to the control wire. If the 
signal goes to the instruction decoder, it is treated like any other microcontrol word 
bit, and so can modify the operation of the datapath. The syn tax for specifying the 
operation of an output is 
<OUT> : : = <ID> 
<OUT> : : = <ID> BIT <INT> 
<OUT> : : = PAD 
<OUT> : : = UNUSED 
In the first case, the output specification is a field name. The control line from the 
datapath element will drive the first bit of the field. In the second case, a field 
name and an index are given. The index indicates which bit of the field will be 
driven by the datapath element. The specification of 'PAD' states that the control 
line should connect to an output pad. The 'UNUSED' option indicates that the 
control line should not connect to anything. This is equivalent to not specifying 
the parameter. In the register example, the incrementer element was seen to have a 
parameter with keyword CARRY OUT. This parameter is of type output. 
-107-
Augmenting the register example to include connection of the carry output signal 
to a pad, we get the following code: 
NAt1E OUTPUT _EXM1PLE 8; 
FIELD RESET<l>,0P<2:3>; 
I NCREl1ENTER I NCl 
INPUT_REGISTER: ( l.IRITE_UPPER: QP,,,OI, 
SUGGEST: RESET=!, 
VALUE: 00000000 l, 
LOAD: AUJAYS, 
CARRY _OUT:. PAO; 
I NCRH1ENTER I NC2 
INPUT_REGISTER: [ READ_UPPER: OP=Xl, 
REFRESH: AL~JAYS, 
SUGGEST: OP=IO, 
VALUE: OOOOXXXX l, 
OUTPUT _REG I STER: [ WR I TE_UPPER: OP= I I J , 




Each of the incrementers' carry outputs will go to pads. 
7.3.6: Masks 
MASKs are used to indicate which bits of the datapath are to be affected by a 
particular operation. Recall in the register example that one of the incrementers' 
input register had a suggest value of OOOOXXXX. This indicates that the four most 
significant bits should be set low, which the four least significant bits were to be 
left unchanged. Notice that the length of the MASK is required to be the same as 
the width of the datapath, since each character in the MASK represents one bit in 
the datapath. The first bit in the MASK is associated with the most significant bit 
in the datapath, while the last bit in the MASK is associated with the least 
significant bit. 
-108-
7.3.7: Variable Timing Equations 
For almost every control line in Bristle Blocks, we can state precisely which clock 
phase should enable the line. Registers always write to a bus during PH!..__}; AL Us 
always operate during PH~. However, for the input and output ports, we can not 
say what the timing requirements are, for these are dictated by off-chip concerns. 
Hence, the control lines driving the ports must have the flexibility of changing the 
timing information. <VAR_l:QUATION>s are <EQUATION>s with the capability of 
having this modified timing. 
<VAR_EOUAT I ON> : : = <EOUATIDN> 
<VAR_EOUAT I ON> : : = <EQUATION> [ <VAR_ TIMI NG> ] 
<VAR_TIMING> : : = CLOCKED PHI 1 
<VAR_TlMING> : : :s CLOCKED PHf2. 
<VAR_TIMING> : : = NOT CLOCKED 
We see that a VA~QUATION may be a standard equation, in which case the timing 
takes the default clock phase, or an equation followed by one of the three timing 
specifications. The VA~QUATION may take on PH!..J or PH!_? as the enabling 
clock, or may asynchronously drive the control line directly. To see an example of 
these variable equations, consider an output port. The documentation for the 







Type: EQUATION Variable Timing 
NAME OUTPUT TEST 8; 
OUTPUT PORT PORT 1 - -REGISTER: [REFRESH:ALWAYS]; 
OUTPUT PORT PORT 2 - -REGISTER: [REFRESH:ALWAYS], 
DRIVE: PAD; 
OUTPUT PORT PORT 3 
REGISTER: [REFR-ESH:AL WAYS], 
DRIVE: PAD"[ CLOCKED PH!.._!]; 
PRECHARG E BOTH PCHG; 
END 
-109-
The first port always drives the pads. The second port drives the pads during PH!_? 
only when its input (coming from an input pad) was high during the previous PH_!_ 
_!. The third port drives the pads during PH.!J. only when its input (coming from a 
different input pad) was high during the previous PH.!_?. 
7.3.8: Decode Operations 
The Arithmetic-Logic Unit (ALU) is an example of a cell which can perform a wide 
variety of operations, but which has relatively few control lines. The particular 
operation performed by the ALU depends upon the state of several control lines. It is 
very difficult to specify the operation of the ALU in terms of its control line. One 
naturally thinks of the specification of the ALU operation in terms of operations 
like ADD and SUBTRACT. A DECODE parameter specifies how a field should be 
decoded to perform the appropriate operations. For example, the following is a 





K ey1-1ord: INPUT _B 
Key1~ord: OUTPUT_l 
K ey1~ord: DECODE 









SUB _!.J _BORROl~ 
I NCREf1ENT _A 
















Key1~ord: CARRY _OUT 
Keyword: CARRY _I NTO_JlSB 
K ey1.1orcl: f1SB 
KeyL~ord: ZERO 
Keyword: WRITE_OUTPUT_l 





















When the ALU_OP field has the value 00, the ALU will perform an addition 
operation, while an ALU~P of OI will cause a subtraction. ELSE can be used as the 
last case in the decode, which can save effort in a large, sparse decode . 




Another shorthand available allows several field values to be associated with one 
operation, using the BITSPEC construction. 
.... DECODE: ALU_OP 
0=> ADO 
2=> ANO 
<l, 3>=> OR 
-111-
The formal syntax for DECODE parameters is 
. rwrnnE> .. - <DECODE> <DECODEl> .. -
'it LUU[> .. - <ID> <DECODEl> .. -
<0ECODE1> .. - <BITSPEC> => <ID> .. -<DECODE!> .. - ELSE => <ID> .. -<DECODEl> : : = <INT> => <ID> 
This states that a DECODE is an <ID>, which is the field to decode, followed by a list 
of associations. Each association ties a field value or values to an operation. If there 
are some field values which are not associated with operations, a DON'T CARE is 
assumed. If the decoded field ever contains one of these values, the operation 
performed is unspecified and unguaranteed. 
7. 3 .9: Sources 
In Bristle Blocks datapaths, we have data lines running horizontally and control 
lines running vertically. There are times, however, when one would like to turn 
data lines into control lines. For example, flags from a register leave the register as 
data, but should enter the instruction decoder as control lines. The lines have to 
'turn the corner'. Another example would be an instruction register. The 
instruction register is loaded with data, the operation to be performed, and it must 
communicate this data to the decoder. Bristle Blocks needs to know which bits of 
the register should connect to which inputs of the instruction decoder or to which 
pads. A parameter of type SOURCE conveys this information. 
In the simplest case, a SOURCE parameter is a list of bit index and instruction bit 
pairs. For example, 
1 => FLAG ; 2 => ENABLE l 
indicates that bit 1 (the most significant bit) of the register in question connects to 
the FLAG field, which must be a field containing only one bit. Similarly, the second 
bit of the register connects to the ENABLE field, again a single-bit field. To connect 
-112-
to multiple-bit fields, the BITSPEC shorthand is used: 
I <1:4> => OPCODE I 
Here, the O~ODE field must be a four-bit field, which is driven from the four 
most significant bits in the register. 
One element that uses SOURCES is a DATA TO CONTROL element. This element -will 
function as an instruction register. Data in the register can drive bi ts in the 
instruction decoder. The documentation for this element is 
Element: DATA_TO_CONTROL 
Required Parameters: 
Keyword: REGISTER Type: REGISTER 
Keyword: MAP Type: SOURCES 
Optional Parameters: NONE 
One might use this element as follows. 
NMlE IR_TEST 8: 
FIELD FROM<1:4>, T0<5:8>; 
JNPUT_PORT INSTRUCTION_PORT 
REGISTER: [l.JR I TE_UPPER: FROM=OOOOJ , 
LOAD: ALWAYS; 
OATA_TO_CONTROL INSTRUCTION_REGISTER 
REGISTER: [REAO_UPPER: TO=OOOO, 
SUGGEST: NOT<T0=0000), 
VALUE: 00000000], 
tlAP: I <1:4> => FRot1; <5:8> =>TO l; 
INCREMENTER PC 
INPUT_REGISTER: [READ_UPPER: TO=OOOI, 
REFRESH: AUJA YS, 
LOAD: FROM=OOOO; 
OUTPUT_PORT ADDRESS 
~JR I TE_Lm!ER: FROf·l=OOOOJ ; 
REGISTER: [READ_LO~IER: FROM=OOOO, REFRESH:AL~JAYSJ; 
PRECHARGE_BOTH PCHG; 
ENO 
This example is portion of the Fetch/Execute section of a simple microprocessor. 
The Instruction Register drives the TO and FROM fields of the microcontrol -word. 
Notice that if the TO field is not 0000, the instruction suggests to 00000000 for the 
next cycle. The 00000000 operation causes data in the Instruction Port to be loaded 
-113-
into the instruction register, and the PC value increments. Thus, after every 
instruction which does not write into the instruction register, the instruction 
register automatically loads with the FETCH instruction. 
Sources can also specify that certain bits in a register should connect to pads. If 
many bits from a register are connecting to pads, OUTPUT PORTS shoulrl be used, but 
if only a few bits connect to the decoder and a few connect to pads, a DAT0Q_ 
~ONTROL register can be used. Pads are indicated by the token 'PAD' in place of a 








: : :::: .. -.. -
! : = 
: : = 
.. -.. -




<BI TSPEC> => PAD 
<INTSPEC> => <ID> 
<INTSPEC> => PAD 
<SINGLE_SOURCE> ; <SOURCE> 
<SINGLE_SOURCE> } 
{ <SOURCE> 
The SOURCE parameters indicate how to turn data lines into control lines. The 
inverse operation is also useful: turning control lines into data lines, which allows 
equations from the instruction decoder to load into registers, to be used in the 
datapath during later cycles. The format for specifying a DESTINATION parameter is 
very similar to the SOURCE parameter format. 
<DEST> .. - <EOUATIONl> => <INT> ; <DEST> .. -<DEST> .. - <EOUATIONl>· => <INT> } .. -
<:OESTS> .. - { <DEST> 
Informally, a DESTINATION parameter is a list of EQUATIONs with associated bit 
indicies. The following example illustrates calls of this type. The documentation 




Keyword: REGISTER Type: REGISTER 
Keyword: MAP Type: OESTS 
Keyword: LATCH Type: EQUATION 
Optional Parameters: NONE 
NMlE DEST I NAT I ON_EXAMPLE 4; 
FIELD INPUT<1:2>; 
CONTROL_TO_OATA DECODE 
REG I STER: [l.JR ITE_UPPER: ALWAYS] , 
nAP: IINPUT=OO => l; INPUT=OI => 2; INPUT=lO => 3; INPUT=I I => 41, LATCH: ALWAYS; 
PRECHARGE_BOTH PCHG; 
ENO 
The LATCH equation states that the register should be loaded from the DESTINATION 
parameter values every clock cycle. The DESTINATION parameter in this example 
'decodes' :the value of the input field: When the field has value 0, the most 
significant bit of the register will be the only bit with a high value; when the field 
has value l, the next most significant bit will be the high bit, etc. Any bits of the 
register not specified in the DESTINATION parameter will be unaffected by the 
LATCH signal. 
7 .4: Comments and Macros 
In addition to the language constructs presented above, the Bristle Blocks parser has 
two meta-commands: comments and macros. These constructs are not part of the 
formal language definition, but are processed by the parser before the formal 
language is parsed. 
Comments consist of all characters between double-quote characters. The parser 
removes the double-quotes and all characters between them, before tokenizing the 
input stream. This allows comments to be inserted anywhere, even in the middle of 
an indentifier or number. 
Macros are simple text-replacement facilities which reduce the amount of typing 
required to specify a·design. They also aid in the reduction of errors. A macro has a 
-115-
name, a set of parameters, and a body. When the macro is instantiated, the body is 
inserted into the character stream. Any parameter values to the macro are inserted 
in the text where the parameters occur in the macro body. An example of macro 
usage follows. 
flACRO TES Tl {ABC, DEF l 
% THIS IS A TEST: ?ABC? IS PARAMETER 1, 
?DEF? JS PARAMETER 2. 
% 
/*TESTllHI THERE,XYZZYl*/ 
We have defined the macro TEST 1. This macro takes two parameters, which are 
identified as ABC and DEF. The body of the macro consists of all characters between 
the percent signs. Within the macro body, tokens within question marks refer to 
instantiations of parameter values. For instance, the value of the first macro 
parameter is inserted in the macro body where the characters ?ABC? occur. 
Following the macro definition, we have a macro instantiation. The characters /* 
signify the start of a macro call, while "'I indicates the end of the call. Between 
these indicators, we have the macro name and parameter values. Here we are 
stating that the macro TEST 1 is to be called, with the first parameter set to the 
characters 'HI THERE' and the second parameter set to the characters 'XYZZY'. The 
above macro definition and instantiation is identical to the following text. 
THIS IS A TEST: HI THERE IS PARAMETER 1, 
XYZZY IS PARAMETER 2. 
Macro parameters may be given default values. The following example gives 
default values for the first and third parameters of the macro. 
MACRO TEST21Pl/123,P2,P3/HI MOMl % IF ?P2?-?Pl? THEN WRITE!'?P3?'l; FI % 
/,·, TEST2 {453, 231, WHAT?> M 
I»: TEST2 {. Xl ~·:/ 
These four macro instantiations will expand into the following text. 
IF 231=453 THEN WRITEl'WHAT?'I; Fl 
IF X=l23 THEN WRITE('HI MOM'l: FI 
IF X=l23 THEN WRITE('HI MOM'I; Fl 
IF =123 THEN WRITE('HI MOM'l; Fl 
-116-
In the first example, we specified values for all three parameters, which were 
inserted into the text. In the second example, we let the first and third parameters 
take their default values. This was done by not specifying a value for the 
parameters. In the third parameter, we terminated the parameter list after the 
second parameter, so the third parameter again took on its default value. In the 
final example, the parameter list was empty, so every parameter took on its default 
value. Since the second parameter did not have a default value, an empty set of 
characters was used. 
Parameter values consist of all characters upto but not including the first comma or 
close parenthesis. There are times, however, when one would like to pass these 
characters in as parameter values. To allow this, parameter values or default values 
may be enclosed in percent signs. For example,, 
/*TESTZl,X,%ILLEGAL CONDITION, PLEASE TRY AGAIN%l*/ 
produces the following text. 
IF X=123 THEN WRITEl'ILLEGAL CONDITION, PLEASE TRY AGAIN'l; FI 
Macros are instantiated before the parser tokenizes the input, in the same manner as 
comn1ents are removed. This allows identifiers to be 'split' across macro 
instantiations: part of an identifier or number is generated outside of a macro 
instantiation, while the remainder is generated by the macro. Macro instantiations 
may nest, and macro definitions may instantiate other macros. Macros must be 
defined before they are instantiated. 
A macro definition is treated like a declaration to the parser. A formal statement of 
macro definition syntax is presented here. Rules which use:*= instead of::= do not 
allow arbitrary insertion of blanks. 
<DECLARATION> : : : = 
dlACRO _HEADER> : : = 
d'lACRO_HEADERl > .. -.. -
<~lACRO _HEADERl> .. -.. -
<ilACRO _HEAOERl> : : = 
<PARAf'1 _DECL> : : ::::: 
<PARAf'l_DECL> : : == 
<PARAf'l_DEF AULT> : ~·:= 
<PARAl1_DEF AULT> : ,·,= 
d1ACRO_BODY> : ,·,= 









<flACRO_HEADERl > <PARAM_DECL> 
<ID> 
<ID> <PARAM_DEFAUL T> 




<MACRO_BODYl > <MACRO_BOOY _ELEl1ENT> 
.. -.. -
al 1-characters-unti 1-%-or-? 
? <ID> ? 
-118-
Chapter 8: How Bristle Blocks Works 
In this chapter, we will discuss the operations performed by Bristle Blocks. We 
will use the following chip specification as our example. This chip may be thought 
of as a datapath for a simple control processor. We have eight internal registers and 
an ALU, along with an input port and an output port. Figure 8-1 gives a block 
diagram representation of the circuit. The input specification to Bristle Blocks is 
listed here. 
NAtlE SN1PLE 8; 
FIELD REG_SELECT<l:3>,ENABLES<4:G>,ALU_OP<7:9>; 
INPUT _PORT INPUT LOAD: ALWAYS, REGISTER: (l.JRITE_LOWER: ALWAYS]; 
MACRO AODRESS!AOORJ 
%REGISTER R?AOOR? OPTIONS: [REAO_UPPER: REG_SELECT=?ADOR? AND ENABLES=XXI, 
l.JRITE_UPPER: REG_SELECT=?AOOR? ANO ENABLES=XXO, 
REFRESH:ALWAYSJ;% 
/*ADDRESS(000}*/ 
I ,·,ADDRESS COO l } >"<! 
I ,·,ADDRESS (0 I Ol »d 
I ,·,ADDRESS (0 I I l ,·d 
I ,-.ADDRESS I I 00) ,·d 
/*ADDRESS!IOIJ*/ 
/*AODRESS!IIOl*/ 
I ,·,ADDRESS I I I I } >"d 
PRECHARGE_BOTH PCHG; 
ALU ALU 
INPUT _A: rREAD_LmJER: AUJA YSJ , 
INPUT_B: rREAO_UPPER: ENABLES=IXX, REFRESH:ALWAYSJ, 
OUTPUT _1: [WAI TE_UPPER: ENABLES=XXJ], 
DECODE: ALU_OP 
0=> OR 












Input. Port. ______.. 
-119-
/ALU 
..-----... / \Lot.oh 
Fig. 8-1: Sample Chip Block Diagram 




The first step taken by Bristle Blocks is to parse the user input, determining the 
elements and element configurations needed for the chip. The parser's output vvill 
be a series of function calls which, when evaluated, will generate the chip's layout. 
In our example, we see that the name of the chip is SAMPLE and that the datapath 
width is 8. This name will be kept with the chip, to identify the current chip from 
other chips that may reside in the system. This name is also used to compute file 
names. For instance, the CIF file name will be SAMPLE.CIF, and the log file, which 
lists the testability vector and pad order, will be SAMPLE.CPL. The file names 
adhere to the DEC-10 conventions, which limit file names to six alphanumeric 
characters, the first of which should be a letter. In our example, the name SAMPLE 
is an acceptable file name. In other examples, the chip's name may not be 
acceptable, so Bristle Blocks computes a file name which bears a strong resemblance 
to the given name. 
The datapath width is used for determining how many bits to place in each register 
ilnd each processing element. In addition, for elements like the barrel shifter, the 
number of control lines for the element is a function of the datapath width. 
The next line of text in the sample file contains the micro-control vvord 
specification. The user states that the micro-control word will be nine bits long, 
-120-
and that this word can be thought of as three non-overlapping fields. The REQ_ 
SELECT field will be used to address one of the registers in the datapath, the 
ENABLES field will be used to control the transfer of data across the internal data 
bus, and the ALU OP field will control the operation of the ALU. 
Following the micro-control word specification, the input port is declared. This.is 
an element of type INPUI__!'ORT, which the user chose to call 'INPUT'. This element 
is to 'ALWAYS' load its register from its pads, and its register is to 'ALWAYS' drive 
the lower data bus. The timing conventions presented in Chapter 6 state that the 
port unit actually loads data from the pads to its register every PH!__?, and that its 
register drives the lower bus every PH!.J. We will use the lower bus to transfer 
data from the input port to the ALU. The parser will generate a call to an internal 
function called PORT IN. One parameter to this function gives the equations to 
drive a register, another parameter is the equation to control loading from the pads. 
The register has only one equation, which controls writing the the lower bus, and 
is set to PH!__!. The load parameter is set to PH!_?, since the port al ways loads from 
the pad, independent of the micro-control word. 
Next, the user wants to specify the register array. This register array is composed 
of ei£;ht registers which function almost identically. To save typing, and to reduce 
the possibility of specification errors, the user uses a macro. The macro takes one 
parameter, which is the address of the register, and generates the specification for 
that register. The MACRO name is ADDRESS, and the single parameter's name is 
ADDR. The macro call /*ADDRESS(abc)"'/ will generate the text 
REGISTER Rabe OPTIONS: [READ_UPPER: REG_SELECT=abc ANO ENABLES=XXI, 
1-IRI TE_UPPER: REG_SELECT =abc AND ENABLES=XXO, 
REFRESH:ALWAYSJ; 
Following this macro definition, the user calls the macro eight times, passing the 
eight register addresses. When this macro is expanded, the parser will see eight 
register specifications, so will generate eight calls to the internal REGISTER 
function. These registers each have three equations: reading the upper bus, writing 
the upper bus, and refreshing. The bus read/write operations occur during PH!__!, 
while the refresh occurs during PH~. 
-121-
After the registers are specified, the user adds the bus precharge element. The buses 
in Bristle Blocks are dynamic. They are precharged during PH!_?, and transfer data 
during PH!J.. To write on the bus, a datapath element pulls the bits low to write a 
zero, or leaves the bus alone to write a one. 
Following this, the user specifies an ALU, whose name is ALU. Following the three 
ALU register specifications, the user gives the operations performed by the ALU. 
The ALU has 13 control lines which are used to determine the operation done by the 
ALU. To perform an ADD operation, these 13 lines must be set in particular states, 
"\Vhile a SUBTRACT operation requires different states on these wires. Rather than 
having the user specify these states, the parser allows the user to specify the 
operations. Here, the user has specified that the ALU should perform an ADD when 
the ALU OP field is IOI, and a SUBTRACT when the field is OII. The other operations 
are seen in the input text. The parser must convert this operation-wise 
specification into a control-line-wise specification before calling the internal ALU 
.function. This conversion will be discussed in section 8.2. 
Following the ALU, the user specifies the output port, and then the END. When the 
END is reached, the parser will have collected 12 function calls to internal datapath 
element procedures, along with the description of the micro-control word and 
datapath width. Before these function calls can be made, the parser must generate 
the instruction decoder functions. 
8.2: Generate Instruction Decoder Functions 
The instruction decoder used in Bristle Blocks is nothing more than a series of NOR 
gates, as shown in figure 8-2. Each NOR gate drives one of the control lines, based 
upon the states of its input lines. Given a structure like this, only 
very-uninteresting decodes can be performed. The NOR gates can be thought of as 
actually AND gates, if all the microcode inputs were negated. Thus, we could only 
perform AND functions in the instruction decoder. To allow the inclusion of OR 
functions in the decoder, we allow some of the NOR gates to drive new decoder 
inputs, rather than driving control lines. Figure 8-3 shows some of these NOR 
gates. We can now perform OR functions in the decoder, although the OR functions 
cost more both in area and in time than the AND functions. In fact, we use this 
technique to generate the compliments of microcode inputs. The user may state 
-122-
that an equation is dependent upon a microcode input being low rather than high, 
in "\Vhich case a single-input NOR gate is used to generate the compliment of the 
actual input signal. 
To Control Line BufFere 
Fig. 8-Z: NOR Gate Decoder 
To Cont.rol Line BuFF-
Fro• Mloro-Ccnt.rol Word 
Fig. 8-3: Decoder with Minterm Gates 
Each of the microcode equations passed to the internal element functions in Bristle 
Blocks must be the NOR form of equations. Hence, the parser must convert all 
non-NOR functions to NOR functions by declaring these new 'microcode inputs' and 
specifying the NOR gates which win drive these inputs. We convert an AND 
function to a NOR function simply by complimenting all of the input signals. 
Therefore, 
a AND b AND -c 
becomes 
NOR (-a, -b, cl 
An OR function is converted to a NOR function by inverting the output, which is 
done with another NOR gate. Therefore, 
-123-
a OR b OR -c 
becomes 
NORINOR!a,b,-c)) 
An IF ... THEN ... ELSE ... FI function is converted to NOR functions by realizing that 
IF a THEN b ELSE c FI 
can be stated as 
ANO(a,b) OR ANO(-a,c) 
or as 
NOR!NOR!NOR(-a,-b),NOR(a,-clll 
Similarly, the decode functions, as in the ALU unit, are converted into NOR 
functions. In our example, we wish to perform the following decode. 
ALUOP 
0 0 0 
0 0 I 
0 I 0 
0 I I 
I 0 0 
I 0 I 
1 I 0 
I I I 
OUTPUTs 
I I I 0 0 0 0 I 0 
IIOOOOIIO 
IOOOOIIIO 
I 0 0 I 0 0 I 0 0 
011010010 
0 I I 0 0 0 0 I 0 
OOOOIIIIO 









The parser converts this to the following code: 
FIELD NEW_FUNCTIDN<n>; 
NEl-J_FUNCTION: = ALU_DP=OXI; 
OUTPUT_!:= ALU_OP=OXX; 
OUTPUT _2: = ALU _OP =XDX; 
-124-











Once these conversions are completed, all of the equations are in the NOR form, 
which can easily be implemented in the instruction decoder. We have effectively 
widened the microcontrol word, and we also have a list of equations which drive 
these extra microcontrol word inputs. 
8.3: Build Datapath Core 
At this point, we know the width of the datapath, the equations for each of the 
control lines and virtual microcontrol word inputs, plus we have the 12 datapath 
element functions. We are set to generate the layout for the core of the datapath. 
The datapath core consists of the actual registers and ALUs, without the control line 
buffers or instruction decoder. 
Before we actually generate the layouts, we need to determine the physical sizes of 
the datapath bits. In Bristle Blocks, we chose to perform global optimization by 
having all datapath bits the same physical height over local optimization with the 
required routing between cells. Figure 8-4 illustrates the two possible alternatives. 








'L ·-~ '°' ~- -. .i:µ 






~ .. , ,_ .. , 
' I I 
~ 
·--. !ll ~ )()(' tt "' II -
xx ~ 
ro II Ill .. ... 
I I I II .. ' ~~ ~:-..I I 
" -ll -~- 1·- - . '. ~ 
l - I 
. 
L • 
') J1= ·- l I I I IL-
.. 
X>: , ..... 
w i '" -' I I I I ,. n , NI 
I I : 111 '~ 
II 





II ~ . ' ... 
1 
,. 
-. I I I 
I-· 
H ..I 
·n ._ ,, 
' I '°'' ....... H 
ill I I I• I I +l- -
"' :J '" 
I lf q~ I l l~l 'Ill I 
. 
~ I 
O' ~ liW" 
~~~[J -




ll - Ii ...... f I Ir- jj ' .. . ~ 
~. ;.. " ... 
L M ~-~~Bl 
" ~ ~ " 
._ ~I I ·~ ~ ~ f I 
Routing Stretching 
Fig. 8-4: Comparison of Stretching and Routing 
route between the cells. The electrical properties of the individual cells would not change, but we would degrade the bus signals, and we would take horizontal chip area for the routing. In the second case, we would stretch the cells so that the interfaces match, and the cells could plug together with no stretching. Horizontal area is saved at the cost of some vertical area. Additionally, the control line signals would degrade, and the electrical properties of the cells could change. After an analysis of the situation, it was determined that the best approach would be to stretch the cells. But rather than externally stretching the cells, which would play havoc with the electronics, we design the cells to accept stretching parameters, so that the cell generates the stretched layout. In this way, the cell can monitor the stretching and alter its geometries to preserve the electronics. The cell may also select one circuit topology from several potential topologies, depending upon the physical size of the datapath. 
-126-
v 0 D 111111111 Ill II Ill I IIll I I Ill Ill II 1111111111111111111111111 Ill I Ill I - y 3 
up E> er Ill I Ill II II II Ill II Ill Ill I Ill Ill Ill II Ill Ill m mmm 11111111111 - y 1 
Bus 
G NO 111~111111111111111111111111111111111111111111111111111111111111 - 0 
Power Width 
L 0 8 ~: 11111111111111111111111111m11111111111111 irn mm 111111111111 - Y 2 
V 0 D Ill I I I I l Ill Ill II Ill I Ill Ill Ill Ill Ill Ill II 11111111 ~ill Ill I Ill Ill fl - y 4 
Fig. 8-5: Single Bit Floorplan 
The first step, therefore, is to determine the stretch point values. Figure 8-5 shows 
the floorplan of a single bit of the datapath. We have chosen to position the Ground 
line which runs through the cell along the x-axis. Above this line we have the 
upper bus, at a y-coordinate of 'Y1', and a VDD line, at a y-coordinate of 'Y3'. 
Similarly, we have the lower bus and VDD line below the Ground line, at 
y-coordinates of 'Y2' and 'Y4'. Finally, the width of the power lines is POWE~ 
WIDTH. Each datapath element function can be thought of as an object, as in 
object-oriented programming. To determine the stretch point values, we just ask 
each function what its minimum requirement is for each spacing. We take the 
largest of each of these values as the spacing between stretch points. In addition, 
we ask each element for its power consumption. By summing the power 
consumptions, we can determine the necessary width of the power lines. In figure 
8-6 we show the register cell stretching itself to match the requirements of the 
system, while figure 8-7 shows the stack cell. The stack cell uses an alternate 
layout when the stretching is great enough. 
After we have computed the stretch point values, we can call the individual 
element functions, requesting the layouts. These functions will examine the 
stretch points, along with the parameters passed froin the user's specification, to 
determine which layout to use. For example, we have used several types of 
registers in the. sample chip, yet there is only one register function. The follo-wing 
register configurations have been requested. 
-127-' 
Short Mid T CJ 1 1 
Fig. 8-6: Stretching Register Cell 
Short T CJ 1 1 
Fig. 8-7: Stretching Stack Cell 
-128-
1) I.Jr i t e LOL.Jer 
2) Read and Write Upper, and Refresh 
3) Read Lm.Jer 
4! Read Upper and Refresh 
5) Write Upper 
A register cell layout which performs all five of the basic operations required here 
is shown in figure 8-8. If we were to use this layout in all of the locations that 
require a register, we would be wasting a lot of space, since each of these functions 
require area. On the other hand, we do not wish to design 31 different registers, 
one for each of the possible configurations. What we do is design one register as a 
program which computes the appropriate layout from the functions required. 
Figure 8-9 shows the five resulting layouts needed by our sample chip. 
Fig. 8-8: Complete Register Cell 
As the cell is computing the layout, it is very easy to add information about 
connection points: where the control lines interface to the datapath core. If this 
information were not captured with the layout, some program would have to 
determine these positions later, which means that this program would have to have 
intimat1:1 knowledge of each datapath cell, and would have to duplicate much of the 
-129-
Fig. 8-9: Sample Chip Register Instances 
computation performed by the cells. Instead, we choose to add information to the 
layout datastructure to indicate where the control lines connect. In fact, we need to 
know more than just position. What are these connection points to connect to? We 
have told the datapath elements which microcrn;le equations drive each line, and the 
datapath element knows what style of buffer is required for each line. By adding 
this information to the connection point, the buffer program can generate the 
buffers by looking at the datapath core layout, and the instruction decoder program 
can generate its layout by looking at the result of the buffer program. 
Once we hav.e generated the layouts for each datapath element, we may abut the 
elements and finish the datapath core. To simplify the abutment procedure, -we 
have defined the following conventions regarding the left and right edge 
characteristics of datapath cells: All geometry within a cell must have positive 
x-coordinates. All geometric primitives must be at least half the minimum design 
rule spacing from x=O. For instance, a diffusion feature must be at least 1.5 lambda 
from the edge of the cell. We can state the width of the cell as being the minimum 
x-coordinate that is at least half of the minimum design rule spacing from all 
geometric features of the cell. Therefore. if a diffusion edge has the largest x 
coordinate, the cell's width is 1.5 lambda beyond that coordinate. If we place the 
first datapath element at the origin, and displace all other elements by the widths of 
all elements to their left, we will have no design rule violations between cells. 
Notice that the two data buses and the power buses do not enter into the width 
calculation, for these lines must connect between cells. The layouts produced in 
this manner are large. Most of the elements communicate with the buses with 
diffusion connections. We will therefore allow a cell to place a diffusion-to-metal 
feedthrough on either edge of a cell, to connect to either bus. If a neighbor cell also 
connects to the bus, they will both share the same feed through. 
-130-
Once we have completed the abutment process, we have finished the datapath core. 
Figure 8-10 shows the layout of the core for our sample chip. In addition to the 
layout, we have connection points along the lower edge of the layout for 
connecting to the buffers, and we have connection points on the left and right edges 
for connecting to pads. 
Fig. 8-10: Sample Chip Core Layout 
8.4: Add Buffers and PLSRs 
Given a datapath core, we need to add buffers to each of the control lines. These 
buffers latch values from the instruction decoder during one clock phase and drive 
the control lines on the other clock phase. These buffers satisfy the electrical 
-131-
constraints of driving large loads from the weak signals of the decoder. They also 
satisfy the timing constraints by allowing the instruction decoder and datapath to 
work in parallel rather than in series. This parallelism allows the chips to nin 
fast.er. and removes the possibility of race conditions. 
To facilitate the testing of chips, we would like to independently test the 
instruction decoder and the datapath. If we had to test the two units together, it 
could take a fantastically long time to verify that the chip functions correctly. By 
splitting the testing task into testing the two pieces in isolation, one can hope of 
completely testing the chip. We do this by adding Parallel-Load Shift Registers 
(PLSRs) between the decoder and the datapath. As it turns out, the circuitry 
required by the buffers and the PLSRs have a lot in common. If we therefore design 
the PLSRs into the buffers, we can save a lot of area. The buffer routine adds the 
output driver of the buffers, while the PLSR routine adds the remainder of the 
buffers and the PLSRs. 
The datapath core tells us which buffers it needs on which control lines, since this 
information is present in the connection points. We can arrange our buffer 
program to generate the buffer layouts in the same order as the connection points, 
so that we may river route between the buffers and the core. To generate the 
buffers, we need to compute the positions of the individual buffers. If we take the 
positions of the connection points as a first approximation, we will generate buffers 
which are as close as possible to the wires they drive. Given this first 
approximation, we move any buffers which are too close to neighboring buffers. 
We continue to shift buffer positions until none of the buffers overlap, and then 
we river route to the core. Figure 8-11 shows the buffer programs output for the 
sample chip. 
Fig. 8-11: Sample Chip Buffers 
-132-
Some of the buffers drive control lines independent of the microcontrol word. 
\tVhenever the user specifies ALWAYS, NEVER, VDD, or GND for a control line 
function, that control line does not connect to the instruction decoder. Because 
some control lines may not connect to the instruction decoder, and because the 
buffers may have to shift positions to avoid overlaps, we put connection points on 
the buffer layouts. The PLSR program does not have to compute the positions of the 
buffer inputs, this information is given in the connection points. In fact, the type 
of the PLSR which needs to connect to each control line can be deduced from 
information in these connection points. The PLSR program operates in the same 
manner as the buffer program, positioning and shifting the individual circuits to 
avoid overlaps. Figure 8-12 shows the PLSR circuit and river route. 
Fig. 8-12: Sample Chip PLSRs 
8.5: Add Instruction Decoder 
After the buffers and PLSRs are added to the core, we are ready to add the 
instruction decoder. The PLSR cells have connection points which state position 
and microcode equations. In addition, we have the microcode equations for the 
virtual control word inputs from the OR functions and the DECODE functions. We 
generate the instruction decoder layout in three steps. The first step initialized the 
decoder. The second step adds the virtual input NOR gates and connections to pads. 
The third step packs the wires to conserve chip area. 
To initialize the decoder, we add the NOR gates which drive the PLSRs. These NOR 
gates are inserted in the column closest to the PLSR which is to be driven. Next, we 
add the NOR gates to generate the virtual equations. These NOR gates are driving the 
inputs of other NOR gates. We may potentially have a NOR gate driving a wire 
-133-
1.vhich extends the whole width of the chip and drives many NOR gates. If this 
"\.Vere to be allowed, the instruction decoder would run very slow. To avoid these 
long delays, we will limit the loading which we will add to a NOR gate. As we are 
scanning across the decoder, if we notice the load getting too great, we will 
terminate the line and regenerate the signal where it is needed. To do the scanning, 
we need to sort the virtual inputs before adding the NOR gates. We sort the list so 
that equations in the list only depend on equations occurring later in the list and 
not on any equations earlier in the list. We then take equations off the list in order, 
adding the NOR gates to the decoder as we go. When we have finished adding the 
virtual equations, the equations remaining in the list are in fact the actual 
microcontrol word inputs, so we connect each of these to wires which will connect 
to pads. 
When we have completely generated the instruction decoder, we pack the wires to 
save space. The packed instruction decoder for the sample chip is shown in figure 
8-13. 
nn I i.~=Filt 
I ·~~ .j~ 
n 
·• - L :m g 1~ l ~ I l I I! , • 1! • • - -· 
' 
; 
i• •·' . ·11: • t ,,,. • t:•t • I ,,. . . ,, . I' 
I' i: .. '~ 
. Ill I ,I ' . 
i r . i. =r 'ril ,i; 
,, ii .-.... II • .. :4l - :::#. • ~! " .. . n .. . -• ;. • "" 
• • • 
Fig. 8-13: Sample Chip Decoder 
8.6: Add Pads 
When the instruction decoder has inputs which come from pads, it adds wires to 
the edge of the cell. To the ends of these wires, it attaches connection points which 
will tell the pad router of the existence of the wire and of the type of pad required. 
Similarly, the datapath elements have previously generated connection points 
calling for pads. Based upon this information, along with power consumption 
information, the pad router can add the pads to the chip. If this datapath is to be a 
complete chip, the pad router can be called, which completes the chip. If this 
-134-
rL1tapath is to be a portion of a chip, the datapath can be used as is, and the 
connection points are available to aid in the interfacing to the datapath. 
The pad router gathers all connection points which connect to pads. It then 
determines how many pads are needed, and can tell what types of pads are required. 
It places the pads and 'Rota-Routes' them as described in the River Router appendix. 
This Rota-Route shifts the pads around the chip in an attempt to minimize the wire 
lengths. The box river router is then called to route wires to the pads, and the chip 
is complete. Fig_ure 8-14 shows the pad layout for the chip. 
8.7: Conclusions 
This chapter has described how Bristle Blocks builds a chip from the user 
specification. It can be seen that much of the task is geared toward this particular 
style of chip. This focus upon the floorplan does restrict the capabilities of the 
compiler to a very specific class of chip. On the other hand, this also allows Bristle 
Blocks to compile very optimal chips, and it also relieves the user of a lot of 
specification, since much of the specification can be implied from the structure of 
Bristle Blocks chips. 
-135-
Fig. 8-14: Sample Chip Pads 
-136-
Chapter 9: Bristle Blocl{S Examples 
In this chapter, we will show several chips designed by Bristle Blocks. These 
examples ~.vill not only reinforce the language aspects of Bristle Blocks, but also 
illustrate how the design methodology is impacted by a silicon compiler, even one 
with a limited floorplan. 
9. 1: Lamp Dimmer 
The lamp dimmer chip is a variation of a chip designed by Ron Ayres. Ron wanted a 
chip ·which he could use to control the brightness of a lamp. A diagram of Ron's 
setup is shown in figure 9-1. Several of these lamp control chips would be 
connected to a small processor via a serial bus. The processor could send commands 
to the lamp chips over this bus. The commands would be to select a particular 
device by its address or to set a device's lamp brightness to a given value. The lamp 
chips would drive Triacs, which controlled each lamp's power supply. 
Processor 
Terminal 
Serial Communication Bue 




A block diagram of the lamp dimmer is shown in figure 9-2. We have an 8-bit 
shift register which reads the serial data from the command bus and drives the 
6-bit data bus and 2-bit instruction bus. The data bus can load into the address 
register for modifing the device's bus address. The data bus is also compared w"ith 









,__-Centro 1----- Execute 
1-----------Lo9 i c Sync 
Inetruotlon Bue 
Fig. 9-Z: Lamp Dimmer Chip Block Diagram 
mic1·oprocessor is selecting this device. Finally, the data bus can load into the value 
register, which holds the current lamp brightness value. 
To drive the Triac, we need to convert the data in the value register to a time. For a 
bright lamp, we want to pulse the Triac soon after the zero crossings of the AC line 
current. Conversely, for a very dim lamp, we should trigger the Triac just before 
the zero crossings. We will convert the data value to a time by comparing the value 
register's contents with the contents of a counter. The counter will be reset at the 
zero crossings of the AC current, and will be clocked so that the counter reaches full 
count just before the next zero crossing. 
The 2-bit instruction bus drives the control logic section of the chip. The EXECUTE 
pin is used to indicate when the instruction bus and data bus contain valid data. 
When EXECUTE is high, the 2-bits are decoded as follows. An instruction of 00 
initializes every device to its initial address. This initial address is read from a 6-bit 
input port, which is hard wired on each chip to a unique number. When the 
instruction is 01, the processor is selecting a new device. Each chip compares the 
data bus value to its address value and, if they match, the chip becomes enabled. 
vVhen the instruction is 10, all selected devices will load their address registers 
from the data bus, allowing the processor to change the address of any device. 
Finally, when the instruction is 11, all selected devices will load their value 
register from the data bus. 
-138-
From this description, we can start mapping the chip specification into Bristle 
Blocks. Since most of the data widths are 6 bits, we will set the chip width to 6. 
We can also state what the microcontrol word looks like. We need a bit to state 
whether this chip is the selected chip. We call this the ACTIVE bit. Next, we need 
t\vo bits which contain the current instruction from the shift register, which we 
call the OP bits. The SYNC bit clears the counter. This signal goes high at each 
zero-crossing of the AC line. EXECUTE is an input whieh states when the 
instruction and data values are valid. We also need a data input, which we call 
INPUT. Our specification to Bristle Blocks now looks like this. 
NAf1E RONS_CHIP 13; 
PRECHARGE_BOTH PCHG; 
FIELD ACTIVE<l>,0Pc2:3>,SYNC<4>,EXECUTE<5>,JNPUTcl3>; 
We define macros for each of the four basic instructions executed by the lamp 
dimmer chip. We have also defined a macro NO'[_!NITIALIZE which is true for any 
instruction but INITIALIZE. 
MACRO INITIALIZEll % OP=OO ANO EXECUTE=! % 
MACRO SELECT!! % OP=OI AND EXECUTE=! % 
llACRO LOAD_ADOR () % OP= I 0 AND EXECUTE= I ANO ACTIVE= I % 
MACRO SET_VALUEO % OP=ll ANO EXECUTE=! ANO ACTIVE=!% 
r-IACRO NOT _IN IT I All ZE (} % NOT (0P=00l ANO EXECUTE= I % 
We can now list the datapath elements we require for this chip. These elements are 
the command shift register, the initial address port, the address register and 
comparison unit, the value register and comparison unit, and the counter. 
The command shift register must be 8-bits long, but our datapath is only 6-bits 
wide. However, we can think of the 8-bit register as a 6-bit register followed by a 
2-bit register. The 6-bit register will contain the data portion of the command 
when the ENABLE bit is TRUE, at which time the 2-bit register is holding the 
operation portion of the command. The 6-bit register is simply a LEFT_RIGH!_ 
~HIFTER, while the z-bit register is a SHIFTIN~R, since we need to access the 
register's value in the instruction decoder. These two elements are specified by 
FI ELD llSB< 7>; 
LEFT _RIGHT _SH I FT 
INPUT_REGISTER: 
SHI FT _LEFT: 







SHI FT _LEFT: 
SHI FT _RIGHT: 
DATA 










The data in the 6-bit register should write to the lower bus for every operation 
except INITIALIZE. This shifter always shifts right, never left. The input comes 
from the INPUT pin, and the output drives a new microcode bit called MSB. This bit 
supplies the input data for the 2-bit shifter. The 2-bit shifter al ways shifts left, 
never right, and we feed the input of the shif.ter from the MSB bit, which is the 
output of the· 6-bit shifter. The first and second bits in this shifter drive the OP 
field of the microcontrol word. 
The next element we would like to design is the initial address port. This element 
is an input port which should transfer its data to the address register during an 
INITIALIZE instruction. The data input shift xegister does not write the lower bus 
during INITIALIZE, so we can have the input port drive the lower bus. The 
specification for the input port is simply 
JNPUT_PORT FIXEO_ADORESS 
REGISTER: WRITE_LmJER: l>"dNITIALIZE,·c/J, 
LOAD: ALWAYS: 
This element always loads its internal register from the pads, and drives the lower 
bus during the INITIALIZE instruction. 
The address register and comparison unit must contain a latch to save the device 
address, a comparator to compare the device address to the select address, and a 
mechanism for saving the result of the comparison. To maintain the comparison 
result, we can either have a single bit latch for holding the value, or we may have a 
register to hold the select address and continuously perform the comparison. In 
Bristle Blocks, all registers have the same width as the datapath, so a 1-bit register 
takes as much area as a 6-bit register. Therefore, we choose to have a register for 
the select address and we will continuously compare the address and select 
-140-
registers. To compare two registers' values, we use a subtracter with a 'value 
checker' on its output. This unit will compute the difference between its two 
input values, then compare this difference to a fixed constant. With a fixed 
constant of 0, this element's RESULT will be TRUE when the two input values are 
equal. 
SUBTRACTER_l4I TH_VALUE_CHECK AOORESS_CHECKER 
VALUE: 000000, 
RESULT: ACTIVE, 
1 NPUT _A: lREAO_LmJER: t.·d NIT I ALI ZE,·:/ OR /,•,LOAO_ADDR»:I, 
REFRESH: AU,!AYSJ, 
INPUT_B: [REAO_LOf.lER: !.-.SELECT»:/, 
REFRESH:ALWAYSJ, 
LOAD: ALWAYS; 
The INPU~ register is the device address register. This register reads data from 
the lower bus during the INITIALIZE instruction and the LOAD ADDRESS instruction 
if the chip is currently the selected device. The INPUT B register contains the select 
address. This register reads the lower bus during the SELECT operation. 
Next, we will specify the value register. This register should load from the lower 
bus during the SE'I.:_VALUE instruction. The contents of this register should be 
available for comparison with the counter's value. We can use the upper bus for 
this transfer. Since there are no other transfers on the upper bus, we can simply 
drive the upper bus from the register every clock cycle. 
REGISTER VALUE 
OPTIONS: [READ_LQl.JER: t.·1SET_VALLJE,·:I, 
REFRESH: AUJA YS, 
l.JRI TE_UPPER: AUIAYSJ; 
Finally, we need to specify the counter and comparison unit. In the chip 
specification, we stated that we wish to compare the data in the value register to 
the value in a counter. This counter is reset at the zero-crossing of the AC current, 
and simply increments each clock cycle. The clock cycle for the chip is adjusted so 
that the counter overflows at the next zero-crossing of the AC current. Rather than 
having an incrementer and a comparison unit, we can have a decrementer which is 
initialized to· the value in the VALUE register at the zero crossing, and simply 




INPUT_REGISTER: lREAO_UPPER:SYNC=I ANO NOT(!,·,SET_VALUE,·d), 
REAO_LOL.JER: SYNC= I ANO /,·,SET _VALUE,.,/], 
LOl\O: AL~IA YS, 
CARRY_OUT: PAO; 
We use the upper bus to transfer data from the VALUE register to the decremen.ter 
at the zero crossings. Problems arise when the zero crossing occurs during the same 
clock cycle as a SET_VALUE instruction, because the VALUE register vvould be 
loading from one bus while driving the other. The register documentation in 
chapter 7 states that simultaneous read and write operations cause garbage data to 
be driven onto the written bus .. Therefore, the decrementer reads its data from the 
lovver bus if SYNC is high and a SET VALUE instruction is being executed. 
We have now described each of the elements for the lamp dimmer chip. We nr:?ed 
only decide the order the elements should be placed in the datapath, since the order 
of implementation will be the order in which we specify the elements. The order 
does not matter a great deal, although the port cells are more efficient at either of 
the t"tvo ends of the datapath. We will place the fixed address input port on the left 
end of the chip. The complete specification for the lamp dimmer chip is listed here. 
NNlE RONS _CH IP 13: 
r1flECHARGE_BOTH PCHG; 
FIELD ACTIVE<l>,0P<2:3>,SYNC<4>,EXECUTE<5>,INPUT<l3>,MSB<7>; 
MACRO INITIALIZE{) % OP-00 ANO EXECUTE=! % 
MACRO SELECT!! % OP=OI AND EXECUTE=I % 
t1ACRO LOAD_ADOR<l % OP=IO AND EXECUTE=I AND ACTIVE=! % 
llACRO SET_VALUEO % OP=II AND EXECUTE=! ANO ACTIVE=!% 
MACRO NOT_INITIALIZE<l % NOT!DP=OOJ AND EXECUTE=! % 
INPUT_PORT FIXED_ADDRESS 
REGISTER: [l.JRI TE_LmJER: /,·dNITIALIZE:·dl, 
LOAD: ALWAYS; 
LEFT _RIGHT _SH I FT 
INPUT_REGISTER: 
SHIFT _LEFT: 
SHI FT _RIGHT: 
INPUT: 
llSB: 




















JNPUT_A: CREAD_LOl-IER: h·:!NITIALIZE~·:/ OR h·:LOAO_AODR»:I, 
REFRESH:ALWAYSJ, 




OPTIONS: [REAO_LOl-JER: 1~·:SET _VALUE1·d, 
REFRESH: AL~lA YS, 
l~R I TE_UPPER: ALWAYSJ ; 
DECRHIENTER OUTPUT 
ENO 
INPUT _REG I STER: CREAO_UPPER: SYNC=! AND NOT U1·cSET _VALUE1•:/ J , 
READ_LOl-IER: SYNC=! AND /i·:SET _VALUEMJ, 
LOAD: ALWAYS, 
CARRY_OUT: PAO; 
Bristle Blocks compiled the layout for this chip in 1.8 minutes. The chip 
dimensions were 78.9 mil by 102..4 mil, and the chip consumed 2.6 ma. Figure 9-3 
shows the bounding boxes for the various sections of the chip. 
9.2: Random Tune Generator 
The Player chip was designed to play pseudo-random melodies. The system block 
diagram is shown in figure 9-4. External to the player chip is an EPROM memory 
chip which contains the melody algorithm. Using the algorithm in the ROM, the 
player chip computes a square wave signal. This square wave is multiplied by the 
note amplitude to generate an 8-bit output value. The output value is converte.d to 
an analog voltage by a Digital-to-Analog Converter (DAC). 
The melody algorithm is contained in an object-oriented data structure contained in 
the melody ROM. The ROfy'I is organized as as 256 note 'objects'. Each object 
specifies a note, containing a duration, amplitude, and frequency, along -with 
potential future notes. A note object is graphically illustrated in figure 9-5. When 
the player chip is playing a note, it generates a square wave with the specified 
duration, amplitude, and frequency. When the given note has finished, the player 
chip will follow one of the four next-note pointers to find the next note. This 
-143-
















Fig. 9-4: Random Tune System Block Diagram 
-145-
CONTROL_TO_OATA_AND_BACK IR 
TO_CONTROL: 1<1:3>=> OP;<4:G>=> PADl, 
LOAD: OP=DXX OR TIME=!, 
REGISTER: CSUGCEST:T!ME=O AND OP=IXXJ, TO_OATA: IIF DP=Ol I THEN RND=XXI ELSE OP=DXO FI => 3; 
IF DP=DII THEN RND=XIX ELSE OP=OOI OR OP=OIO FI => 2; OP=DII=> 1; 
IF DP=DII THEN RND=XXI ELSE OP=OXO FI => 5; IF OP=Oll THEN RND=XIX ELSE OP=DDI OR OP=OIO FI => 5; OP=Oll=> 41; 
The TO CONTROL parameter specifies that bits 1-3 drive the OP field, while bits 4-6 
drive pads, as stated above. The register is loaded from the instruction decoder 
when the OP field equals OXX or when TIME=I. When the OP field's MSB is low, 
the chip is reading in the note parameters, so the sequencer increments the OP field 
value. When the final parameter is read, the OP field is loaded with IOO, IOI, IIO, or 
III, depending upon the next note to be played. The sequencer then waits until the 
TIME field goes high, indicating that the note has finished playing. 
The pseudo-random number generator uses a shift register with feedback logic. 
The feedback logic computes the shifter input value as a function of the current 
shift register data. With an appropriate feedback function, the random number 
stream repeats every 255 cycles, which is the maximal cycle length attainable 
using this form of generator. The RESET2 input, which comes from a pad, will 
clear the shift register. This input allows the user to alter the random number 
sequence. Without providing this reset, the system may only produce one fiXF..:d 
melody if the random number shift register always initializes with the same value 










RND=IOO OR RND=OIO OR RND=OOI OR RND=III, 
[SUGGEST: RESET2=l, VALUE:OOOOOOOO, REFRESH:OP=IXXJ; 
The ROM interface is fairly straightforward. An output port supplies the upper 8 
address bits for the ROM. These bits select which note object is the active note. 
This register is loaded with a new value when the chip begins to play a new note. 
The register is cleared when RESET1 is high, which allows the user to reinitialize 
the melody. An input port reads the data from the ROM. This port always drives 
the data unto the lower bus. The Bristle Blocks specification of these two ports is 
shown here. 
-146-
OUTPUT _PORT SEGl'lENT 







REGISTER: [!~RI TE_LOWER: AL~lAYSJ; 
The frequency divider is implemented as a 16-bit down-counter. This counter is 
initialized to the frequency value read from the ROM. The counter then decrements 
once each clock cycle. When the counter's data reaches zero, the frequency divider 
is reinitialized to the frequency value, and the square wave output changes sign. 
The 16-bit counter is implemented as a pair of 8-bit decrementers. Both 
decrementers decrement their values each clock cycle. If the least-significant 
word's value· does not cause a carry, the most-significant value is reset to its 
pre-decremented value. In effect, the most-significant word is not decremented 
unless the least-significant word caused a carry. When both decrementers have a 
ca1·ry output, both counters are set to the frequency value and the square wave 
changes sign. The frequency divider is specified as follows. 
SWAPPING DECREMENTER FREQUENCY LOW 
ACTIVE:- [SUGGEST:NEVER], -
BACKUP: [READ LOWER:OP=OII, REFRESH:ALWAYS], 
RESTORE: FREQ= II, 
LOAD: ALWAYS, 
CARRY OUT: FREQ BIT 1; 
REGISTER FREQUENCY HIGH OPTIONS:[WRITE UPPER:ALWAYS, 
READ LOWER:OP=OIO, 
REFRESH:ALWAYS]; 
SWAPPING DECREMENTER FREQUENCY HIGH DEC 
ACTIVE:- [READ UPPER:FREQ=II OR OP=OII), 
BACKUP: [READ UPPER:FREQ=II OR OP=Oll, REFRESH:ALWAYS], 
LOAD: ALWAYS, 
CARRY OUT: FREQ BIT 2, 
RESTORE: FREQ= XO, 
SA VE: FREQ= XI; 
Next, we need a timer. The timer is preset to the note duration. The timer's value is 
decremented when the TEMPO input is high. When the timer's value becomes zero, 
TIME becomes high, and the next note is played. 
DECREMENTER TIMER 
INPUT REGISTER:[READ LOWER:OP=OOO, REFRESH:TEMPO=O], 
-147-
LOAD: TEMPO=I, 
CARRY OUT: TIME; 
To generate the output value, we need to multiply the square wave by the note 
amplitude. As it turns out, square waves have only two values: +1 and -1. When 
the square wave is high, the output value is just the note amplitude, and when the 
square wave is low, the output value is the inverse of the note amplitude. Our 
output section has a swapping output port. The two registers are loaded with the 
amplitude and the inverse amplitude when the note parameters are read. Each time 
the frequency divider produces a carry output, the data in these two registers swap 
places. The output pads are driven with the data contained in one of these register. 

















rREAD_LOWER: OP=OOI, SUGGEST:AUlAYSJ, 
lREAD _UPPER: OP =0 IO, SUGGEST: AL~IA YS J , 
OP= I XX AND FRED= I I , 
OP=IXX AND FREO=ll; 
The complete chip specification is listed next. Bristle Blocks compiled the chip in 
3.67 GPU minutes, and the final chip size is 140 by 154 mil. The chip consumes 59 
ma. of power at 5 vol ts, 
NMIE PLAYER 8; 
FI ELD OP<l: 3>, RN0<4: 6>, TI ME<7>, FREQ<8: 9>, RESETl <10>, TH1P0<11>, RESET2<12>; 
OUTPUT_PORT SEGMENT 
REGISTER: lREAD_LmJER: OP=IXX ANO Tlf1E=l, 
SUGGEST: RESETl=I, 
VALLIE: 00000000) ; 
INPUT_PORT DATA 





SH I FT _LEFT: 
INPUT: 
REGISTER: 
l<l, 7, 8>=>RNDI, 
OP=OOO, 
NEVER, 
RND-IOO OR RND=OIO OR RND=OOI OR RND=III. 
[SUGGEST: RESET2=I, VALUE:OOOOOOOO, REFRESH:OP=IXXJ; 
-148-
CONTROL_TO_OATA_AND_BACK IR 
TO_CONTROL: l<l:3>=> OP:<4:G>=> PAOJ, 
LATCH: OP=OXX OR TIME=!, 
REGISTER: [SLJGGEST:TIME=D ANO DP=IXXJ, 
TO_DATA: !IF OP=OII THEN RND=XXI ELSE OP=OXO FI => 3; 
IF OP=DII THEN RNO=XIX ELSE OP=OOI OR OP=OIO FI => 2; 
OP=OI I=> 1; 
IF OP=DII THEN RND=XXI ELSE OP=OXO FI => 5; 
IF OP=OII THEN RND=XIX ELSE OP=OOI OR OP=OIO FI => 5; 
OP=OII=> 41; 
SIJAPP I NG_DECREt1ENTER FREQUENCY _LOW 
ACTIVE: [SUGGEST: NEVERJ, 
BACKUP: CREAO_LOLJER:OP=OI I, REFRESH:ALWAYSJ, 
RES TORE: FREQ= I I , 
LOAD: ALWAYS, 
CARRY_OUT: FREQ BIT l; 
REG! STER FREQUENCY _HIGH OPT IONS: [t.IR ITE_UPPER: ALt.IAYS, 
READ_LOWER:OP=OIO, 
REFRESH:ALWAYSJ; 
















OR. OP=Ol I J, 
OR OP=OII, REFRESH:ALWAYSJ, 




JNPUT_A: [SUGGEST: ALWAYS, VALUE:OOOOOOOOl, 
CREAD_LOLJER: OP=OOI l, INPUT _B: 
OUTPUT_REGISTER: [WR! TE_UPPER: OP=OIOJ, 
LOAD: ALLJAYS; 
PRECHARGE_BOTH PRECHARGE; 
SIJAPP I NG_OUTPUT _PORT OUTPUT 
ACTIVE: CREAD_LOLJER: DP=DOI, SUGGEST:ALWAYSJ, 
BACKUP: [READ_UPPER: OP=DIO, SUGGEST:ALt.lAYSJ, 
RESTORE: OP=IXX AND FREO=II, 
SAVE: DP=IXX AND FREQ=II: 
END 
9.3: Frequency Scaler Chip 
Jeff Sandeen, employed by Hewlett-Packard, Colorado Springs, was on temporary 
assignment to Caltech when he designed the frequency scaler (FRESCA) chip. The 
chip specification presented here is a slightly modification of Jeff's d~sign. Jeff 
-149-
wanted a chip which scales the frequency of an input waveform. The chip would 
accept a binary waveform, and generate a new binary waveform with the 
frequency scaled, but with the duty factor of the output wave as close as possible to 
the input wave's duty factor. 
The chip counts the number of clock cycles that occur while the input waveform is 
high, and the number of clock cycles occurring while the input signal is low. The 
sum of these two numbers is the period of the input signal. These two numbers are 
multiplied by one user-supplied constant, and divided by another constant, to 
generate two output period numbers. The output generator sets the output high for 
the number of clock cycles indicated by the scaled high period value, then sets the 
output low for the number of clock cycles indicated by the scaled low period value. 
Rather than perform a multiply and divide on the chip, Jeff used incremental 
techniques to achieve the same results. Rather than incrementing a value during 
the high period and multiplying this by one of the scaling factors, we accumulate 
the scaling factor over the high period. We do the divide and decrement by repeated 
subtractions. The simplified block diagram of the FRESCA chip is shown in figure 
9-6. The input section computes the high and low periods, scaled by one of the two 
scale parameters. The storage section stores these two values. The output unit 
computes the output signal, using the period values from the storage section and the 
other scale parameter. Finally, the state section computes when various signals 
change. 
Some additional complexity has been added to the simplified block diagram to 
correct for round off errors during the counting processes. The SAVE D BAR and TQ_ 
_QUTPUT elements are the elements added to improve the counting accuracy. Bristle 
Blocks compiled the FRESCA chip in 3.0 minutes. The chip size was 124 by 177 mil, 
and the chip consumed 68 ma. at 5 volts. The Bristle Blocks specification for the 
chip is shown here. 
NM1E FRESCA 16; 
MACRO CONST!{) % OOOOOOOOOOOOXXXX % 
llACRO CONST2 {) % I I I Ill l I I I I I XXXX % 




Input Storage Output State 
-0 
t 
IJ l 0 0 
-fu !a .. • .... .. ... j a. > .:r ... .. 








Fig. 8-6: Frequency Scaler Block Diagram 
OELTA_IN<S>, 





REG I STER: [REFRESH: ALI-IA YS, 
SUGGEST:LOAD=OX, 
VALUE: /*CONSTl*/, 
WRI TE_LOl-JER: DEL TA_! N=OJ, 
MAP: ! OATA=XXXJ => lG 
OATA=XXIX => 15 
OATA=XIXX => 14 , 
OATA=IXXX => 13 I, 
LATCH: LDAD=DX; 
CDNTROL_TO_DATA SAVE_D_BAR 
REGISTER: [REFRESH: ALWAYS, 
SUGGEST:LDAD=XD, 
VALUE: /*CONST2*/, 
WRITE_LOIJER: DELTA_IN=I J, 
MAP: I OATA=XXXO => lG 
OATA=XXOX => 15 
DATA=XOXX => 14 , 
OATA=DXXX => 13 I, 
LATCH: LOAD=XD; 
ADDER INPUT 
INPUT _A: CREAO_UPPER: DELT A_I N=D, 
READ_LOl-JER: DELTA_IN=IJ, 
INPUT_B: · rREAD_LOl-JER: DELTA_IN=DJ, 










CREAD_UPPER: DELTA_IN=I ANO IN=O AND 
NOT(DELTA_OUT=O AND OUT=O}, 
REFRESH: ALWAYS, 
1.JRI TE_LOl.IER: DELTA_OUT =0 ANO OUT =Dl; 
CREAO_UPPER: OELT A_I N= I AND IN= I AND 
NOTCOELTA_OUT=O AND OUT=I), 
REFRESH: ALWAYS, 
1-lR I TE_LO!.IER: DELT A_OUT =0 ANO OUT= IJ ; 
PRECHARGE_ANO_BREAK_UPPER GAP2; 
"State Section" 
CONTROL TO OATA_AND_BACK STATE 
REGISTER: (REFRESH:ALWAYSJ, 
LATCH: ALWAYS, 
TO_CONTROL: l 1=> PAD ; 
2=> OUT ; 
3=> OLD_IN ; 
4=> DELTA_IN I, 





IF DELTA_OUT=D THEN OUT=D ELSE OUT=I FI => 2 ; 
IN= I => 3 ; 
IF IN=I THEN OLD_IN=O ELSE OLD_IN=I FI => 4 I; 
CREAO_LOl.JER: DELT A_OUT =0, REFRESH: AU.JAYSJ , 
[REAO_UPPER: DELTA_OUT=I, 
t.JR I TE_UPPER: DEL T A_OUT = OJ , 
AUJAYS; 













CONTROL TO DATA SAVE_D 
REG Is TER: (REFRESH: AUIA YS' 
SUGGEST:LOAD=XD, 
VALUE: /*CONSTl*/, 
WR I TE_LOl.JER: LOAD=X I l , 
MAP: l DATA=XXXI => 15 
DATA=XXIX => 15 
OATA=XIXX => 14 , 




9 .4: SDLC Chip 
-152-
John \Nawrzynek, a member of Caltech's Silicon Structures Project (SSP), was 
interested in building a synchronous, serial communication chip, similar to IBM's 
Synchronous Data Link Control chip, or the synchronous portion of INTEL's 8251 A 
USART chip. He found that each of these chips had undesirable 'features' because 
the chip designers wanted a 'universal' chip. John realized that with a silicon 
compiler, chips can be optimized to their application, rather than being 'general 
purpose' in nature. 
The SDLC chip is designe1i to be used with an 8-bit microprocessor. The chip 
contains both a transmit and receive buffer, along with a status/command register. 
The microprocessor interface consists of an 8-bit data port, a read (RD) line, a write 
(WR) line, and a control/data (<;_!?BAR line). The system interface consists of a reset 
(RESET) line, transmit clock signal (TXC), and receive clock signal (RXC), along 
with the standard power and clock signals. The network interface consists of the 
transmit data (TX) line and the receive data line (RX). 
Upon RESET, or when the microprocessor sets bit 3 in the status/command register, 
the receiver enters the HUNT mode. In HUNT mode, the receiver circuitry attempts 
to match each 8-bit windoi;v in the incoming bit stream, scanning for the SYNC 
character, which is fixed as IOOOOOOI. When the sync character is received, the 
SDLC. chip terminates HUNT mode and begins assembling characters. 
Upon RESET, the SDLC chip will transmit SYNC characters until data is written into 
the transmitter buffer. Additionally, whenever a character has finished being 
transmitted, and the transmitter buffer is not full, the SYNC character will be 
transmitted. 
The Bristle Blocks code. for the SDLC chip is listed here. Bristle Blocks compiled the 
chip in 2.4 minutes, and the resulting chip size was 95 by 148 mils. The chip 
consumed 36 ma. of power. 
-153-
NAtlE SDLC 8: 
MACRO SYNC() % 10000001 % 
FI ELD RESET <l >, RD<2>, l.JR<3>, C_DBAR<4>, RXC<9>, TXC<10>, TOONE<5>, RDONE<l 1 >, 
TXBUF _FULL<5>,RXBUF _FULL<7>,HUNT_l100E<8>, IS_SYNC<12>,RX<l3>; 
IO_PORT DATA 
OUTPUT_REGISTER: (READ UPPER: RD=l, 




CONTROL TO DATA AND BACK STAT CMO 
REGISTER: - -(READ_UPPER: WR=l AND C_DBAR=l, 




TO_CONTROL: { 1=> RXBUF _FULL: 2=> TXBUF _FULL: 3=> HUNT_~10DE I, 
TO_OATA: { RDONE=I OR RXBUF_FULL=I ANO RD=O 
OR RXBUF_FULL=l AND C_DBAR=l => 1; 
t.lR=l ANO C_DBAR=O OR TOONE=O ANO TXBUF_FULL=l => 2; 






SHI FT _RIGHT: 
SHIFT_LEFT: 
l'lAP: 
[READ_UPPER: l.JR=I AND C_DBAR=l, 
t.JRJ TE_LOWER: TOONE= I, 
REFRESH: ALWAYSJ; 
fREAD_LOllER: TOONE=! AND TXBUF _FULL= I, 









OPTIONS: [I.JR! TE_UPPER: RD=I AND C_DBAR=O, 
READ_L01JER: RDONE=l, 
REFRESH: ALWAYS]; 
SHIFTER_IJI TH_VALUE_CHECK R 
REGISTER: [WRITE_LOt.IER: RDONE=l, 
SHIFT_RIGHT: 






















INPUT_REGISTER: [SUGGEST: RESET=!, 
VALUE: 0000000!, 
END 







In another application, the same basic function was required, but due to processor 
overhead time, FIFOs were required on the transmit and receive buffers. In the 
following listing, 8-word deep FIFOs have been added to the two buffers. The 
compile time for this new chip was 6.67 CPU minutes, the chip size was 222 by 
199 mils, and the power requirements were 1 O~ ma. 
NNlE SOLC2 8; 
11ACRO SYNC<l % 10000001 % 
FIELD RESET<l>,RD<2>,~lR<3>,C_OBAR<4>,RXC<9>,TXC<10>,TDONE<5>,RDONE<ll>, 
TXBUF _FULL<6>, RXBUF _FULL<?>, HUNT _r10DE<8>, I S_SYNC<12>, RX<13>, 
TXRA<l4:21>,TXWA<22:29>,RXRA<14:21>,RXWA<22:28>; 
I O_PORT DAT A 
OUTPUT_REGISTER: [REAO_UPPER: RD=I, 
l-JRITE_UPPER: WR=l, 
REFRESH: ALWAYS}, 
LOAD: ~IR= I , 
DRIVE: RD=l: 
CONTROL_TO_OATA_AND_BACK STAT_CMO 
REG I STER: [REAO_UPPER: ~JR= I AND C_OBAR= I, 




TO CONTROL: I l=> RXBUF _FULL; 2=> TXBUF _FULL; 3=> HUNT_l100E } , 
TO=DATA:. ! RDONE=l ANO 
!RXRA=IXXXXXXX AND RXWA=XXXXXXXI OR 
RXRA=XIXXXXXX AND RXWA=IXXXXXXX OR 
RXRA=XXIXXXXX AND RXWA=XIXXXXXX OR 
RXRA=XXXIXXXX AND RXWA=XXIXXXXX OR 
RXRA=XXXXIXXX AND RXWA=XXXIXXXX OR 
RXRA-XXXXXIXX AND RXWA=XXXXIXXX OR 
RXRA=XXXXXXIX AND RXWA=XXXXXIXX OR 
RXRA=XXXXXXXI AND RXWA=XXXXXXIX> => 1; 
WR=l AND C_DBAR=O AND 
<TXRA=IXXXXXXX ANO TXWA=XXXXXXXI OR 
TXRA=XIXXXXXX ANO TXWA=IXXXXXXX OR 
LATCH: 
-155-
TXRA=XXIXXXXX AND TXWA=XIXXXXXX OR 
TXRA=XXXIXXXX ANO TXWA=XXIXXXXX OR 
TXRA=XXXXIXXX AND TXWA-XXXIXXXX OR 
TXRA=XXXXXIXX AND TXWA=XXXXIXXX OR 
TXRA=XXXXXXIX ANO TXWA=XXXXXJXX OR 
TXRA=XXXXXXXI AND TXWA=XXXXXXIXJ 
OR TXBUF_FULL=I => 2: 
IS_SYNC=O AND HUNT_MODE=I => 3; 
NOTIRXRA=IXXXXXXX ANO RXWA=IXXXXXXX OR 
RXRA=XIXXXXXX AND RXWA=XIXXXXXX OR 
RXRA=XXIXXXXX ANO RXWA=XXIXXXXX OR 
RXRA=XXXIXXXX AND RXWA=XXXIXXXX OR 
RXRA=XXXXIXXX AND RXWA-XXXXIXXX OR 
RXRA=XXXXXIXX AND RXWA=XXXXXIXX OR 
RXRA=XXXXXXIX AND RXWA=XXXXXXIX OR 
RXRA=XXXXXXXI ANO RXWA=XXXXXXXIJ =>4; 
NOTITXRA=IXXXXXXX ANO TXWA-IXXXXXXX OR 
TXRA=XIXXXXXX AND TXWA=XIXXXXXX OR 
TXRA=XXIXXXXX ANO TXWA=XXIXXXXX OR 
TXRA-XXXIXXXX ANO TXWA-XXXIXXXX OR 
TXRA=XXXXIXXX ANO TXWA=XXXXIXXX OR 
TXRA=XXXXXIXX ANO TXWA-XXXXXIXX OR 
TXRA~XXXXXXIX AND TXWA=XXXXXXIX OR 
TXRA-XXXXXXXI ANO TXWA=XXXXXXXI> =>5l, 
ALWAYS; . 
MACRO TXBUFREG<NAME,AOR) 
% REGISTER TXBUF_?NAME? 
OPTIONS: [REAO_UPPER: WR=! AND C_DBAR=O AND TXRA=?ADR?, 
t.JRITE_LOL..JER: TDONE=I AND TXWA=?AOR?, 











SHIFT_RIGHT: WR=I AND C_DBAR=D, 
flAP: !<1:8> => TXRAI, 
REGISTER: [SUGGEST: RESET=!, 
VALUE: I OOOOOOOJ , 
INPUT: TXRA=DOOOOOOI; 
SHI FT I NG_IR TXl~RI TE_POINTER 
SHIFT_LEFT: NEVER, 
SHIFT_RIGHT: TDONE=I, 
MAP: 1<1:8> => TXl~A!, 
REGISTER: (SUGGEST: RESET=!, 
VALUE: I OOOOOOOJ , 
INPUT: TXWA=OOOOOOOI; 
SHIFTING_IR T 
REGISTER: rnEAD_LOl~ER: TDONE=I AND TXBUF _FULL=I, 





1·1AP: I 8=> PAO I: 
-156-
PRECHARGE _ANO _BREAK _LQL.IER LOl-lER_CHARGE: 
PRECHARGE_BOTH BOTH_CHARGE: 
f!ACRO RXBUFREG <NAME, AOR) 
% REGISTER RXBUF_?NAt1E? 
OPTIONS: [l.lRITE_UPPER: RO=I AND C_OBAR=O AND RXRA=?AOR?, 
R_EAO_LOlJER: RDONE=I AND RXWA=?AOR?, 











SHIFT_RIGHT: RD=! ANO C_OBAR=O, 
MAP: 1<1:8> => RXRAI, 
REGISTER: [SUGGEST: RESET=!, 
VALLIE: I OOOOOOOJ , 
INPUT: RXRA=OOOOOOOI; 
SHIFTING IR RXWRITE POINTER 
SHIFT_LEFT: NEVER, 
SHIFT_RIGHT: RDONE=I, 
llAP: l<l: 8> ~> RXl-lAI, 
REGISTER: [SUGGEST: RESET=!, 
VALUE: I OOOOOOOJ , 
INPUT: RX~IA=OOOOOOO I : 
SH IF TER_IJ I TH_ VALUE_CHECK R 
REGISTER: [l.JRI TE_LOWER: ROONE=I, 
SHJFT_RIGHT: 











INPUT_REGISTER: [SUGGEST: RESET=! OR HUNT_f100E=l, 
VALUE: OOOOOOOI, 
SHI FT _LEFT: 


















9.5: A Microprogrammed Microprocessor 
In this next example, we will see how a silicon compiler allows the user to explore 
alternate system architectures. We will design a microprogrammed microprocessor 
system, similar to the OM2 [15][16] system designed at Caltech. The basic 
architectural plan of the OM system is shown in figure 9-7. We have a datapath 
chip, which contains the scratchpad registers and ALU for the system, a microcode 
controller, which generates microcode addresses, and a microcode memory, which 
contains the instruction code for the machine. Surrounding these three modules are 
application dependent peripheral circuits. The basic system communicates with the 

















Fig. 9-7: OM System Block Diagram 
We will begin by designing a controller chip. The controller provides microcode 
addresses. We need a register to hold the current microcode Program Counter 
(mPC). The usual operation of the controller will be to sequence through a series of 
microcode "tvords, so the mPC will need an incrementer. If we used an adder instead 
of an incrementer, we can perform relative microcode branches. Under normal 
-158-
operation, one input to the adder can be set to the value 1, so that the adder 
performs the increment operation. To branch, we merely load this adder input with 
the offset. To do a jump, we can force new data into the mPC register. By including 
a small stack on the chip, we can have subroutines in our microcode. 
Based upon these desires, we can design a Register Transfer (RT) level diagram of 
the datapath, as shown in figure 9-8. We have drawn each of the registers and 
transfer paths. The transfer paths have been labeled to aid in the description of the 
chip operation. The upper bus is used to transfer the new mPC to the PORT unit, 
which drives the address lines of the microcode memory (note: the least-significant 
address is connected to the PHI-2 clock line, so that two words are read from the 
microcode memory every clock cycle). Since we want a new mPC value each clock 
cycle, the A control line should be high every clock cycle, and one of C, D, or E 
should also be high. The mPC latch, which is one of the adder input registers, 
should also be loaded every clock cyde, so the B control line is always high, too. 
For normal operation, we want to increment the mPC value each clock cycle, so the 
OFFSET register should normally contain a value of 1, and the NEW_mPC register, 
\Vhich is the adder output, should normally drive the upper bus. Therefore, the L 
control, which loads the OFFSET register with 1, and the C control lines should 
normally be high. To perform a branch, we want to load the OFFSET register with 
the data in the H~ _ _!>ORT. This transfer is done by enabling the F and J control lines. 
To do a jump, we wish to directly transfer the I~ORT data into the mPC, so we 
enable the E control line instead of the C control line. 
Stack 
Address 
Fig. 9-8: Controller Register Transfer Diagram 
-159-
The STACK unit allows calls and returns in the microcode. To perform a CALL 
operation, we need to push the NEW_mPC value onto the stack and load the mPC 
from the I~ORT. This operation requires setting the G, I, N, and E control lines 
high. To perform a RETURN operation, we simply pop the top value off the STACK 
and into the mPC. Setting D and M high will perform this transfer. 
We have described five operations performed by the controller chip, which means 
that a 3-bit microcode field is required to specify the operation. We can have up to 
eight operations specified by the 3-bit field, so we can add three more instructions 
to the controller's repertoire without impacting the microcode cost. If we can 
perform these new operations with the existing controller hardware, these new 
instructions are virtually free. One operation we may wish to have is a SA VE 
operation, which will push new data unto the STACK. This operation allows us to 
store a jump address in the controller chip several clock cycles before the jump is to 
occur. vVhen the time comes to jump, the RETURN instruction will transfer the 
jump address to the mPC. We may like to use the two remaining instructions as 
loop control operations. One of the operations would be used at the start of the loop, 
the other at the end. The form of loop we will implement is a DO loop. The DO 
instruction will push the NEW_p1PC value on the stack, and the ENDDO instruction 
v.rill move the top-of-stack value into the mPC. 
To allow conditional operations, there will be a condition input to the chip. If the 
condition is TRUE (i.e. the pin is high), the instructions will be executed as stated 
above. If the condition is FALSE, the normal operation, which increments the 
current mPC value. will be executed. If the ENDDO instruction is executed when 
the condition is FALSE, we will say that an UNDO instruction is executed, which 
causes the controller to 'fall out' of the loop. We will increment the mPC value and 
discard the top value on the STACK. 
The following table summarizes the driving functions for each of the control lines. 
Operation Condition Active Control Lines Opt i ona I Active Controls --------- --------- -------------------- ------------------------NOP TRUE A,8,L,C G,H,J FALSE A,8,L,C G,H,J 
JUtlP TRUE A,B,E L,F,G,H,J FALSE A,B,L,C G,H,J 
-160-
CALL TRUE A,8,E,L,G, I ,N 
FALSE A,8,L,C G,H,J 
RETURN TRUE A,8,DJl L,F,G,H,J FALSE A.B,L,C G,H,J 
BRANCH TRUE A,8,C,F,J 
FALSE A,8,L,C G,H,J 
SAVE TRUE A,8,C,L, I ,N,J 
FALSE A,8,L,C G,H,J 
DO TRUE A,8,C,L,G, I ,N 
FALSE A,8,L,C G,H,J 
ENDOO TRUE A,B,O G,H,J,L,F <UNDO) FALSE A,B,L,C,M,H,K 
The translation of this chip description into the Bristle Blocks specification 
language is straightforward. The Bristle Blocks input is listed here. Bristle Blocks 
compiled the chip in 4.06 CPU minutes, the chip area was 171 by 195 mil, and the 
power requirements were 88.8 ma. 
NAME CONTROLLER 16; 
FIELD 0P<l:3>,CDNDITIDN<4>,L0AD<5>,DRIVE<6>; 




INPUT _A: £READ_UPPER: AL~IAYSJ , 
INPUT _8: [READ_LOl-JER: hBRANCH,·,/, SUGGEST: NOT (/,-,BRANCH,·,/) , VALUE:OOOOOOOOOOOOOOOIJ, 
LOAD: AUJA YS. 
OUTPUT_REGISTER: 
(I.JR I TE_UPPER: I ,-,NOP,·,/ OR I ,·,BRANCH,·,/ OR I ,·,SAVE M OR I ,·,DO,·rl, lJRITE_LDWER: NOTU1·,UND01·d OR hBRANCHi·d OR /,·,SAVE,·,/)); 
PRECHARGE_BOTH PCHG; 
. STACK STACK 
DEPTH: 16, 
TOP: Cl.JR I TE _UPPER: I i"<RE TURN,·,/ DR /.:£NDOO,·d, 
!JR I TE_LO!JER: h«UNDO,·c/, 




OUTPUT .:_REGISTER: (l,JRI TE_UPPER: h·:CALL,·d OR h'dUt1P,·d, 
lJR I TE _LOIJER: />:BRANCH,·:/ OR /,-,SA VE ,·r/, 
READ_LQl.JER: h:UNOO,·rl, 
LOAD: LOAD= I , 
DRIVE: DRIVE=I; 
REFRESH: AUJAYSJ, 
We can experiment to see how the stack size affects the area and power 
requirements of the chip. After compiling controllers with stack depths of 8 and 
12, and interpolating and extrapolating the results, the power requirements were 
found to be approximately 28.8 + 3.75"'depth ma. and the width of the chip was 
found to be approximately 83 + 5.5"'depth mils . 
• • .I) .I) L .j) L 
L • r • .... C\I • L .j) JI: < ID :::::> [1) 0 0 .I) IJ :a :a .fl I _J j .fl IJ Q.. Q.. • .j) IO U> (f.. ::i l s: s: .. • % ..J .. Q ... ... < d ..... 0 Ol r ..c lL. 0 • 0 en ..... ..... 0:: u 
Fig. 9-9: Datachip Block Diagram 
Next, we can design the datachip for the microprogrammed processor. We need two 
bi-directional data ports, some general purpose registers, a fixed constant source, a 
shifter, and an Arithmetic/Logic Unit (ALU). A block diagram of the proposed chip 
is shown in figure 9-9. Each of the registers in the chip communicate with tvvo 
data buses. We can assign a unique bus address for each of the registers. We can 
decode the microcode to allow two transfers per clock cycle. There are 16 data 
sources for each bus, and 15 data sinks (due to the constant value). Hence, vve can 
decode a 16-bit microcode word as four 4-bit address field. One address specifies 
the upper bus (A bus) source, another specifies the destination. We decode the two 
lower bus (B bus) addresses in the same manner. 
-162-
PHl-1 Microcode Word Decode 
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ I I I I I 
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ A Source A Oest. B Source B Oest. 
The left and right data ports are implemented as single-register IC2.__!'0RT. The left 
port's register is assigned bus address 2, while the right port's register is assigned 
bus address 3. The load and drive controls for these ports come directly from a 
microcode field of the PHI-2 microcode word. When the least-significant bit of the 
FORT field is high, the right port will drive from its internal register to the pads. 
The next bit controls when the right port reads from the pads into the internal 
register. The two high-order bits of the PORT field control the left port. 
PHI-2 Microcode Word Decode 
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ !\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\! PORT I 
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ 
I I . I I 
Left Port Load ---/ I I I 
Left Port Drive -----/ I I 
Right Port Load --------/ I 
Right Port Drive ----------/ 
It is useful to have a source of constant data in the datapath. Besides giving us a 
known value, a constant 'register' does not read data from the bus. Hence, we have 
an unassigned bus destination address. If we do not wish to perform a transfer on 
one of the two data buses, we can 'transfer' into this non-existing n~gister. We 
must choose what our two constant values will be. To aid in the generation of 
masks and shift operations, the upper bus constant will be 0 and the lower bus 
constant will be -1. 
We will use a barrel shifter for the shift element. The MASKED SHIFTER has 
registers for the input most-significant word and least-signficant word, along with 
an output register and mask register. With the masked writing capabilities, we can 
do field extractions and field insertions. We have a 4-bit shift constant field in the 
microcode, along with a two bit field specifing how to load under mask. If the t-wo 
mask bits ~re low, the shifter does not write into its output register. If both bits are 
high, the shifter directly loads its output register. If the lower bit of the mask op 
field is high, the shifter writes into the output register bits whose corresponding 
mask register bits are low. If the upper bit of the mask op field is high, the shifter 
writes into the output register bits whose corresponding mask register bits are 
high. 
-163-
PHI-2 Microcode Word Decode 
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ I SHIFT 1\\\\\\\\\\\1 f1ASKl\\\\\I PORT I 
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ 
0 0 No Write 
0 1 Write where mask is low 
1 0 Write where mask is high 
1 1 Write every bit 
For the ALU, we use the Bristle Blocks ALl:!_!"'ITI:!.£LAGS element. This element has 
hvo input registers, either one or two output registers, and a flag register. We will 
use both output registers. The CARRY, MSB, and ZERO flags from the flag register 
vvill drive pads, so that external circuitry can sense the state of the flags. To allovv 
external conditions to modify the ALU operations, we will have a condition input 
which drives the ALU operation decode. The ALU portions of the PHI-2 microcode 
vvord are listed here. 
PHI-2 Microcode Word Decode 
+--+--+--+--+--+--+-~+--+--+--+--+--+--+--+--+--+ I SHIFT I ALU I MASKI LOADI PORT I 
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ 
I I I I I I 
I I I I I \--- Load ALU output 
I I I I \------ Load ALU output 
0 0 0 0 Divide Step,·, 
0 0 0 1 Increment A 
0 0 1 0 Subtract w j th Borrow~, 
0 0 1 1 Subtract 
0 1 0 0 Aclcl with Carry,·, 
0 1 0 1 Adel 
0 1 1 0 Decrement A 
0 1 1 1 Negate A 
1 0 0 0 Mu I t i p I y Step,·, 
1 0 0 1 Select A/B,·, 
1 0 1 0 OR 
1 0 1 1 AND 
1 1 0 0 A 
1 1 0 1 XOR 
l 1 1 0 TEST 
1 1 1 1 Comp Ii ment A 
register 
register 
,., indicates that the operation performed is CJ function of the condition input. 
2 
1 
Finally, we add the general purpose registers. We have four free bus addresses, so 
we will add four registers to the chip. The Bristle Blocks specification for this chip 
is listed here. A chip enable input has been added to the chip specification. When 
chip enable is low, none of the registers' contents will be modified. 
-164-
NAME OATAPATH 16; 
FIELD A_SOURCE<l:4>,A_DEST<5:8>,B_SOURCE<9:12>,B_DEST<13:16>, 
ENABLE<17>,SHIFT_CONST<1:4>,ALU<5:8>,MASK<9:10>,LOAD<ll:l2>, 
PORT<13:16>,CONDITION<l8>,ALU_OP= ALU & CONDITION; 
MACRO ADDRIADDRI 
% [READ_UPPER: A_DEST=?ADDR? AND ENABLE=I, 
READ_LOWER: B_DEST=?ADDR? AND ENABLE=I, 
l,JR I TE_UPPER: A_SOURCE=? ADDR?, 
lJRI TE_LOlJER: B_SOURCE=?ADOR?, 
REFRESH: ALWAYS J % 
IO_PORT LEFT_PORT 
OUTPUT _REG I STER: hADDR <DO I OJ M, 
LOAD: PORT=lXXX ANO ENABLE=!, 
DRIVE: PORT=XIXX AND ENABLE=!; 
REGISTER R12 OPTIONS:/*ADDRIIIOOJ*/; 
REGISTER R13 OPTIONS:/*ADDRIIIOIJ*/; 
REGISTER R14 OPTIONS:/*ADDRIIIIOI*/; 
REGISTER Rl5 OPTIONS:/*ADDRIIIIII*/; 
ROl'l_PA IR C0 
LEFT_ENABLE: A_SOURCE=DOOO, LEFT_UPPER:OOOOOOOOOOOOOOOO, 
RIGHT_ENABLE:B_SOURCE=OOOO, RIGHT_LOL-lER: I I I I I I I I I I I I I I I I; 
PRECHARGE_BOTH PCHG; 








/,·,AOOR I I 0001 ,.,; , 
I ,.,AOOR ( I 00 I I,·:/, 
I ,·,AOOR I I 0 I Ol ,·:/, 
hADDR I I 0 I I l ,·:/, 
SH I FT _CONST, 
MASK=XI ANO ENABLE=!, 
MASK=IX AND ENABLE=I; 







1-JR I TE_OUTPUT _1: 
[,JR! TE_OUTPUT _2: 
TO_CONTROL: 
hAOOR !O I DOI ,·:/, 
hADDR !O IO I l >":!, 
hAODR !O I IO I ,·:/, 
hAOOR !O I I I I ,·:/, 
hADDR !ODO I l ;":/, 
LOAD=IX ANO ENABLE=!, 
LOAD=IX ANO ENABLE=!, 
LOAD=Xl AND ENABLE=!, 
l<l,2,9>=>PADI, 
DECODE: ALU_OP 
<0> => SUBTRACT 
<1> => ADO 
<2,3> => INCREMENT_A 
<4> => SUBTRACT 
<5> => SUB_lJ_BORROW 
<G,7> =>SUBTRACT 
<8> => ADD 
<9> => ADO_IJ_CARRY 
<10,11> =>ADD 
<12,13> => DECREl1ENT_A 
<14,15> => NEGATE_A 
-165-
<lG> => SETA 
<17> => ADO 
<18> => SETA 
<l!:l> => SETB 
<20,21> => OR 
<22,23> => ANO 
<24,25> => SETA 
<2G,27> => XOR 
<28,29> => TEST 
<30,31> => SETCA; 




I )'1AODR !001 l > )'d, 
PORT=XXIX ANO ENABLE=!, 
PORT=XXXI ANO ENABLE=!; 
ENO. 
This chip was compiled in 6.2 minutes, resulting in a chip whose area was 203 by 
2 76 mil, and whose power consumption was 96 ma. 
The microcode writers were unsatisfied with. the limited number of general 
purpose registers. There were only four registers in the original chip specification 
that were not used by the data processing elements, although one of the ALU output 
registers can be used if the user never loaded the register from the ALU. The system 
designers, on the other hand, wished to keep the microcode width at 16-bits, which 
presented an addressing problem. How can we address more registers in thE! 
datapath. Four schemes were pursued which lead to an increased register count in 
tne data chip. 
The first scheme involved rearranging the PHI-1 microcode word. Instead of 
having 4-bit addressing for both the A and B buses, we tried having 5-bit addresses 
for the A bus and 3-bit addresses for the B bus. We would limit the number of 
registers which could communicate across the lower bus and at the same time 
increase the number of registers v.rhich can use the A bus. With this technique, we 
were able to add 16 more registers to the chip. The chip area increased to 229 by 
272 mil, and the power consumption rose to 126 ma. The specification for this ne~ 
chip is listed here. 
NMlE OATAPATH2 lG; 
FIELD A_SOURCE<1:5>,A_OEST<6:10>,B_SOURCE<ll:l3>,B_DEST<l4:16>, 
ENABLE<17>,SHIFT_CONST<l:4>,ALU<5:8>,MASKS<9:10>,LOA0<11:12>, 
PORT<13:16>,CONOITION<l8>,ALU_OP= ALU & CONDITION; 
flACRO ADOR_BOTH {AOORl 
% [REAO_UPPER: A_DEST=OO?ADOR? ANO ENABLE=!, 
-166-
READ_LO\.IER: B_DEST=?AOOR? ANO ENABLE=I, 
I.JR I TE_UPPER: A_SOURCE=OO?AODR?, 
!JR I TE_LOlJER: B_SOURCE=?AOOR?, 
REFRESH: ALWAYS J % 
~1ACRO AOOR_A (AOOR) 
% fREAO_UPPER: A_DEST=?ADDR? AND ENABLE=!, 
!,JR I TE_UPPER: A_SOURCE=?ADDR?, 
REFRESH: ALWAYS J % 
IO_PORT LEFT_PORT 
OUTPUT _REG! STER: ;,.,AOOR_BOTH ( 11 l l ,.,; , 
LOAD: PORT=IXXX ANO ENABLE=!, 
'DRIVE: PORT=XIXX AND ENABLE=!; 
REG I STER Rl OPT! ONS: /,·,AOOR_BOTH !001 l::-/; 
REG I STER R2 OPTIONS: I ,·,ADOR_BOTH !O I OJ,·:/; 
REGISTER R3 OPTIONS:hADOR_BOTH!DIIJ,·J; 
REG I STER R15 OPTIONS: /,·,ADOR_A (O! I I I),-:/; 
REGISTER R15 OPTIONS:/,.,AODR_A!IOOOOl::/; 
REGISTER Rl7 OPTIONS:/-:cAOOR_A{!OOOll-;d; 
REG I STER R18 OPTIONS: h·:AODR_A ( IOOIOJ ,-J; 
REGISTER R19 OPTIONS: /,·,AOOR_A ODDI I},·:!; 
REGISTER R20 OPTIONS:hAODR_A!IOIOOl\·d; 
REG! STER R21 OPTIONS: /:':AOOR_A ! I Ol Ol} >'d; 
REG I STER R22 OPTIONS: /,·,ADDR_A (I 0 I I Ol l'd; 
REG !STER R23 OPTIONS: I ,·,AODR_A (I 0 I I I l ,·d; 
REGISTER R24 OPT IONS: hADOR_A (I 1000 l:·d; 
REGISTER R25 OPT IONS: / :':AOOR_A (I 1001 ) ,·:/; 
REG I STER R28 OPT IONS: hAOOR_A (1I0 I OJ,-:/; 
REG I STER R27 OPTIONS: /,·,AOOR_A (I IO I I l :"d; 
REGISTER R28 OPTIONS:hADDR_A!lIIDOl,·:I; 
REGISTER R29 OPTIONS: h·:AODR_A (I I I 0 I ) :":!; 
REGISTER R30 OPTIONS:hADDR_AU!l!Oh/; 
REG I STER R31 OPTIONS: hAOOR_A <I I I I I l ,·d: 
Ron_PAIR C0 
LEFT_ENABLE: A_SOURCE=OOOOO, LEFT_UPPER:OOOOOOOOOOOOOOOO, 
RIGHT_ENABLE:B_SOURCE=OOO, RIGHT_Lm!ER: I! I I I I I I I I I I I I I I; 
PRECHARGE_BOTH PCHG; 
t'lASKED _SH IF TER SH I FTER 
tlOST _SI GNI FI C/\NT _WORO: I »:AODR_A (Q IOOO) ,·:f, 
I »:AODR_A (QI 00 I l :•cf, 
/ ,·:AOOR_A (0 I 0 I 0 l ,·cf, 
/ ,-:ADOR_A (0 IOI I h·cf, 




SH I FT _CONSTANT: 
LOAO_IF _0: 
LOAD_IF _1: 







LJR I TE_OUTPUT _l: 
~JR I TE_OUTPUT _2: 
MASKS=XI ANO ENABLE=!, 
MASKS=IX AND ENABLE=l; 
hADDR_BOTH (I OOJ ,-:/, 
hADDR_A (QI I 00l l'c/, 
hADDR_BOTH { I 01 ) ,·:/, 
hADDR_A (QJ I or)-::!' 
/,-,AOOR_A (QI I I OJ M, 
LOAD=IX AND ENABLE=l, 
LOAD=IX AND ENABLE=!, 
LOAD=XI ANO ENABLE=I, 
TO_CONTROL: l<l,2,9>=>PADl, 
DECODE: ALU_OP 
<0> => SUBTRACT 
<l> => ADO 
<2. 3> => I NCREf1ENT _A 
<4> => SUBTRACT 
<5> => SUB_l.l_BORRml 
<6,7> =>SUBTRACT 
<8> => ADD 
<9> => ADD_l.l_CARRY 
<10, 11 > => ADO 
<12.13> => OECREMENT_A 
<14,15> => NEGATE_A 
<16» => SETA 
<17> => ADD 
<18> => SETA 
<19> => SETS 
<20,21> => OR 
<22,23> => ANO 
<24,25> => SETA 
<26,27> => XOR 
<28,29> => TEST 
<30,31> => SETCA; 
-167-




I ~·:ADDR_BO TH ( I I Ol ~·:!, 
PORT=XXIX AND ENABLE=!, 
PORT=XXXI AND ENABLE=!; 
ENO 
Another proposed method for increasing the number of datapath registers was to add 
backup registers, similar to the alternate register set in the Zilog Z80 chip. We 
would have backup registers for each of the four general purpose registers, and 
when a swap instruction was executed, the register pairs would swap data values. 
For this method to work, we need a bit to indicate when to swap. We can free up 
one PHI-2 bit if we only have one ALU output register. The load bit for that 
register can then be used as the SWAP bit. The area for this new chip is 220 by 280 
mil, and the power consumption is 114 ma. 
NAME DATAPATH3 16; 
FI ELD· A_SOURCE<l: 4>. A_DEST <5: 8>, B_SOURCE<9: 12>, B_OEST <13: 16>, ENABLE<l7>,SHIFT_CONST<1:4>,ALU<5:8>,MASKS<9:10>,LOAD<ll>,SWAP<12>, PORT<13:16>,CONOITION<l8>,ALU_OP= ALU & CONDITION; 
MACRO AODR!ADORJ 
% [READ_UPPER: A_DEST=?ADDR? ANO ENABLE=!, 
REAO_LOIJER: B_DEST=?ADOR? AND ENABLE=!, 
ljR I TE_UPPER: A_SOURCE=?ADOR?, 
1-IRITE_LOl-JER: B_SOURCE=?ADDR?, 
REFRESH: ALWAYS 1 % 
t!ACRO Sl-IAP (AODRl 
%SWAPPJNG_REGISTERS R?ADOR? 
-168-
LEFT: [REFRESH: LOADS=XXXO, 
REAO_UPPER: A_DEST=?ADOR? ANO ENABLE=!, 
READ_LOWER: B_DEST=?AODR? ANO ENABLE=!, 
WRITE_UPPER: A_SOURCE=?AOOR?, 
I.JR! TE_LOl.IER: B_SOURCE=?AOOR?J, 
RIGHT: [REFRESH: LOADS=XXXOJ , 
RIGHT_TO_LEFT: Sl..JAP=I, 
LEFT _TO_RIGHT: Sl-IAP=I; % 
IO_PORT LEFT_PORT 
OUTPUT _REG! STER: /;,AOOR (001 Ol ,·,/, 
LOAD: PORT=IXXX ANO ENABLE=!, 
DRIVE: PORT=XIXX ANO ENABLE=!; 
h·:Sl..JAP ( I I OOl ,·r/ 
I ,·,Sl..JAP ( I I 0 I l ,·r/ 
I ,-:SWAP ( I I I Ol .,·,/ 
I ,·,Sl-IAP ( I I I I l ,·r/ 
ROM_PAIR C0 




~lOST _SI GNI FI CANrnORO: /,·,ADDR CI 0001 ,·:f, 
LEAST _SI GNI FI CANT _[,.!ORD: hADDR < IOOI J ,.,/, 
OUTPUT _REG I STER: I ,·,AOOR (I 0 IO>,·:/, 
MASK_REG I STER: hADDR CI 0 I I J ,·,/, 
SHIFT_CONSTANT: SHIFT_CONST, 
LOAO_IF _0: t1ASKS=Xl ANO ENABLE=!, 
LOAD_IF_l: MASKS=IX AND ENABLE=!; 






l-IR I TE_OUTPUT _l: 
TO_CONTRQL: 
DECODE: ALU_OP 
I ,-,AODR (0 I 00 l ,·:/, 
hADDR (O!OJ J >'<!, 
hADDR !O I JO h·d, 
I ,.,ADDR <000 I ) ;'c/' 
LOAD=! AND ENABLE=!, 
LOAD=! AND ENABLE=!, 
!<1,2,9>=>PAOI, 
<0> => SUBTRACT 
<l> => ADD 
<2,3> => INCREMENT_A 
<4> => SUBTRACT 
<5> => SUB_lJ_BORROf.,.J 
<6,7> =>SUBTRACT 
<8> => ADO 
<~l> => AOD_lJ_CARRY 
<10,11> =>ADD 
<12,13> => DECREllENT_A 
<14,15> => NEGATE_A 
<16> => SETA 
<17> => ADO 
<18> => SETA 
<19> => SETB 
<20,21> => OR 













REGISTER R13 OPTIONS:/*AOORIOIIII*/: 
IO_PORT RIGHT_PORT 
OUTPUT _REG! STER: /1·:AODR IOOI I I >'d, 
-169-
LOAD: PORT=XXIX AND ENABLE=!, 
DRIVE: PORT=XXXI AND ENABLE=l; 
END 
A simpler proposal was to sharn the shifter and ALU input registers, thereby freeing 
up two· bus addresses. Since is is difficult to physically share the registers, we can 
share the registers in a logical sense: ALU input register A and shifter MSW register 
"vill have the same bus destination address, but only the ALU register will write the 
bus. Whenever a transfer is made to the ALU input register, the shifter register 
v.rill also load. Whenever a transfer is made fr9m the ALU input register, only the 
ALU register will write the bus. This chip has an area of 209 by 272 mil, and a 
power consumption of 100 ma. 
NAME OATAPATH4 lG; 
FIELD A_SOURCE<1:4>,A_DEST<5:8>,8_SOURCE<9:12>,B_DEST<13:1G>, 
ENABLE<17>,SHIFT_CONST<l:4>,ALU<5:8>,MASKS<9:10>,LOAD<ll:l2>, 
PORT<l3:1G>,CONDITION<18>,ALU_OP= ALU & CONDITlON; 
tlACRO ADDR <ADORl 
% [REAO_UPPER: A_DEST=?ADDR? AND ENABLE=l, 
READ_LOl-IER: B_DEST=?ADDR? ANO ENABLE=!, 
l-IRI TE_UPPER: A_SOURCE=?AODR?, 
L.JRI TE_LQl.JER: B_SOURCE=?ADDR?, 
REFRESH: ALWAYS l % 
nACRO HALF!ADORI 
% [REAO_UPPER: A_DEST=?AOOR? ANO ENABLE=I, 
READ_LOL.JER: B_DEST=?AODR? AND ENABLE=!, 





hADOR IOO I OJ ~·:!, 
PORT=IXXX AND ENABLE=!, 
PORT=XIXX ANO ENABLE=!: 
REGISTER R8 OPTIONS:/*AODR!IOOOl*/; 
REGISTER R9 OPTIONS:/*ADDRIIOOII*/: 
REGISTER Rl2 OPTIONS:/*ADDRIIIOOJ*/: 
REGISTER Rl3 OPTIONS:/*AOORIIIOII*/; 
REGISTER R14 OPTIONS:/*AODRIIIIOl*/; 
REGISTER R15 OPTIONS:/*AODR<IIIII*/: 
-170-
ROll_PA IR C0 
LEFT_ENABLE: A_SOURCE=OOOO, LEFT_UPPER:OOOOOOOOOOOOOOOO, 
RIGHT_ENABLE:B_SOURCE=OOOO, RIGHT_LOWER:IIIIIIIIIIIIIIII; 
PRECHARGE_BOTH PCHG; 
rlASKED _SH IF TER SH IF TER 
llOST_SIGNIFICANT_WORD: hHALF <O I OOl 1·rl, 
I ,·,.HALF CO l 0 I ) ,·rl, 
I ,·,AOOR {I 0 I OJ ,·rl, 
/,·,AODR {I DI I J i·rl, 
SHIFT_CONST, 
LEAST _SIGNIFICANT _t.JORO: 
OUTPUT_REGISTER: 
llASK_REG I STER: 
SHI FT _CONSTANT: 
LOAD_! F _0: 
LOAD_IF _1: 
MASKS=XI AND ENABLE=!, 
MASKS=IX AND ENABLE=!; 
ALU_lH TH_FLAGS ALU 






!JR I TE_OUTPUT _1: 
I.JR I TE_OUTPUT _2: 
TO_CONTROL: 
DECODE: ALU_OP 
I l·:ADDR CO IO I l l"</, 
I ,·,ADDR (QI IO h·d, 
I ,·,AOOR (QI I I J ,·:/, 
I ,·,AODR rnoo I ) l'd. 
LOAD=IX AND ENABLE=!, 
LOAD=IX AND ENABLE=!, 
LOAD=Xl AND ENABLE=!, 
f<l,2,9>=>PAOI, 
<0> => SUBTRACT 
<1> => ADO 
<2,3> => INCRH1ENT_A 
<4> => SUBTRACT 
<5> => SUB_~J_BORRmJ 
<5,7> =>SUBTRACT 
<8> => ADO 
<9> => ADD_l-J_CARRY 
<10, 11> = > ADO 
<12, 13> =>' OECREMENT_A 
<14,15> => NEGATE_A 
<16> => SETA 
<17> => ADD 
<18> => SETA 
<19> => SETB 
<20, 21> => OR 
<22,23> => AND 
<24,25> => SETA 
<26,27> => XOR 
<28,29> => TEST 
<30,31> => SETCA; 
IO_PORT RIGHT_PORT 
OUTPUT _REGISTER: I 1':ADOR <00 I I l ,·d, 
LOAD: PORT=XXIX AND ENABLE=!, 
DRIVE: PORT=XXXI AND ENABLE=!; 
ENO 
The final proposal was to add a stack to the chip. We would again have to remove 
one of the ALU output registers to free up a control bit for the POP line. This stack 
pushes data whenever the top register is written to, and pops data whenever the 
-171-
POP signal is high. The top of stack register can be read independent of whether the 
stack POPs or not. For an 8-deep stack, the chip area is 231 by 279 mil and the 
power consumption is 129 ma. 
NAME OATAPATHS 16; 
FI ELD A_SOURCE<l: 4>, A_DEST <5: 8>, B_SOURCE<3: 12>, B_DEST <13: 16>, ENABLE<l7>,SHIFT_CONST<l:4>,ALU<5:8>,MASKS<3:10>,LOAD<ll>,P0P<12>, PORT<l3:16>,CONDITION<18>,ALU_OP= ALU & CONDITION; 
f!ACRO ADOR(ADDR> 
% [READ_UPPER: A_DEST=?ADDR? AND ENABLE=!, 
READ_LOlJER: B_DEST =?ADOR? AND ENABLE= I, 
ljR I TE_UPPER: A_SOURCE=? ADDR?, 
1-lRI TE_LOl.JER: B_SOURCE=?ADDR?, 





h·:ADDR mo I 0) ,·d' 
PORT=IXXX AND ENABLE=!, 
PORT=XIXX AND ENABLE=!; 
REGISTER R12 OPTIONS: /*ADDR(l!OO)*/; 
REGISTER R13 OPTIONS: /*AODRIIIOil*/: 
REGISTER Rl4 OPTIONS: /*ADDR(ll!Ol*/: 
REGISTER R15 OPTIONS: /*ADDRCIIIII*/; 
ROil_PA IR C0 
LEFT_ENABLE: A_SOURCE=OOOO, LEFT_UPPER:OOOOOOOOOOOOOOOO, RIGHT_ENABLE:B_SOURCE=OOOO, RIGHT_LOWER:IIIIIIIIIIIIIIII; 
PRECHARGE_BOTH PCHG; 
I !ASKED _SH IF TER SH I FTER 
f10S T _SIGN IF I CAN WORD: I ,·,ADDR CI 000 l ,·d, 
LEAST _SIGNIFICANT _ljORD: /,·,ADOR (I 001) ,·d, 
OUTPUT _REG I STER: h,ADDR CI OI O> ,.,/, 
tlASK_REG I STER: I ,-,ADDR (I 0 I I),·,/, 
SH I FT _CONST ANT: SH I FT _CONST, 
LOAD_IF _0: 11ASKS=Xl AND ENABLE= I, 
LOAD_IF_l: MASKS=IX AND ENABLE=!; 






lJR I TE_OUTPUT _1: 
TO_CONTROL: 
DECODE: ALU_OP 
I ,-,ADOR (QI OOJ ,.,/, 
I '"AOOR (QI 0 I »·d, 
hADOR (QI I OJ,.,/, 
I »:ADDR COOO I l ,·d, 
LOAD=! AND ENABLE=!, 
LOAD=I ANO ENABLE=!, 
l<l,2,9>=>PADI, 
<0> => SUBTRACT 
<1> => ADD 
<2, 3> => I NCRH1ENT _A 
<4> => SUBTRACT 
<5> => SUB_W_BORROl-l 
<6,7> => SUBTRACT 







































ORI VE: . 
ENO 
I ,·:AOOR WOI I ) ,·:/, 
PORT=XXIX AND ENABLE=!, 
PORT=XXXI AND ENABLE=!: 
The following table summarizes the results of the datachip modification 
experiments. 
Number of Size 
Name Free Registers x \:I Power --------- -----------""!"'--
OAT APA TH 4 203 276 36 DATAPATH2 20 229 272 126 OATAPATH3 5 with 4 backups 220 280 114 DATAPATH4 6 209 272 100 OATAPATHS 4 ~' i th 8-deep stack 231 273 129 
Figure 9-10 shows the bounding boxes for each of these chips. Given this 
comparison data, the microcode designers and the fabrication engineers can haggle 
over the design specs. 
Later that afternoon, the members of the market staff came by, expressing a desire 
for combining the controller and datachip onto a single chip. Additionally, th.e 
v-vidth of the microcode was to be narrowed from 24-bits to 16-bits. One of the two 
bi-directional buses could also be eliminated. Using a handy text editor, the 
controller specification was merged with one of the datapath specifications. Bristle 
Blocks compiled the new chip in 7 minutes. The chip size was 244 by 246 mil, and 
-173-
OATAPl\Tlf DATAPl\TH DATAPl\TH DATAPATH 
..... ~~ ... ~ 
DECODER DECODER 
PADS ~-·~ 
OATAPATH OATAPATH2 OATAPATH3 DATAPATH4 DATAPATHS 
Fig. 9-10: Bounding Box Comparisions 
the power consumption was 128 ma. The source code for the corn bined chip is 
listed here. 
NMIE COf"lB I NED 16; 
FI ELD ADDRESS<l: 3>, A_SOURCE<4: 7>, A_DEST <8: 1 b, B_SOURCE<12: 14>, 
B_DEST<l5:16>,SHIFT_CONST<l:4>,SHIFT_LD<5:6>,ALU<7:10>, 
PORT<l3:16>,CONOITJDN<17>,ALU_OP=ALU & CONDITION; 
f1ACRO NOP ( ) 
llACRO JUllP () 
11/\CRO CALL ( > 
flACRO RETURN () 
llACRO BRANCH() 
f IACRO SAVE () 
tlACRO DO (} 
I IACRO ENDOO { l 
f'lACRO UNDO < l 
% ADDRESS=OOO OR CONDITION=O % 
% ADDRESS=OOI AND CONDJTIDN=l % 
% ADDRESS=OIO ANO CONDITION=! % 
% AODRESS=DII ANO CONDITIDN=l % 
% ADDRESS=IDO AND CONDITION=! % 
% AOORESS=lDI AND CONDITION=! % 
% ADDRESS=IIO ANO CONDITION=!% 
% AODRESS=Ill AND CONDITION=! % 
% AODRESS=III AND CONDITION=O % 
tlACRO REG_A<ADR> 
% READ_UPPER: A_DEST=?AOR?, 
IJRJTE_UPPER: A_SOURCE=?ADR?, 
REFRESH:ALWAYS % 
tlACRO REG_B_OUT <AADR, BADR> 
% (1,-,REG_A<?AAOR?li·:/, l.JRITE_LmJER:B_SOURCE=?BADR?J % 
llACRO REG_B_I N !AADR, BADR) 
% [/,.,REG_A !?AADR?h·:/, READ_LOWER:B_DEST=?BADR?J % 
llACRO REG_A_ONL Y !ADA> 
% [hREG_A !?ADR?> ,·:/) % 
llACRO PORT_OUT(} % PORT=OOOX % 
t1ACRO PORT_INO % NOTC/,·,PORT_OUfo/} % 
OUTPUT_PORT ADDRESS 
REGIS TEA: CREAD_UPPER: I ,·,NOP,·:! OR I ,.,DO,·:! OR /,·,BRANCH,·d OR /,·,SAVE i'c/, 
READ_LmtER: /,·,JUIJP,·c/ OR ;,.,CALLi·c/ OR ;,.,RETURNi·c/ OR ;,.,ENODO>":IJ; 
-174-
ADDER NEW_PC 
INPUT _A: CREAD_UPPER: hNOP,·:/ OR /,·:DO,·:/ OR ;,.,BRANCH,·d OR /,'<SAVE,·:/, 
READ_LOIJER: /,·;-J1Jl·1p,'</ OR /,·:CALL\'</ OR Ji·,RETURN\·c/ OR /\·,ENOOO.,-:IJ , 




OUTPUT _REG I STER: (l.JR ITE_UPPER: ALL.JAYS] ; 
STACK PC_STACK 
DEPTH: 8,, -
TOP: CREAD_UPPER: /,·,CALL,·d OR hDO,·c/, 
READ_LOlJER: hrSAVE,•:/ 1 
WR I TE_LOllER: hRETURN,·:/ OR h'1ENDDO,·c/] I 
PUSH: /*CALL*/ OR /*DO*/ OR /*SAVE*/, 
POP: /*RETURN*/ OR /*UNDO*/; 
PRECHARGE_ANO_BREAK_UPPER PCHGl; 
REGISTER LINK OPTIONS: 
[l.JRI TE_LQl.JER: /,',JUMP,.,/ OR /;·:CALL,·:! OR hBRANCH,·d OR /;•1SAVE,·d I 
l-lRI TE_UPPER: A_SOURCE=OOOI. READ_UPPER: A_DEST=OOOI J; 
PRECHARGE_AND_BREAK_LOIJER PCHG2; 
PRECHARGE_BOTH PCHG3; 
ALU_l.JI TH_FLAGS ALU 
INPUT _A: I ,·,REG_B_l N (I 000. 0 I),·:! I 
INPUT _B: I ,·,REG_A_ONL YI I 00 I),-:/, 
OUTPUT_l: hREG_B_OUT(JOIO, 100),·:/, 
FLAGS: hREG_A_ONL Y (I 0 I I l ,·:/. 
LOAO_FLAGS: NOT!ALU=lllll, 
l.JR I TE_OUTPUT _l: NOT !ALU= I I I I l , 
TO_CONTROL: l<l,2,9>=>PADI I 
DECODE: ALU __ OP 
<0> => SUBTRACT 
d> => ADD 
<2,3> => INCREl1ENT_A 
<t+> => SUBTRACT 
<5> => SUB_l.J_BORROW 
<b,7> =>SUBTRACT 
<8> => ADO 
-<8> => ADD_l.l_CARRY 
<10 I 11> => ADD 
<12,13> => DECREl1ENT_A 
<14,15> => NEGATE_A 
dG> => SETA 
<17> => ADO 
<18> => SETA 
<18> "'> SETB 
<20. 21> => OR 
<22,23> => AND 
<21+,25> =>SETA 
<25,27> => XOR 
<28,28> => TEST 
-<30,31> => DONT_CARE; 
tlASKEO_SHI FTER SHIFTER 
llOS.T _SIGNIFICANT _~JORO: /;'1REG_A_ONL y IOI 00) ,·ti I 







/1·:REG_B_OUT <DI IO, IOI),·:/, 
/,·,REG_A_ONL Y COi I I l i'rf, 




LEFT _ENABLE: A_SOURCE=OOOO, 
RIGHT_ENABLE:B_SOURCE=IIO, 
LEFT_UPPER: 0000000000000000, 
RIGHT _LOWER: I I I I I I I I I I I I I I I I ; 
REGlSTER R2_REG 
OPTIONS: f~JRI TE_UPPER: A_SOURCE=OOIO, 
REAO_UPPER: A_DEST =0010, 
REFRESH: AU.IA YS, 
READ_LOl-JER: B_OEST=OO AND NOT <B_SOURCE=OOOl, 
l.JR I TE_LOl.lER: B_SOURCE=0001 ; 
REGISTER Rl3 OPTIONS: hREG_B_OUT(! IOI ,001 )1·d: 
REGISTER R14 OPTIONS: h·:REG_B_OUT(! I IO,O!Ol1·d; 
REG I STER RlS OPTIONS: /,·,REG_B_OUT <I I I I, OI I l i•d; 
IO_PORT RIGHT_PORT 
OUTPUT_REGISTER: U.IR I TE_IJPPER: A_SOURcE::oo I I • 
A_OEST =001 I, 







1-lR I TE_LOWER: 
/,·,PORT _I Ni·:/, 
/1·:PORT _OUT1·:/; 
The netv system still requires external logic associated with the microcode and 
external circuitry for the condition select operations. This external circuitry does 
provide system flexibility, but it also adds to system complexity. A final proposed 
system includes on-chip circuitry for providing strobe signals and condition select 
operations. 
NAt1E COllPLETE 16; 
F 1 ELD ADORESS<l: 3>, A_SOURCE<lf: 7>, A_OEST <8: 11 >, B_SOURCE<12: 14>, 
B_DEST <15: H», SHI FT _CONST <l: 4>, SHI FT_LD<5: 6>, ALU<?: 10>, 
PORT <13: 16>, CONDI Tl ON<l 7>, ALU_OP=ALU & CONDIT I ON, STROBES<18: 24>, 
RESET<25>,FLAGS<2G:28>,EXTERNAL<29:31>; 
I 1ACRO NOP ( l 
flACRO JUIJP {) 
flACRO CALL < l 
tlACRO RETURN(} 
r lACRO BRANCH ( l 
llACRO SAVE{) 
flACRO DO ( l 
f IACRO ENOOO (} 
r1ACRO UNDO ( } 
% AODRESS=OOO OR CONDITION=O % 
% AOORESS=OO! AND CONDITION=! % 
% AODRESS=OIO AND CONDITION=! % 
% AOORESS=Oll AND CONDITION=! % 
% ADDRESS=IDO ANO CONOITION=I % 
% AODRESS=!Ol AND CONOITION=I % 
% ADORESS=IIO AND CONDITIDN=I % 
% AOORESS=lll AND CONDITION=! % 
% ADDRESS=lll AND CONOITION=O % 
flACRO REG_A (ADRJ 
% READ_UPPER: A_DEST=?ADR?. 
l IR I TE _UPPER: A_SOURCE =? ADR?, 
REFRESH: AUJA YS % 
nACRO REG __ B_OUT<AAOR,BADRJ 
-176-
% [/)·:REG_A<?AAOR?h·:/, ~JRITE_LQl.IER:B_SOURCE=?BAOR?J % 
flACRO REG_B_I N (AAOR, BADRJ 
% [/)·,REG_A <?AADR?h·:/, READ_LOWER: B_DEST =?BADR?J % 
llACRO REG_A_ONL Y (AORJ 
% [hREG_A <?ADR?J ,·:/] % 
OUTPUT_PORT ADDRESS 
REG I STER: CREAD_UPPER: hNOP)·r/ OR hDOM OR /,·(BRANCH,·:/ OR /,·(SAVE,·d, 
READ_LQl.IER: /,·:JUflP,·:/ OR /,·(CALL,·:/ OR hRETURN,·:/ OR h(ENOOO,·dJ; 
ADDER NEl.J_PC 
INPUT _A: CREAD_UPPER: hNOP,·:/ OR hOO,·:/ OR hBRANCH,·d OR /,·,SAVE,.,/, 
READ _LQl.JER: / ,·:JUflP,·d OR h':CALUd OR h·(RE TURN~d OR I ,·,ENOOO,·d, 
SUGGEST: RESET=!, 
VALUE:OOOOOOOOOOOOOOOOJ, 




OUTPUT_REGISTER: [~JRITE_UPPER: AU.JAYS]; 
STACK PC_STACK 
DEPTH: 8, 
TOP: CREAD_UPPER: hCALL,·:/ OR h':DO,·:!, 
READ_LQl.JER: h':SAVE,·:/, 
I.JR! TE_LQl.IER: h«RETURN,·:/ OR hENDDO,·r/J, 
PUSH: /*CALL*/ OR /*DO*/ OR /*SAVE*/, 
POP: /*RETURN*/ OR /*UNDO*/; 
PRECHARGE_AND_BREAK_UPPER PCHGl; 
REGISTER LINK OPTIONS: 
[l.JR I TE_LOl-JER: h':JUf1P,·:/ OR /,·(CALL,·:/ OR hBRANCH,·d OR /,·(SAVE,·:/, 
!JR I TE_UPPER: A_SOURCE=ODOI, READ_UPPER: A_DEST =000! J; 
PRECHARGE_AND _BREAK _LOl-IER PCHG2: 
ALU_l.J I TH_FLAGS ALU 
INPUT_A: hREG_B_IN<I000,01 ),·:/, 
INF-'UT_B: hREG_A_ONLY<IOOI l)·d, 
OUTPUT_l: h·REG_B_OUT<IOIO, IOOJ,·d, 
FLAGS: hREG_A_ONLYOOI I ),·d, 
LOAO_FLAGS: NOT<ALU=lllll, 
I.JR I TE_OUTPUT _l: NOT (ALU= I I I I l, 
TO_CONTROL: l<l,2,9>=>FLAGSI, 
DECODE: ALU_OP 
<0> => SUBTRACT 
<l > => AOD 
<2, 3> => ·1 NCREf"IENT _A 
<4> => SUBTRACT 
<5> => SUB_l.J_BORROW 
<5,7> =>SUBTRACT 
<8> => ADD 
<9> => ADD_l.l_CARRY 
<10, 11> = > ADO 
<12, 13> => OECRHlENT _A 
<14,15> => NEGATE_A 
<lEl> => SETA 
<17> => ADD 
<18> => SETA 
<18> => SETS 
<20,21> => OR 
<22,23> => ANO 
<24,25> => SETA 
<26,27> => XOR 
<28,29> => TEST 
<30,31> => OONT_CARE; 
-177-
f lASKED _SH IF TEA SH IF TER 
llOST_SIGNIFICANT_IJORO: 
LEAST _SIGNIFICANT _~JORO: 
OUTPUT_REGISTER: 
f IASK _REGISTER: 
SHIFT_CONSTANT: 
/~·,REG_A_ONLY <0100} M, 
hREG_B_I N <O I 0 I , I Ol )·cl, 
/)·,REG_B_OUT !Ol IO, IOI l )'cl, 
I )·:REG_A_ONL Y (0 I I I l )'cl, 
SHIFT_CONST, 
LOAD_IF _0: SHIFT_LD=XI, 





R I GH T _LOlJER: I I I I I I I I I I I I I I I I ; 
REGISTER R0_REG 




REAO_LQl.JER: B_DEST =00 AND NOT (8_SOURCE=000l, 
IJRI TE_LOlJER: B_SOURCE=OOOJ; 
OPTIONS: [l.IR I TE_LOlJER: 8_SOURCE=111 , 
READ_LOl.IER: B_DEST=l I, 
REFRESH: ALl-IAYSJ; 
REGISTER R13 OPTIONS: /~·:REG_B_OUT<IIOI,OOIJ,.cl; 
REGISTER R14 OPTIONS: /*REG_B_OUTCIIIO,OIOJ*/; 
REGISTER Rl5 OPTIONS: /)·,REG_B_OUT<IIll,OIIJ)·cl; 
PRECHARGE_AND_BREAK_LOWER PCHG3: 
REGISTER UNK2 
OPTIONS: lREAO_UPPER: A_OEST=OOII, 
t.JRI TE_UPPER: A_SOURCE=OOI I, 
REFRESH: All,IAYS, 
READ_LOllER: PORT=OIXX OR PORT=OXIX OR PORT=OXXI, 
t.JRITE_LOllER: PORT=l IXXJ; 
PRECHARGE_BOTH PCHG4; 
CONTROL_TO_DATA_AND_BACK STROBES 
REGISTER: [SUGGEST: RESET=!, VALUE:OOOOOOOOOOOOOOOOJ, 
LATCH: ALWAYS, 
. -178-
TO_CONTROL: f<l: 7>=> STRORES: <8>=> CONDITION; <9: 14>=> PADI, TO_OATA: lSTROBES=IXXXXXX ANO NOTCPORT=OI!I) OR PORT=llll=>l; 
Lot.JER_ROf1 Cl 





I O_PORT PORT 
STROBES=IXXXXXX ANO NOTCPORT=OIIIJ OR PORT=IIIl=>3: STROBES=XIXXXXX ANO NOTIPORT=Oill) OR PORT=III0=>2; STROBES=XIXXXXX ANO NOTIPORT=OIIIJ OR PORT=l110=>10; STROBES=XXJXXXX ANO NOTCPORT=Ollll OR PORT=ll0l=>3; STROBES=XXIXXXX ANO NOTIPORT=Ollll OR PORT=llOl=>ll; STROBES=XXXIXXX ANO NOTCPORT=IOOOl OR PORT=ll00=>4; STROBES=XXXIXXX ANO NOTIPORT=l000) OR PORT=ll00=>12; STROBES=XXXXIXX AND NOTCPORT=l000) OR PORT=l0ll=>5; STROBES=XXXXIXX AND NOTCPORT=IOOOJ OR PORT=IOil=>l3; STROBES=XXXXXIX AND NOTIPORT=IOOOJ OR PORT-1010=>6; STROBES=XXXXXIX AND NOTCPORT=IOOO> OR PORT=l010=>14; STROBES=XXXXXXI ANO NOTIPORT=IOOO> OR PORT-1001=>7; STROBES=XXXXXXI ANO NOTCPORT=IOOOJ OR PORT=l001=>15; CONDI TI ON =00 OR 
CONDITION=OI AND FLAGS=IXX OR 
CONDITION=IO ANO EXTERNAL=IXX OR 
CONDITION-II ANO 
IF SHJFT_CONST-IXXX THEN 
SHIFT_CONST=IOOI AND FLAGS=DXX OR 
SHIFT_CONST=IOIO AND FLAGS=XXO OR 
SHIFT_CONST=IOII ANO FLAGS=XOX OR 
SHIFT CONST=llOO ANO EXTERNAL=DXX OR 
SHIF(~CONST=l JOI AND EXTERNAL=XOX OR 
SHIFT_CONST=IIIO AND EXTERNAL=XXO OR 









SHIFT_CONST=OOOI AND FLAGS=IXX OR 
SHIFT_CONST-0010 AND FLAGS-XX! OR 
SHIFT_CONST=OOII AND FLAGS-XIX OR 
SH I FT _CONST =0 JOO AND EX TERN AL= I XX OR 
SHIFT_CONST=OIOJ AND EXTERNAL=XIX OR 
SHIFT_CONST-0110 ANO EXTERNAL=XXI OR 
SHIFT_CONST=OIII AND NOTCFLAGS=OXOJ FI => 81; 
PORT =000I, VALUE:IIIIIIIIIJIIOOOO; 
PORT=OOI 0, VALUE:IIIIOOOOIIIIOOOO; 
PORT=OOI I, VALUE:IIIIIIIIOOOOOOOO; 
PORT=OIOO, VALUE:IIIOOOIIIOOOOOOO; 
PORT~OIOJ, VALUE:IIIJIIOOOOOOOOOO; 
PORT =DI IO, VALUE:IIIIIIIIIIOIOOOO; 
OUTPUT _REG! STER: [!JRITE_Lot.JER: PORT=OI I I, 
ENO 
REAO_LOl-JER: PORT=! IXX, 
REFRESH: STROBES=OOOXXXXJ, LOAD: NOTCSTROBES=OOOXXXXl, 
DRIVE: NOTCSTROBES-XXXOOOO); 
-179-
Chapter 10: A History of Bristle Blocks 
This chapter provides a brief overview of the Bristle Blocks project. The ma.jar 
results of a number of experiments are stated, and the motivation behind various 
design decisions are given. Finally, a description is given of what the next version 
of Bristle Blocks may be like. 
10.1: The Past 
Bristle Blocks was born out of the OM project [ 15][16][ 17]. The OMZ datapath chip 
'\Vas designed in nine months, three of which were spent designing the low level 
cells, and the remaining six of which were spent interconnecting all of the pieces. 
The chip was designed using a special purpose 'programming language, PAL [2]. A 
picture of the finished mask set is shown in chapter 2, figure 2-10. 
There were many lessons learned from the OM project. The more dramatic (and 
painful) lessons dealt with the limited expressability of the language, the 
complexity of the global interconnect versus the simplicity of leaf cell design, and 
the limited exp1·essibility of a purely graphical design system. 
The PAL artwork language is a special purpose drafting language. The purpose of 
the language is to describe simple line drawings or printed circuit board layouts. 
There are relatively few standard programming language constructs. It is virtually 
impossible to design a parametrized cell in such a language, and there is little hope 
for designing automatic routing programs with such a system. Due to the limited 
po,ver of PAL, yet the power of textual cell descriptions, imbedded languagi:is were 
developed. The first imbedded language developed at Caltech was ICLIC, written by 
Ron Ayres and Maureen Stone in the ICL language [ 4]. Soon theri:iafter, Bart 
Locanthi programmed LAP in Simula [ 19]. 
The complexity issue of global interconnect had two manifestations in the OM 
project. The first was that the layout of the final portion of the chip took much 
longer than the design of the majority of the chip area, even though much time was 
spent planning the global structure of the chip. The leaf cells were small layouts, 
which could easily be plotted on small sheets of paper. The entire function of each 
-180-
cell could be grasped as the cell was being designed. The control structure, on the 
other hand, was a very large cell, so that it was difficult to make detailed plots of 
the entire cell. The cell was hard to design because of the many timing and logical 
function details which had to be included in the cell. The second manifestation of 
the global interconnect complexity appeared when the chip was tested. It was in 
the global interconnections that all of the design errors were encountered. There 
\Vere two timing errors, one logical error, and one design rule error in the 
interconnections. The first timing error set the chip speed at 2.5 MHz, one quarter 
of the intended operating speed. The second error caused the flag circuitry to 
become inoperative. The logical error was not fatal: the polarity of one of the 
control input pins was negated. The design rule error was the major design error. 
Six of the highest level wires ever so slightly missed their proper connection 
positions on the instruction decoder. They were off less than .2% of their total 
length. For 5000 micron long wires, however, this small error, which is invisible 
on all cell plots, caused six of the control input bits to be shorted to ground. Each of 
these errors was not caused because the global interconnection task for any 
particular signal was difficult, but because there were so many signals to be 
interconnected that the specific details were forgotten. 
The third lesson learned from OM was that cells are more than just layout. Then.~ is 
documentation information about the cells that is just as important as the layout 
information. The design system which was used to create OM only allowed for the 
specification of geometric information, although I was able to add a block diagram 
description of the OM2 datapath chip to the system. As a designer, it was very 
frustrating not being able to add a little more information to the cells' descriptions. 
Even if additional information could be added to the cell, there was no way to access 
that information later. With the new design tools that have been developed, there 
has been a gradual increase in the flexibility of the cell data representation, so that 
additional designer intent can be encapsulated with the design. 
When the OM2 datapath chip design errors were found, there was a strong 
motivation to develop better design tools: to cast away nine months of effort 
because of a few tiny implementation details is not an easy thing to do. The process 
was begun of designing programs to aid in the design of integrated circuits. 
-181-
The first routine implemented was a simple, monochromatic river router. There 
v.rere several places in the datapath chip where a river router could be usnd to 
interconnect cells. Although there were no design errors in the datapath's hand 
designed river routes, the generation of the 500 interconnection wires between two 
of the cells was not a pleasant task. 
The second routine to be implemented was an instruction decode generator. In the 
datapath chip, the instruction decoder was implemented as a collection of 42 
incredibly tiny cells. These cells measured 7 lambdas by 14 lambdas, and were used 
to tile large portions of the chip. The instruction decoders required close to 20,000 
function calls, each of which required an absolute chip position parameter. This 
tedious and error prone task was accomplished without design error. However, the 
design ·was fixed for one particular chip instance, and if there were any change in 
the chip specification, this entire decoder would have to be re-implemented. An 
instruction decoder generator was written to automatically produce calls to cells 
very similar to the cells used in the datapath chip. Data structures were defined in 
ICL to desribe the instruction decoder operations, which became the input 
parameters to the generator. When this programming task was completed, a chip 
designer could rapidly generate an instruction decoder from a functional 
description of the decoder operation, plus positional information for the outputs of 
the decoder. 
The next step in automating the design of chips was to add the timing information 
to the decoder routing, so that the buffers and decoder could automatically be added 
to the datapath. It was at this same time that Ron Ayres presented some fascinating 
news of his Programmed Logic Array (PLA) compiler, RELAY [5]. He pointed out 
some very obvious ideas which helped crystallize the Bristle Blocks framework. A 
short description of RELAY will be presented here. 
Hon Ayres is a software computer scientist. He had a mathematical description of a 
chip he wanted implemented, yet he did not know how to design integrated 
circuits. He built a p1·ograrnming system that let him describe his formal, 
mathematical, chip descriptions. The system accepted a hierarchy of synchronous 
logic equations, and would allow the designer to alter the hierarchy of the logic 
while preserving the function of the description. The designer could simulate the 
operation of the chip at any time to verify the correctness of the specification. Ron 
-182-
then met with a student in the LSI design course, and they composed a simple model 
of a PLA and ·or an interconnect algorithm. Ron added these models to his system, 
which allowed him to quickly see what a set of logic equations would look like 
v·.rhen implemented in PLAs. He could observe the physical impact of editing the 
logic hierarchy. Finally, Ron borrowed a PLA generator and wrote an actual 
interconnect procedure. With these two routines, Ron was able to generate 
complete chip layouts from logic equation specifications. 
To illustrate the form of RELAY input, the following cell examples will be given. 
These examples are not meant to teach the reader how to design chips with RELAY, 









Fig. 10-1: General Purpose Register Block Diagram 
The first cell is a General Purpose Register (GPR). A block diagram of this register is 
shown in figure 10-1. The register will load data from the IN pin when LOAD and 
ENABLE are TRUE. When ENABLE is TRUE, EOUT will be set to the value contained 
within the register, and when ENABLE is FALSE, EOUT will be set to the value of 
ElN. The RELAY 5pecification for the GPR register is listed here. 
VAR GPR=LL: 
BEGIN VAR DATA,JN,LOAD,ENABLE,EIN,EOUT=BlT; 
OAT A: =NEll_B IT; 
IN: =NEIJ_B IT: 
LOAD: =NEl~_BI T; 
ENABLE: =NHl_Bl T: 
EIN: =NE!.l_BI T; 
EOUT:=NEW_BIT; 
GPR:= 
£EXTERNALS: UN_PJNS: IEIN\NAl1EO 'EIN'; 
ENABLE\NAMEO 'ENABLE': 
LOAO\NAMEO 'LOAD'; 
IN\NM1EO 'IN' I 
END 
-183-
0UT _PINS: IEOUT\NAMED 'EOUT'IJ 
RELATIONS: !EOUT\EOU [IF: ENABLE THEN:OATA ELSE:EINJ; 
OATA\NEXT CIF:LOAD\ANO ENABLE THEN: IN ELSE:DATAJIJ; 
We have declared GPR to be of type LL, which stands for Logic Level, the RELAY 
cell. Internal to a GPR. we have the following signals: DATA, IN, LOAD, ENABLE, 
EIN, and EOUT. We have declared the port characteristics of the GPR cell, and given 
the logic equations relating the signals within the GPR. 
We can now define a cell which uses two of these GPR cells. This GPR PAIR cell has 
a SELECT input which is used to select which GPR cell is being addressed. 
VAR GPR_PAIR=LL; 
BE.GIN VAR L, R"'NAMED_LOG I C_LEVEL; IN, LOAD, SELECT, ENABLE=B IT; 
L: =GPR \NEl.J; 
R: =GPR \NEl.J: 
IN: =NEl.J_B IT; 
LOAD: =NELJ_BI T; 
SELECT:=NEW_BIT; 
ENABLE: =NEl~_B I T: 
GPR_PAIR: = 







OUT_PINS: !L\S 'EOUT'\NAMED 'EOUT'IJ 
IL\S 'EIN'\EQU R\S 'EOUT'; 
L\S 'IN'\EOU IN; 
R\S 'IN'\EQU IN; 
L\S 'LOAD'\EOU LOAD; 
R\S 'LOAD'\EOU LOAD 
L\S 'ENABLE'\EQU SELECT\ANO ENABLE; 
R\S 'ENABLE'\EOU NOT<SELECTJ\ANO ENABLEIJ; 
In the same manner, we can define a few new register cells. The GPRO cell is 
similar to the GPR cell, except that the data contained within the cell is also 
available as a port. The GPRI cell is used as an interface cell, a shared register 
betvveen two processors, for instance. When one processor writes into the cell, the 
second processor notices the effect in its corresponding interface cell. 
VAR GPRO=LL; 
BEGIN VAR DATA,IN,LOAO,ENABLE,EIN,EOUT=BIT; 
DATA: =NEW_BIT; 
IN: =NHJ_BI T; 
LOAD: =NEW_BI T; 
ENABLE: =NEl-1_81 T; 
EI N: =NEl.J_BI T; 




[EXTERNALS: UN_PINS: IEIN\NAllEO 'EIN'; 
ENABLE\NAMED 'ENABLE'; 
LOAO\NAMED 'LOAD'; 
I N\NAMED ' IN' l 
OUT_PINS: IEOUT\NAt1ED 'EOUT'; 
DATA\NAt1EO 'DATA'}J 
RELATIONS: IEOUT\EOU CIF:ENABLE THEN:DATA ELSE:EINJ; 
DATA\NEXT [IF:LOAO\AND ENABLE THEN: IN ELSE:OATAJIJ; 
VAR GPRl==LL; 
BEGIN VAR OATA,JN,LOAD,ENABLE,EIN,EOUT,DIN=BIT; 
DATA: '°NEW_BI T; 
ENO 
IN: =NEl-l_BI T; 
LOAD: =NElJ_BI T; 
ENABLE:=NEW_BIT; 
EIN: =NEW_BI T; 
EOUT: =NHJ_B IT; 
DIN: =NElJ_BI T; 
GPRI: = 
[EXTERNALS: UN_PINS: IEIN\NAMEO 'EIN'; 
RELATIONS: 
ENABLE\NAMEO 'ENABLE'; 
LOAD\NAMED . 'LOAD' ; 
IN\NN1EO 'IN'; 
DIN\NAMED 'OATA_IN'l 
OUT_PINS: !EOUT\NA11ED 'EOUT'; 
OATA\NAt1ED 'OATA_OUT' l) 
<EOUT\EQU [lF:ENABLE THEN:OIN ELSE:EINJ; 
OATA\NEXT CIF:LOAD\ANO ENABLE THEN: IN ELSE:OATAJlJ; 
As a final example, a shift register loop is described. Externally, the shifter appears 
like a GPR cell, except that shift input signals are included in the interface of the 
cell. The top cell communicates with a series of short shift registers, each of which 
is composed of a series of bits. Hence, the shifter is a hierarchy of shift bits, as 
shown in figure 10-2. 
Y AR LOOP _BI T =LL; 
BEGIN VAR LIN,RIN,LSHIFT,RSHIFT,OUT=BIT; 
LIN: =NEW_BI T; 
ENO 
RIN: =NElJ_BIT; 
LSH I FT: =NEl.J_B IT; 
RSH I FT: =NElJ_B IT; 
OUT: =NEIJ_B IT; 
LOOP _BIT:= 





OUT_PINS: !OUT\NAf1EO 'OUT' J J 
IOUT\NEXT [JF:LSHIFT THEN:RIN 
ELSE: [! F: RSHI FT THEN: LIN 
ELSE: OUTJ JI J; 
-185-
"A LOOP ROI-I IS A STRING OF N LOOP Bl TS, ALL PROPERLY CONNECTED." 
OEF I NE LOOP _ROlHN: I NTJ =LL: 
BEGIN VAR LOOP _8 I TS=NArlED_LOG I C_LEVELS; L, R, Bl, BN=NAf1ED_LOG I C_LEVEL; 










OUT _Pl NS: {81 \S 'OUT' \NAl1ED 'LOUT'; 
BN\S 'OUT' \NAr1ED 'ROUT' l J 
RELATIONS: IFOR lL;Rl SC LOOP_BITS: COLLECT 
IL\S 'RIN'\EQU R\S 'OUT'; 
R\S 'LIN'\EQU L\S 'OUT'; 
R\S 'LSHIFT'\EQU 81\S 'LSHIFT'; 
R\S 'RSHIFT'\EOU 81\S 'RSHIFT'll 
GUTS: LOOP_BITSJ 
"A LOOP LOOKS f1UCH LI KE A GPRO EXTERNALLY, BUT IT CONT A I NS AN 
rt*N+l BIT SHIFT REGISTER. EXTERNALLY, IT DOES HAVE THE RSHIFT AND 
LSHIFT SIGNALS." 
DEFINE LOOP!M,N:INT>=LL: 
BEGIN VAR LOOPS=NAMED_LOG I C_LEVELS; L, R, Bl, BN=NAl1ED_LOG I C_LEVEL; 
DATA,IN.LOAD,ENABLE,EIN,EOUT,LSHIFT,RSHIFT=BIT; 
DO DAT A: =NEl-l __ B IT; 
l N: =NElJ_B IT; 
LOAD: =NEl-!_BI T: 
ENABLE: =NEU_B IT; 
EIN: =NHJ_Bl T; 
EOIJT: =NEl-l_B IT; 
LSHIFT: =NEW_BI T; 
RSH I FT: d~El~_B IT; 
Bl: =LOOP _ROl~ (rt); 
LOOPS:=!COLLECT Bl\NEW REPEAT N;l; 
Bl:=LOOPS[ll; 
BN:=LOOPS£NJ; 






OUT_PINS: !EOUT\NAMED 'EOUT'; 
DATA\NAMEO 'DATA'll 
RELATIONS: IEOUT\EOU [lF:ENABLE THEN:DATA ELSE:EINl; 
DATA\NEXT CIF:LOAO\AND ENABLE THEN: IN 
ELSE: UF:LSHIFT THEN:BN\S 'ROUT' 
ELSE: CIF:RSHIFT THEN: Bl \S 'LOUT' 
ELSE: OATAJJJ; 
81\S 'LIN'\EQU DATA; 
BN\S 'RIN'\EQU DATA; 
FOR IL;RI nc LOOPS; COLLECT 
IL\S 'RIN'\EQU R\S 'LOUT'; 
R\S 'LIN'\EQU L\S 'ROUT'; 
ENO 
-186-
R\S 'LSHIFT'\EQU LSHIFT; 
R\S 'RSHIFT'\EQU RSHIFTI; 
81\S 'LSHIFT'\EQU LSHIFT: 


















Fig. 10-2: Shifter Loop Block Diagram 
These examples illustrate the design of leaf cells and composition cells. Each cell 
(LL) contains an interface specification (EXTERNALS), an interconnection 
specification (RELATIONS), and a subcell specification (GUTS). Lc~af cel1s do not 
have any GUTS, only EXTERNALS and RELATIONS. Composition cells have values in 
all three areas. 
The first version of Bristle Blocks was completed in December, 1978. Version one 
produced small datapath chips, in a variety of representations. The compiler 
produced the NMOS artwork, along with a stick diagram, transistor diagram, logic 
diagram, and block diagram of the chip. In all later versions of Bristle Blocks, the 
-187-
capability of multiple representations (even multiple technologies) has been an 
integral part of the system, although the datapath cells were designed to produce 
only layouts, due to the press of time. 
In the two and a half years since the first running of Bristle Blocks, there have been 
several areas of improvement upon the basic system. Work has been done on the 
Virtual Memory (VM) system, which greatly increased the compilable chip size. 
Many of the algorithms, like the river router and instruction decode generator, 
have been improved and tested. User interfaces have been added to allow 
non-specialists to use the system. Finally, the variety of datapath elements has 
increased, improving the efficiency and flexibility of Bristle Blocks. 
To provide efficient generation of artwork, Bristle Blocks cells were designed to be 
programs, rather than data structures. If the cells were data structures, the user 
would be limited to designed cells expressable in the data structure. Since the user 
is allowed to write programs for the cells, the user is only limited by the 
expressability of the language Bristle Blocks is written in (ICL). ICL allows much 
greater expressability than a simple data structure would allow, so that the user 
can design very flexible cells. 
Unfortunately, the PDP-10 computer has a very small address space, with only 18 
bits for addresses. In current versions of ICL, programs are not swappable to the 
disk, although data structures can be swapped to the disk. Since data structures are 
swappable, we can have a very large effective address space by saving the 
information contained in the data structures in a disk file. The system can read this 
information as it is required, and when the data is no longer needed, the data can ·be 
written back into the file. With swapping, we can effectively have a much larger 
address space if our cells were data structures. 
To make use of swapping, yet still retain the power of cells as programs, a 
compromise was made. Most cells have a lot of relatively fixed, or constant, layout. 
The fixed portion of the cell can be stored in a data structure, and thereby can be 
swapped to a disk file. The variable portions of the cell can be kept as a program. 
The cells compute the variable portions of the layout and swap in the fixed layout 
sections. Partitioning the cells in this manner does add to the complexity of the 
compiler and of the cells, but users of the system never see the additional 
-188-
complexity. 
To free up as much code space as possible, we need to have as much of the cells as 
possible represented in the swappable data structure. To this end, the data 
structures used in Bristle Blocks allow the representation of simple variations in 
the layout and connection points. In many cases, the actual code required by a cell 
simply checks the user's parameters and swaps in the cell implementation from the 
data file, which is called the Virtual Memory (VM) file. 
The following data structure definitions describe the structures used in the current 
version of Bristle Blocks. 
The first primitive user-defined datatype in Bristle Blocks is called the STRETCI_! _ 
_EOINT. The name refers to the common use of the datatype, although a better name 
would probably be VARIABLE. The data structures refer to these STRETCl'!_!'OINTs 
using the ID number as a name. To stretch a layout, the appropriate STRETCH 
_EOINT's value is modified, and the layout is effectively changed. 









STRETCH_POI NTS= STRETCH_POINT I; 
VAR STRETCH_POINTS_VALID=BOOL; 
The NAME component of STRETC!i_.!'OINTs holds the user's names for the STRETCI_!__ 
_EOINTs. The system looks through the global STRETCH POINT list to convert a name 
to a STRETCI!._!'OINT. The ID is the internal identification assigned by Bristle Blocks 
to the STRETCH POINTs. The remaining components are used to compute the value 
of a STRETCH POINT. The XFRM component may contain an algorithm for 
computing a STRETCH__!'OINTs value: a STRETCI-!.J>OINT's value may depend upon 
other STRETCH POINT values. The FRESH component states whether the FINAL 
component holds the actual value of the STRETCH__!'OINT. Whenever a STRETCH 
POINT'S value is modified, all of the STRETCH_J'OINTs in the system have their 
FRESH value set FALSE. When computing a STRETCH__!'OINT's value, the FRESH 
component is examined. If FRESH is TRUE, the FINAL component hold the value. If 
FRESH is FALSE, the system recomputes the final value. The FINAL value is set to 
-189-
the INITIAL value, and the FRESH component is set TRUE. The XFRM is then 
evaluated, and the resulting value is stored in the FINAL component. 
COORDINATES are used to express equations in the system. A COORDINATE may 
state, for example, that a certain feature be positioned with a Y-coordinate of 5 
lambdas above the 'Yt' STRETCH POINT. This equation is stated as follows. 
'Yl'\P 5 
The datatypes associated with COORDINATEs are listed here. 
TYPE COORDINATE= EITHER 
INTEGER= I NT 
STRETCH= STRETCH_pO!NT 
OP= [OP:COORDINATE_OP A,B:COORDINATEl 
NEGATE= COORDINATE 
IF= CREL: IF _RELATION C',A,B:COOROINATEl 
ENDOR; 
COORDINATES= { COORDINATE l; 
COORO I NA TE_OP= SCALAR {ADD, SUB, f1UL, DIV, MIN, MAX> ; 
IF_RELATION= SCALAR<ZERO,NZERO,NEG,NNEG,POS,NPOS,EYEN,000); 
In the simplest case, a COORDINATE may be an INTeger. A STRETC!i_!'OINT may al.so 
be a COORDINATE. A COORDINATE may be a simple function of two other 
COORDINATES: A OP B, where A and B are coordinates and OP is either ADD, SUB, 
MUL, DIV, MIN, or MAX. A COORDINATE may be the inverse of another 
COORDINATE, and finally, a COORDINATE may be an IF ... THEN ... ELSE ... FI equation: 
the C COORDINATE is compared with relation REL. If the comparison is TRUE, the 
value of A is returned. Otherwise, the value of Bis returned. 
Using the above definition of COORDINATE, definitions for wires, boxes (VBOXES), 
and polygons (SSXY) were defined. These primitives were not associated with mask 
layers, but all the primitives of a single layer (within a single cell), were collected 
in to a single MASK LA YER. 
TYPE XY_PAIR= [X,Y:COORDINATEJ; 
SXY= I XY_PAIA l; 
-190-
SSXY= { SXY l; 
WIRE= [WIOTH:INT PATH:SSXYJ; 
!~IRES= I l.JJRE I; 
VBOX= CLOW,HIGH:XY_PAIRJ; 
VBOXES= l VBOX }; 
MASK_LAYER= [COLOR: COLOR WIRES:L-IIRES BOXES:YBOXES POLYS:SSXYJ; 
MASK_SET= l MASK_LAYER I; 
Of'IASK_SET = a swappab I e version of f1ASK_SET ; 
A collection of MASK_!,AYERs formed a MAS~ET, which was the complete set of 
geometric primitives for a particular representation. The PICTURE datatype 
described one representation. 
TYPE VIEW= SCALAR<LAYOUT,STICKS, TRANS,BLOCK,LOGIC>; 
PICTURE= [VIHJ:VIEl.J MASKS:OMASK_SETJ; 
PICTURES= I PICTURE l; 
The next set of datatype definitions described connection points. Connection points 
could be kept with the artwork, swapped out in the disk file. Connection points 
contain a name, positions, signal direction (into or out of the cell), connection type 
(control connection, pad connection, etc.), buffer type or pad type, connection edge 
(north, south, east, or west), timing infortn.ation, layer information, and the 















SCALAR<CONTROL,PAO,CONDITION, .... ) 
SCALAR <PHI _l, PHI _2' Plr1UX' P2r1ux' Pl I NY' P2l NV' YOO, GNO' 
BUFIN,BUFOUT,BUFINV) 
SCALAR (IN, OUT, DOL-IN, I 0, IO _OOL-IN, ENABLE, OUT _ENABLED, 












Next, we have the BLOCK definition. A BLOCK is the basic cell in Bristle Blocks. It 
contains a name, some layout information (pictures), calls to subblocks, connection 
points, and a bounding box. Recall that many of the BLOCKs for a particular chip 
are computed by a program. These BLOCKs have enough flexibility, however, that 
many of the datapath cells can be represented as BLOCKs rather than programs. 











BLOCKS= ! BLOCK l; 
DBLOCK= a swappable version of BLOCK ; 
Lastly, we have definitions for CALLs. A CALL is a reference to a subBLOCK. 












V: [I: XY YAIR N: COORDI NATEJ J 
lJITH_MASKS= [C:CALL fl:MASK_f1AKERSJ 
PASSJIASKS"' [C: CALL N: SI J 
ilASKED= [C: CALL N: I NTJ 
IF= [REL:IF_RELATION C:COOROINATE A,B:CALLJ 
ENDOR; 







MASK_MAKERS= I f1ASK_MAKER I; 
The first four types of CALLs are fairly straightforward. A STRING CALL places a 
subCALL at each point in the list of XUAIRs (SXY). A VECTOR CALL evaluates the 
V.N COORDINATE to determine an iteration count. The V.I Xy__!>AIR specifies a step 
distance. The VECTOR CALL will return a row of the subCALLs, each offset from 
-192-
the previous instance by the step distance. The total number of instances in the row 
is given by the iteration count. The CALLS CALL allows a BLOCK to refer to several 
subBLOCKs. The next three types of CALLs specify masks. Each of the iteration 
type CALLs (STRING, VECTOR, CALLS) can be masked: only a few- of the specified 
subCALLs w-ill be returned. The WITH_MASKS CALL adds masks to a global list of 
masks. PAS-?_ MASKS reorders the masks in the global list, and MASKED extracts one 
mask from the list and applied the mask to the subCALL. Finally, the IF CALL 
returns one of its subCALLs depending upon the correspondence of its COORDINATE 
and relation (similar to the IF type COORDINATE). 
These datatype definitions wexe arrived at through many iterations and trials. 
They are not as general or easy to use as straight procedural cells, but they sufficed 
with the implementation restrictions that existed. 
10.2: The Future 
In the future, there are four areas of improvement needed in Bristle Blocks. The 
first area has to do with the implementation concessions using the current ICL 
implementation. Secondly, the floorplan of Bristle Blocks needs to have a greater 
flexibility, tvhich would allow more efficient implementations of many datapath 
chips. Thirdly, more work has to be done with the simulation aspects of the chips. 
Finally, I need to address the user specification issues. What languages are suital>le 
for the specification of Bristle Blocks chips? 
The main implementation concession in the current Bristle Blocks programs has to 
do with the address space limitations. Because of the 18 bit limit, the datapath cell 
programs have had a split personality. Portions of cells are data structures kept in 
disk files, while the remaining portions exist as programs compiled into the Bristle 
Blocks system. In the new ICL system, code is swappable, so that the cells can be 
entirely represented as programs without exceeding the address space of the 
machine. 
The second improvement to Bristle Blocks modifies the floorplan of the compiler. 
In addition to allowing a greater number of buses in the datapath, I would like to 
add greater flexibility in the instruction decode portion of the chip. The most 
-193-
logical way to enhance the instruction decoder is to perform a fusion of Bristle 
Blocks with the RELAY compiler, allowing the user to design chips which are 
hierarchical compositions of register transfer units and finite state machines. The 
datapath portion of the compiler would generate the efficient register transfer 
circuitry and the PLA portion of the compiler would generate the random logic and 
state machine mechanisms. The proposed compiler will interconnect the various 
datapaths and PLAs using a hierarchical general interconnection system. 
Thirdly, I need simulation procedures in Bristle Blocks. Each version of Bristle 
Blocks has had hooks for linking simulators to the compiler, both register transfer 
simulators and timing simulators. Due to the press of time, these simulators have 
only been dreams. When I have the added flexibility of the Bristle Blocks/RELAY 
fusion, simulation will become a very important aspect of the design. I do not plan 
to do electrical model simulations of the entire chip. The simulation will be 
performed in much the same manner as the layouts are generated. Since the us0r 
provides a very high level specification of the design in the well defined design 
language, RT simulations and timing information can be generated directly from the 
high level specification, without having to generate the artwork and examine the 
resulting layout. 
Finally, I need to develop languages for specifying Bristle Blocks chips which also 
capture the random logic/state machine information. These languages should feel 
natural to the designer, so that the designer can easily express his desires, and so 
that the user can intuitively grasp the meaning of expressions in the language. A 
lo\ver bound exists on the information content required in a chip specification. 




Appendix 1: ICL Summary and ICLIC Reference Guide 
This appendix summarizes some of the language features of ICL and lists the ICLIC 
functions use_d in this thesis for describing integrated circuit layouts. For a more 
detailed description of ICL, refer to the ICL appendix of Ron Ayres' thesis [3]. A 
more complete description of ICLIC is given in the ICLIC manual [ 4]. 
A 1.1: ICL Summary 
For the purposes of understanding the code examples presented in this thesis, ICL is 
very similar to PASCAL, with the following exceptions. 
Pointers: ICL makes use of pointers in its memory management scheme, like 
PASCAL. However, the pointers are implicit in ICL, whereas the user must 
explicitly state when pointers are to be used in PASCAL. 
Strings: ICL does not have a mechanism for building arrays. Instead, ICL allows 
the user to build strings. Most languages allow text strings to be arbitrarily long. 
In ICL, the user may build structures which are arbitrarily long strings of any 
desired datatype. Strings are generated in ICL by enclosing the string elements in 
curly brackets, {}. The elements of the string are separated by semicolons. 
Elements can be appended to the front of an existing string using the <$ operator, 
and elements can be appended to the end of an existing string using the$> operator. 
The$$ operator concatenates two strings. Elements of a string can be examined by 
indexing into the string. The ith element of string Sis accessed by writting S[i]. 
The tail of a string (all elements from a specified index to the end of the string) is 
accessed by writting S[i-]. Quantifiers can be used to sequentially access elements 
in a string without indexing into the string. 
Record Generation: ICL has record constructs similar to PASCAL's. There are 
differences between the record generation processes of the two languages. In 
PASCAL, one must explicitly request a chuck of memory for the record, the 
sequentially fill each component of the record. In ICL, one never requests chucks of 
memory. Instead, one merely specifies the record template with the desired values 
for each component. 
-196-
Points: POINT is a basic datatype in ICL, just like integers and reals. A POINT 
contains two real values, which are usually interpreted as X and Y coordinates of a 
point in two-space. Points are generated using the binary operator #. 3#4 is the 
paint whose x-coordinate is 3 and whose y-coordinate is 4. The x-coordinate of a 
POINT P is accessed by writting P.X. 
Polymorphic Functions: In ICL, the user can specify any number of functions 
·(procedures) with the same name. There is no ambiguity if the set of input 
parameters and return parameters uniquely determine the proper function to apply. 
For instance, the user may have a WRITE(INTeger) function, a WRITE(REAL) 
function, and a WRITE(CHARacter) function. For each call to a WRITE function, ICL 
selects the appropriate function based upon the parameter types. If the user writes 
WRITE(5), the WRITE(INTeger) routine is called; if the user writes WRITE(5.), the 
WRITE(REAL) routine is called. 
Coercions: In most languages, there are predefined arithmetic coercions. If the 
user assigns an INTeger value to a REAL variable, the compiler automatically calls a 
routine which translates INTegers to REALs. In ICL, the user may declare coercions 
between any datatypes. ICL will implicitly apply coercions to satisfy datatype 
requirements. 
Infix Operators: Math operators, such as + and -, are infix operators: one writes 
A + B rather than +(A,B). Binary function definitions (functions which take two 
parameters and return one value) typically do not use infix format: f(A,B), not A f 
B. In ICL, any binary function may use the infix format when the function name is 
preceded by the\ operator. f(A,B) can be written A \f B. 
Quantifiers: Virtually every language has constructs for generating loops in the 
program control flow. These loops may be arithmetic loops (FOR loops) or 
conditional loops (WHILE loops or REPEAT loops). In addition to these standard loop 
generators (quantifiers), ICL has mechanisms for sequencing through strings (FOR 
element $E string;). ICL also has unary and binary operators which apply to 
quantifiers. The && operator forces two loops to iterate together; the !! operator 
steps one quantifier for each iteration of the other quantifier. Unary operators may 
eliminate some iterations of the quantifier or perform some actions before each 
iteration of the quantifier. 
-197-
Suspendable Functions: The suspendable function mechanism in ICL allows 
the user to assign function call references to variables. A reference to function X 
may be assigned to the variable Y by writting Y:= 11 X \ \;. Later, function X may be 
envoked by writting <*Y*>. 
A 1.2: ICLIC Reference Guide 
The datatype definitions used in ICLIC are listed here: 
TYPE SP= f POINT I; 
WIRE= [WJOTH:REAL PATH:SPJ; 
RG= EITHER 
POLY= SP 
l-ll RE= 1-11 RE 
BOX= BOX 
UNION= 11RGS 
MATRIX= CDISPLACE:MRG BY:MATRIXJ 
POINT= COISPLACE:MRG BY:POlNTl 
COLOR= [COLOR:MRG WITH:COLORl 
O!SK= 
ENOOR; 
MRG= lRG:RG VMBB:BOX ..... J: 
llRGS= I llRG l ; 
COLOR= SCALAR <RED, BLUE, GREEN, YELLO~I. BLACK, GLASS, BRmlN, YI OLET, BURI EDJ; 
MATRIX= [A, 8, C, 
0, E, F: REAU; 
These definitions declare that SP (String of Points) is an indefinite list of points, a 
WIRE contains a width and a path, and a BOX is two points. An RG (ReGion) may 
either be a POLYgon, represented by an SP, a WIRE, a BOX, an arbitrary list of MRGs, 
an MRG whose points are transformed, a displaced MRG, an MRG with an associated 
COLOR, or other types which are not used in this thesis. An MRG contains an RG 
along with a Virtual bounding box and other internal data. 




DEFINE AT!M:MRG P:POINTl=MRG: 
DEFINE ROT!M:MRG ANGLE:REALl=MRG: 
DEF I NE r1 !RX <M: llRG l =t'IRG: 
DEF I NE tll RY (M: tJRG l =r-lRG: .......... ~ .... 
DEFINE PAINTEO<M:MRG C:COLORl=MRG: 








The TO function takes two points and makes a box. AT takes an MRG and a POINT 
and generates a new MRG ide.ntical to the first MRG with all features displaced by 
the amount specified by the point. ROT takes an MRG and a REAL and generates a 
new MRG identical to the first but rotated counterclockwise the numher of degrees 
specified by the REAL. Similarly, MIRX and MIRY mirror about the X and Y axis, . 
respectively. PAINTED applies the given COLOR to the given MRG, and UNION takes 
two MRGs and merges them. To generate an array of identical MRGs, the following 
routine can be used: 
TYPE ARRAY_OF_OOTS= CIX,IY:REAL NX,NY:INTJ; 
DEF I NE AT (11: tlRG A: ARRAY _OF _DOTS I =MRG: ENDDEFN 
IX and IY specify the distance between columns and rows, and NX and NY specify 
the number of columns and rows. To easily generate colored geometric primitives, 
the following routines have been defined: 
DEF I NE Ill RE CC: COLOR tl: REAL P: SPl ,.,.MRG: 
DEFINE WIRECC:COLOR P:SPl=MRG: 
DEFINE BOX!C:COLOR S:BOXJ=MRG: 
DEFINE POLYGON<C:COLOR SP:SP>=MRG: .... 





The second wire function does not require a width parameter: it uses the default 
width for the given color. The DISK function configures the MRG so that it can 
swap to a disk file with the virtual memory system in !CL. The color interpretation 





pa I ys i I i can 











second layer metal 
buried contacts 
-199-











Green-to-Blue feedthrough (Green-Contact-Blue} 
Red-Contact-Blue 
Butting contact, Red 'UP' {Green-Red-Contact-Blue-Up) 
Butting contact, Red 'Left' 
Butting contact, Red 'Down' 




Global variables and routines: 
LAf18DA==REAL 
The basic dimension for describing layouts 
IJ I DTH !REAL l =REAL 
The width of metal wire required to supply power to the given number 
of squares of pul lup. For example, to supply 100 minimum size 
inverters whose pul I ups are each 1/4 squares wide, the metal wire 
should be 1-JIDTH<100i·(.25l ~iide. 
WIOTH!COLORl~REAL 
The default width of features for the given layer 
SPACING!COLOR,COLOR>=REAL 
The spacing between feature edges of the two colors 
CENTER_ TO_cnnER <COLOR. COLOR) =REAL 
The center-to-center spacing for wires of default sizes on the t~ro 
layers 
O_LDAD=REAL 
The capacitive load for the minimum 
LOAO(COLOR,BOXl=REAL 
The capacitive load for the box 
LOAD!WIREl=REAL 
The capacitive load for the wire 
There are routines for input/output of MRGs: 
PLOT<PICTURE,PLOTTER): 
where PICTURE may be one of: 
an llRG 
AIF ( f i I e-namel 
AIF(fi le-name, I ist_of_colors} 







A IF ( f i I e-name l 
A IF ( f i I e-name, I i s t_o f _co I ors) 
flBH <rlRG) ~BOX the mini mum bounding box of the 11RG 
CIF2_0UT01RG,file-name); produces a CIF file 
CIF2_1Nlfi le-namel·MRG reads a CIF file 
-201-
Appendix 2: Imbedded Language Example 
The code listed here generates the parameterized shift register cell presented in 
chapter 3. There are several parameters used in the routines below. The following 














The length of the pul lup transistor in lambda 
The length of the pul I down transistor in lambda 
The width of a power I ine which supplies half a 
rm~ of eel Is 
The power I ine width for a whole row 
The power line width for two rows 
The power line width for the entire array 
The number of shift register bits in a row 
The number of rows for each shift register (always an 
odd number, which indicates how many times the 
long shift register is folded) 
The number of shift reg1sters in the array 
The number of bits in the last row of the shift register 
The total number of bits in each shift register 
The first set of routines generate a single bit of the shift register. There are six 
routines: each generates layouts with one of the six aspect ratios. The first five cell 
layouts generate only one layout for the cell, but the last generates different 
layouts for adjacent bits. By alternating the two layouts, the total array size is less. 
For this reason, the SHIF~ELL datatype is defined, which can contain t-wo MRGs. 
The first five routines only use the ODD component of the SHIF~ELL, -while the 
last routine uses both. 
TYPE SHIFT_CELL= [EVEN,OOD:f1RGJ; 
DEFINE SHIFTl_CELLIPU,PO,DP:REALl=SHIFT_CELL: 
BEGIN VAR CYDD=REAL; 
DO CVOD:=8+0P/2 MAX 4+PU; 




t.JIRE(REO, 14#3;.#.5;1#-2.5;.#-3.l l; 
WIRE (GREEN, l0t/0; 1. Sfl.; 4#-2. 5;. #-4. l l; 
~JI RE <RED, !5. 511-8.; 9. 5#-12.; 15#. I J; 
l.JIRE !GREEN, {8#-13. -DP/2;. #-14.; 12#-8.;. #CVOO; 8#. I J; 
IF P0>=3 
THEN POLYGON!GREEN, fGfl-15. ;8+PO#.; .#-9. ;9.5#. ;G#-12.5}) 
ELSE NIL FI; 
BOX<RE0,9#2\TO 15#2+PUl; 
BOX<YELLOW,9#0\TO 15#4+PUJl\AT 10#0;14#01; 





DEFINE SHIFT2 CELL!PU.PO,OP:REALl=SHIFT CELL: 
BEGIN VAR CVDD=REAL; -
DO CVOO:=ll+OP/2+PU; 






I.JI RE CREO, 14tlG+PU;. #. 5; 1#-2. 5;. #-3. ! l; 
I.JI RE <GREEN, 10#0; 1. 5#.; 4#-2. 5;. #-6. ! J; 
LJI RE <RED. 14. 5#-10.: 8. 5#-14.; 12#. I); 
UI RE CGREEN, 15#-14. -OP/2;. #-14.; 10#-10.;. #CVODl); 
IF P0>=3 
THEN POLYGON !GREEN, 15#-16.; 6+PD#.;. #-12.;. -1#. +1; 
7.5#. ;5#-13.Sl) 
ELSE NIL FI; 
BOX<RE0,7#2\TO 13#2+PUl; 
BOXCYELLOW,7#.5\TO 12.5#4+PUll\AT 10#0;12#0!: 
WIRECVIOLET, 14#6+PU;.#12.5+PUll; 
WIRECVIOLET, 116#6+PU;.#PU-.5l lll 
DEFINE SHIFT3_CELL!PU,PO,OP:REALl=SHIFT_CELL: 




l-IIRECREO, 15#4; .#-2.1 l; 
tJ I RE (GREEN, 10110: 10#. ; . #3 l } ; 
l.JI RE <RED, 114#5:. #8: 20#.;. #-1. l l; 
LJIRECGREEN, f17#-l.-OP/2; .#1! l: 
WIRECGREEN, 128+PU#0;23#.;.#5:23+PU#.;.#9+0P/2ll; 
IF PD>2 THEN BDX!GREEN,113#0\TO 23#POJ ELSE NIL Fl; 
BOX!RED,25#2\TO 26+PU#8l: 
BOXCYELLOW,23#2\TO 25+PU#10l!\AT 10#0;26+PU#01; 
WIRE !VIOLET, 14#4;. #9+0P /21 ) ; 
WIRE !YI OLET, !30+PU#4;. #-1. -OP/21) l J 
DEFINE SHIFT4_CELL(PU,PO,SP,OP:REALl=SHIFT_CELL: 
BEGIN VAR M=MRG; 
DO M:=IRCBCB\AT 1#4; 
GRCBR\AT {8#4;21#41; 
GCB\AT fl3#-l.-SP/2;19+PU#9+SP/21; 
WIRE<REO, 111#4; .tl7;l71J.; .#-2. l l; 
!JI RE <GREEN, 114#-1. -SP/2;. #0; 20#.;. #5; 20+PU#.;. #9+SP/21); 
IF PD>2 THEN BOXCGREEN,13#-1.\TO 21#P0-1J ELSE NIL FI; 
BOX!RE0,22#2\TO 23+PU#8l; 
BOX!YELLOW,20#2\TO 23+PU#l0ll; 




WIRE <RED, !2#4;. #-2. l}; 
WIREIGREEN, l0#0;7#.;.#3l l; 
l.JIRE<REO, 115#-5.-SP;.#.+1.5;21#.+2.5;22#.I ); 
WIREIGREEN, 120#0;.#-5.-SPIJ; 
1-JIRE <GREEN, !33#-2. -SP;. #0; 2G+PU#. I) I l 
-203-
DEFINE SHIFT5_CELL<PU,PO,SP,OP:REALl=SHIFT_CELL: 
BEGIN VAR t1=MRG: Yl. Y2=REAL; 
DO Yl:=-20+SP MAX 2G+PD; 





1-J I RE <RED, 10#13; . #21. 5: 2#23. 5; . #24+PDI l ; 
POLYGON (GREEN. 12#20:5#.; .#23+PD;-2.tl.; .11241}; 
l-l 1 RE (GREEN, 1-1. #23+PO; • #Yl +1; 0#. ; • #Y2-1; 4#. l } ; 
BOX<RE0,-3.llY1+3\TO 3#Yl+PU+3l; 
BOX(YELLOW,-3.#Yl+l\TO 3#Yl+PU+5ll; 
GIVE £000: IM\AT 7#0; 
ENO 
ENDOEFN 
t1\t1 IRY\A T 15#0; 
GCB\AT 111#18; .#Y21; 
RCBCB\AT 17#-17.;14#-10.I; 
l-JIRE !GREEN, 10#-1.; .#.5;5#5.5; .#9! l; 
III RE (GREEN, 1811-1.;. ti. 5; 14#5. 5;. #31); 
IJIRE<REO, IG#-15. ;3#.; .#.5;0#3.51 l; 
l-JIRE <RED, 113#-9.; 11#.;. ti. 5: 8#3. 51 l l l 
DEFINE SHIFT5_CELL(PLJ,PO,SP,OP.HP:REAL>=SHIFT_CELL: 
BEGIN VAR M,ML=MRG;Yl,Y2=REAL; 
00 Yl:= 23+HP MAX 29+PO; 




IJ I RE !RED, 10#1 G; . #24. 5; 2#25. 5; . 1127 +POI J ; 
POLYGON (GREEN. l 2#23: 5#. ; • #25+PO; -2. ti. ; . #271 ) ; 
1-llRE (GREEN. l-1. #25+PD:. t/Yl+l: 0#.;. #Y2-l; 4#. I); 






~IIRE (GREEN. 13#-10.; .#-13. l l; 
I.JI RE !RED. 17#-17. : • #-4. I ) : 
WIRE(VIOLET, 10#10;2.5#-3.ll; 
l.JIRE (VIOLET, 13#-28.; 5. 5#-18.1 >I; 







WIRE<REO, 1-2.#9;3#.; .1171 l; 
~JIRE<REO, 11#-27.;5#.;.#-25.lll 
EVEN: lf'l\t1 I RY\A T 8#0; 
M\ROT 180\AT 11#-18.; 
tlL\AT 8110; 
t.JI RE <RED, !10#9; 5#. ; • 1171 l ; 
WIRE <RED, !13#-27.; 8#.;. #-25.1 l l l 
-204-
We would like a series of routines which would take the shift register bits from 
the routines above and generate complete arrays. As one might expect, much of the 
work in generating these arrays is independent of which type of aspect ratio one is 
using, and as one might also expect, there are some differences. Therefore, we have 
a routine FINISH which contains the code which can be common for each of the 
different cell types and individual routines for generating the type-specific data. 
The datatype SHIFT_!'OW was created to contain the information which must be 
transferred between each of the SHIFTn ROW routines and the FINISH routine. 
TYPE SH! FT _ROl-l= (FIRST, m DOLE, LAST, ALT, ONLY: MRG 
DOWN,UP,TOP,BOTTOM,REOGE:REALJ; 
DEFINE FINISH!R:SHIFT_RQ!.I RB.NB: INT TP:REAU=MRG: 
BEGIN VAR CTC,TOP,BOTTOM=REAL;M=MRG; 
DO CTC: =R. UP-R. Dot.JN; 
TOP:=R.UP+R. TOP; 
BOTTOM:=R.OOWN-R.BOTTOM; 
M:=IF RB>l THEN 
IR.FIRST; 
IF RB>4 THEN 
R.MIODLE\AT 0#2*CTC\AT [NX:l NY:RB/2-1 IY:2*CTCJ 
ELSE NIL FI: 
R.LAST\AT B#CTC*<RB-ll; 
R.ALT\AT [NX:l NY:RB/2 IY:2*CTCJl 
ELSE R. ONLY FI ; 
GIVE 11'1\AT [NX:l NY:INB+ll/2 IY:2,.,crc,·,RBJ; 
END 
ENDDEFN 
11\lllRX\AT 0#2,·:RB,·:CTC+2,·:R.OOllN\AT (NX:l NY:NB/2 IY:2,·,crc,·,RBJ; 
L.I IRE (VIOLET, I-TP+l. S#BOTTOl1+1. 5; -3. II. ; 
.#CTC*IRB*NB-ll+TOP-1.5;-TP+l.5#.f ); 





DEFINE SHIFTl_RQl.J<PU,PO,OP, TP:REAL NR,RB,NL: INTl=SHIFT_ROL.J: 
BEGIN VAR M,P,R=MRG:LEOGE,REDGE,CVOO,CTC=REAL; 
DO LEOGE:=lF RB>l THEN 7 ELSE 0 FI; 
REDGE:=28*NR+LEDGE+3; 
CVOD:=8+0P/2 MAX 4+PU: 
CTC:=12+0P/2+CVDD: 
~1: =SH I FTl_CELL <PU, PO, OPJ. 000; 
P:=IBOXIBLUE,-1.#-12.-DP\TO REOGE-3#-12.l; 
BOX!BLUE,3#8\TO REDGE+l#CVDO+OP/2); 
~JI RE !VIOLET, 1-3. #3. 5; REOGE-3#. l l; 
WIRE<VIOLET, 13#-3.5;REOGE+3#.lll; 









FIRST: IR:P;WRE (GREEN, 11-TP#0;LEOGEll.1 l l 
tl!DDLE: IR;PI 
LAST: Hl\AT LEDGEll0\A T [J X: 28 NX: NL NY: 1J ; 
P;lJI RE <GREEN. 128.-:NL+LEDGEt/0; REDGE+ TP-111. I l I 
ALT: IR\ROT 180\AT REOGEf12,«CVDD: 
P\MIRX\AT 0112*CVDD: 
lJI RE <GREEN. 14#2»:CVOO; -1. ti.;. 112,·,CTC; LEDGE#. I l; 
WIRE <GREEN, IREDGE-4110;. +5#.;. 112,·,CVOO; REDGE-LEDGE#. I l l 
ONLY: IR; P; I.JI RE (GREEN, 11-TPll0; LEDGE#. I ) ; 
WIRE!GREEN, IREDGE-3#0:REDGE+TP-lll.lllJ 
DEF I NE SHI FT2_ROlHPU, PO, OP, TP: REAL NR, RB, NL: INTl =SHI FT _ROL-1: 
BEGIN . VAR M,P,R=MRG:LEDGE,REDGE,CVOO,CTC=REAL; 







I.JI RE (VIOLET, 1-3. llPU+12. S;REDGE-3#. I); 
WIRE<VIOLET, 1311PU-.5:REOGE+3#.lll; 








FIRST: IR; P; WI RE (GREEN, ll-TPll0; LEDGE#. I) I 
tl!DDLE: IR;PI 
LAST: 111\AT LEDGEt/0\AT [!X:21+ NX:NL NY:lJ; 
P: 1.JIRE (GREEN, 12r1·::NL+LEDGE#0; REOGE+ TP-1#. l l l 
ALT: IR\ROT 180\AT REDGE#2)·,CVOO; 
P\MIRX\AT 0112*CVOO; 
IJI RE (GREEN, 14#2i'rCVOO; -1. It.:. t/2i•rCTC; LEDGEt/. l); 
l-1 I RE (GREEN. IREDGE-41/B: • +5#. ; • #2,-,cvoo; REDGE-LEDGEll. I ) J 
ONLY: IR; P: l-lI RE (GREEN, 11- TP#0; LEDGE#. I l ; 
LHRE <GREEN, !REDGE-3110;REOGE+TP-l#. l I I l 
DEF I NE SHI FT3_ROlHPU, PD, DP, TP: REAL NA, RB, NL: I NTl =SHIFT _ROL-1: 
BEGIN VAR M,P,R=MRG:LEDGE,REDGE,CTC=REAL; 
DO LEOGE:-IF RB>l THEN PU-5 MAX 1 ELSE 1 FI: 
REDGE:-(52+2*PUl*NR+ IF RB>l THEN 2+ABS<PU-5l ELSE 2 FI; CTC:=l0+DP; 
ti: =SHI FT3_CELL <PU, PO, DP>. ODO; 
P:=IBOX<BLUE,-1.11-1.-DP\TO REDGE-311-1.); 
BOX<BLUE,3#9\TO REDGE+lt/S+DPJ; 
I.I I RE (VIOLET, !-3. #S+DP /2; REDGE-3#. l l ; 
WIRE!VIDLET, 13#-1.-DP/2;REOGE+3#.lll; 






FIRST: IR:P;l.JIRE<GREEN, ll-TP#0;LEOGEll.I ll 




LAST: IM\A T LEDGEll0\A T [IX: 52+2)·,PU NX: NL NY: 1 l ; 
P:WIRECGREEN, 1152+2*PUl*NL+LEDGE#0;.#4;REDGE+TP-l#.;.#01 JI ALT: IR\ROT 180\AT REDGE#l8+0P; 
P\MIRX\AT 0#18+DP; 
l-Jl RE CGREEN, 16-PU f1AX 0#18+0P; -1. #.;. #2)·,CTC; LEDGE#. l); 
1-J!RECGREEN, IREDGE-CG-PU f1AX 0l#0;REOGE+l#.;.#18+0P; 
REOGE-LEDGE#. I l J 
ONLY: IR; P: 1-1 I RE (GREEN, 11-TP#0; LEDGE#. l); 
WIREIGREEN, IREDGE-1#0;.#4;REDGE+TP-l#.;.#01JJJ 







WI RE (VI OLE T, 1-3. #4: REOGE-3#. I J ; 
WIREIVIOLET, 13#-G.-SP;REOGE+3#.JJl; 








FIRST: IR: P; I.JI RE (GREEN, ll-TP#0; 4#. l) I 
MIDDLE: IR:PI 
LAST: U-1\AT 4#0\AT [IX:2G+PU NX:NL NY:lJ; 
P:WIREIGREEN, 1125+PUl*NL+4#0;.#4;REDGE+TP-l#.;.#0J JI ALT: iR\ROT 180\AT REOGE#l8+SP: 
P\~IIRX\AT 0#18+SP: 
I-JI RE CGREEN, 111#l8+SP; -1. #. : . #2,·,CTC; 4#. I ) : 
lJI RE I GREEN, IREOGE-11#0;. +10#. ; . #18+SP; REDGE-4#. J ) I ONLY: !R; P: ~JI RE I GREEN, 11-TP#0; 4#. l J ; 
WI RE <GREEN, IREDGE-11#0;. #4; REOGE+ TP-1#.;. #01 I I J 
DEF I NE SH I FTS_ROIHPU, PO, SP. DP, TP: REAL NR, RB, NL: I NT> =SH I FT _ROW: 
BEGIN VAR M,P,R-MRG;REDGE,CTC,Yl,Y2=REAL; 
DO REOGE:=16*NR+2; 
Yl:~ 2l+SP MAX 27+PO; 
Y2:= Yl+ 19+0P/2 MAX G+PLJ); 
CTC:=Y2+1G; 
fl: =SHI FTS_CELL CPU, PO, SP, OPJ. ODD: 
P:=IBOXIBLUE,-1.#18\TO REDGE-3#Yl-3l; 
BOXIBLUE,3#Y1+9\TO REDGE+l#Y2+0P/2l; 
l-JI RE IV I OLET, I -3. #-1 G. : REDGE-3#. l l ; 
lJIRE !VIOLET, 13#-9. ;REDGE+3#. l I I; 
R:=M\AT -2.#1\AT [IX:lG NX:NR NY:lJ; 





FIRST: IR;P;WIRECGREEN, ll-TP#0;-2.tt.l JI 
MIDDLE: IR;Pl 





ALT: IR\ROT 180\AT REDGE#2,·,Y2; 
P\fll RX\AT 0/./2,·:Y2; 
[.JI RE !GREEN, 14#2,·:Y2: -5. #.:. #2,·:CTC; -2. #.I I; 
WIRECGREEN, IREDGE-4#0;.+10#.;.#2*Y2:REDGE+2#.l)l ONLY: IR; P: l.JJ RE !GREEN. 11- TP/10; -2. ti. I l; 
l.JIREIGREEN, IREDGE-4#0;REDGE+TP-l#. I) I J 
DEFINE SHIFT5_ROLJIPU,PD,SP,OP, TP,HP:REAL NR,RB,NL: INTl=SHIFT_ROW: BEGIN VAR ME,MO,P,R=MRG;REDGE,CTC,Yl,Y2=REAL: 
DO REOGE:=8*NR+18.5; 
Y1:=23+HP MAX 23+PD; 
YZ:=Yl+ 13+SP/2 MAX G+PUI; 
CTC: "'z,·,Y2+18; 





W RE !BLUE, IREOGE-5115; 5#. I ) ; 
l.JIRE !BLUE, !5#-24. ;REDGE-5#.; .#-33.1); 
BCB\AT 15#5;REOGE-5#-33.I; 
lJIRE (VIOLET I !-3. #5; 5#.1); 
I-JI RE ! VIOLET, !REOGE-5#-33. ; RE OGE +3#. I ) l ; 
R:=lflO\AT 11.5#0\AT UX:15 NX: !NR+ll/2 NY:lJ; 
llE\AT 11.5#0\AT UX:lG NX:NR/2 NY:lJ I; 







FIRST: IR; P: t.JJ RE I GREEN, 11- TP#0; 11. 5#. I I l 
11 !DOLE: !R: Pl 
LAST: lflO\AT 11.51/0\AT f!X:15 NX: !NL+ll/2 NY:lJ; 
flE\AT ll.5f10\AT flX:lG NX:NL/2 NY:lJ; 
P ;l.J I RE I GREEN, !8,·:NL+ll. 5#0: REDGE+ TP-1#. I l l 
ALT: IR\ROT 180\AT REOGE#2i":Y2; 
P\MIRX\AT 0#2*Y2; 
t.JIRE!GREEN, !Gt/2,·:Y2;2#.; .#2>":CTC;ll.5#.l I; 
l.JIRE !GREEN, IREOGE-5#0;. +2#.;. #2,·:Y2; REDGE-11. 5#. Ill ONLY: IR;P:~JIRE!GREEN, ll-TP#0;11.5#.ll; 
t.JIREIGREEN, IREDGE-14#0;REOGE+TP-l#.l llJ 
Each shift array function is now trivial: They each call their corresponding SHIFT!!-_ 
ROW function and the FINISH function. Also note that each of the SHIFTn ROW 
functions requires a subset of the total list of parameters, but that the SHIFT!!-_ 
~TIRA Y functions require all parameters, but do not use all of the parameters. This 
is done so that other programs do not have to be aware of the differences in the 
parameter requirements. 
DEFINE SHIFTl_ARRAY!PU,PO,SP,OP, TP,HP:REAL NR,RB,NB,NL: INTl=11RG: FINISH<SHIFTl_ROWIPU,PD.DP,TP,NR,RB,NL},RB,NB,TP} 
-208-
ENOOEFN 
DEFINE SHIFT2_ARRAY(PLJ,PO,SP,OP, TP,HP:REAL NR,RB,NB,Nl: INTl=t1RG:. 
FINISH!SHIFT2_RQl.HPU,PO,OP, TP,NR,RB,NU ,RB.NB, TPJ 
ENCJOEFN 
DEF I NE SHI FT3_ARRAY <PU, PO, SP, OP, TP, HP: REAL NR, RB, NB, NL: I NTJ =f1RG: 
FINISHISHIFT3_ROl-HPU,PO,OP, TP,NR,RB,NU ,RB,NB, TP> 
ENOOEFN 
DEFINE SHIFT4_ARRAY<PU,PO,SP,OP,TP,HP:REAL NR,RB,NB,NL:INT>=MRG: 
FI NI SH <SHI FT4_ROl-J<PU,PO, SP ,OP, TP, NR,RB, NU, RB, NB, TP> 
ENOOEFN 
DEFINE SHIFTS_ARRAY(PU,PO,SP,OP,TP,HP:REAL NR,RB,NB,NL:INT>=MRG: 
FINISH(SHIFT5_ROW<PU,PO,SP,OP,TP,NR,RB,NL>,RB,NB,TP) 
ENDDEFN 
DEFINE SHIFTG_ARRAY(PU,PO,SP,OP,TP,HP:REAL NR,RB,NB,NL:INT>=MRG: 
FINISHISHIFT8_RO~l!PU,PO,SP,OP,TP,HP,NR,RB,NL>,RB,NB, TP> 
ENDDEFN 
To choose between the various possible cell types and configurations, we need to 
know the sizes of all arrays. Since we want to try many configurations, but we 
will only use one, we don't want to perform the expensive computation of 
generating the arrays until we know which one we want. The SIZE function takP-s 
the pertinent parameters and computes what the array size would be if we were to 
actually generate that anay. This computation is very cheap both in terms of time 
and memory space. The SIZE function returns a POINT whose x coordinate is the 
horizontal size of the array and whose y coordinate is the vertical size. The SIZE 
function also returns a Suspendable Function. The suspendable function is 
generated inside the //: \ \ characters. This function is not executed, but is a 
freeze-dried function call. In this usage, all of the parameters for the call to the 
SHIFTn __!\.RRA Y functions are evaluated, but the SHIFTn_ARRA Y function is not 
called. At any time in the future we may, if we wish, actually perform the 
function call and receive the resulting layout. The datatype SHIF'.[_MAKER is our 
freeze-dried function call, and SHIF~ESULT is the datatype which SIZE returns, 
containing both the array size and the suspendable function. 
TYPE SHI FT _flAKER=/ /llRG\ \; 
SHI FT _RESULT= CS! ZE: POI NT SS: SHIFT _llAKERl; 
DEFINE SIZECNB, TB: INT POl.JER:REAL CLASS, RB: INTl=SHIFT_RESULT: 
BEGIN VAR PU,PO,SP,OP, TP,HP=REAL;NR,NL=INT; 
DO PU:=2./POl~ER r-IAX 15./3.; 





TP: =I.I I OTH ( TEkNB,·,POl.IER) ; 
HP: =I-JI OTH (PQl.IER,·:NR/21; 
-zog .. 
GIVE IF CLASS=l THEN 
ENO 
ENOOEFN 
[SIZE: 28*NR+ IF RB>l THEN 10 ELSE 3 FI +2*TP # 
((8+0P/2 MAX 4+PUl+l2+DP/21*NB*RB+OP 
SS://:SHIFTl_ARRAY[PU,PD,SP,OP,TP,HP,NR,RB,NB,NLJ\\J 
EF CLASS=2 THEN 
£SIZE: 24*NR+ IF RB>l THEN 9 ELSE 4 Fl +2*TP # 
125+DP+PUl*RB*NB+OP 
SS://:SHIFT2_ARRAY[PLJ,PD,SP,OP,TP,HP,NR,RB,NB,NLJ\\J 
EF CLASS=3 THEN 
[SIZE: C52+2*PUl*NR+ IF RB>l THEN 2+ABSCPU-6l ELSE 2 FI +2*TP # 
( l 0+0P I ,·:RB,·,NB+OP 
SS://:SHIFT3_ARRAY[PLJ,PD,SP,OP,TP,HP,NR,RB,NB,Nll\\J 
EF CLASS=4 THEN 
[SIZE: C2S+PUJ*NR+l5+2*TP # C20+2*SPl*RB*NB+SP 
SS://: SHIFT4_ARRAY CPU.PO, SP, OP, TP, HP, NR, RB, NB, NU\ \J 
EF CLASS=5 THEN 
(SIZE: 16*NR+2+2*TP # 
ELSE 
<16+<2l+SP MAX 27+PDl+<9+0P/2 MAX S+PUll*RB*NB+OP/2+2 
SS://:SHIFTS_ARRAY[PU,PO,SP,OP,TP,HP,NR,RB,NB,NLJ\\J 
[SIZE: 8*NR+l8.5+2*TP # 
12*1C23+HP MAX 29+POl + CS+SP/2 MAX S+PUlJ+181*RB*NB+SP 
SS://:SHIFT6_ARRAYCPU,PD,SP,OP, TP,HP,NR,RB,NB,NLJ\\J FI 
The SHIFT CELL function is our actual shift cell. We call it passing the number of 
shift registers required, the number of bits per register, the power requirements, 
the desired area, and the oversize costs. This function generates several candidates 
by calling the SIZE function and returns the array which best matches the desi.red 
size. If there are candidates which fit within the desired area, the one with the 
closest match to the area is chosen. If no candidates match, the amount of oversize 
in both x and y for all candidates is multiplied by the weights and the candidate 
\Vith the smallest resulting cost is used. 
DEF I NE SH I FT _CELL< NB, TB: I NT POl-lER: REAL SIZE, ~JE I GHT: PO I NTJ =MRG: 
BEGIN VAR I,J=INT: 
DEFINE BEST<A.B:SHIFT_RESULTJ~SHJFT_RESULT: 
IF A.SJZE<SIZE THEN 
IF 18.SIZE<SIZEl&CDISTCB.SIZE,SIZEl<DIST<A.SIZE,SIZEll 
THEN B ELSE A FI 
EF 8.SJZE<SIZE THEN B 
EF ABS<<CA.SIZE-SIZEl\SCALED_BY WEIGHT> MAX 0#0) < 
ABS<CCB.SIZE-SIZEl\SCALED_BY WEIGHTJ MAX 0#0 ) THEN A 
ELSE B Fl 
ENDDEFN 
<*(\BEST SIZE<NB,TB,POWER,I,Jl FOR I FROM 1 TO 6; !! 
ENO 
ENOOEFN 
FOR J FROM 1 TO 21 BY 2;l.SS*> 
-210-
When the user has specific size requirements for the shift array, a direct call on the 
SHIFT CELL function is used. Most of the time, however, the user can make 
tradeoffs of chip area between various units. In these cases, the user may wish to 
.see the sizes of the various candidates. The GRAPH function will plot a graph of all 
candidates within a maximum size limit while the TABLE function prints a table of 
this same data. Given this information, the user can see what the possible areas are 
for the arrays, which will aid in the planning of other circuit sizes. These 
functions take the number of shift registers, the number of bits per register, the 
power required, the maximum number of folds used (although the SHIFT CELL as 
written always uses a maximum of 21), and the maximum candidate size which 
filters the output. 
DEF l NE GRAPH !NB, TB: l NT POllER: REAL N: l NT MAX: PO I NTl =f1RG: 
BEGIN VAR M=MRG;CLASS,RB=INT;P,O=POINT;SPS=SPS;SP=SP; 
OD SPS:=ICOLLECT 
!COLLECT SIZE !NB, TB, POllER, CLASS, RB) • SIZE 
FOR RB FROf1 1 TO N BY 2; I 
FOR CLASS FROM 1 TO G;I; 
P:= MAX IF P<MAX THEN P ELSE 0#0 Fl FOR (Pl SE SPS;; 
SPS:=!COLLECT 
!COLLECT Q.X*500/P.X # Q.Y*500/P.Y FOR Q SE SP;I 
FOR SP SE SPS:I; 
GIVE ICOLLECT lJIRE!BLUE,0, !COLLECT 0 FOR 0 SE SP;WITH 0<500#500;1) 
FOR SP $E SPS;; 
END 
ENODEFN 
COLLECT !COLLECT M\AT Q FOR Q SE SP;WITH 0<500#500;1 
FOR SP IE SPS;&& FOR M IE 
! BOX{RED.-5.#-5.\TO 5µ5) ; 
POLYGON IRED, f-5.#-4.;5#.;0#41 J ; 
POLYGON !RED, !0#5;-3. #-4.: 5#2. 5; -5. #.; 3#-4. l J ; 
!L-JIRE IRE0,0, !5#5;-5.#-5. l J ;l.JIRE<RE0,0, !-5.#5;5#-5. l) J 
POLYGON <RED, 15#0; 0#5: -5. #0; 0#-5. J J ; 
POLYGON !RED, 12115: -2. fl..; -5. #2;. #-2.; -2. #-5.; 2#.; 5#-2.;. #21 > 
WIREIGREEN,0, 10#500;0#0;500#01}; 
SC{P.XJ\PAINTED BLACK\SCALED_BY 2#2\AT 500#10; 
SC<P.YJ\PAINTED BLACK\SCALED_BY 2#2\AT 10#472; 
'NB:'llSC<NB)\PAINTEO BLACK\SCALEO_BY 2#2\AT 500#130; 
'TB:'llSC<TBl\PAINTED BLACK\SCALEO_BY 2#2\AT 500#30; 
'PQl.JER: 'USC <PQl.JERl \PAINTED BLACK\SCALEO_BY 2#2\AT 500#50! 
DEFINE TABLE !NB, TB: INT POllER:REAL N: INT i1AX:PDINTl: 
BEGIN VAR CLASS,RB:JNT;P:POINT; 









I .. . . 
-211-
Appendix 3: River Routers 
This appendix discusses the design of a river router and illustrates some of the 
extensions 't\rhich augment the usefulness of river routers. River routers are used 
to interconnect the connectors along the adjacent edges of two cells. The following 




must be a one-to-one mapping between connectors of 
ti..JO ce I Is. 
2) Corresponding connectors must be on the same masK layer. 
3) Each set of connectors must satisfy the design rules for 
minimum width wires. 
4) Adjacent connector pairs on dependent mask layers must 
not cross. 
The first condition simply states that the two sets of connectors be of the same 
length. We will connect the first connector of one list to the first connector of the 
other list; the second connect.ors will be interconnected, etc. The second condition 
assures us that we can route a single wire between the two connectors without 
changing mask layers. The third condition assures us that we can indeed route 
"\Vires to all the connectors without violating the design rules. The fourth 
condition assures us that we do not have to cross wires. If wires had to cross, we 
would have to change layers, and we do not wish to change layers with our wires 
(see condition 2). Dependent layers are layers that produce undesirable side-effects 
when wires cross. For instance, in NMOS design, when diffusion and polysilicon 
cross, a transistor is formed. Hence, diffusion and polysilicon are dependent layers. 
On the other hand, the metal layer is independent of polysilicon and diffusion since 
metal wires may freely cross wires of these other layers. Notice that every layer is 
dependent with itself: If two wires of the same layer cross, they short together. 
Based upon these conditions, there are a few properties of river routes which can be 
used. One of these properties has already been mentioned: the interconnection 
between two connectors will be a single wire on a single mask layer. A second 
property is that independent layers can be routed independently. We have noticed 
that, in NMOS, metal wires can arbitrarily cross polysilicon or diffusion wires. 
Therefore, we can route all of the metal wires as a group, and then route all of the 
-212-
polysilicon and diffusion wires as a group. This also allows connector pairs to cross. 
provided the connector pairs are on independent layers. 
We can also divide the routing task for each set of dependent layers into groups. 
We will define a group to be all adjacent connector-pairs on dependent layers 
which route in the same direction. Using figure A3-1 as an example, we see that 
the first three connector pairs have wires slanting to the left as we go from top to 
bottom. The next three connectors slant to the right, and the final three connectors 
slant to the left. We can divide the connector pairs into groups and route each 
group independently. This is possible because each wire drawn will only move 
horizontally in one direction, towards its destination. We can also route these 
independent groups as if they were dependent. This allows us to separate the 
connectors into two groups: those that tend to the left and those that tend to the 
right (any wires which need only be vertical can belong in either group). 
Fig. A3-1: Connector Pairs 
Another property we will use is that each wire depends only upon one other wire in 
the route: its adjacent neighbor in its direction of travel. If every wire maintains 
proper distance from its neighbor in its direction of travel, we will not have design 
rule violations between wires. We will use this property to determine the order of 
routing wires. In the left-going group, we will route the left most wire first, 
followed by the next-to-the-left most wire, etc. We will route the right-going 
wires starting with the right most wire. The first wire in each group will move 
directly over to its destination connector's x coordinate and wait. The second wire 
in each group now only needs to avoid this one wire as it heads toward its 
-213-
destination. In a like manner, each wire will only have to consider the previous 
wire as it is generating its path. 
a) Left 
Neighbor 
b) Off set 
Fig. A3-Z: Computing New Path 
c) New 
Poth 
The final property we will use is that the design rule spacing between wires is 
uniform in both directions. This allows us to compute the majority of a ·wire's path 
by simply shifting the points from the previous path. In figure A3-2a we see the 
path of one wire. If we shift the points of this path over in x and down in y, each 
time by the minimum design rule spacing for the two layers in question, we have 
the path of the wire which is as close to the given wire as possible. Given this new 
path, we need only fix the ends of this path to have the route for the next wire. 
We will remove any points of this path which lie beyond the destination and will 
append segments to the front of the wire which connect to the starting connector 
(fig A3-Zc). This efficiently generates each wire given the neighbor's wire. As we 
have already stated, the first wire is trivial to implement: We move from thr.:! 
initial connector over to the final connector's x coordinate, then down. We can 
now prove that each wire only draws in one direction. The first wire draws only 
in one direction, as shown in the previous statement. The central portion of every 
wire follows its neighbor's path until the destination coordinate is reached. Hence, 
once the central portion of the wire is reached, the wire only heads in the direction 
of its destination. To complete the proof, we must show that the start of the wire 
does not move in the opposite direction. The end of the shifted portion of the wire 
is at minimum design rule spacing from the neighbor's wire. For the initial 
-214-
segment of the vvire to run in the opposite direction, the starting connector must be 
closer to the neighbor's wire then design rules allow. This is a violation of 
condition 3. Therefore, every wire draws in one direction, which completes the 
inductive proof. Given this, we can then prove that the extend of a wire is limited 
by the x coordinates of its two connectors. If the wire ever extended beyond one of 
the two connectors, it could never connect to the connector since it would have to 
change directions. Therefore, wire extents are limited, and we can separate the 
routes into groups. 
The following code is the basic river router routine. We will discuss the Forbidden 
Zones later, for now assume that they are identity functions. The River Node 
contains the coordinates of the two connectors, and the common color of the 
connectors. The river routing routine returns a River Return, which contains the 
layout and the height, which is the height of the completed route. The river 
routing routine calls a routine to route the individual sets of dependent layers. This 
routing, GROUP_!iOUTE, also returns a RIVER_!iETURN, but this routine uses the 
DONE component. The Done component contains all of the river nodes at the end of 
the route. Since we can not state how tall the river route will be until the route is 
completed, we do not know hovv long to make each of the final wire segments until 
we have finished the rest of the route. We put each of the nodes into the Done 
component when we are finished jogging them, and we look at these nodes after all 
the wires are jogged to add the final segments to the wires. 
TYPE RIVER_NODE= CFRml, TO: POINT COLOR:COLORJ; 
RIVER_NODES= ! RIVER_NODE I; 
RIVER_RETURN= [LAYOUT:MRG HElGHT:REAL OONE:RIVER_NOOESJ; 
RIVER_RETURNS= ! RIVER_RETURN l; 
FORBIDDEN_ZONE= //WlRE(SP,REAL,COLORJ\\; 
DEFINE SORT<OLD:RIVER_NODESl=RIVER_NOOES: 
BEGIN VAR NEW=RIVER_NODES;Nl,N2=RIVER_NODE;l,J=INT; 
DO NE~!: =NIL; 
WHILE OEFINED(OLOI; DO 
Nl: ~OLD ClJ: 
I: =1: 
<FDR N2 SE OLOC2-l;&& FOR J FROM 2 BY l;)WITH N2.FROM.X>Nl.FROM.X; 







NEL-J:: = Nl<S; 
OLD [I l : =NIL: 
-215-
DEFINE GROUP ROUTE<LIST:RIVER NODES MIN,TOP,BOT:REAL 
- FZl,FZ2:FORBIDDEN_ZONEl=RlVER_RETURN: 




BEGIN VAR P=POINT; 
CPA TH: LAST _PA TH L.JI DTH: LOWJ: =<~·,FZl~·,> <LAST _PATH, LOW, C> ; 
L2::= WIRE<C, «~·,FZ2~·,>(LAST_PATH,LOW,C}).PATH> <S; 
COUNT::=+l; 
IF COUNT>40 THEN 














LAST _COl_OR: =RED; 
FOR N SE LIST; DO 
IF N.FROM.X<N.TO.X THEN RIGHT::= N <S; 
ELSE SPACE:=CENTER_TO_CENTER<N.COLOR,LAST_COLOR); 
LAST_COLOR:=N.COLOR; 
IF N.TO.X-SPACE>=LAST_NOOE.FROM.X THEN 
ELSE 
LAST _PA TH:= IN. FRDr1; 
IF N.FROM.Y\IS_CLOSE_TO TOP 
THEN NIL ELSE .#TOP FI; 
N. TO. X#. l; 
LAST_PATH:=ICOLLECT P+SPACE#-SPACE 
FOR P SE LAST_pATH; 
WITH IP.Y=<TOPJ&<P.X+SPACE>=N.TO.Xl:I; 
IF LAST_PATH[ll.X<N.FROM.X THEN 
LAST_PATH::=IN.FROM; 
IF N.FROr-1. Y\IS_CLOSE_TO TOP 
THEN NIL ELSE .#TOP Fl; 
LAST_PATH[lJ.X#.lSS; 
ELSE LAST_PATH::= N.FRDM<S; FI 
P:=REVERSE<LAST_PATHl Cll; 
IF -IP.X\IS_CLOSE_TO N.TO.X> 
THEN LAST_PATH::7 S> N.TO.X#P.Y; FI 







LAST_NODE:=N; Fl ENO 
LAST _NODE:= [FROf1: 838393#383839 COLOR: REOJ; 
FOR N SE RIGHT; DO 
SPACE:=CENTER_TO_CENTER<LAST_COLOR,N.COLOR>; 
LAST_COLOR:=N.COLOR; 
IF N. TO. X+SPACE=<LAST _NOOE. FROf1. X THEN 
LASTYATH: = !N.FRDr1; IF N.FRDr1. Y\IS_CLOSE_TO TOP 
THEN NIL ELSE .#TOP FI;N.TO.X#.1; 
ELSE LAST_PATH:=ICOLLECT P-SPACE#SPACE FOR P SE LAST_pATH; 
WITH (P.Y=<TOPJ&<P.X-SPACE=<N.TO.X};l; 
IF LAST _PATH [l) • X>N. FRot1. X THEN 
LAST _PATH::= IN. FRot1; 
IF N.FROM.Y\IS_CLOSE_TO TOP 
THEN NIL ELSE .#TOP Fl; 
LAST_PATH[lJ.X#.lSS; 
ELSE LAST PATH::= N.FROM<S; FI 
P: =REVERSE <LAST_PATH> (lJ; 
IF -{P.X\IS_CLOSE_TO N.TO.X> 
THEN LAST_PATH::= S> N.TO.X#P.Y; FI 







IF COUNT>0 THEN Ll::= OISK(L2} <S; FI 
GI VE CLAYOUT: DI SK Cll l HEIGHT: LOL.l DONE: OONEl 
ENO 
ENODEFN 
DEFINE RIVER ROUTE<LIST:RIVER NODES MIN,TOP,BOT:REAL 
- FZl.FZ2:FORBIDOEN_ZONEl=RIVER_RETURN: 
BEGIN VAR N=RIVER_NOOE;LISTS=RIVER_RETURNS;CLASS=INT;LOW=REAL; 
R =R I VER _RE TURN; 
DO LISTS: =NIL; 
WHILE DEFINED<LISTJ; DO 
CLASS:=CLASS<LISTCll.COLOR>: 
LISTS::= GROUP_ROUTE<ICOLLECT N FOR N SE LIST; 
WITH N.COLOR\CLASS=CLASS;l ,MIN,TOP,BOT, 
FZl, FZ2l <S; 
LIST:=ICOLLECT N FOR N SE LIST;WITH N.COLOR\CLASSc>CLASS;l; 
ENO . 
LOW:= MIN R.HEIGHT FOR R SE LISTS;; 






FOR [OONE:!NlJ SE LISTS;}} 
The RIVER ROUTE routine takes the list of connector pairs and routes betw-een them. 
The route is assumed to be horizontal. To generate vertical routes, the connector 
positions can be rotated 270 degrees, and the resulting layout rotates 90 degrees. 
The MIN parameter is used to specify a minimum width for the route. We can not 
state maximum width of the route, but we may wish to state a minimum width for 
-217-
the route. (For example, we may wish to run some horizontal metal wires over the 
route, so we would require the route to be tall enough to allow all of the metal 
wires to fit between the cells.) In some cases, the connectors do not lie on the 
perimeter of the cell, but rather lie inside the cell's boundary. To connect to the 
point, we either have to examine the entire set of geometry contained in the cell or 
we have to have conventions for connecting to the cell. We will use the 
convention that if a point lies within the cell boundary, we may draw a minimum 
width wire from the connector straight to the edge of the cell. The TOP and BOT 
parameters indicate the boundries of the two cells. If a node's FROM point has a Y 
value greater than TOP, a wire is drawn from the point straight down to TOP, before 
the river route begins. Similarly, if the TO point has a Y value less than BOT, a wire 
is drawn. The FZ 1 parameter is used to jog the wire, and the FZ2 parameter is used 
to translate the wire. These operations are discussed later. 
The RIVER ROUTE routine takes all of the connectors and separates them into 
groups, based upon the color of the connectors. The CLASS routine in ICLIC is used 
to determine the dependence of the layers. Dependent layers have the same class. 
RIVER ROUTE calls GROUP ROUTE will all of the connectors in each class. Once 
GROU1=._!i0UTE has been called for each group, RIVE~OUTE determines the height 
of the route, extends all of the wires, and returns the layout. 
GROU~OUTE routes all of the wires which slope to the left first, then it routes all 
of the wires which slope to the right. For each wire, it determines the design rule 
spacing between this wire and the previous wire. It then checks to see if the 
previous wire is outside the range of the current wire, in which case it can 
immediately draw the current wire connecting directly to its desired location. If 
the previous wire was in range, all of the points in the previous wire are diagonally 
shifted by the design rule spacing, and the two ends of the wire are adjusted to fit 
the TO and FROM points of the current wire. Given the current wire, the ADD 
routine is called. The ADD routine passes the wire to the first FORBIDDEN~ONE, 
which may jog the wire. The result of the jogs becomes the official path of the 
wire, which the neighboring wires must avoid. This is also passed to the second 
FORBIDDEN_ZONE, which may arbitrarily map the wire from the river rou.te 
coordinate system to the chip coordinate system. For standard river routes, these 
two FORBIDDE~ONEs are indentity functions. The following code facilitates 
calling standard river routes. 
-218-
DEF I NE 1.1 I RE =FORB I OOEN_ZONE: 
//(SP:SP R:REAL C:COLOR> CPATH:SPJ\\ 
ENODEFN 
DEFINE IDENTITY=FORBJDDEN_ZONE: 
//(SP:SP R:REAL C:COLOR> CWIOTH:R PATH:SPJ\\ 
ENDOEFN 
DEFINE RIVER ROUTE(LJST:RIYER NODES f11N,TOP,BOT:REALJ=RIVER RETURN: 
RIYER_ROUTE<LIST,MIN, TOP,BOT,IOENTITY,WIREJ ~ 
ENDDEFN 
This new RIVEI!_ROUTE routine does not require the two FORBIDDE~ONEs, but 
uses the two default routines. 
The first FORBIDDEN _?:ONE is used to jog the wires. Due to global concerns, there 
may be obstacles to the river route. The FORBIDDEN ZONEs allow the user to specify 
a routine which will modify the path of a wire.in the river router. When the river 
router wants to route a wire through one of these obstacles, the user's routine may 
deflect the path of the wire. In figure A3-3a we see a wire which runs through an 
obstacle. The wire's path may be deflected to lie outside the obstacle (fig. A3-3b), 
and the river router will route all future wires to the new path (fig. A3-3c). 
c) Wir-e inside 
obstacle 
b) Wir-e 'pushed' 
out 
Fig A3-3: Jogging the Path of a Wire 
o) Route 
continue• 
We '\Vill define obstacles to be a collection of colored points. For an Upper Left 
obstacle we state that if a wire path begins to the right of the point, the path may 
not contain any points above and to the left of the obstacle. Figure A3-3 illustrated 
an Upper Left obstacle. Similarly, we may have Upper Right obstacles. These t'Wo 
sets of obstacles can be used to describe features of the upper cell which must be 
-219-
avoided in the river route. We would also like to avoid features of the lower cell. 
We can not, hovvever, just 'push' the wires outside of the obstacle points, as we did 
for the upper obstacles. If we did push the wires, they would run into neighboring 
wires. Instead, we push the lower cell down so that the wire path lies outside the 
obstacle, as shown in figure A3-4. 
o) Obstacle crossing 
wire 
b) Obstacle moved 
down 
Fig. A3-4: Moving Lower Cell 
The following COL01l_!.IMIT datatypes are used to describe the obstacles, and the 
LIMIT function will move a path (SP) to remain outside the obstacles. 
TYPE COLOR_LIMIT= CCOLOR:COLOR LIMITS:SPJ; 
COLOR_Lim TS= I COLOR_LIMI T l; 
DEF I NE LIMIT CSP: SP LQL.J: REAL COLOR: COLOR UL, UR, LL, LR: COLOR_LI MI TS l =WIRE: BEGIN VAR CL-COLOR_LIMIT;P,O=POINT;Xl,X2=REAL;W=WIRE; 
DO Xl:=SPClJ.X; 
X2:=REVERSECSPl Cll .X; 
IF Xl<X2 THEN 
IF THERE_IS CL.COLOR=COLOR FOR CL SE UR; 
THEN SP:=RCLIPCSP,CL.LIMITS,Xl,X2l; FI 
IF THERE_IS CL.COLOR=COLOR FOR CL SE LL; 
THEN LOW::= MIN LMOVECSP,CL.LIMITS,Xl,X2l; FI 
ELSE IF THERE_IS CL.COLOR=COLOR FOR CL SE UL; 
THEN SP:=LCLIP<SP,CL.LIMITS,X2,Xll; FI 
IF THERE_IS CL.COLOR-COLOR FOR CL SE LR; 
THEN LOW::= MIN RMOVECSP,CL.LIMITS,X2,Xll; FI FI 




The LIMIT function takes the current path (SP) and computes a new :path 
(result.PATH). Since this routine may need to push the lower cell down, we must 
also retu1·n the new separation of the cells. The LOW input parameter is the 
previous spacing. We return the new spacing in the WIDTH component of the 
result. The LIMIT function also requires the wire's color, and the list of obstacles. 
The routine determines whether the line slopes to the left or right, and calls the 
appropriate CLIP and MOVE routines. The CLIP routines are used for the upper 
limits to jog the wires, while the MOVE routines are used for the lower limits to 
move the lower cell. The MOVE and CLIP routines are listed here. 
DEFINE LMOVE<PATH;CORNERS:SP LX,HX:REALJ=REAL: 
BEGIN VAR MIN=REAL;P,O=POINT; 
DO MIN:=999999: 





IF THERE_IS Q.X>=P.X FOR Q SE PATH; THEN MIN::= MIN Q.Y-P.Y; FI 
DEFINE RMOVECPATH,CORNERS:SP LX,HX:REALl=REAL: 
BEGIN VAR MIN=REAL;P,Q=POINT; 
DO MIN:=999999: 





IF THERE_IS Q.X=<P.X FOR Q SE PATH; THEN MIN::= MIN Q.Y-P.Y; Fl 
DEFINE LCLIPCPATH,CORNERS~SP LX,HX:REAL>=SP: 
BEGIN VAR Y=REAL;P,O=POINT;NEW=SP;FLAG=BOOL; 
.DO Y:=PATH[ll.Y; 
FOR P SE CORNERS;WITH <P.X>LXJ&<P.X=<HXJ&(P.Y<Yl; DO 
FOR Q SE PATH; 
FIRST _00 NElJ: =IQ! ; 
FLAG:=O.X=<P.X;; 
OTHER_OO IF O.X>P.X THEN NE~!;:= 0<8; 
EF O.Y<P.Y THEN 
IF FLAG THEN NEW::= Q.X#P.Y<S; FI 
FLAG:=FALSE; 
NEIJ:: = Q <S; 
EF -FLAG THEN NEW::= !P;P.X#Q.YlSS;FLAG:=TRUE; Fl; 
FINALLY_DO IF FLAG THEN NEW::= LX#P.Y<S; Fl; 







DEFINE RCLIP!PATH,CORNERS:SP LX,HX:REALl=SP: 
BEGIN VAR Y=REAL;P,Q=POINT:NEW=SP:FLAG=BOOL; 
DO Y:=PATH[ll.Y; 
FOR P SE CORNERS;WITH !P.X>=LXl&!P.XcHXl&(P.YcYl; DO 
FOR Q SE PATH: 
FIRST_DO NEW:={Ql; 
FLAG:=O.X>=P.X;; 
OTHER_DO IF O.X<P.X THEN NEW::= OcS; 
EF Q.Y<P.Y THEN 
IF FLAG THEN NEW::= O.X#P.YcS; FI 
FLAG:=FALSE; 
NEW::=O<S; 
EF -FLAG THEN NEW::= {P;P.X#Q.YISS;FLAG:=TRUE; FI; 
FINALLY_DO IF FLAG THEN NEW::= HX#P.YcS; FI; 






The MOVE routines look through the list of obstacles (CORNERS) for points which 
lie within the limits of the wire (PATH). For e;:i.ch obstacle point within the wire's 
limits, the routine computes the offset required to move the lower cell. The largest 
offset is returned by the routine. The CLIP routines take each obstacle which lies 
within the span of the wire, and moves all wire points which lie inside the 
obstacle. 
o) With LIMIT Function b) Without LIMIT Function 
Fig. A3-5: River Route Comparison 
To use this LIMIT routine in the river router, we need only compute the obstacles 
and pass this routine as the first FORBIDDE~ONE. In figure A3-5, we show a river 
route that uses the LIMIT routine and one that does not. The routine that uses 
LIMIT can route some of the wires inside the cell's boundry, while the route that 
does not use limit must remain outside of the cell's boundry. In many cases, the 
-222-
program can compute these obstacles, so that more efficient routes can be used. 
Pads 
Fig. A3-6: River Routing to Pads 
Another interesting use of the river router is to route wires to pads. In figure A3-6, 
we show a cell surrounded by pads. Between the cell and the pads, we need to route 
wires. A river route could be used, except for one thing: a river route is a single 
channel, whereas the pad route routes around a box. 
r 
~ N / 
w 
~J v s i ' 
w 5 E N 
Fig. A3-7: Unfolding the Box 
We can still use the river router, if we can convert the box route into a channel 
route, perform the river route, the convert the result back into a box route. In 
figure A3-7, we show the mapping from a box route to the linear route and back. 
We cut the box into four trapezoids and unfold the box into a single strip. The 
shaded portions of the strip are cut out of the river route when the trapezoids are 
folded back into a box. Because the shaded portions are removed, we can not have 
-223-
any wires jogging inside the shaded regions. For this reason, the suspendable 
functions are called FORBIDDEN ZONEs: it is forbidden for the wires to jog inside the 
shaded regions. We will write a procedure, TRAPEZOID, which will constrain vvires 
to jog outside of these forbidden zones. Figure A3-8 shows tw-o cases of wires 
vvhich jogged i~side these shaded areas and were pushed outside of the region. In 
the following code, we describe TRAPEZOIDS as a left point and a right point, along 
-with a left slope (SLEFT) and a right slope (SRIGHT). The TRAPEZOID function takes 
a series of these trapezoids and assures that each wire lies outside of the trapezoid. 
Notice that here we have reversed the polarity of the trapezoids. These trapezoids 
are the shaded regions, no corners may exist within the trapezoid. 
Ir 'V«I ~ >Yrl 
Q) Wires jog in 
Forbidden Zone 
b) Jogs external 
Forbiaden Zone 
Fig. A3-8: Constraining Jogs 
TYPE TRAPEZOID= [LEFT,RIGHT,SLEFT,SRIGHT:POINT CENTER:REALJ; 




DEFINE TRAPEZOID<SP:SP LOW:REAL COLOR:COLOR TS:TRAPEZOIDSl=WIRE: 
BEGIN VAR T=TRAPEZOIO;P,0=POINT;Xl,X2,R=REAL;NEW=SP;FLAG=BOOL; 
DO Xl:=SP[lJ.X: 
X2: =REVERSE <SP> [l]. X: 
Xl#X2:= IXl MIN X2l # IXl MAX X21; 
FORT SE TS;~IITH (T.CENTER>Xll&IT.CENTER<X2J; DO 
NEl~: =ISP [11 l: 
FLAG:=FALSE; 
FOR IP;Ql SC SP; DO 
IF Q\INSIDE T THEN 
IF -FLAG THEN 
FLAG:=TRUE; 
WHILE Q\INSIDE T; DO 
R:= IF Q.X<P.X THEN P\RIGHT EDGE T 
ELSE P\LEFT_EDGE T FI; -
NEW::= R#Q. Yd; 
P: =R#P. Y-<~'(TRAPEZOID_JOG~·r> {COLOR>; 





LOW::= MIN O.Y; 
ENO FI 




GIVE CWIOTH:LOW PATH:SPJ 
ENO 
ENOOEFN 
DEFINE INSIDEIP:POINT T:TRAPEZOIDl=BOOL: 
IF P.X=< P\LEFT_EDGE T THEN FALSE ELSE P.X< P\RlGHT_EDGE T FI 
ENDOEFN 
DEFINE RIGHT_EDGE!P:POINT T:TRAPEZOIDl=REAL: 
T.RIGHT.X-T.SRIGHT.X*(T.RIGHT.Y-P.Yl/T.SRIGHT.Y+TRAPEZOID_EDGE 
ENDDEFN 






We use the second FORBIDDEN ZONE in the river router to map the wire from the 
river route c.oordinate system to the chip coordinate system. We can use this 
function to map from the linear strip into the box. In the following section of code, 
we have a datatype REGION which describes one of the four regions of the route. 
For each region, we have the trapezoid in the linear space which corresponds to one 
section of the box. Additionally, we have transformations from the chip 
coordinates to the linear coordinates and back. If we transform the connectors' 
locations by the MA~O matrix, the new locations correspond to locations along the 
strip. When we transform each point in the wire paths by the MA:i:'._XROM matrix, 
the resulting path has the correct coordinates for the chip coordinate system. We 
may need to add points to the wires when mapping them back to the chip coordinate 
system. If figure A3-9a, we show a route in the river route coordinate system. The 
"tVire travels from one trapezoid to another, which is valid since the wire does not 
jog withing the shaded area. If we just transformed the four points in the wire, we 
would get the layout shown in figure A3-9b, which has one wire cutting across 
our cell. We need to add a point on the edge between the two trapezoids when we 
do the mapping, resulting in the layout shown in figure A3-9c. 
The REGION function takes two corner points and two slopes and computes the 
corresponding region. The REGIONS function takes the wire in the river route's 
-225-
b) Er-r-oneou• 
T l""on•f' or111e1~ l Of"'I 
Fig. A3-9: Mapping Wires into Box 
o) Extr-a po 1..,~ 
odded 
coordinates and computes the path in the chip's coordinates, adding the points 
where needed. 
TYPE REGION= CINSIOE:TRAPEZOID MAP_TO,MAP_FROM:MATRIX 
CORNER,SLOPE:POINT MINX,M~XX:REALl; 
REGIONS= { REGION l; 
DEFINE REGIONCUL,UR,LL,LR:POINT GROUP:INTl=REGION: 
BEGIN VAR A=REAL; 
DO A:=(UR-LJL)\ANGLE; 
GROUP:: =~·,100000; 
GIVE lINSIOE: lLEFT:GROUP#0 RIGHT:GROUP#0+((LJR-LJL)\ROTATED_BY -A) 
ENO 
ENDDEFN 
SLEFT:LL\ROTATED_BY -A SRIGHT:LR\ROTATED_BY -AJ 
MAP _TO: DI SPLACEt1ENTCGROUP#0) \ROTATED_BY -A \AT -UL 
MAP_FROM:DISPLACEMENT(LJL)\ROTATEO_BY A\AT -!GROUP#0) 




DEFINE BEGIONS!SP:SP BOT:REAL COLOR:COLOR RGS:REGIONS>=WIRE: 
BEGIN VAR NEW=SP;P=POINT;I,J,K=INT; 
00 l:=FIXR!SPClJ.X/100000}; 
NEW:=!SP[lJ\AT RGS[JJ.MAP_FROMl; 
FOR P SE SP[2-l; 00 
J:=FIXRCP.X/100000!; 
ENO 
IF I<J THEN 
DO Na!: : = RGS [K J • CORNER-RGS [Kl • SLOPE ~·,p. Y < S; 
FOR K FROM I TO J-1; 
EF J<I THEN 
DO NEW::= RGSCKJ.CORNER-RGS[KJ.SLOPE*P.Y <I: 
FOR K FROM I-1 TO J; FI 
NEW::= P\AT RGS[JJ.MAP_FROM <S; 
I: =J; 




There are a few other concerns befoi-e we have completed the box router. First, 
consider figure A3-10. We have a wire that starts on the NORTH and ends on the 
WEST. In the river route space, this wire extends from the far right to the far left, 
shorting out every other wire in the route. To solve this, we may move the WEST 
trapezoid to be to the right of the NORTH trapezoid, but then we would have the 
same problem with WEST/SOUTH wires. Instead, we may have a second WEST 
i·egion, W', which is to the right of the NORTH region. We have two WEST regions 
now. NORTH/WEST wires use W', wh,ile WEST/SOUTH wires use the original 
WEST region. WEST /WEST wires can use either region. 
W S E N 
Fig. A3-10: Erroneous Wire Wrap-Around 
Unfortunately, this causes anotheJ: problem. We now have two independent WEST 
regions in the river route space, but there is only one WEST region in the chip 
space. In figure A3-11, we show two wires, one a SOUTH/WEST wire, the other a 
WEST/NORTH wire. Since these are in the independent regions of the river route, 
they independently route, which causes trouble in the chip space. What we need to 
do is to make the two WEST regions independent. We have noticed above that vvires 
can be routed independently if they run in opposite directions. The two wires in 
figure A3-11 run in the same direction, so they are not independent. We vvill make 
<l new SOUTH region, S', to the right of W', and move the wire AB into the W'/S' 
n~gions. We continue this process until the left-most wire in the river route runs 
in the opposite direction of the right-most wire. (We can also stop the circulation 
of wires when the wire spans do not overlap.) We must check for the condition 
that all wires run in the same direction, and signal an error if this occurs. 
W S E N w~ 
Fig. A3- 1 1: Non-Independent Wires 
-227-
Another potential problem occurs near the edges of the trapezoids. Given two 
neighboring trapezoids, the adjacent edges in the river route coordinates represent 
the same line in the chip coordinates. Wires jogging close to these lines may short 
together in the chip space while quite far apart in the river space, as shown in 
figure A3-12. To combat this problem, we just bloat the trapezoids by half the 
maximum design rule spacing, This assures that wires remain far enough apart. 
Fig. A3-12: Boundary Interference 
The remaining code describes the connection points for box routes, which need to 
know the side on which the connector resides. Also, the routines for implementing 
the route are listed. The NORMALS routine is used for generating the trapezoids 
given the outline of the cell. The OUTSIDE routine is used to invert the polarity of 
the trapezoids. The first ROTO ROUTE function is used to reorder the pads to shorten 
the wire lengths. The final ROT~OUTE routine is the river router which routes 
around the outside of the cell. Figure A3-13 shows a river route around a 
rectangular cell, while figure A3-14 shows a river route around a hexagonal cell. 
TYPE CONNECT2= [FROM,TO:POINT COLOR:COLOR FEDGE,TEOGE:INTJ; 





BEGIN VAR NORMALS=SP;P,Q=POINT; 
00 NORMALS:= !COLLECT rn-PI \NORMAL FOR IP; i'<Ol SC SP; l; 
NORMALS:=REVERSE<NORMALSl; 
NORMALSC2-J:=REVERSE<NORMALS[2-JI; 










FOR IP;OI SC RGS;I 
ENO 
ENDDEFN 
DEFINE RDTO_ROUTECRNS:RIVER_NODES J:!NTl=RIVER_NODES: 
BEGIN VAR RN=RlVER_NOOE;TO=SP;CGF,CGT=REAL;P=POINT; 
DO TO:={COLLECT RN.TO FOR RN SE RNS;l: 
CGF:=+ RN.FROM.X FOR RN SE RNS;; 
CGT:=+ RN.TO.X FOR RN SE RNS;; 
\JHILE ABS CCGF-CGTl >. ss~·d; 00 
END 
IF CGT>CGF THEN 




ELSE TO:=T0[2-JI> TO[lJ.X+J#TO[lJ.Y; 
CGT::=+J; FI 




FOR RN SE RNS;&& FOR P SE TO;} 
DEFINE ROTO_ROUTECCS:CONNECT2S MIN,TOP,BOT:REAL OUTLINE:SP ROTO:BOOLl= 
RI VER_RETURN: 





FOR IPp·:Ol SC OUTLINEUOUTLINE;&& 
FOR IR;*SI SC NORMALSSSNORMALS;&& 
FOR I FROM 1 BY l;l; 
J:=+l FOR P SE OUTLINE:: 
RNS:=ICOLLECT 
[FROM: C. FRDr1\A T 
RGS[C.FEDGE+ IF C.TEDGE<l THEN J ELSE 0 FIJ.MAP_TO. 
TO:C.TD\AT RGSCC.TEDGE+ IF C.TEDGE<l THEN J ELSE 0 FIJ .f1AP_TO 
COLOR:C.COLORJ 
FOR CSE CS;l; 
WHILE RNS[1J.FROM.X>RNS(2J.FROM.X; DO RNS:=RNSC2-JS>RNSCll; END 
J:: =~·(100000; 
IF ROTO THEN RNS::=\ROTO_ROUTE J; FI 
RN:=REVERSECRNSl [11: 
WHILE CRN.FROM.X<RN.TO.Xl=(RNSCll.FROM.X<RNSClJ.TO.Xl;&& 
FOR I FROM 1 TO 1000; DO 
END 
RN: =RNS [lJ; 
RN.FRO!l.X::=+J; 
RN. TO. X: : =+J: 
RNS:=RNS£2-JS>RN; 
IF 1>~1000 THEN WRITE<'ROTO_ROUTE: CIRCULAR'l;CRLF;HELP; FI 





Fig. A3-13: Box Route 
-230-
Fig. A3-14: Hexagon Route 
-231-
Appendix 4: The RLC Compiler 
The appendix contains the complete code listings for the Random Logic Compiler 
described in Chapter 5. In some cases, Chapter 5 used approximations for the data 
structures and routines, so there may be a few differences between the code implied 
by Chapter 5 and the code listed here. 
The PHYSICA1=._ WIRE datatypes are .defined as shown in Chapter 5. In addition, we 
declare a type GAT~RODUCER which is a suspendable function. The input and 
output parameters for this function match the requirements of the NAND, NOR, and 
INVERT functions. We will use instances of this datatype to refer to virtual 
routines for generating the gate layouts. The user at any time may reassign new 
routines to these variables, which will modify the layout produced. 
TYPE PHYSICAL_UIRE= [HEIGHT,LEFT,RIGHT:REAL NAME:QSl; 
PHYSICAL_WIRES= I PHYSICAL_WIRE l; 
GA TE_PRDDUCER= I /11RG <PHYS I CAL_W I RES, PHYS I CAL_w I RE, REAL> \ \; 
We can now define routines for generating gates in a number of technologies. We 
will have global variables set to one group of these functions, which indicate the 
current technology, Currently, we support NMOS, 2-layer metal NMOS, CMOS, and 
2-Jayer metal CMOS. In addition to these actual technologies, we have a few 
pseudo-technologies: NMOS sticks, 2-layer metal NMOS sticks, Logic diagrams, and 
NMOS transistor diagrams. The gate producing functions for these technologies are 
listed here. 
DEFINE NMOS_PULLUPCOUTPUT:PHYSICAL_WIRE X:REALl=MRG: 
DO CONNECT!OUTPUT,X-21; 
POWER::=+.25: 
GIVE IBOXCREO,X-15#0\TO X-5#6!; 
ENDDEFN 
BOXCYELLOW,X-15#-2.\TO X-5#91; 
WIRECGREEN,2. IX-13#YV00;.#3;X-8#.;.#.-5;.+5#.;.#0UTPUT.HEIGHTJ ); 
GCB\AT !X-12#YVOO;X-2#0UTPUT.HEIGHTl; 
GRCBU\AT X-7#-1.l 
DEF I NE NMOS_NAND (INPUTS: PHYS I CAL_~! I RES 
OUTPUT: PHYS I CAL_~J I RE X: REAL> =MRG: 
BEGIN VAR IN=PHYSICAL_l~IRE;NUMBER=1NT;X2=REAL; 
DO NUMBER:= +1 FOR IN SE INPUTS;; 
X2:=X-10-2*NUMBER: 
DO CONNECTCIN,X2); FOR IN SE INPUTS; 
CWIDTH: =X2-5; 
-232-
GlVE IGCB\AT X-8#YGNO; 
BOX<GREEN,X2+3#YGN0-2\TO X-7#-1.l; 
COLLECT IRCB\AT X2#IN.HElGHT; 
END 
ENDOEFN 
WlRE<RE0,2, !X2#lN.HElGHT;X-6#.lll FOR lN SE lNPUTS;; 
NllOS_PULLUP <OUTPUT, X l I 
DEFlNE NMOS_NOR!lNPUTS:PHYSICAL_t.JIRES 
OUTPUT:PHYSICAL_WIRE X:REAL>=MRG: 
BEGIN VAR lN=PHYSICAL_WIRE; 
DO DO CONNECT!IN,X-16); FOR IN SE INPUTS; 
CLHOTH: =X-24; 
GIVE IGCB\AT X-19#YGNO; 
ENO 
ENDOEFN 
WIRE!GREEN,2, IX-20#YGND;.# MAX IN.HElGHT FOR IN SE lNPUTS;+4l l; 
WIRE!GREEN,2, IX-8# MIN IN~HEIGHT FOR IN SE INPUTS;+4;.#-2.l); 
COLLECT IRCB\AT X-16#IN.HElGHT; 
WIRE<RED,2, !X-15#IN.HEIGHT+l;X-ll#.; .#.+51 l; 
WIREIGREEN,2, IX-20#IN.HEIGHT+4;X-8#.lll 
FOR IN SE INPUTS;; . 
NllOSYULLUP (OUTPUT, Xl I 
DEF I NE NllOS_I NYERT I INPUTS: PHYS l CAL_W l RES 
OUTPUT:PHYSICAL_WlRE X:REALl=MRG: 









l.JlRE !RED,2, IX-12f/IN.HEIGHT;X-6#. l l; 
NMOS_PULLUPIOUTPUT,Xll 
DEFINE METAL2 NANOCINPUTS:PHYSICAL WIRES 
- OUTPUT:PHYSlCAL=WIRE X:REALl=MRG: 
BEGIN VAR IN=PHYSICAL_WIRE;NUMBER=INT;X2=REAL; 
DO NUMBER:=+l FOR IN SE INPUTS;; 
X2:= -14. MIN -8.-2*NUMBER; 
DO CONNECTCIN,X+X2+2l; FOR IN SE INPUTS; 
CONNECTCOUTPUT,X+X2+9); 
Cl.JI OTH: =X+X2; 
POl.IER:: =+. 25; 




. BCB\AT 12tl-l.;9#0UTPUT.HEIGHTI; 
l.JIRE!YIOLET,3, !2#-1. ;9#.; .#OUTPUT.HEIGHT!); 
BOXCRE0,0#0\TO 11#6); 
BOX<YELLOW,0#-2.\TO 11#8); 
l.JIRE!GREEN,2, 18#YVDD; .#3;3#.; .#-2. ;6#.; .#YGNDI); 
BOXCGREEN,5#YGND-2\TO 5+2*NUMBER#-4.); 
COLLECT IRCB\AT 2#1N.HEIGHT; 
W lRE <RED, 2, 12# IN. HEIGHT; 6+2~·,NUMBER#. l ) I 
FOR IN SE INPUTS;l\AT X+X2#0 
-233-
DEFINE METAL2 NOR(lNPUTS:PHYSICAL WIRES 
- OUTPUT: PHYS I CAL)J I RE X: REAL} =MRG: 
BEGIN VAR IN=PHYSICAL_WIRE; 









l~I RE CV IDLET, 3, 112#-1.;. #OUTPUT.HEIGHT}); 
BOXCRE0,3#0\TO 14#6); 
BOXCYELLOW,3#-2.\TO 14#8); 
WIRECGREEN,2, ll#YGNO;.# MAX IN.HEIGHT FOR IN SE INPUTS;-41 l; 
WIRE<GREEN,2, 113# MIN IN.HEIGHT FOR IN SE INPUTS;-4; .#0!); 
l~IRE CGREEN, 2, !6#YVOO;. #3; 11#.;. #0l l; 
COLLECT IRCB\AT 5#IN.HEIGHT; 
WIRE<RE0,2, 16#IN.HEIGHT-1;10#.;.#.-5Jl; 
1.-JIRE !GREEN, 2, ll#IN. HEIGHT-4; 13#. J} l 
FOR IN SE INPUTS:l\AT X-17#0 
DEFINE METAL2 INVERTCINPUTS:PHYSICAL WIRES 
- OUTPUT: PHYS I CAL)JI RE X: REAU =MAG: 











1.-1 IRE CV I OLET, 3, !2#-1.; 9#.;. #OUTPUT. HEI GHTI}; 
BOXCRE0,0#0\TO 11#51; 
BDX<YELLOW,0#-2.\TO 11#8); 
[,JI RE <GREEN, 2, !8#YVOO; . #3; 311. ; • 11-2. ; 6#. ; . #YGNOl } ; 
BOX<GREEN,5#YGND-2\TO 7#-4.J; 
RCB\AT 2#IN.HEIGHT; 
WIRECRE0,2, 12#IN.HEIGHT;8#.J ll \AT X-14#0 
DEF I NE LOG I CAL_NANO CI NPUTS: PHYS I CAL_l-l I RES 
OUTPUT:PHYSICAL_WIRE X:REALl=MRG: 
BEGIN VAR I N=PHYSI CAL_l.-IIRE: NUMBER= I NT; Y=REAL; 
DO NUMBER:= +l FOR IN SE INPUTS;; 
DO CONNECTCIN,X+Yl; FOR IN SE INPUTS;&& 
FOR Y FROM 10./NUMBER-25. BY 20./NUMBER; 
CONNECT<OUTPUT,Xl: 
CWI DTH: =X-30; 





-5. #15; -5. #01); 
WIRECGREEN,0, 1-15.#31; .#33;0#.; .#OUTPUT.HEIGHT! l; 




FOR IN SE INPUTS;&& FOR Y FROM 10./NUMBER-25. BY 20./NUMBER;; 
OUTPUT.NAME\PAINTEO REO\ROT 90\AT -8.#371\AT X#0 
DEFINE LOGICAL_NOR<INPUTS:PHYSICAL_WIRES 
OUTPUT: PHYS I CAL_l.J I RE X: REAL l =MRG: 
BEGIN VAR I N=PHYS I CAL_l~JI RE; NUt1BER= I NT; Y =REAL; 
DO NUMBER:= +1 FOR IN SE INPUTS;; 
DO CONNECT<IN,X+Yl; FOR IN SE INPUTS;&& 
CONNECT<OUTPUT,Xl; 
CWIDTH: =X-30: 
FOR Y FROM 10./NUMBER-25. BY 20./NUMBER; 










COLLECT WIRE<GREEN,0, IY#IF Y<-21. THEN !25+Yl/4 
EF Y<-17. THEN 1+.7*(21+Yl/4 
EF Y<-13. THEN 1.7 
EF Y<-9. THEN 1-.7*(9+Yl/4 
ELSE 1-(9+Yl/4 FI;.#IN.HEIGHTI l 
FOR IN SE INPUTS;&& FOR Y FROM 10./NUMBER-25. BY 20./NUMBER;; 
OUTPUT.NAME\PAINTED REO\ROT 90\AT -8.#37!\AT X#0 
DEFINE LOGICAL INVERT<INPUTS:PHYSICAL_WIRES 
OUTPUT: PHYS I CAL_l.J I RE X: REAU =MRG: 












WIRE<GREEN,0, 1-15.#31; .#33;0#.; .#OUTPUT.HEIGHT! l; 
WIRE(GREEN,0, 1-15.#0;.#IN.HEIGHTlJ; 
DUTPUT.NAME\PAINTED RED\ROT 90\AT -8.#37!\AT X#0 
DEFINE CMOS NAND(INPUTS:PHYSICAL WIRES 
- OUTPUT:PHYSICAL=WIRE X:REALl=MRG: 
BEGIN VAR I N=PHYS I CAL_W IRE; MN=REAL; 
DO DO CONNECT(IN,X-13J;CONNECT(IN,X-2ll; FOR IN SE INPUTS; 
CDNNECT<DUTPUT,X-2l; 
CWIDTH:=X-33; 
MN:= MIN IN.HEIGHT FOR IN SE INPUTS;; 
GIVE !COLLECT IRCB\AT IX-2l#IN.HEIGHT;X-13#.I; 
WIRE<RE0,2, IX-22#IN.HEIGHT-1; .-4#.; .#.-5!}; 





FOR IN SE INPUTS;; 
GCB\AT !X~28#YVOO;X-16#YV00-7;.+6#.;.+6#.;X-2#0UTPUT.HEIGHT; 
X-10#YCNDI; 
l.JIRE (BLUE, 3, !X-lG#YVD0-7;. +12#. l l; 
WI RE (GREEN, 2, !X-29#YVDO;. #t1N-4l l; 
l.JIRE(GREEN,2, IX-17#t1N-4; .#YV00-7!); 
WIRE(GREEN,2, !X-9#YVD0-7;.#YCNDll; 
WJRE(CREEN,2, IX-3#YVDD-7: .#OUTPUT.HEIGHT!}; 
BOX(YELLOW,X-32#YGND-3\TO X-13#YV00-4ll 
DEF I NE Cl10S_NOR <INPUTS: PHYS I CAL_W I RES 
OUTPUT:PHYSICAL_WIRE X:REALl=MRG: 
BEGIN VAR I N=PHYS I CAL_l.J I RE; MX=REAL; 
DO DO CONNECT(IN,X-13l;CONNECT(IN,X-2ll; FOR IN SE INPUTS; 
CONNECTCOUTPUT,X-2l; 
CWIOTH: =X-33; 
MX:= MAX IN.HEIGHT FOR IN SE INPUTS;; 
GIVE !COLLECT !RCB\AT !X-21#IN.HEIGHT;X-13#.l; 
ENO 
ENDOEFN 
t.JIRE(RE0,2, !X-22#IN.HEIGHT-l; .-4#.; .#.-5! l; 
l.JIRE !RE0,2, !X-12#IN.HEIGHT; .+5#. l l; 
WIRE(GREEN,2, !X-29#IN.HEIGHT-4;X-17#.l )} 
FOR IN SE INPUTS;; 
GCB\AT IX-28#YGNO;X-16#YGN0+7; .+6#.; .+6#, ;X-2#0UTPUT.HEIGHT; 
X-10#YVDDI ; 
WI RE (BLUE, 3, !X-1G#YCND+7;. +12#. l l; 
WIRE (GREEN,2, IX-29#YGNO;. #MX-41); 




DEFINE CMOS INVERTCINPUTS:PHYSICAL_WIRES 
OUTPUT:PHYSICAL_WIRE X:REALl=MRG: 
BEGIN VAR IN=PHYSICAL_WIRE; 




GIVE fRCB\AT X-12#IN.HEIGHT; 
END 
ENDOEFN 
WI RE (RED, !X-6#1 N. HEIGHT; X-11#. ; . #YVOD+ll } ; 
GCB\AT IX-2#DUTPUT.HEIGHT;X-7#YGND;.#YVDD-7;.#YVOO;X-15#YVOD-71; 







BEGIN VAR IN=PHYSICAL_WIRE;MN=REAL; 
DO DO CONNECT<IN,X-9l;CONNECTCIN,X-17l; FOR IN SE INPUTS; 
CONNECTCOUTPUT,X-2l: 
Cl.JIOTH: =X-25; 
nN:= mN IN.HEIGHT FOR IN SE INPUTS;; 
-236-
GIVE !COLLECT !RCB\AT IX-91/IN.HEIGHT;X-17#.l; 
END 
ENDDEFN 
WIRE <RED, 2, !X-8#IN. HEIGHT-1; X-4#.;. #. -5}); 
~JIRE<RED,2, !X-18#IN.HEIGHT-l;X-23#.l); 
I.JI RE <GREEN, 2, !X-1#! N. HE! GHT -4; X-13#. l ) l 




l.JIRE (BLUE,3, IX-14#YVD0-7;X-20#. I l; 
~IIRE(VIOLET,3, IX-14#YVOD-7:X-2#.: .#OUTPUT.HEIGHT!); 
WIRE<GREEN,2, !X-l#YVOO; .#llN-41); 
WIRECGREEN,2, IX-13#YVD0-8;.#MN-4l); 
WI RE CGREEN, 2, !X-2l#YVOD-8;. #YGNDI): 
BOX(YELLOW,X+l.5#MN-7\TO X-17#YV00+2)l 
DEF I NE CMOS_2_NOR (INPUTS: PHYS I CAL_~JI RES 
OUTPUT: PHYS I CAL_W.J RE X: REAL) =MRG: 
BEGIN VAR I N=PHYS I CAL_JJ! RE; 11X=REAL; 
DO DO CONNECTCIN,X-9l;CONNECT<IN,X-17); FOR IN SE INPUTS; 
CONNECT(OUTPUT,X-2l; 
Cl.JIDTH: =X-25; 
MX:= MAX IN.HEIGHT FOR IN UE INPUTS;; 
GIVE !COLLECT !RCB\AT IX-9#IN.HEIGHT;X-17#.l; 
END 
ENDDEFN 
l.JIRE<RE0,2, !X-8#IN.HEIGHT-l;X-4#.; .#.-51); 
WIRE <RED, 2, IX-18#I N. HE! GHT-1; X-23#. l); 
WIRE (GREEN, 2, IX-l#IN. HEIGHT-4; X-13#. l) l 




WI RE <BLUE, 3, !X-14#YGN0+7: X-20#. I l; 
I.JI RE (VIOLET, 3, IX-14#YGN0+7; X-2#.;. #OUTPUT. HE I GHTl ) ; 
l.JIRE <GREEN,2, IX-l#YGNO; .#MX-41); 
WIRE (GREEN, 2, IX-13#YGN0+8;. #f1X-4l); 
WJRE<GREEN,2, IX-2l#YGN0+8;.#YV00l); 
BOXCYELLOW,X-17#YGND+4\TO X-23.5#YVOD+2ll 
DEF I NE Cl"10S~2_I NVERT ( l NPUTS: PHYS I CAL_W I RES 
OUTPUT:PHYSICAL_WIRE X:REALl=MRG: 
BEGIN VAR IN=PHYSICAL_WlRE; 
DO IN:=INPUTS[lJ: 
CONNECT< IN, X-9); 
CONNECT(OUTPUT,X-2l; 
CWIOTH: =X-15; 
GI VE !RCB\AT X-9#1 N. HEIGHT; 
ENO 





WI RE <BLUE, 3, !X-10#YVDD-7; X-2#. l l; 
WIRE{VIOLET,3, IX-2#YVDD-7;.#0UTPUT.HEIGHTll; 
WIRE (GREEN, 2, !X-3#YGND+l;. #YVOD-8!); 





VAR SWW=REAL; "STICKS WIRE WIDTH" 
SCON-MRG; "STICKS_CONTACT" 
Sl..JW: =· 25; 
SCON:=BOXCBLACK,-1.#-1.\TO l#ll; 
DEF !NE Nt10S_ST I CKS_NAND (INPUTS: PHYS I CAL_~J I RES 
OUTPUT:PHYSICAL_WIRE X:REALl=MRG: 
BEGIN VAR IN=PHYSICAL_WIRE; 
DO DO CONNECTCIN,X-10); FOR IN SE INPUTS; 
CONNECTCOUTPUT,X-21; 
CWIDTH: =X-12; 
GIVE !COLLECT ISCON\AT X-10#IN.HEIGHT; 
END 
ENDDEFN 
l~I RE lREO, SWl~. !X-10#I N. HEIGHT;.#. -4; X-4#. l) l 
FOR lN SE INPUTS;; 
WIRECGREEN,SWW, IX-6#YGN0+2;.#YVDD-21J; 
WIRE<GREEN,SW~J, IX-6#0;X-2#.; .#OUTPUT.HEIGHT} l; 
WIRE <RED. SWW, IX-6#0; X-10#.;. #8; X-2#. l); 
BOX<YELLOW,X-8#4\TO X-4#8); 
SCON\AT IX-8#YGN0+2;.#0;.#YVDD-2;X-2#0UTPUT.HEIGHTJJ 
DEF r NE NMOS_ST r CKS._NOR {INPUTS: PHYS I CAL_W I RES 
OUTPUT:PHYSICAL_WIRE X:REALJ=MRG: 
BEGIN VAR IN=PHYSICAL_WIRE; 
DO DO CONNECT<IN,X-10}; FOR IN SE INPUTS; 
CONNECT<OUTPUT,X-21; 
CWIDTH: =X-16; 
GIVE !COLLECT !SCON\AT X-10#IN.HEIGHT; 
END 
ENDDEFN 
WIRElREO,S~JW, lX-10#IN.HElGHT; .#.-8l l: 
WIRE (GREEN, SWW, !X-14#I N. HEIGHT -4; X-G#. l) l 
FOR IN SE INPUTS;; 
WI RE (GREEN, S~JW, !X-l 4#YGND+2; . # MAX IN. HEIGHT FOR IN SE INPUTS; -4 l } ; 
WIRECGREEN,SWW, IX-G# MIN IN.HEIGHT FOR IN SE INPUTS;-4;.#YVDD-21 }; WIRE<GREEN,SWW, !X-8#0;X-2#.;.#0UTPUT.HEIGHTJJ; 




















BEGIN VAR I N=PHYS I CAL_~! l RE; 
DO DO CONNECT<IN,X-101; FOR IN SE INPUTS; 
· CONNECT <OUTPUT, X-2); 
CWIDTH: =X-12; 
GIVE !COLLECT ISCON\AT X-10#IN.HEIGHT; 
END 
ENDDEFN 
WI RE <RED, Sl-!W, !X-10#1 N. HEIGHT;.#, -4; X-4#. l J l 
FOR IN SE INPUTS;; 
WIRE<GREEN,SWW, IX-6#YGN0+2;.#YVD0-2J); 
WIRECVIOLET,SWW, IX-6#0;X-2#.;.#0UTPUT.HEIGHTll; 





BEGIN VAR IN=PHYSICAL_~JIRE; 
DO DO CONNECT<IN,X-10); FOR IN SE INPUTS; 
CONNECT<OUTPUT,X-2l; · 
CWIDTH: =X-16; 
GIVE !COLLECT ISCON\AT X-10#IN.HEIGHT; 
ENO 
ENDDEFN 
WIRE<RED,SWW, IX-10#IN.HEIGHT; .#.-6)); 
WIRE <GREEN, SWW, !X-14#1 N. HE I GHT-4; X-6#. l )} 
FOR IN SE INPUTS;; 
WIRE<GREEN,SWW, IX-14#YGN0+2;.# MAX IN.HEIGHT FOR IN SE INPUTS;-4l l; 
WIRE(GREEN,S~!W, !X-6# MIN IN.HEIGHT FOR IN SE INPUTS;-4; .#YVDD-2l); 
WIRE (VIOLET, SW~J. !X-5#0; X-2#.;. #OUTPUT. HE I GHTJ ) ; 
WIRE <RED, Sl.JW, !X-6#0; X-10#.;. #6; X-2#. I J; 
BOX(YELLOW,X-8#4\TO X-4#81; 
SCON\AT IX-14#YGND+2;X-6#0;.#YVD0-2;X-2#0UTPUT.HEIGHTIJ 
DEFINE METAL2 STICKS INVERT<INPUTS:PHYSICAL WIRES 
- - OUTPUT:PHYSICAL-WIRE X:REALl=MRG: 









~!IRE <VIOLET, St.J~I. IX-6#0; X-2#.;. #OUTPUT. HEI GHTJ l; 
WI RE <RED, SWl..J, IX-6#0; X-10#.;. #6; X-2#. l); 
BOX<YELLOW,X-8#4\TO X-4#8); 
SCON\AT !X-10#IN.HEIGHT;X-6#YGND+2;.#0;.#YV00-2;X-2#0UTPUT.HEIGHTll 
VAR TRANQ, TRANGNO, TRANPULL=~1RG; 
TRANQ:-IWIRE<BLACK,0, 10#0;2#.l)l 
l..JIRE(BLACK,0, 12#-2.; .#2J ); 
WIRE<BLACK,0, 12.5#-3.;.#31); 
-239-
I.JI RE <BLACK, 0, !2. 5#2; 4#. l l; 
WIRE!BLACK,0, 12.5#-2.;4#.lll; 
TRANGND:=!WIRE!BLACK,0, 1-2.#0;2#.ll; 
WIRE CBLACK, 0, !-1. 2#-. 8; 1. 2#. l l ; 
WIRE !8LACK,0, 1-.4#-1.6; .4#. l l l; 
TRANPULL:=!TRANQ\AT -4.#6; 
WI RE <BLACK, 0, 1-4. #6; . #2; 0#. l l ; 
lJ IRE <BLACK, 0, 10#0; . #4 l l ; 
WIRE CBLACK, 0, 10118; • #14 J l ; 
IJIRE !BLACK, 0, !-. 6#12; 0#14;. 6#121 l J; 
DEF I NE TRANS_NAND <INPUTS: PHYS I CALJ,J I RES 
OUTPUT: PHYS I CAL_~J I RE X: REAL l =MRG: 
BEGIN VAR IN=PHYSICAL_WIRE;SR1,SR2=SR;R,S=REAL; 
DO DO CONNECT!IN,X-10); FOR IN SE INPUTS; 
CONNECTCOUTPUT,X-2l; 
SR2:=1COLLECT IN.HEIGHT FOR IN SE INPUTS;}; 
SRl:=NIL; 
WHILE DEFINED<SR2l; DO 
S:= MAX R FOR R SE SR2;; 
SRl::= S <S; 
SR2:=!COLLECT R FOR R SE SR2;WITH R<>S;l; 
ENO 
CWibTH:=X-12; 
GIVE !COLLECT !TRANQ\AT X-10#IN.HEIGHT-4; 
END 
ENDDEFN 
WIRE !8LACK, 0, !X-10#1 N. HE! GHT;. ti. -41 l l 
FOR IN SE INPUTS;; 
COLLECT WIRECBLACK,0, !X-6#R-2;.#S-6l) 
FOR !R;SJ SC YGND+2 <S SRl S> 6;; 
TRANGND\AT X-6#YGNO; 
TRANPULL\AT X-6#0; 
WIRE CBLACK, 0, !X-6#0; X-2#. ; . #OUTPUT. HE I GHTl > l 
DEF I NE TRANS_NOR (INPUTS: PHYS I CAL_W I RES 
OUTPUT:PHYSICAL_WIRE X:REALl=MRG: 
BEGIN VAR IN=PHYSICAL_WIRE; 
DO DO CONNECT!IN,X-10); FOR IN SE INPUTS; 
CONNECTCOUTPUT,X-2l; 
CWIOTH: =X-16; 
GIVE !COLLECT !TRANQ\ROT 270\AT X-10#IN.HEIGHT; 
END 
ENODEFN 
WIRE !BLACK, 0, !X-14#1 N. HE! GHT -4;. +2#. l l; 
IJ I RE !BLACK, 0, !X-8# IN. HEIGHT -4; . +2#. l l l 
FOR IN SE INPUTS;; 
TRANGND\AT X-14#YGNO; 
WIRE<BLACK,0, !X-14#YGNO;.# MAX IN.HEIGHT FOR IN SE INPUTS;-4ll; WIRE<BLACK,0, !X-6# MIN IN.HEIGHT FOR IN SE INPUTS;-4;.#0;X-2#.; 
.#OUTPUT.HEIGHT!); 
TRANPULL\AT X-6#0! 
DEFINE TRANS INVERT!INPUTS:PHYSICAL_WIRES 
OUTPUT:PHYSICAL_WIRE X:REALl=MRG: 






GIVE ITRANQ\AT X-10#IN.HEIGHT-4; 
END 
ENDOEFN 
l-IIRE <BLACK, 0, IX-10#1 N. HEIGHT;.#. -41}: 
WIRE !BLACK, 0, IX-G#YGND:. #IN. HEIGHT-Gl}: 
TRANGNO\AT X-G#YGNO; 
TRANPULL\AT X-8#0; 
WIRE <BLACK, 0, IX-G#I N. HEIGHT -2;. #0; X-2#.;. #OUTPUT. HE I GHTl } l 
In addition to the gate producing routines, each technology requires a function 
which will draw the final signal wires on the chip. This wire function accepts a 
chip as the input parameter and produces an MRG as the output parameter. Since we 
want the user to be able to change the wire drawing routine at will, this too will be 
a global variable which is a suspendable function. The following type declaration 
declares the type. The wire drawing routines are listed after the type declaration. 
TYPE CHIP~TO_MRG= //MRG<CHIPl\\; 
. DEFINE NMOS_WIRES<C:CHIPl=MRG: . 
BEGIN VAR S=SIGNAL_WIRE;LEFT,RIGHT,PWIDTH=REAL; 
DO LEFT:=CWIDTH+5; 
RIGHT:=-2.; 
PWIDTH:=WIDTH(POWER} MAX 4; 




FOR S SE C.SIGNALS: 
EACH_OO @(S.PHYSICALJ.LEFT::= MAX LEFT; 
@(S.PHYSICALl.RIGHT::= MIN RIGHT;;; 
BOX<BLUE,CWIOTH+3#YV00-3\TO 4#YVOD+<PWIDTH-3 MAX 2l); 
BOX<BLUE,Cl-IIOTH-l#YGN0+2-PWIOTH\TO 0#YGN0+2}l 
DEF I NE METAL2_WIRES (C: CHI Pl =f1RG: 
BEGIN VAR S=SIGNAL_WIRE;LEFT,RIGHT,PWIDTH=REAL; 
DO LEFT:=CWIDTH+2; 
RIGHT:=-5.; 
PWIDTH:=WIOTH(POWERl MAX 4; 




FOR S SE C.SIGNALS; 
EACH_OO @(S.PHYSICAL}.LEFT::= MAX LEFT; 
@(S.PHYSICAL}.RJGHT::= MIN RIGHT;;; 
BOX<BLUE,CWIOTH#YV00-3\TO l#YVOO+<PWIOTH-3 MAX 2ll; 
80X!BLUE,CWIOTH-4#YGND+2-PWIDTH\TO -3.#YGN0+2)J 
DEFINE LOGICAL_WIRES<C:CHIP>=MRG: 
BEGIN VAR S=SIGNAL_wIRE;LEFT,RIGHT=REAL; 
DO LEFT:=CWIOTH-2; 
RIGHT:=5; 
GIVE !COLLECT WIRE<GREEN,0, IS.PHYSICAL.LEFT#S.PHYSICAL.HEIGHT; 
S.PHYSICAL.RIGHT#.J} 




EACH_DO @{S.PHYSICALl.LEFT::= MAX LEFT; 
@{S.PHYSICALl.RIGHT::= MIN RIGHT;;} 
DEFINE CMOS_WIRESCC:CH!Pl=MRG: 
BEGIN VAR S=SIGNAL_WIRE;LEFT,RIGHT=REAL; 
DO LEFT:=CWIDTH+l2; 
RIGHT:=-2.; 




FOR S SE C.SIGNALS; 
EACH_OO @CS.PHYSICALl.LEFT::= MAX LEFT; 
@{S.PHYSICALl.RIGHT::= MIN RIGHT;;; 
WIRE CBLUE, 4, !CW IDTH+4#YVOO; 2#. l l ; 
WIRE<BLUE,4, !CWIOTH#YGN0;-2.#.l Jl 
DEFINE CMOS_2_WIRES{C:CHIPl=MRG: 
BEGIN VAR S=SIGNAL_WIRE;LEFT,RIGHT=REAL; 
DO LEFT:=CWIDTH+8; 
RIGHT: =-2.; 




FOR S SE C.SIGNALS; 
EACH_DO @{S.PHYSICALJ.LEFT::= MAX LEFT; 
@CS. PHYSlCAU. RIGHT::= MIN RIGHT;;; 
WIRE CB LUE, 4, !CW I 0 TH+S#YVOO; 2#. l l ; 
WIRECBLUE,4, fCWIDTH#YGN0;-2.#.l)l 
DEFINE NMOS_STICKS_WIRESCC:CHIPJ=MRG: 
BEGIN VAR S=SIGNAL_WIRE;LEFT,RIGHT=REAL; 
DO LEFT:=CWIDTH+2; 
RIGHT:=-2.; 




FOR S SE C.SIGNALS; 
EACH_OO @CS.PHYSICALl.LEFT::= MAX LEFT; 
@CS.PHYSICALJ.RIGHT::= MIN RIGHT;;; 
WI RE CBLUE, SWW, ILEFT#YV00-2: 2#. l ) ; 
WIRE<BLUE,SWW,!CWIOTH-2#YGN0+2;RIGHT#.JJJ 
"METAL-2 sticks uses the NMOS_STICKS_WIRES routine" 
DEF I NE TRANS_W I RES CC: CH IP) =11RG: 
BEGIN VAR S=SIGNAL_~JIRE;LEFT ,RIGHT =REAL; 
DO LEFT:=CWIDTH+2; 
RIGHT:=-2.; 




FOR S SE C.SIGNALS; 
EACH_DO @{S.PHYSICALJ.LEFT::= MAX LEFT; 
@CS.PHYSICALl.RIGHT::= MIN RIGHT;;! 
-242-
In addition to the wire drawing routine, each technology has a routine for 
initializing the global coordinates and a routine for calculating wire positions in the 
wiring channel. The first routine requires the chip as an input parameter, and 
produces no output. The second routine takes a channel index, an INTEGER, and 
returns the channel position, a REAL. These two routines will be CHIP CONSUMERs 
and INT TO REALs. 
TYPE CHIP _CONSUllER= //(CHIP)\\; 
INT_TO_REAL= //REAL(INTJ\\; 
DEFINE NMOS_SETUP!C:CHIP>: 
BEGIN VAR S=SIGNAL_WIRE; 
YGND:= -8.*(MAX S.VHEIGHT FORS SE C.SIGNALS;J-6; YVDO: =9; 
END 
ENDOEFN 
"llETAL2 uses NllDS_SETUP" 
"LOGICAL uses NMOS_SETUP" 
DEFINE CMOS_SETUP(C:CHIP>: 
BEGIN VAR S=SIGNAL_WIRE; 
YV00:=0; 
YGND:=-9.*(MAX S.VHEIGHT FORS SE C.SIGNALS;)-19; ENO 
ENDOEFN 
"CMOS_2 uses CMOS_SETUP" 
DEFINE Nnos_STICKS_SETUP(C:CHIP): 
BEGIN VAR S=SIGNALJ.JIRE; 
YGND:= -10.*(MAX S.VHEIGHT FORS $E C.SIGNALS;)-6; YV00:=12; 
ENO 
ENDOEFN 
"llETAL2_STI CKS uses NMOS_STVCKS_SETUP" 
DEFINE TRANS_SETUP(C:CHIP): 
BEGIN VAR S=SIGNAL_WIRE; 
YGND:= -10.*<MAX S.VHEIGHT FORS SE C.SIGNALS;)-8; ENO 
ENDDEFN 
DEFINE NMOS_WIRE_HEIGHTS<I:INT>=REAL: 
"METAL2 uses NMOS_WIRE_HEIGHTS" 
"LOGICAL uses NMOS_WIRE_HEIGHTS" 
ENDDEFN 
-243-
DEF I NE CMOS_W I RE_HE I GHTS (I : I NT l =REAL: -5. -9~" I ENDOEFN 
"CllOS_2 uses cnos_w IRE_HE I GHTS II 
DEF I NE Nl10S_ST I CKS_W I RE_HE I GHTS (I: I NT J =REAL: ENDOEFN 
"t1ETAL2_STICKS uses NMOS_STICKS_WIRE_HEIGHTS" 
DEFINE TRANS_WIRE_HEIGHTS(I:INTl=REAL: 5-10~'<1 ENDDEFN 
The final technology dependent routines in RLC concern wire packing and gate 
sorting. For each technology, we may desire to have a routine which will pack the 
-wires in the wiring channel. Similarly, we may desire sorting routines -which sort 
the gates to achieve higher performance or smaller area. These routines are similar 
to the SETUP routines: They require a CHIP as an input parameter, and they return 
no output data. 
With these considerations in mind, we can declare a TECHNOLOGY datatype -which 
contains all of the technology-dependent information. We can define ne-w 
technologies and add them to the technology list at any time, and can then out1:mt 





WI RE_HE I GHT: 
VOO,GNO: 
NAME: 
GA TE __ PROOUCER 
CHIP _TO_f1RG 




TECHNOLOGIES= TECHNOLOGY l; 
VAR TECHNOLOGIES= TECHNOLOGIES; 
VAR NllOS,METAL2,LOGICAL,CMOS,CMOS2,NMOS_STICKS,METAL2_STICKS, TRANSISTOR= TECHNOLOGY; 
NMOS: =[NANO: I I: NllOS_NANO <PHYS I CAL_t..JI RES, PHYS! CAL_WI RE, REAL>\\ NOR: I I: Nf10S_NOR <PHYS I CAL_W I RES, PHYS I CAL_W I RE, REAL J \ \ INVERT: I I: NllOS_I NVERT <PHYS I CAL_WI RES, PHYS I CAL_t..II RE, REAL>\\ WIRES://:NMOS_WIRES(CH!Pl\\ 
PACK://: Nt1DS_PACK_2 (CHIP)\\ 
SORT://:NO_SORT<CH!Pl\\ 
SETUP://:NMOS_SETLJP{CHIPJ\\ 




11ETAL2: =[NANO: I I: t1ETAL2_NAND <PHYS I CAL_l.JI RES, PHYSICAL_WI RE, REAL>\\ NOR://:METAL2_NOR(PHYSICAL_WIRES,PHYSICAL_WIRE,REALl\\ INVERT://:METAL2_INVERT<PHYSICAL_WIRES,PHYSICAL_WIRE,REALJ\\ WIRES://:METAL2_WIRES(CH!Pl\\ 








LOG I CAL: = [NANO: I I: LOG I CAL_NAND (PHYS I CAL_t~ I RES, PHYS I CAL_~J I RE, REAL l \ \ 
NOR://:LOGICAL_NOR(PHYSICAL_WIRES,PHYSICAL_WIRE,REALl\\ 
INVERT:l/:LOGICAL_INVERT<PHYSICAL_WIRES,PHYSICAL_WIRE,REAL}\\ 
WI RES: I I: LOG I CAL_~J I RES {CH IP l \ \ 
PACK: 11: Nf10S_PACK_2 <CH IP}\\ 
SORT:/l:NO_SORTCCH!Pl\\ 
SETUP:/l:NMOS_SETUPCCHIPJ\\ 
~JI RE_:_HE I GHT: I I: NMOS_W I RE_HE I GHTS (I NT) \ \ 
NAME:. ' LOG I CAL' l ; 
Cf''IOS: =[NANO: I I: CMOS_NANO <PHYS! CAL_~JI RES, PHYS I CAL_WI RE, REAL>\\ 
NOR://:CMOS_NOR<PHYSICAL_WIRES,PHYSICAL_WIRE,REALJ\\ 









Cf'10S2: = CNANO ~I I: Cl'lOS_2_NAND (PHYS ICAL_~l!RES, PHYS I CAL_WIRE, REAU\\ 
NOR://:CMOS_2_NOR(PHYSICAL_WIRES,PHYSICAL_WIRE,REALl\\ 
INVERT://: CMOS_2_I ~JVERT (PHYSICAL_WI RES, PHYSICAL_WIRE, REAL)\\ 
l.J I RES: I I: U10S_2_l~ I RES (CHI Pl\\ 
PACK://:Nf'10S_PACK_2<CHIPl\\ 
SORT://:NO_SORTCCHIPl\\ 




NM1E: 'CMOS2' J; 
NMOS_STICKS:= 
[NANO: I I: Nl'10S_STI CKS_NANO <PHYS ICAL_WI RES, PHYSICAL_WI RE, REAL l \ \ 
NOR://:Nl'lOS_STICKS_NOR(PHYSICAL_WIRES,PHYSICAL_WIRE,REALl\\ 
INVERT://:NMOS_STICKS_INVERT(PHYSICAL_WIRES,PHYSICAL_WIRE,REALl\\ 




UI RE_HE IGHT: //: NMOS_STI CKS_WIRE_HEIGHTS (INT>\\ 
VD0:2 . 
GN0:-2. 









SETUP://: NMOS_STI CKS_SETUP <CHIPJ \ \ 





NOR: I I: TRANS_NOR (PHYS! CAL_l-lI RES, PHYS! CAL_l-JI RE, REAL J \ \ 
I ~NERT: I I: TRANS_! NVERT <PHYS I CAL_W I RES, PHYS I CAL_W I RE, REAL} \ \ 








DEFINE NMDS=MRG: COMPILE<CHIP,NMOS) ENDDEFN 





DEFINE CMOS2-MRG: COMPILE<CHIP,CMOS2J ENDDEFN 




COMP I LE <CH IP, f1ETAL2_ST I CKS J ENDDEFN 
COMPILE<CHIP,TRANSISTORJ ENODEFN 
DEFINE PUT_NMOS: PUT<CHIP,NllOSl; ENDDEFN 
DEFINE PUT_METAL2: PUT<CHIP,METAL2l; ENDDEFN 
DEFINE PUT_LOGICAL: PUT(CHIP,LOGICALJ; ENDDEFN 
DEFINE PUT_CMOS: PUT<CHIP,CMOSJ; ENDDEFN 
DEFINE PUT_CMOS2: PUT<CHIP,CMOS2J; ENDDEFN 
DEFINE PUT_NMOS_STICKS: PUTCCHIP,NMOS_STICKSJ; ENDDEFN 
DEFINE PUT_METAL2_STICKS: PUTCCHIP,METAL2_STICKSJ; ENDOEFN 
DEFINE PUT_TRANSISTOR: PLJT(CHIP,TRANSISTORJ; ENDDEFN 
TECHNDLOGI ES:= !N~lOS; nETAL2; LOG I CAL; Cl10S; CMOS2; NMOS_STI CKS; METAL2_STI CKS; 
TRANSISTOR!; 
Now that we have our basic technologies defined, we will present the data 
structure def~nitions for representing the chip. These definitions, which follow 
-246-
the definitions in Chapter 5, represent the wires and gates of the chip. In addition, 
the definition for the CHIP datatype is given. The DCHIP type is a swappable CHIP, 
which means that an instance of type DCHIP can be swapped into the virtual 
memory by the system. ICL allows the user to specify what datatypes are 
swappable, b~cause the user can do a much better job of describing conceptual units 
than a program can. 
TYPE SIGNAL_WIRE= CFROM:GATE 
TO: GATES 
NNlE:OS 









RI NDEX: REAU ; 
GATES= I GATE l; 
GATE_TYPE= SCALAR(NANO,NOR,INVERTl; 
·CHIP= [GATES: GATES 
SIGNALS: SI GNAL_lJ I RES 
SIGNAL_COUNT:INT 
NAME,OESCRIPTION:QSJ; 
OCHIP= PRIVATE DISK_NODE; 
VAR YVOO,YGNO,POWER,CWIOTH=REAL; 
CHIP=CHIP; 
LET OCHIP BECOME CHIP BY MACR0-10('1NCORS'l 
DEFINE OISKCC:CHIPJ=DCHIP: MACR0-10{'0SKIZS'J 
DEFINE MOOIFIED<D:OCHIPJ: MACR0-10{'0MODS'J 
DEFINE PUT<D:OCHIP N:QSl: 




DEFINE PUT<C:CH!Pl: PLJT(D!SK(Cl,C.NAMEl; ENDDEFN 
DEFINE GET(N:OSl=DCHIP: 
BEGIN LET GLS24 BECOME DCHIP BY MACR0-10('10ENTS'l GET<N,'DCHIP 1/2/81') 
END 
ENDDEFN 
This next section of code is the actual compiler, which closely follows the code in 
Chapter 5. Because of the similarity of the code, no additional comments -will be 
given here. 
DEF I NE PHYS I CAL <SW: SI GNAL_~J I RE l =PHYS I CALJJ I RE: SW. PHYS I CAL ENDOEFN 
DEF I NE PHYS I CAL CS~JS: SI GNAL_~J I RES l =PHYS I CAL_W I RES: 
BEGIN VAR S=S I GNAL_~J I RE; 
!COLLECT S\PHYSICAL FORS SE SWS;I 
END 
ENDOEFN 
DEF I NE INPUTS <C: CH IP l =SI GNAL_~l I RES: 
BEGIN VAR S=S I GNAL_~I I RE: 




BEGIN VAR S=S I GNAL_LH RE; 
!COLLECT S FOR S SE C.SIGNALS;WITH S.OUTPUT;l 
END 
ENDOEFN 
DEFINE CONNECTCWIRE:PHYSICAL_WIRE X:REALl: 
@llJIREl .LEFT::= MIN X; 
@IWIREl.RIGHT::= MAX X; 
ENDDEFN 
DEFINE INITIALIZE_WIRES<C:CHIP T:TECHNOLOGYl: 
BEGIN VAR S=SI GNAL_m RE; 




@<Sl.PHYSlCAL:=[LEFT:lF S.INPUT THEN -999999. ELSE 999999 FI 
RIGHT:IF S.OUTPUT THEN 999999 ELSE -999999. FI 
HEIGHT: <~'( T. W IRE_HE I GHfo> CS. VHE I GHTl 
NAME:S.NAMEJ; 
DEFINE DRAW_CELLS(C:CHIP T:TECHNOLOGYl=MRG: 
BEGIN VAR X=REAL;G=GATE; 
!COLLECT <* CASE G. TYPE OF 
NOR: T. NOR 
NANO: T.NANO 
INVERT: T.INVERT 





BEGIN VAR G=GATE;T=SIGNAL_WIRE; 
(+ CASE G. TYPE OF 
NOR: 1 
INVERT: 1 
NANO: +1 FORT SE G.INPUTS; 
ENDCASE FOR G SES. TO;hQ_LOAD + 
LOAD!BLUE,WIDTH(BLUEl,S.PHYSICAL.RIGHT-S.PHYSICAL.LEFTJ ENO 
ENDDEFN 
DEFINE COMPILE(C:CHIP T:TECHNOLOGYJ=MRG: 
BEGIN VAR M=MRG; 
DO CWIOTH: =0; 
POWER:=0; 
<1'< T. SORT.1·r> (CJ; 
<~·,T. PACK,·:> (CJ; 






DEFINE PUT<C:CHIP T:TECHNOLOGYJ: 
BEGIN VAR ~1=t1RG;G=GATE;S=SIGNAL_WIRE; 
M:=COMPILE(C,Tl; 








PORTS: !COLLECT CNAf'lE: IS.NAf1EI 
AT: I { [COLOR: BLUE 
EDGE: lJEST 
AT:S.PHYSICAL.LEFT#S.PHYSICAL.HEIGHTJll LOAD: LOAD!Sll 
FORS SE C\INPUTS;; 
COLLECT [NAME: IS.NAME! 
AT: I I [COLOR: BLUE 
EDGE: EAST 
AT:S.PHYSICAL.RIGHT#S.PHYSICAL.HEIGHTJll ORIVE:l 
LOAD: LOAD (SJ J 
FORS SE C\OUTPUTS;JJ\OISK,C.NAMEl; 
DEFINE EQ(A,B:GATE>=BOOL: MACR0-10('LSPEQS'J 
DEF I NE EQ !A, B: SI GNAL_l~l REl =BOOL: MACR0-10!'LSPEQS'l 
DEFINE LINK_INPUT<G:GATE S:SIGNAL_l~lREJ: 
@!SJ.TO::= G <S; 
@(Gl.INPUTS::= S <S; 
ENDDEFN 
DEFINE LINK_OUTPUT!G:GATE S:SIGNAL_WIREI: 
@(G) .OUTPUT:=S; 
-249-
@(SJ. FROM: ·G: 
ENDDEFN 
DEFINE UNLI NK_I NPUT !G: GATE S: SI GNAL_~ll REJ: 
BEGIN VAR O·GATE:R=SIGNAL_WIRE; 
@{SJ. TO:-!COLLECT Q FOR Q SE S.TO;WITH -(Q\EQ Gl;l; 
@!Gl.INPUTS:=!COLLECT R FOR R SE G.INPUTS;WITH -!R\EQ Sl;l: 
END 
ENDDEFN 





BEGIN VAR O=GATE; 




BEGIN VAR R=S I GNAL_l-J I RE; 
CHIP.SIGNALS:=!COLLECT R FOR R SE CHIP.SIGNALS;WITH -{R\EQ Sl;l; 




IF DEFINEOCA.VINVERTJ THEN @CA.VINVERTJ.VINVERT:=NIL; FI 




DEF I NE FUSE CA. B: SI GNAL_l-JI REJ: 
BEGIN VAR G=GATE:C=CHAR: 
IF DEFINEO<B.FROf1) !B. INPUT THEN 
IF DEFINED CA.FROM) !A.INPUT THEN HELP; 
ELSE @(Al.INPUT:=B.INPUT; 
G: =B. FROf1; 
IF DEFINED<Gl THEN 
UNLINK_DUTPUT!G,Bl; 
LINK_OUTPUT<G,Al; Fl FI FI 
IF ALWAYS C\DIGIT FOR CSE A.NAME; THEN @!Al.NAME:=B.NAME; FI 
IF DEFlNED<B.VINVERTl THEN VINVERTCA,B.VINVERTl; FI 
@CAJ.OUTPUT::=!B.OUTPUT; 




ELI 111 NATE <Bl; 
END 
ENDDEFN 
LET OS BECOME SIGNAL_~JIRE BY 
BEGIN VAR S=SIGNALJJIRE; 
IF THERE_IS S.NAME\EQ QS FDR S SE CHIP.SIGNALS; THEN S 
ELSE DO S:-[NAME:QSJ; 
CHIP.SIGNALS::• S <S; 
-250-
GIVE S FI 
ENO; 
DEF I NE NHJ_S I GNAL =SI GNAL_W I RE: SC ( !CH IP.SI GNAL_COUNT: : =+1; ) ) 
DEFINE SET<S:SIGNAL_WIRE G:GATE>: LINK_OUTPUT!G,S>; ENDOEFN 
LET GATE BECOME SIGNAL_WIRE BY 
BEGIN VAR S=S I GNAL_~J I RE: 




DEFINE INPUT(S:SIGNAL_WIRE>: @<S>.INPUT:=TRUE; ENDDEFN 
DEFINE OUTPUT<S:SIGNAL_WIREI: @(S).OUTPUT:=TRUE; ENDOEFN 
DEFINE lNPUTS!SQS:SQS): 
BEGIN VAR QS=QS; 




BEGIN VAR QS=QS; 
DO OUTPUT<QS); FOR QS SE SOS; 
ENO 
ENODEFN 
DEFINE NAME<OS:QS): CHIP.NAME:=OS; ENODEFN 
DEFINE DESCRIPTION<OS:QSI: CHIP.DESCRIPTION:=OS; ENDOEFN 




DEF I NE NEW_GATE (St.JS: SI GNAL_W I RES TYPE: GATE_ TYPE) =GATE: BEGIN VAR GATE=GATE;SW=SIGNAL_WIRE; 
DO GATE:=[TYPE:TYPEJ; 
CHIP.GATES::= GATE <S; 




DEFINE NAND<SWS:SIGNAL_WIRESl=GATE: NEW_GATE<SWS,NANO) ENOOEFN 
DEFINE NOR<SWS:SIGNAL_WIRES>=GATE: NEW_GATE<SWS,NOR) ENODEFN 
DEFINE INVERT <SW:SIGNAL_l.JIRE> =GATE: NEW_GATE ( !SWJ, INVERT> ENDDEFN 
DEF I NE AND <S~JS: SI GNAL_W I RES) =GATE: SWS\NANO\ INVERT 




DEFINE NAND<A,B:SIGNALJJIREl=GATE: NAND<IA;Bll ENDOEFN 
DEFINE NOR<A.B:SJGNAL_WIREl=GATE: NOR(!A;Bll ENDOEFN 
DEFINE AND<A.B:SIGNAL_WIREl=GATE: ANO(IA;Bll ENDDEFN 
DEFINE OR(A,B:SIGNAL_WIREl=GATE: OR((A;Bll ENDDEFN 
DEFINE XOR!A,B:SIGNALJJIREl=GATE: 
BEGIN VAR C=S I GNAL_~J I RE; 
DO C:=NAND(A,Bl; 
GIVE NAND!A\NAND C,B\NAND Cl 
END 
ENODEFN 
The following code lists the optimizers defined in the RLC. These optimizers look at 
the logical structure of the chip, replacing gates while preserving functionality. 
The GE~NVERT function is a utility function which generates the inverse of its 
input signal, using existing inverters if they exist. The REMOVE INVERTERS 
function removes extra inverters from the chip's logic. REMOV~EDUNDANCIES 
looks for redundant gates, removing those which don't add to the functionality of 
the chip. The DE_MORGAN function will convert a NAND gate into a NOR 
implementation and turn a NOR gate into a NAND implementation. This function is 
used by REMOV~ANDS and REMOV~ORS,, which eliminate all instances of their 
respective gates. REMOVE NANDS is used to turn a NAND circuit into a NOR circuit, 
"\Vhile REMOVE NORS does the inverse transformation. DE MORGAN COST is a 
function which computes the relative cost of a NAND or NOR gate in the chip's logic 
equations. The D~MORGAN function calls this cost routine to determine which 
gates to replace. If the cost of converting a particular gate into its dual gate is 
negative, which means we would use fewer gates to implement the chip, the D~ 
~ORGAN function will perform the transformation. The UNIQu:l~J.NPUTS function 
removes extra inputs to NAND and NOR gates. If a particular gate has more than one 
connection to a signal, all but one of those connections are removed. Finally, the 
MERGE function moves signals which connect to strings of NAND or NOR gates. 
DEF I NE GET _INVERT !S: SI GNALJ.JI REl =SI GNAL_WI RE: 
BEGIN VAR T=SIGNAL_~lIRE:G=GATE; 
IF S.FROM.TYPE=INVERT THEN 
GIVING S.FROM.INPUTSCll 







EF OEFINED!S.VINVERT) THEN S.VJNVERT 
EF THERE_IS G.TYPE=INVERT FOR G SES.TO; THEN G.OUTPUT 
ELSE INVERTCS> FI 
ENO' 
ENDDEFN 
DEFINE REMOVE INVERTERS: 
-252-
BEGIN VAR G=GATE;5,T=SIGNAL_WIRE; 
FOR G SE CHIP.GATES;WITH G.TYPE=INYERT;WITH DEFINEO<G.OUTPUT>; DO 
S:=G.OUTPUT; 








DEFINE REMOVE REDUNDANCIES: 
BEGIN VAR G,Gl,G2=GATE;LIST=GATES;S,Sl,S2=SIGNAL_WIRE;I=INT; 
LI ST: =NIL; 
((FOR Gl SE CHIP.GATES;&& FOR I FROM 2 BY l;lWITH OEFINEO<Gl.OUTPUTl; 
! ! FOR G2 SE CHIP.GATESCI-J;WITH DEFINEDIG2.0UTPUT>;> 
WITH IF Gl.TYPE<>G2.TYPE THEN FALSE 
END 
ELSE AUlAYS THERE_! S Sl \EQ S2 FOR S2 SE G2. INPUTS; 
FOR Sl SE Gl.INPUT5~ & 
ALWAYS THERE_IS S2\EQ Sl FOR Sl SE Gl.INPUTS; 
FOR 52 SE G2.INPUTS; FI; DO 
S2:=G2.0UTPUT; 
Sl:=Gl.OUTPUT; 




LIST::= G2 <S; 
FUSE <Sl, 52!; 
DO ELIMINATE<GJ: FOR G SE LIST: 
END 
ENDDEFN 
DEFINE DE MORGAN<G:GATEl: 
BEGIN- VAR TYPE=GATE_TYPE;S,T=SlGNAL_wlRE;SWS=SlGNAL_WIRES;N=GATE; 





IF DEFINED<TYPEJ THEN 
Sl~S: =NIL; 
FORS SE G.INPUTS; DO 
UNLINK_INPUTCG,Sl: 
SWS::= S\GET_INVERT <S; 
ENO 




IF THERE_IS G.TYPE=INVERT FOR G SES.TO; 
THEN T:=G~OUTPUT; 











ELSE LINK_OUTPLJT(INVERT(N),S); FI FI 
DEFINE REMOVE_NANDS: 
BEGIN VAR G=GATE: 
CHIP.GATES:=REVERSEICHIP.GATESl; 





BEGIN VAR G=GATE; 
CHIP.GATES:=REVERSEICHIP.GATESJ; 





BEGIN VAR S-SIGNALJJIRE;N=GATE; 
IF G.TYPE=NAND ! G.TYPE=NOR THEN 
IF NEVER N.TYPE=INVERT FOR N SE G.OUTPUT.TO; THEN 1 
EF -(DEFINEDIG.OUTPUT.T0[2-Jl !G.OUTPUT.OUTPUTJ THEN -1 . ELSE 0 FI + 
+ IF DEFINEO(S.VINVERTJ THEN 0 
EF S.FROM.TYPE=INVERT THEN 
IF DEFINEDCS.T0[2-JJ !S.OUTPUT THEN 0 ELSE -1 FI 
EF THERE_IS N.TYPE=INVERT FOR N SES.TO; THEN 0 ELSE 1 FI 
FORS SE G.INPUTS; 




BEGIN VAR G=GATE: 
CHIP.GATES:=REVERSE(CHIP.GATESJ; 





IFS.OUTPUT THEN FALSE ELSE -DEFINEO<S.TOC2-J) FI 
ENDDEFN 
DEFINE UNIQUE_INPUTS: 
BEGIN VAR G=GATE;Sl,S2=SlGNAL_WIRE;I=INT;SWS=SlGNAL_WIRES; 
FOR G SE CHIP.GATES; DO 
SWS:=NIL; 
FOR Sl SE G.INPUTS;&& FOR I FROM 2 BY 1; DO 
IF THERE_IS S2\EQ Sl FOR S2 SE G.INPUTS[l-l; & 











BEGIN VAR LIST=GATES;G,H,I=GATE;S,T,U=SIGNAL_WIRE; LIST: =NIL; 
IFOR GS[ CHIP.GATES;WITH DEFINED<G.OUTPUTJ;WITH G.TYPE=NANO!G.TYPE=NOR;J ! ! <FORS SE G.INPUTS:WITH (l:=S.FROM;J.TYPE=INVERT; 
END 
W TH S \UN IOUE_DES TI NAT ION; 
WITH IH:=IT:=S.FROM~INPUTS[ll;l.FROM;J.TYPE=G.TYPE; WITH T \UN IOUE_DES TI NAT ION; l DO 








ELI MI NATE <Tl; 
ELI MI NATE £SJ: 
LIST::= IH;ll SS; 
DO ELIMINATEIGJ; FOR G SE LIST; 
ENO 
ENDDEFN 
The ANNOTATE function is used to label plots. All of the input and output signals of 
the chip have their names drawn on the plot. This function has the technology as a 
parameter, so that any technology's layout can be annotated. 
DEFINE ANNOTATE(C:CHIP T:TECHNOLOGYJ=MRG: 
BEGIN VAR M=MRG;LENGTH=INT;S=SIGNAL_WIRE;X=REAL;SCALE=PDINT; DO M:=COMPILEIC,Tl; 
LENGTH:= MAX LENGTHIS.NAMEJ FORS SE C\INPUTS;; X:=CWIOTH-30*LENGTH/7.-4; 
SCALE:=. 8,·, (<,·, T. WI RE_HE I GHfo> Ill -<1': T. WIRE_HE I GHT ic> (2) l 114. id ltlll ; GIVE IM; 
END 
ENDDEFN 
COLLECT S.NAME\SCALED_BY SCALE\AT XtlS.PHYSICAL.HEIGHT-2.5 \PAINTED VIOLET FORS SE C\INPUTS;; COLLECT S.NAME\SCALED_BY SCALE\AT 5#S.PHYSICAL.HEIGHT-2.5 \PAINTED VIOLET FORS SE C\OUTPUTS;l 
To allow the use of macro definitions, RLC allows the user to expand a previously 
declared CHIP into the current chip. The user specifies the set of interconnections 
via a set of SIGNA~VALUEs, each of which state which signal of the current CHIP 
connects to the ports of the expanding CHIP. The inputs of the expanding CHIP may 
be tied to TRUE or FALSE signals. The FIXED HIGH and FIXED LOW routines 
-255-
eliminate these fixed value signals, and in doing so may eliminate the gates they 
connect to. The EXPAND function takes a CHI~NSTANCE, which states which CHIP 
to expand and how to interconnect the signals, and adds the equations to the current 
chip. 




SIGNAL_VALUE= [NA!1E:QS FROM:POSSIBLE_SIGNALJ; 
SIGNAL_VALUES= f SIGNAL_VALUE l; 
CHIP_INSTANCE= [CHIP:CHIP NAME:QS VALUES:SIGNAL_VALUESl; 
DEFINE FIXEO_HIGH{S:SIGNAL_WIREJ: 
BEGIN VAR G=GATE;T=SIGNAL_WIRE;J=INT; 
DEFINE ZAPCG:GATEJ: 
T:=G.OUTPUT: 
UNLI NK_OUTPUT<G, Tl ; 
ELIMINATE <Gl; 
FI XEO_LDW <Tl ; 
ENDOEFN 
FOR G SE S. TO; OD 
UNLINK_INPUT<G,SJ; 
CASE G.TYPE OF 
INVERT: ZAPCGJ; 
NANO: J:=+l FORT SE G.INPUTS;; 
IF J=0 THEN ZAPIGJ; 
EF J=l THEN @CG).TYPE:=INVERT; FI 















FOR G SES.TO; 00 
UNLINK_INPUTCG,SJ; 
CASE G.TYPE OF 
INVERT: ZAPCGJ; 
NOR~ J:=+l FORT $E G.INPUTS;; 
IF J=0 THEN ZAP(GJ; 
EF J=l THEN @(GJ.TYPE:=INVERT; FI 








DEFINE COPY<C:CHIP N:OSl=CHIP: 
BEGIN VAR CHIP=CHIP;G,H=GATE;S, T=SIGNAL_WIRE; !=INT; 
DO DO @CGJ.INDEX:=I; FOR G SEC.GATES;&& FOR I FROM 1 BY 1; 
DO @CSJ.VHEIGHT:=I; FORS SEC.SIGNALS;&& FOR I FROM 1 BY l; 
CHIP:= [GATES: !COLLECT [TYPE:G. TYPEJ FOR G SE C.GATES; l 
SIGNAL_COUNT: C.SIGNAL_COUNT 
SIGNALS: {COLLECT [NAME: NUS. NAf1EJ FOR S SE C. SIGNALS; l 
NAME: C.NAME 
DESCRIPTION: C.DESCRIPTIONJ; 
FOR G SE CHIP.GATES;&& FOR H SE C.GATES; DO 
END 
IF OEFINEDCH.OUTPUT) THEN 
LINK_OUTPUTCG,CHIP.SIGNALSfH.OUTPUT.VHEIGHTJ); FI 
DO LINK_INPUTCG,CHIP.SIGNALS(S.VHEIGHTJ); FORS SE H.INPUTS; 
DD IF DEFINED<S.VINVERT> 
THEN @(Tl.VINYERT:=CHIP.SIGNALSfS.VINVERT.VHEIGHTJ; FI 










CHIP.GATES:=REFRESHCCHIP.GATES SS a.GATES}; 
CHIP.SIGNALS:=REFRESH{CHIP.SIGNALS SS a.SIGNALS>; 
FOR SY SEC.VALUES; DO 
END 
N:=C.NAME $$SY.NAME; 
S:=IF THERE_IS S.NAME=N FORS SE a.SIGNALS; THENS ELSE NIL FI; 
PS:=SV.FROM; 
CASE PS OF 
VAR: FUSE CPS, SJ; 
FIXED: IF PS THEN HIGH ELSE LOW FI ::= S cS; 
ENDCASE 
FORS SE HIGH; DO FIXED_HIGH{S); ENO 
FORS SE LOW; DO FIXED_LOWCS); ENO 
END 
ENDDEFN 
When the chip expanders and chip optimizers have been used upon a chip, the logic 
equations of the chip are changed, although the function of the chip has remained 
constant. To allow the user to see what the new logic equations are, the UNPARSE 
function is used. This function displays the logic of the chip in the same format as 










l~RITE('OEFINE 'SSCHIP.NAMEU' ('J; 
IF THERE_IS SW.INPUT FOR SW SEC.SIGNALS; THEN 
WRITEl'INPUTS:'J; 




ELSE B:=FALSE; FI 
IF THERE_IS SW.OUTPUT FOR SW SE C.SIGNALS; THEN 
IF B THEN WRITE!' '); FI 
WRITEl'OUTPUTS:'J; 




IF THERE_IS SW\LOCAL FOR SW SE C.SIGNALS; THEN 
IF B THEN WRITE!' 'J; FI 
WRITEl'LOCALS:'J; 
FOR SW SE C. SIGNALS; WITH SW\LOCAL; OTHER_DO WRITE <' , ' ) ; ; 
00 WRITECSW.NAMEJ; . 
ENO FI 
WRITEC'J:'J;CRLF; 
FOR SW SEC.SIGNALS; WITH SW.OUTPUT SW\LOCAL; DO UNPARSECSW); END 
WRITEC'ENDDEFN'J;CRLF; 
DEFINE UNPARSElSW:SIGNAL_WIREJ: 




DEFINE UNPARSE(SW:SIGNAL_WIRE B:BOOLJ: 
BEGIN VAR S=SIGNAL_WIRE;G=GATE; 
END 
ENDDEFN 
IF -8 & ISW.INPUT!LDCALISWJ !SW.OUTPUT) THEN WRITE<SW.NAMEJ; 
ELSE G:=SW.FROM; 
CASE G. TYPE OF 
INVERT: WRITEl'-'J;UNPARSECG.INPUTSClJ,FALSEJ; 
NANO: IF -8 THEN WRITE<' ('J; FI 
FDR SSE G.INPUTS;OTHER_DO WRITE!' & ');; 
DO UNPARSE<S,FALSEJ; 
END 
IF -B THEN WRITE(')'); FI 
NOR: IF -B THEN WRITE(' ('J; FI 
FOR S SE G. INPUTS; OTHER_DO WRITE (' 'J;; 
DO UNPARSECS,FALSEl; 
ENO 
IF -8 THEN WRITE(')'); FI 
ENDCASE FI 
-258-
DEFINE UNPARSE: UNPARSE<CHIP>; ENDDEFN 
In conjunction with the unparsing of logic equations, the user might like a quick 
summary of the size of the chip. This allows the user to judge the usefulness of 
various optimizations which can be applied to the chip. The STATS function will 
list the area of the chip, the number of gates, and the number of wiring channels, as 
a function of the technology and the current chip. 
DEFINE STATS<T:TECHNOLOGY}: 
BEGIN VAR G=GATE;S=SIGNAL_WIRE; 
CRLF; 
WRITE('Technology:'I; 





WRITEC'Number of gates: '); 
WRITE<+l FOR G SE CHIP.GATES;J; 
TAB; 
WRITEC'N~mber of channels:'); 




As mentioned above in the technology definition, we have routines to pack the 
interconnection wires. The packing routines attempt to have wires share channels, 
so that the number of channels (and the size of the chip) is minimized. There axe 
two packers presented here. The first, NMO~ACK~, does not 'know' about the 
internals of a cell. It assumes that every wire which connects to a cell consun1es 
the channel for the enti1·e width of the cell. This packer is more general for new 
technologies. The second packer, NMOS PACK 2, knows enough about the internals 
of the cells to allow the output wire to share a channel with one of the input wires, 
under certain circumstances. Since this packer knows about the implementation of 
cells, it is not as general as the first packer, but it does a better job of packing the 
wires for the currently defined technologies. 
DEFINE SORTISWS:SIGNAL_WIRESl=SIGNAL_WIRES: 
BEGIN VAR OUT=SIGNAL_l.JIRES;W=SIGNAL_WIRE; I ,J,K=INT; 
DO OUT:=NIL; 
WHILE DEFINEDCSWSI; DO 
I: =-1; 
FOR W SE SWS;&& FOR J FROM 1 BY l; DO 












BEGIN VAR Sl-JS=S I GNAL_~I I RES; H= I NT; G=GATE; S=S I GNAL_~J I RE; 
DEFINE DRAW_WIRE!LEFT:INTJ: 
BEG! N VAR W=SI GNAL_l~IRE: I= I NT; 




@(~JJ. VHEIGHT: =H; 
DRAW_WIRECW.VRIGHTJ; FI 
FOR G SEC.GATES;&& FOR H FROM 1 BY 1;00 @(GJ.INDEX:=H; END 
FOR S SE C.SIGNALS; DO 
@ISJ.VLEFT:= IFS.INPUT THEN 0 
EF DEF I NED CS. TOJ 
THEN S.FROM.INOEX MIN MING.INDEX FOR G SES.TO; 
ELSE S.FROM.INDEX FI; 
@(SJ.VRIGHT:= IFS.OUTPUT THEN 999999 
ELSE S.FROM.INOEX MAX MAX G.INDEX FOR G SES. TO; 
END 
SWS:=C.SIGNALS\SORT; 




1 BY l; DO DRAW_WIREl-ll; END 
BEGIN VAR Sl.JS=S I GNAL_~l I RES; H=I NT; G=GATE; S=SI GNAL_wI RE; 
DEFINE DRAl.J_l,JJRE !LEFT: INTJ: 
BEGIN VAR l,J=SIGNAL_WIRE; !=INT; 
FI; 






FOR G SEC.GATES;&& FOR H FROM 1 BY 2;DO @CGJ.INDEX:=H; END FOR S SE C.SIGNALS; DO 
@CSJ.VLEFT:= IFS.INPUT THEN 0 
EF DEFINEDCS.TOJ 
THEN S.FROM.!NDEX+l MIN MING.INDEX FOR G SES.TO; 
ELSE S.FROM.INDEX+l Fl; 
@CSJ.VRIGHT:= IFS.OUTPUT THEN 999999 
ELSE S.FROM.INDEX+l MAX MAX G.INOEX FOR G SES.TO; FI; END 
SWS:=C.SIGNALS\SORT; 
WHILE DEFINEOISWSJ;&& FOR H FROM 1 BY l; DO DRAW_WIRE(-ll; END 
ENO 
ENDOEFN 
In addition to the packers, we have sorters. The sorters may reorder the gates in 
attempts to minimize wire lengths or minimize the number of wiring channels. 
The first 'sorter', NO SORT, does nothing. The SMALL SORT routine rebuilds the chip 
from left to right, each time adding the gate which will add the fewest wiring 
-260-
channels. This is a local optimization, which means that it will not necessarily 
(and, in fact, rarely) produce the smallest chip. The RELAXATION SORT is an 
iterative routine. Each time it is executed, it 'averages' each gates position. For 
each gate, the routines averages the indexes of the gates input and output gates. It 
then sorts the gates by these averages. Presumably, if this routine is executed a 
few times, gates will tend to be near the gates they connect to. 
DEFINE NO_SORT!C:CHIPJ: NOTHING; ENOOEFN 
DEFINE SMALL_SDRT!C:CHIPJ: 
BEGIN DEFINE ACTI YES !SL-IS: SI GNAL_WIRES GS: GATES> =SI GNAL_WIRES: 
BEGIN VAR G=GATE;S,T=SIGNAL_WIRE; 
!COLLECT S FDR S SE SWS;WITH 
IF S.OUTPUT THEN TRUE 
ELSE THERE_I5 <G.OUTPUT\EQ S ~ 
ENO 
ENOOEFN 
THERE_I5 T\EQ 5 FORT SE G.INPUTS;l 
FDR G SE GS; FI;l 
DEFINE UNIQUE!Sl,S2:5IGNAL WIRE5l=51GNAL WIRES: 
BEGIN VAR A,B=51GNAL}l!RE; -





IF NEVER A\EQ B FOR 8 SE S2; THEN 52::= A <S; FI 
VAR ACTIVE,Ll,L2=SIGNAL_WIRE5;DLD,NEW=GATES;Sl,S2=SIGNAL_L-lIRE; G,Gl=GATE;l,J,K,L=INT; 
OLD: =C. GA TES: 
NEW:=NIL; 
ACTIVE:=ICOLLECT Sl FOR Sl SE CHIP.SIGNALS; WITH Sl.INPUT;l; 
WHILE DEFINED<OLDl: DO 
I:=999999; 
FOR G SE OLD;&& FOR J FROM 1 BY l; DO 
L2:=ACTIVES!UNIQUE!G.OUTPUT<SG.INPUTS,ACTIVE), 
END 
{COLLECT Gl <FDR Gl SE OLD;&& FORK FROM 1 BY l;l 
WITH K<>J;ll; 
K:=+l FOR Sl SE L2;; 
IF K<I THEN 
I: =K: 
L:=J; 
L1: =L2; FI 








BEGIN VAR DLO,NEW=GATES;G,H=GATE;5=51GNAL_WIRE;I,N=INT;R=REAL; OLD:=C.GATES; 
-261-
N:= +1 FOR G SE OLD;+l: 
FOR G SE OLD;&& FOR I FROM l BY l; 00 @{Gl.INDEX:=l; END 
FOR G SE OLD:DO @IGl.RINDEX:= 
I+ IFS.INPUT THEN 0 ELSE S.FROM.INDEX FI FORS SE G.INPUTS; + 
IF G.OUTPUT.OUTPUT THEN N ELSE 0 FI + 
+ H.INOEX FOR H SE G.OUTPUT.TO;I/ 
1+1 FORS SE G.INPUTS; + +1 FOR H SE G.OUTPUT.TO; + 
IF G.OUTPUT.OUTPUT THEN 1 ELSE 0 FIJ; END 
NHI: =NI Li 
WHILE DEFINED<OLDJ; DO 
R: =-1.; 











Next, we have the parser. The parser accepts a series of function definitions and 
generates a CHIP for each function. The following input is an example of the 
parser's input. 
DEFINE DFLOPIINPUTS:DATA,CLOCK,RESET,SET OUTPUTS:OUT,BAR LOCALS:Xl,X2,X3J: 
X3 = X2 & RESET & DATA 
X2 = Xl & CLOCK & X3 
Xl = RESET & CLOCK & I X3 & SET & Xl J 
BAR = OUT & X2 & RESET 
OUT = BAR & Xl & SET 
OUT--.-BAR 
ENDOEFN 
DEFINE EQIINPUTS:A,B,CIN OUTPUTS:COUTl: 
COUT = IA & -Bl*l-A & BJ*CIN 
ENDDEFN 
DEFINE GEIINPUTS:A,8,CIN OUTPUTS:COUTJ: 
COUT= 1-A&BJ*CIN 
ENODEFN 
DEFINE COUNTERIINPUTS:RESET,EI,CLOCK OUTPUTS:OUT,BAR,EOl: 
<DFLOPIDATA:BAR SET:.TRUE. 






<DFLOPlDATA:DATA SET:.TRUE. RESET:.TRUE. CLOCK:SHIFT OUT:VALUEJ> 
<DFLOP<DATA:YALUE SET:.TRUE. RESET:.TRUE. 
. CLOCK: 1-LOAD_VAL!-SHIFTJ OUT:VALJ> 
<DFLOP<DATA:VALUE SET:. TRUE. RESET:. TRUE. 
CLOCK: (-LOAO_ADR!-SHIFT> OUT:ADRJ> 
-262-
<COUNTER<RESET:SYNC EI:CNTI CLOCK:SHIFT OUT:COUNT EO:CNTO)> <GEIA;COUNT B:VAL CIN:ONI COUT:ONOI> 
<EOIA:AOR B:VALUE CIN:EQI COUT:EQOl> 
ENODEFN 
This input will produce five CHIPs in the virtual memory. The final two CHIPs 
have expansions of the previously defined CHIPs. This parser will accept characters 
from a character string, a data file, or from the terminal. There is a file INCLUDE 
feature which uses the ICL metalanguage syntax: /*READ file;*/. 
DEFINE PROOUCERISC:SC>=CHAR_PROOUCER: 
II ISC;l IF OEFINEDISCJ THEN GIVING SCUJ DO SC:=SC(2-J: END ELSE THE_CHARl0l FI \\ 
ENODEFN 
DEFINE FILE_PROOUCERIFILE:FILE_SCl=CHAR_PROOUCER: 




BEGIN VAR C=CHAR; 
DO C:=F\INPUT; 





PRODUCERS= I CHAR_PROOUCER I; 
PUSHED_SC=SC; 
TOKEN=QS;-














DO IJHILE DO VERIFYl!'OEFINE'; !THE_CHAR1261Jl,'Definition'); 
GIVE TOKEN='DEFINE'; 




BEGIN VAR C=CHAR; 
-263-
IF OEFINEO!PUSHED_SC} THEN GIVING PUSHED_SCClJ 
00 PUSHED_SC:=PUSHEO_SCl2-J; END 
EF -DEFINED!PRODUCER) THEN THE_CHAR<26l 
ELSE DO C: =<~·,PRODUCER»:> \UPPERCASE; 
ENO 
ENOOEFN 




GIVE C FI 
DEFINE GET_A_CHAR2=CHAR: 
BEGIN VAR C=CHAR; 





WHILE GET_A_CHARl<>'"'; DO NOTHING; END 
C:=GET_A_CHARl; FI 
DEFINE GET A CHAR=CHAR: 
BEGIN - VAR C~CHAR; 






IF C='*' THEN [:=METALANGUAGE; 
ELSE PUSHED_SC:=IC!; 
C:='/': FI FI 
DEFINE IS_BLANK(C:CHAR>=BOOL: C\IN_SET ISPACE;TAB;CR;LFl ENODEFN 
DEFINE IS_ID_CHAR<C:CHAR>=BOOL: LETTER<Cl !OIGIT(Cl ! (C=' _') ENOOEFN 
DEFINE GET_TOKEN=OS: 
BEGIN VAR C=CHAR;SC=SC; 
DO WHILE <C:=GET A CHAR:l\IS BLANK: DO NOTHING; ENO 
IF C\IS_ID_CHAR-THEN -
SC:= IC!; 
~JHILE CC:=GET_A_CHAR;l\IS_ID_CHAR; DO SC::= C <I; END 
PUSHED_SC::= C <I; 
SC:=REVERSE(SCl; 











DEFINE VERIFY(SQS:SOS B:QS): 
BEGIN VAR as. TOKEN=OS; 
TOKEN:=GET TOKEN: 
IF NEVER OS=TOKEN FOR as SE SOS; THEN ERROR<B>; FI 
END 
ENDDEFN 
DEFINE VERIFY<O:OSJ: VERIFY< IOI ,Q); ENDOEFN 
DEFINE CHECK_TOKEN!SOS:SOSl=BOOL: 
BEGIN VAR OS,TOKEN=OS; 
DO TOKEN:=GET_TOKEN: 
GIVE IF THERE_IS OS=TOKEN FOR as SE SOS; THEN TRUE 
ELSE DO PUSHED_SC::= TOKEN SS; GIVE FALSE FI 
ENO 
ENDDEFN 
DEFINE CHECK_TOKEN!O:OS>=BOOL: CHECK_TOKEN!IQl) 
DEFINE METALANGUAGE=CHAR: 
BEGIN VAR SC=SC;C=CHAR; 
DEFINE FILE_DOES_NOT_EXIST(SC:SCJ=SC: 
DO CALF; 
WRITE!'Fi le ·ssscss 
ENDDEFN 





DEF I NE r·1E TA HELP: 
CRLF; 





SC: =GET TOKEN; 
IF CHECK_TOKEN('.'} THEN SC::= SS '.'<SGET_TOKEN; 
ELSE SC::= SS '.RLC'; FI 
IF GET_A_CHAR2<>';' THEN METAHELP; FI 
IF GET _A_CHAR2 <>' ,., ' THEN MET AHELP; FI 
IF GET _A_CHAR2 <>' I' THEN f1E T AHELP; FI 
LJHILE IF DEFINEO(SCJ THEN -EXISTS<FILE_SC::SC> ELSE FALSE FI; 
DO SC: :. =\FI LE_OOES_NOT _EX I ST; ENO 
IF DEFINEO!SCJ THEN 














NAi1E {SS NEST FOR NEST SE REVERSE <NESTING_LEVELS); SS NAf1E); 
NESTING_LEVELS:: = <NM1Etb' _' l <S; 
GET_HEADER: 
WHILE GET_TOKEN<>'ENDDEFN'; DO 
IF TOKEN='DEFJNE' THEN GET_DEFINITION; 
EF TOKEN=!THE_CHAR{2Gll THEN 
CRLF; 
WRITE<'End of file encountered inside DEFINE'); 
CRLF: 
HELP; 
EF TOKEN='<' THEN GET_CALL: 












BEGIN VAR GROUP,SIG=QS;SQS=SQS; 
VER I FY {' (' J ; 
WHILE 00 VERIFY( l'INPUTS';'OUTPUTS';'LOCALS';'J'I ,'Signal type'!; 
GIVE TOKEN<>')'; 
DO GROUP:=TOKEN; 
VER I FY (' : ' } ; 
END 
SOS:=ICOLLECT GET_TOKEN UNTIL -CHECK_TOKEN<','l;l; 
IF GROUP=' INPUTS' THEN INS::= SOS SS; 
INPUTS <SOS l : 
EF GROUP='OUTPUTS' THEN OUTS::= SOS SS; 
OUTPUTS {SQSl; 
ELSE LOCALS::= SOS SS; FI 




IF CHECK_TOKENC'. ') THEN 
DO YERIFY(!'TRUE';'FALSE'l,'.TRUE. or .FALSE.'); 
GIVE GIVING TOKEN='TRUE' 
DO VERIFY<'. 'l; END 
ELSE GET_RHSl FI 
ENDDEFN 
DEFINE GET_CALL: 
BEGIN VAR NAf1E, NEST, SI G=OS; C=CHI P; SV=SI GNAL_VALUES; 
S=SIGNAL_WIRE;I=INT; 
IF CHECK_TOKEN<'@'l THEN 
NAME:=GET_TOKEN: 
ELSE NAl1E: =GET_ TOKEN; 
IF DEFINED<NESTING_LEVELSJ THEN 
IF THERE_IS 
SS NEST FOR NEST SE REVERSE(NESTING_LEVELSCI-Jl; 
SS NAME \VM EXISTS AS 'DCHIP 1/2/81' 
FOR I FROM 1-TO l++l FOR NEST SE NESTING_LEYELS;; 
THEN NAME::= SS NEST 
-266-
FOR NEST SE REVERSE<NESTING_LEVELS[J-JJ;SS; FI FI FI 
IF NAME\Yt1_EXISTS_AS 'DCHIP 112/81' THEN 
CALL_NUl"lBER: : =+l; 
C: =GET <NAME>; 
SV:=NIL; 
VER I FY (' (' } : 
WHILE <SIG:=GET_TOKEN;}<>'l'; DO 
IF THERE_IS S.NAME-SIG FORS SE C.SIGNALS;WITH S.INPUT!S.OUTPUT; THEN VERIFY(':'!; 




WRITE<'Chip 'SSNAMESS' does not have a port named 'SSSIGJ; CALF; 
HELP; FI 
EXPAND<CCHIP:C NAME:' .. 'SSSC!CALL_NUMBERJ VALUES:SVJ); ELSE CALF; 
END 
ENDDEFN 




BEGIN VAR 0=05; 
IF THERE_IS 0=05 FOR Q SE OUTSSSLOCALS; THEN 
VER I FY ( {' =' ; '"''I , 'Equation' J : 
IF TOKEN='=' THEN FUSE(QS,GET_RHSlJ; ELSE VINVERT<OS,GET_TOKEN}; FI ELSE CALF: 
END 
ENODEFN 




BEGIN VAR S=S I GNAL_l.J I RE; 
DO S: =GET _RHS2; 




DEF I NE GET _RHS2 =5 I GNAL_lj I RE: 
BEGIN VAR SljS=SIGNAL_WIRES; 
00 S~IS: =!GET _RHS3l; 
1-IH I LE CHECK_ TOKEN ( ' ! ' ) ; DO S~JS: : = GET _RHS3 <S; END 
GIVE IF DEFINED<SWS[2-ll THEN NOR<SWSl ELSE SWSClJ FI 
ENO 
ENDOEFN 
DEF I NE GET _RHS3=S l GNAL_l~ I RE: 
BEGIN VAR SWS=SIGNAL_WIRES; 
DO S~JS: = IGET _RHS4J ; 
UH I LE CHECK_ TOKEN (' +' l ; DO SWS: : = GET _RHS4 <S; END 
GlVE fF DEFINED<SWSC2-ll THEN ORCSWSJ ELSE SWSClJ FI 
END 
ENDDEFN 
DEF I NE GET _RHS4=SI GNAL_l.JI RE: 
BEGIN VAR SlJS=S I GNAL_l.JI RES; 
00 SWS:=!GET_RHS51; 
-267-
l-IHILE CHECK_TOKENC'&'l; 00 St.JS::=: GET_RHS5 <S; ENO 
GI VE IF DEF I NED !SlJS (2-J l THEN NANO !SlJSJ ELSE SWS [lJ FI 
END 
ENDDEFN 
DEF! NE GET _RHSS==SI GNAL_tJI RE: 
BEGIN VAR SIJS=SIGNAL_WIRES; 
DO SIJS: = IGET _RHS6l: 
WHILE CHECK_ TOKEN (' i·t' l : DO SlJS: : = GET _RHS6 <S: ENO 




IF CHECK_TOKENC'-'} THEN INVERT!GET_RHS7} ELSE GET_RHS7 FI 
ENDOEFN 
DEFINE GET RHS7=SIGNAL WIRE: 
BEGIN - VAR a.x-os~S.IF=SIGNAL_WIRE;ALL,IFS=SIGNAL_WIRES; 
DEF I NE POSSIBLE (St.JS: SI GNAL_~l 1 RES) : 
. BEGIN VAR P =POSS IBLE_S I GNAL; 
P:,,,,GET_POSSIBLE; 
CASE P OF 
FIXED: IF P THEN ALL::= NAND<SWSl <S; FI 




DO IF CHECK_TOKEN!' (') THEN 
S: =GET _RHSl: 
VER I FY {' ) ' ) : 
EF CHECK_TOKEN{'JF'l THEN 
IF: =GET _RHSl; 
ALL:=NIL; 
VERIFY ('THEN' l; 
POSSIBLE ((!Fl l; 
IFS:=IGET_INVERT!IFll; 




VER I FY C' THEN' } ; 
POSSIBLE!IFSS>IFl; 
IFS::= GET_INVERT!lF> <S; 
POSS IBLE <IFS l ; 
VER I FY ( ' F I ' l : 





IF THERE_IS X=Q FOR X SE INSUOUTSHLOCALS; THEN S:=O: 
ELSE CRLF: 
I-JR I TE ('There i s no s i gna I named 'UQ) ; 
CRLF; 
HELP; FI FI 
• 
-268-
There is a tau-model simulator built into RLC. The MAKE SIMULATOR function 
will take the current CHIP and construct a SIM_CHIP, which is the simulator 
representation of the chip. The user then defines the pulse trains which drive the 
input wires, using the CLOCK and WAVEFORM functions. Following this, the 
RUN(time) function is called, which actually runs the simulation from t=O to 
t=time. RUN will initialize all of the nodes in the circuit. In some cases, like for 
cross-coupled circuits, RUN will ask the user whether a node should be initiallized 
high or low. Once the simulation is complete, the user may plot waveforms of any 
of the nodes using the PLOT functions. The simulator saves the waveforms of each 
node so that many plots can be generated from a single simulation run. 
TYPE SIM_GATE= UNPUTS:Sil1_lJIRES 
OUTPUT: SI ll_IJ I RE 
TYPE: GATE_ TYPE 
GATE:GATEJ; 
SIM_GATES= I Slll_GATE l; 
s I M_W IRE= rNM1E: as 
FROl1: Slll_GATE 





SIM_WIRES= I SlM_WlRE l; 
SIM_CHIP= [WIRES:Slt1_WIRE5 GATE5:SIM_GATE5J; 
VAR SIM_CHlP=SlM_CHlP; 
DEF l NE MAKE~S I ~!ULA TOR: 
BEGIN VAR G=GATE; I= l NT; S, T =5 I GNAL_~II RE; 0=5 l M_GATE; 
DO mfGJ.INOEX:=l ; FOR G SE CHIP.GATES;&& FOR I FROM 1 BY l; 
SIM_CHIP: =[GATES: !COLLECT CTYPE:G. TYPE GATE:GJ FOR G SE CHIP.GATES; l J; 
DO @<Sl.VHEIGHT:=l; FORS SE CHIP.SIGNALS;&& FOR I FROM 1 BY l; 
SIM_CHIP.WIRES:=ICOLLECT 
lNM1E: S. NAME 
FROl1: SI M_CH IP. GA TES CS. FROll. I NOEXJ 
TO: !COLLECT Slf1_CHIP.GATE5[G.INDEXJ FOR G SE S.TO;l 
WIRE:S 
SET:FALSE 
TAU: +CASE G.TYPE OF 
NOR: 1 
INVERT: 1 
NANO: +1 FORT SE G.INPUTS; 
ENOCASE FOR G SES. TO;J 





FOR Q SE SIM_CHIP.GATES; 
END 
ENDDEFN 




EVENTS= I EVENT l ; 
TIME_SLOT= [TJME:REAL EVENTS:EVENTSJ; 





INVERT _SIMULATION= GATE_S I f1ULAT I ON; 
ABORT_SIMULATION= BOOL; 
DEFINE CLEAR_SlllULATION: 
BEGIN VAR S=SIM_WIRE; 
T lt1E_Ll NE: =NIL; 
TIME:=0: 
DO @1SJ.TRACE:=Nll; 




BEGIN VAR G=SIM_GATE; 
FOR G SE GS; DO 
CASE G.TYPE OF 
INVERT: <,·:INVERT_SIMULATION..-,> (Gl: 
NOR: <-::NOR_Sl\1ULATION,·,> (Gl: 






CASE E OF 
WIRE: @(EJ.VALUE:=E.NEW; 
@{El.TRACE::= TIME# IF E.VALUE THEN 1 ELSE 0 FI <I; 





DEFINE SIMULATE<T:TIME SLOTJ: 
BEGIN VAR E=EVENT: 
Tl ME: = T. T IME: 










DO Til1E_LINE:=TIME_LINEC2-J; END>; 
DEFINE EQ<A,B:SIM_WIRE>=BOOL: MACR0-10C'LSPEQ$'} 
DEFINE EQ(A,B:EVENTl=BOOL: 
CASE A OF 
WIRE: CASE B OF 






DEFINE HOLD_UNTIL<E:EVENT R:REAL>: 
BEGIN VAR TS=Til1E_SLOT;I=INT;V=EVENT; 
I: =0; 
IF OEFINED<TIME_LINEJ THEN 
FOR TS SE Tl 11E_LI NE; WI TH TS. T H1E-EPS I LON=<R; && FOR I FROI! 1 BY 1; DO 
IF TS.TIME\IS_CLOSE_TO R THEN 
END 
IF NEVER E\EQ V FOR V SE TS.EVENTS; THEN 
@(TSJ.EVENTS::= E <$; FI 
I:=..:.l; FI 
IF 1>0 THEN TIME_LINEU+l-J:=[Tlt1E:R EVENTS: !Ell 
<S TIME_LINEU+l-J; 
EF I=0 THEN TIME_LINE::= [TIME:R EVENTS: !EJJ <S; FI 
ELSE TIME_LINE:= I [T!ME:R EVENTS: !El JJ; FI 
END. 
ENDOEFN 
DEFINE HOLD<E:EVENT R:REALl: HOLD_UNTILCE,TIME+Rl; ENOOEFN 
TYPE GATE_EVALUATOR=//BOOLCSIM_WIRESJ\\; 
DEFINE GATE_SIMULATOR<G:SIM_GATE GE:GATE_EVALUATORJ: 
BEGIN VAR R=BOOL; 
R:=<*GE*>(G.INPUTSl; 





DEF I NE NANO CWS: SI f1_W I RES l =BDDL: 
BEGIN VAR S=SIM_WIRE; 
THERE_IS ~S.VALUE FORS SEWS; 
END 
ENDDEFN 
DEF I NE NOR ( WS: SI M_W IRES) =BOOL: 
BEGIN VAR S=SIM_WIRE; 











BEGIN VAR W=SIM_WIRE;O=SIM_GATE; 
DEFINE IN!T(B:BOOLl: 
PRESETCG.OUTPUT,Bl; 
DO INITIALIZE!Q); FOR Q IE G.OUTPUT. TO; 
. ENDDEFN 
IF -G.OUTPUT.SET THEN 
CASE G.TYPE OF 
END 
ENDOEFN 
INVERT: IF G.INPUTSClJ.SET THEN 
INIT(-G.!NPUTSClJ.VALUEl; FI 
NANO: IF THERE_IS W.SET & -W.VALUE FOR W SE G.!NPUTS; THEN 
INJ.T (TRUE>; 
EF ALWAYS W.SET & W.VALUE FOR W BEG.INPUTS; THEN 
INITlFALSEl; Fl 
NOR: IF THERE_IS W.SET & W.VALUE FOR W SE G.INPUTS; THEN 
INIT!FALSt:>; 
EF ALWAYS W.SET & -W.VALUE FOR W SE G.!NPUTS; THEN 
IN! T<TRUEl; FI 
ENDCASE FI 
DEFINE INITIALIZE: 
BEGIN VAR G=SIM_GATE;W=SIM_WIRE; 
DO INITIALIZE!GJ; FOR G SE SIM_CHIP.GATES; 
IF THERE_IS -W.SET FOR W SE SIM_CHIP.WIRES; THEN 
END 
ENODEFN 
CWRITE('/Nlnitial ize node 'SSW.NAME$$'. High(lJ or Low!0J?'J; 
PRESET!W,GET_RESPONSEC'10'J='l'J; 
I NIT I ALI ZE; FI 
DEFINE RUN!T:REALl: 
BEGIN VAR SW=SIM_WIRE; 
TJr1E: =0; 
HOLD_UNTIL!//ABORT_SIMULATION:=TRUE;\\,T); 
I NIT I ALI ZE; 
SIMULATE; 
CALF; 
WRITE('Simulation terminated at time='); 
WRITE {TlMEl; 
CRLF; 
FOR SW SE SIM_CHIP.WIRES; DO 
-Z72.-




DEFINE PRESET<W:SIM_WIRE V:BOOL}: 
@(W. VALUE: =V; 
@dW).NEW:=V; 
@(Wl.SET:=TRUE; 
@CWl.TRACE:=l0#IF V THEN 1 ELSE 0 Fil; 
ENOOEFN 
DEFINE PRESETCN:OS V:BOOLI: 
BEGIN VAR ~!=SI M_l.J IRE; 









BEGIN VAR OS=OS; 




BEGIN VAR OS=OS; 
DO PRESET(QS,FALSEl; FOR OS SE SOS; 
END 
ENDDEFN 
TYPE CLOCK= [PHASE, HIGH, LOW: REAL VALUE: BOOL WIRE: SI M_W I RE I NP! IT: r; 
WAVEFORM= lYALUE:BOOL DELTAS:SR WIRE:SIM_WIRE INPUT:OSJ; 
DEFINE NEXT_CLOCKCC:CLOCKl: 
@CC.WIREJ.VALUE:=(C.VALUE::=-;l; 
@CC.l.JIREl.TRACE::"' TIME#IF C.VALUE THEN 1 ELSE 0 FI d~; 
SIMULATECC.WIRE.TOl; 
HOLDC//:NEXT_CLOCKCCJ\\,IF C.VALUE THEN C.HIGH ELSE C.LOW Fil; 
ENDDEFN 
DEFINE CLOCKCC:CLOCKl: 
BEGIN VAR W=SIM_WIRE; 
IF THERE_lS ~l.NAME\EO C. INPUT FOR W SE SIM_CHIP.WIRES;~JITH W.WIRE. INPUT; 
THEN PRESETCW,C.VALUEl; 
C. WIRE: =W; 
HOLD UNTJL{//:NEXT CLOCK[CJ\\,C.PHASEl; 
ELSE CALF; . -
WRITEC'There is no input named'); 








@{W.WIRE>.TRACE::= TIME# IF W.YALUE THEN 1 ELSE 0 FI <S; 
SIMULATE CW.WIRE.TO!; 
e<WJ.DELTAS:=W.DELTASC2-l; 
IF DEFINED<W.OELTASl THEN 
HOLO~UNTlL{//:NEXT_WAVEFORMlWJ\\,W.DELTAS[lll; FI 
ENDDEFN 
DEF I NE l.JA YEFORM (l.J: ~JA VEFORM l : 
BEGIN VAR I=SIM_WIRE; 








WRITEC'There Is no Input named'); 
WRITE CW. INPUT>; 
CRLF; 
HELP; FI 
DEFINE PLOTCPATH:SP NAME:OS START,SCALEX:REALl=MRG: 
BEGIN VAR P,O=POINT; 
!NAME\PAINTED RED\SCALEO_BY .s~·dl#ll; 
WIRE<BLUE,0,PATH[ll.X#PATH[ll.Y*7 <$ $$ IQ.X*SCALEX#P.Y*7;.#0.Y*7l 
FOR !P;Ol $C PATH;l\AT START#0l 
END 
ENDDEFN 
DEFINE PLOT<SOS:SQS PLT:SIZABLE_COLOR_PLOTTER SCALEX:REALl: 
BEGIN VAR QS-QS;X,Y=REAL;SW=SIM_WIRE; 
X:=8* MAX LENGTHCSC::QSJ FOR QS SE SOS; + 4; 
PLOT<MRG:: !COLLECT IF THERE_IS S~J.NAf1E\EQ QS FOR SW SE Slt1_CHIP.WIRES; 
THEN PLOT<SW.TRACE,SW.NAME,X,SCALEXl\AT 0#Y 
END 
ENDDEFN 
ELSE NIL FI FOR QS SE SQS;&&FOR Y FROM 0 BY -12.;l ,PLTl; 
DEFINE PLOTCSQS:SQS PLT:SIZABLE_COLOR_PLOTTERJ: 
PLOT<SQS,PLT,ll; 
ENDDEFN 
Finally, the RLC has a Run Time System (RTS) which interacts with the user. The 
user types commands to the RTS, which then calls the appropriate routine. We 
w-ant the user to be able to add new routines (such as sorters or packers) at any 
time, just as new technologies can be added. This requires the use of suspendable 
functions. We will name these functions, so users may call them by name. The 
NAME~S datatype holds functions which require no parameters, while NAME:Q__ 
CHIP CONSUMERs hold functions which require a CHIP as its single input 
-274-
parameter. We then define global lists of these functions, and assign the existing 
routines to the list. 
TYPE NAMED~SS= [NAME:OS FUNCTION:SSJ; 
NAMED_SSS= I NAMED_SS l; 
NAMED_CHIP_CONSUMER= [NAME:OS CONSUMER:CHIP_CONSUMERJ; 
NAMEO_CHIP_CONSUMERS= { NAMED_CHIP_CONSUMER J; 
VAR OPTIMIZERS= NA11ED_SSS; 
SORTERS.PACKERS= NAMEO_CHIP_CONSUMERS; 
OPTIMIZERS:= 
[NAME:'REMOVE_INVERTERS' FUNCTION://:REMOVE_INVERTERS\\J ; 
CNAME: 'REMOVE_REDUNOANC I ES' FUNCTION: I I: REMOVE_REDUNDANC I ES\ \J 
CNAl1E: 'REMOVE_NANDS' FUNCTION: II: REMOVE_NANOS\\J ; 
CNAME: 'REMOVE_NORS' FUNCTION:ll:REMOVE_NORS\\J ; 
CNAl1E: 'DE_MORGAN' FUNCTION: I I: DE_f10RGAN\ \J ; 
[NAME:'UNIQUE_INPUTS' FUNCTION://:UNJOUE_INPUTS\\J 
[NAME:'MERGE' FUNCTION://:MERGE\\J l; 
PACKERS:= 
! [NAME:'NMOS_PACK_l' CONSUMER://:NMOS_PACK_lCCHIP)\\J ; 
[NAME: 'Nt10S_PACK_2' CONSUf1ER://:NMOS_PACK_2CCHJP)\\J l; 
SORTERS:= 
{ [NAME:'SMALL_SORT' CONSUMER:ll:SMALL_SORTCCH!Pl\\J ; 
[NAME:' NO_SORT' CONSUMER: II :NO_SORTCCHI P) \\J ; 
CNAME:'RELAXATION_SORT' CONSUMER://:RELAXATION_SORTCCHIPl\\J l; 
The following section is the RLC run time system. The user types commands to the 
RTS, V\.Thich then calls the appropriate routine. When typing a command, the user 
need only type enough to make the command unambiguous. Question marks can be 







BEGIN VAR GO=BOOL; 
GO: =TRUE; 
WHILE GO; DO 
JRST <MENU ('?' , {'GET chip' ; 'PUT chip' ; 'READ f i I e' ; 'PARSE ·, npu t' ; 
0=> CRLF; 
'S !11ULA TE' ; 'ED IT I og i c' ; 'PLOT chip' ; 'FI LE p Io t' ; 
'SORT gates':'DIRECTORY';'UNPARSE';'STATS';'QUIT'l }) 





















BEGIN VAR V=VM_DIRECTDRY_ELEMENT; 
CRLF; 
V:=MENUC'CHIP name?','*','OCHIP 1/2/81'); 










BEGIN VAR FILE=SC; 
CALF; 
FILE:=GET_SCC'Enter file name' ,CR>; 
WHILE IF OEFINEOCFILEJ THEN -EXISTSCFILE> ELSE FALSE FI; DO 
CALF; 
ENO 
WRITEC'The file 'SSFILEU' does not exist. '); 
FILE: =GET _SC<' Enter new f i I e name', CR>; 




BEGIN VAR SC=SC; 
CRLF; 
SC:·=GET_SCC'Enter RLC source:' ,BELU; 
IF DEFINEOCSC) THEN PARSE_SCCSCJ; Fl 
END 
ENDDEFN 
DEFINE RTS_SIMULATE:CRLF; ENDDEFN 
DEF I NE RTS_ED IT: 




WHILE GO; DO . 
I : =MENU ('EDIT>' , ! 'DONE' ; COLLECT NSS. NAME FOR NSS SE OPT Ir1 I ZERS; l ) ; 




ELSE <*OPTIMIZERSU-lJ.FUNCTION*>; FI 
CRLF; 
VAR RLC _tlP I C TURE =MP I C TURE; 
DEFINE RTS_STATS: 
BEGIN VAR T=TECHNOLOGY;l=INT; 
CRLF; 
1:-MENUl'Enter Technology:', lCOLLECT T.NAME FORT SE TECHNOLOGIES;!); 
CRLF; 




BEGIN VAR T-TECHNOLOGY:I=INT; 
CRLF; 
I: =MENU ('Enter Techno I ogy: ', !COLLECT T. NAME FOR T SE TECHNOLOGIES;}); 
CRLF; 
IF 1>0 THEN 
END 
ENDOEFN 
RLC~MP I CTURE: =CDf·1P I LE <CH IP, TECHNOLOGIES [I J ) ; 
RTS_PLOTTER; FI 
DEFINE RTS_FILE: 
BEGIN VAR SC-SC; 
CRLF: 
SC:=GET_SCC'Enter AIF file name:',CRJ; 
CRLF: 






BEGIN VAR l=INT;SC=SC; 
END 
ENDOEFN 








SC:=GET_SC!'Enter file name:',CRJ; 







BEGIN VAR I=INT;NCC=NAMED_CHIP _CONSUt1ER; 
CRLF; 
I: =MENU C' Sort routine?', !COLLECT NCC. NAME FOR NCC SE SORTERS; l l; 





Appendix 5: Bristle Blocks Elements 
The following elements are available for use in Bristle Blocks. The type of each 
element is given, followed by the required and optional parameters for element of 
the given type. 
A5.1: Registers 
There are four basic styles of registers in Bristle Blocks. The first type is the 
standard scratchpad register. It may read or write data from the two data buses. Its 
internal value may refresh, and it may load with a constant. The second type of 
register acts like the scratchpad register, but its value may be driven into the 
instruction decoder. The third register type acts like the scratchpad register, but it 
may also load selected bi ts from the instruction decoder. The fourth register type is 
a combination of the second and third types: the register may drive the instruction 
decoder, and the register may load from the instruction decoder. In the second and 
fourth types, the LATCH parameter controls the loading of the register, which 
occurs during PH!_?. 
lU Element: REGISTER 
Required Parameters: 
Keyword: OPTIONS Type: REGISTER 
Optional Parameters: NONE 
l2> Element: DATAl'OCONTROL 
Required Parameters: -
Keyword: REGISTER Type: REGISTER 
Keyword: MAP Type: SOURCES 
Optional Parameters: NONE 
(3) EI ement: CONTROITO DATA 
Required Parameters: -
Keyword: REGISTER Type: REGISTER 
Keyword: MAP Type: DESTS 
Keyword: ·LATCH Type: EQUATION 
Optional Parameters: NONE 
















A5.2: Simple Arithmetic Elements 
There ~re four simple arithmetic elements in Bristle Blocks: Incrementers, 
Decrementers, Adders, and Subtracters. The incrementer and decrementer each have 
an input register and an optional output register. If the output register is specified, 
the output of the incrementer/decrementer will load the register. If the output 
register is not specified, the incrementer/decrementer will load the input register. 
The LOAD equation states when the load should occur. The carry output is 
available, if desired, to drive the instruction decoder or an output pad. 
The adder and subtracter have two input registers and an optional output register. 
If the output register is specified, the results of the operation are stored in the 
output register. If the output register is not specified, the result of the operation is 
stored in the INPU'l2_A register. For the subtracter, INPUT~ is subtracted from 
INPUT A. The LOAD equation again controls when the register is to be loaded. The 
LATCH equation transfers data from the input registers into internal nodes, and this 
happens during PH!_!. The user may specify a carry input and may use the carry 
output. Notice that these signals are inverted. 
(5} Element: INCREMENTER 
Required Parameters: 
Key1.iord: INPUT_REGISTER Type: REGISTER 
Key1.iord: LOAD Type: EQUATION 
Optional Parameters: 
Keyword: OUTPUT_REGISTER Type: REGISTER 
Keyword: PRE CHARGE Type: EQUATION Default: Alt.JAYS 
Key1.iord: CARRY_OUT Type: OUTPUT 
(E)) Element: DEC REMENTER 
Required Parameters: 
Keyword: INPUT_REGISTER Type: REGISTER 
Keyword: LOAD Type: EQUATION 
Opt i ona I Parameters: 
Keyword: OUTPUT_REGISTER Type: REGISTER 
Keyword: PRECHARGE Type: EQUATION Default: ALWAYS 
Keyword: CARRY_OUT Type: OUTPUT 
(7) Element: ADDER 
Required Parameters: 
K·eyword: -INPUT _A 







Key1.iord: CARRY _I N_BAR 
<8> EI ement: SUBTRACTER 
Required Parameters: 






























De fau I t: AL~JAYS 
Default: ALWAYS 
Default: NEVER 
There are three versions of ALUs in Bristle Blocks. The differences have to do with 
the flag logic. In the first case, the flags are valid during the PH.!_? that the ALU is 
operating, so they may control an operation occurring the next PH.!_J. In the second 
case, these flags may load a flag register, which sits on the buses like any other 
register. The flag bits from this register may drive the instruction decoder. The 
third type. of ALU has a complex flag unit that allows selectable 
loading/testing/modifying of any bit in the flag register. 
Each of the ALUs has two input registers and either one or two output registers. 
Equations control when the two output registers are to be loaded from the ALU. In 
addition, the flags from the ALU are immediately available in the instruction 
decoder, or to pads. The carry output and carry in to the MSB are inverted polarity 
logic. Overflow is detected by exclusive-oring these two output signals. 
Additionally, the MSB and the ZERO flag are available. 
There are several operations which the ALUs will perform. The basic arithmetic 
operations are ADD, SUBTRACT, SUBTRACT_llEV, NEGATE~, and NEGAT~. The 
subtract operation subtracts INPU~ from INPU~, while subtract reversed does 
the opposite. Each of these operations assumes there is no carry (or borrow) input. 
Corresponding to each of these operations is an operation which forces a carry or 
-281-
borrow on the input. These operations are ADD W CARRY, SUB W BORROW, SUER vy_ 
~ORROW, NEG A W_!30RROW, and NEG B W_!30RROW, respectively. Similarly, the 
increment/decrement operations are available: INCREMENT_A, INCREMENT_]3, 
DECREMENT A, and DECREMENT B. These operations force a carry or borrow input. 
The operations which assume no carry or borrow input are just SETA, SETB, SETA, 
and SETB, respectively. 
There are operations which set the output of the ALU to a constant value or to one 
of the input values. These operations are SETZ (or ZERO), SETO (or ONES), SETA, 
SETB, SETCA, and SETCB. SETA sets the ALU output to be the value in the INPU~ 
register, while SETCA sets the output to be the compliment of this value. 
Additionally, the ALU can do AND and OR operations on either the input data or its 
compliment. These operations are AND, ANDCA, ANDCB (or TEST), ANDC (or 
NOR),OR, ORCA, ORCB, and ORC (or NAND). The basic AND and OR functions perform 
the obvious operation. The -CA suffix indicates that the operation is performed 
using the compliment of the INPUT A value, while -CB indicates that the 
. compliment of the INPUU value is used. -C indicates that compliments of both 
input values are used. The exclusive-or operations are also available: XOR and EQV 
(or XNOR). 
The ALU can perform single bit left shift operations: SHIF~, SHIF~, SHIFT AW_ 
~SB, and SHIFT B W_!.SB. The SHIFT A and SHIF~ operations shift a zero into the 
least significant bit, while tp.e remaining operations shift a one into the LSB. 
The remaining operations include MASK operations and Find-First-One (or zero). 
The MASK_!\.B and MASK_!3A instructions are used to generate masks. With the 
MASK AB operation, the ALU output will be high between the least significant high 
bit in A and the next high bit in B, and between the next high bit in A and the next 
high bit in B, etc. High bits in A generate carrys while high bits in B kill the carry. 
The FFO A instruction produces an output which is low in every bit position except 
the first low bit in A. This is the Find First Zero in A instruction. Similarly, the 
FF!._{\., FF~. and FF~ instructions exist. 
The DONT CARE instruction is also listed. This operation states that the particular 
instruction is an undefined opcode, so Bristle Blocks can fill this with any 
instruction. 
(9) EI ement: ALU 
Required Parameters: 
Keyword: INPUT_A 



















I NCRH1ENT _B 









Key1.iord: OUTPUT _2 
Keyword: PRECHARGE 
Keyi.1ord: CARRY _OUT _BAR 
K eyt~ortl: CARRY _I NTO_l"lSB_BAR 
Keyword: MSB 
Key1~ord: ZERO 
Keyword: WRITE OUTPUT 1 

























Type: EOUATI ON 
MASK_BA 
SUBR _L.J _BORRmJ 
NEG_B_~J_BORROW 
DECREMENT_B 







Defau It: ALWAYS 
Def au It: AUJAYS 
Default: NEVER 
( 10) Element: ALU WITH FLAGS Required Parameters: · 
Keyword: INPUT_A 
Keyword: INPUT_B 
















Key1.Jord: OUTPUT _2 
Keyword: PRECHARGE 
Keyword: CARRY_OUT_BAR 
K ey1.Jord: CARRY _I NTO_MSB_BAR 
Key1Jord: MSB 
Key1.Jord: ZERO 




















SHI FT _B_iJ_LSB 
FFl_B 




















Oefaul t:' AL~IAYS 
Default: NEVER 
De fau It: [REFRESH: AL~IAYSJ 
Default: NEVER 
This element is similar to the ALU element, with the addition of a flag register. The 
flag register will load from the ALU when the load flags equation is true. Bit 1 of 
the register loads with the carry output, bit 2 loads with the MSB, bit width/2 + 1 
loads vvith zero, and bit width loads with tlie LSB. If the datapath vvidth is 8, bit 5 
loads vvith zero .and bit 8 loads with LSB. The remaining bits are unaltered by the 
load flags control. The to control specification allows these flag bits to drive lines 
of the instruction decoder. 
-284-



























































SH I FT _B_W_LSB 
FFl_B 









































Keyword: WRITE OUTPUT 1 

















In addition to the operations available with the standard ALU, this ALU includes a 
'Wide variety of flag operations. The FLAGS register holds the values of the flags, 
the MASK register may select which of the FLAGS register's bits should load, and 
the FLAG~CCUMULATOR register is used to accumulate flag values. A function 
block (see #29 in section A5.8) exists between the FLAGS register and the FLA~ 
~CCUMULATOR to implement the flag accumulations. The LOAD_ALL operation 
loads all flags from t~e ALU into the FLAGS register. The LOA~MASKED operation 
only loads those bits whose corresponding MASK register bits are high. TEST 
~ELECTED will load the FLAG bit (MSB of FLAGS) with the FLAGS bit selected by the 
-285-
FLAq__§ELECT field. SE~ELECTED will set the bit which FLA~ELECT indicates, 
and CLR SELECTED will clear that bit. CMP SELECTED complements the selected bit. 
LOAD SELECTED transfers from the FLAG bit to the selected bit. 
The bits -in the flag register have the following values. The MSB is carry out, the 
next bit is carry into the MSB, the next bit is MSB, the next bit is overflow, the 
next is greater than or equal, the next is higher, the next is greater than, the next is 
zero, the next is the value of OL12_!LAG (an optional input), and the LSB is LSB. Bits 
10-15 are not used. This element can only be used in datapaths than are 
16 bits wide. 
A5.4: Ports 
The port units are used for data communication with off-chip circuitry. The INPUI_ 
_!'.ORT has a register which will load data from off chip when the LOAD equation is 
TRUE. The OUTPU':!:._!'ORT will always drive the data in its register off chip unless 
the DRIVE equation is present, in which case the port only drives when the 
equation is TRUE. The IQ__!'ORT incorporates features of both the input ports and the 
output ports. When the LOAD equation is TRUE, the off chip data are loaded into the 
input register. When the DRIVE equation is TRUE, data in the output register are 
driven off chip. If the INPU~EGISTER is not specified, the port will have only a 
single register, which is uses for both types of data transfer. 
In each of these ports, the LOAD and DRIVE equations have variable timing, which 
means that the timing requirements of the control line buffers may be given by the 
user. These operations will occur during PH!_? by default, but the user may state 
either PH!.J timing or asynchronous timing should be used. Each of these ports has 
. an optional mask, which can be used to indicate which bits of the register(s) 
actually connect to pads. Bits of a register which do not connect to a pad will be 
unaffected by a LOAD operation. 
-286-





Type: EQUATION Variable Timing Optional Parameters: 
Keyword: MASK Type: MASK 


























The ROM (Read Only Memory) functions in Bristle Blocks are used to drive constant 
data onto the data buses. The value(s) contained in these ROMs can drive each bit of 
the data bus(es) high or low or not affect the value on the bus. The enable 
functions control the gating of the fixed value onto the bus. The LOWER ROM 
function drives the low-er data bus, the UPPER ROM function drives the upper data 
bus, while the ROM and ROM PAIR functions drive both buses. The ROM PAIR ' ~ ~ function is logically equivalent to two ROM functions, but requires less chip area. 
( 15) Element: LOWER ROM 
Required Parameters:-
Keyword: VALUE Type: MASK 
Keyword: ENABLE Type: EQUATION 
Optional Parameters: NONE 





















( 18) Element: ROM PAIR 
















The barrel shifters are capable of performing multiple-bit shifts in a single clock 
cycle. These shifters have two input words: the Most Significant Word (MSW) and 
the Least Significant Word (LSW). The output register may load from almost any 
contiguous set of bits in the combined MSW-LSW register. The shift constant 
indicates how many bits from the most significant end of the LSW are to appear in 
the output, with the remaining bits coming from the least significant end of the 
MSW. The width of the shift constant field must be at least log base two of the 
datapath width. In the SIMPL~HIFTER, the user specifies registers for the MSW, 
LSW, and the output, along with the shift constant and a LOAD equation, which 
controls the loading of the output register. The MASKE12_§HIFTER has an additional 
mask register which can be used to control the loading of the output register. The 
two load signals, LOAJL!:t:._9 and LOA:Q__!U, specify the polarity of the mask bits. 
When the LOAQ_!~ line is high, the only bits of the output register than are loaded 
from the shift operation are those bits whose corresponding mask bits are low. 
Similarly, the LOAD_JF_1 line controls loading the output register's bits whose 
corresponding mask bits are high. If both control lines are high, all of the output 
register bits are loaded. The BARREL_§HIFTER does not have an explicit MSW 
register or LSW register. Instead, two input registers are provided, along with 
circuitry which multiplexes various values into the MSW and LSW of the shifter. 
The MSW can be loaded from either of the two input registers or from the constants 
0, 1, -1, and -Z. The LSW can be loaded from either of the two input registers or 
from the ·constants 0 and -1. Given these possibilities, any of the arithmetic or 
logical shifts and rotates can be performed with the shifter. The following table 
lists the MSW and LSW values for the various OPERATIONs of the BARREL SHIFTER. 
-288-
Operation MSW LSW 
ROTATE_A A A 
ROTATE_B 8 8 
SHIFT_AB A B 
SHIFT _BA B A 
SLA A 0 
SLB B 0 
SRA_LOGICAL 0 A 
SRB_LOGICAL 0 B 
UNARY 0 -1 
UNARY_BAR -1 0 
SRA_ZERO 0 A {see SRA_LOGICALl 
SRB_ZERO 0 B {see SRB _LOG I CAL l 
SRA_ONE -1 A 
SRB_ONE -1 8 
DECODE 1 0 
DECODE_BAR -2 -1 
The most significant bits of the two input registers are available to drive the 
instruction decoder, which is useful for computing the sign -ext en ti on cons tan ts 
for arithmetic shifts. The BARREL SHIFTER also has a mask register. 







Optional Parameters: NONE 
































































The bus precharge elements are used to precharge the data buses. Each of the data 
processing elements in Bristle Blocks (except for the ROM cells) only drives the data 
buses low. To transmit a high value, the data processing elements do not affect the 
bus, assuming that the bus originally had every bus line high. In order to transmit 
data, therefore, the buses must be precharged. These elements precharge one or 
both of the buses during PH!_?. The data buses can be used to store data from one 
cycle to the next, if the clocks run fast enough, and if no other element writes on 
the bus. The first three elements simply precharge the buses. The remaining two 
functions not only precharge the bus, but they 'break' the bus. The bus to the left is 
terminated, and a new bus begins to the right (this new bus must be precharged by 
a different bus precharge element). This allows Bristle Blocks to com.pile chips 
with more than two data buses, although only two data buses may pass any 
element. 
C22J Element: PRECHARGE LOWER 
Required Parameters: NONE 
Optional Parameters: 
Keyword: PRECHARGE Type: .EQUATION 
<23> Element: PRECHARGRJPPER 
Required Parameters: NONE~ 
Optional Parameters: 
Keyword: _PRECHARGE Type: EQUATION 
Default: AUJAYS 
Default: ALWAYS 
(24) Element: PRECHARGE BOTH 
Required Parameters: NONE-
Opt ional Parameters: 
-29D-
Keyword: PRECHARGE Type: EQUATION Default: ALWAYS 
t25> EI ement: PRECHARGEA.ND BREAK LOWER Required Parameters: NONE-
Optional Parameters: 
Keyword: PRECHARGE Type: EQUATION Default: ALWAYS 
t26> EI ement: PRECHARGEA.ND BREAK UPPER Required Parameters: NONE-
Opt i ona I Parameters: 
Keyword: PRECHARGE Type: EQUATION Default: ALWAYS 
A5.8: Randon1 Simple Elements 
There are a few simple elements which do not fit in the categories presented above. 
These elements are described here. 








Type: EQUATION Default: ALWAYS 
The BUS CAM element will monitor data flow across the lower bus. When the 
sampled data matches the fixed value wired into the CAM, the output signal will go 
high. The LATCH equation controls the sampling of the bus. The VALUE mask 
stat.es the comparison value for the CAM. When all the bus bits corresponding to O 
bits in the mask are low and when all the bus bits corresponding to I bits in the 
mask are high, the output signal goes high. 
(28) Element: CAM 
Required Parameters: 
Keyword: REGISTER Type: REGISTER 
Keyword: VALUE Type: MASK 
Keyword: OUTPUT Type: OUTPUT 
Optional Parameters: NONE 
This element is similar to the BUS CAM but that the CAM monitors the value 
contained in its register. Whenever the register's value matches the CAM's value, 
the output signal goes high. There is no LOAD signal, since the CAM always 
monitors the register's value. 
-291-
<28> EI ement: FUNCTION BLOCK 
Required Parameters: 




































The FUNCTIO~LOCK element is used to perform boolean operations betvveen 
values. The function block takes data from the two input registers, and can store 
data into the INPU~A register and the OUTPUT register. The FALS~ALSE (FF), 
FALS~RUE (FT), TRU~ALSE (TF), and TRUE TRUE (TT) lines control the function 
of the element. If the FF line is high, all bits of the output which correspond to lovv 
"Qits in both input registers will be high. Similarly, the TT line controls the output 
bits corresponding to high bits in both registers. If TF is high, all output bits vvhkh 
correspond to high bits in INPU'I.J\ and low bits in INPU':!:.J3 will be high. The FT 
control is similar to the TF control. An alternative statement of the FUNCTION 
~LOCK operation is that each pair of input bits selects which control line drives the 
corresponding output bit. For example, if the MSB of INPUT A is high and the MSB 
of INPUT_? is low, the MSB of the output will be the value of the TRU~ALSE 
control. If TT, TF, and FT are high and FF is low, the function block performs an OR 
operation, while if TT is the only high control, an AND function is performed. The 
PRECHARGE equation controls the loading of data from the input registers to 
in tern al nodes. 
<30l Element: LEFT RIGHT SHIFT 
Required Parameters:-





















The LEFT RIGHT SHIFT element is a bi-directional, single-bit shifter. When the 
SHIFT_!,EFT control is high, the data in the INPUT~EGISTER are shifted one bit 
tovvard the MSB and loaded into the OUTPUT REGISTER. If the OUTPUT REGISTER is 
not specified, the data are loaded into the INPU'~EGISTER. The LSB of the output 
-292-
register is loaded with value of the INPUT equation. The SHIFT RIGHT control 
shifts data toward the LSB, with the MSB receiving data from INPUT. The 
PRECHARGE equation loads the input register's data into internal nodes. The MSB of 
the input register is available to drive the instruction decoder. 

















Oefaul t: rREFRESH:ALt,,lAYSJ 
Def au It: [REFRESH: ALl,JAYSJ 
Default: ALWAYS 
The STACK element implements a stack in the datapath. The stack is consists of a 
TOP register followed by DEPTH-1 MIDDLE registers, followed by a BOTTOM 
register. Between adjacent register pairs lie circuitry for transfering data betvveen 
the registers. When the PUSH control is TRUE, data is moved away from the TOP 
register: The TOP register's data loads the first MIDDLE register, while the first 
MIDDLE register's data are loading the second MIDDLE register, etc. When the POP 
control is TRUE, data are moved towards the TOP register. The PUSH and POP 
controls should not both be high, nor should POP be high while the TOP register is 
writing onto a data bus. 
A5.9: Compound IR Elements 
The following cells combine the DAT~~ONTROL circuitry with another simple 
element function. The DATA_JCL_.90NTROL function is useful for implementing 
Instruction Registers (IR) because the function of an IR is to turn data values in to 
control values. In the INCREMENTIN~R example, the IR's data can be incremented. 
Alternatively, one may think of the incrementer's output driving the instruction 
decoder. The operation of each of these units can be found by comparing the 
functions of the DATA_JCLfONTROL element (2) and the simple element which is 
fused with the IR. 
-293-


























Type: EQUA Tl ON 
Type: OUTPUT 
134> Element: SHIFTINGR 
Required Parameters: 























Keyword: BACKUP Type: REGISTER 
Keyword: SAVE Type: EQUATION 
Keyword: REFRESH T~pe: EQUATION 
Keyword: RESTORE Type: EQUATION 





Def au I t : [REFRESH: ALWAYS] 
Def au I t: CREFRESH: ALWAYS] 
Default: NEVER 
Def au It: AL~JAYS 
Default: NEVER 
This element is a depth=l stack. One of the registers (ACTIVE) is connected to the 
IR, the other (BACKUP) is a backup register. SAVE moves the data from ACTIVE to 
BACKUP, RESTORE moves the data from BACKUP to ACTIVE, and if both are high, 
the two registers swap value. 
-294-










******see (2) and ( 12) 
Type: MASK 
Type: REGISTER Def au I t : CREFRESH: ALWAYS J 
A5.10: Compound Output Port Elements 
In the same manner as section A5.9 presented IR compounds with various elements, 
this section lists Output ports ( 13) fused with other simple elements. 
(37) Element: INCREMENTING PORT Required Parameters: 
Key1.Jord: LOAD Type: EQUATION 
Opt i ona I Parameters: 
Keyword: DRIVE Type: EQUATION Variable Timing Keyword: MASK Type: MASK 
Key1.JOrd: REGISTER Type: REGISTER Default: CREFRESH: ALWAYS J Keyword: PRE CHARGE Type: EQUATION Default: ALWAYS Keyword: CARRY_OUT Type: OUTPUT 
******see (5) and ( 13) 
(38) Element: DECREMENTING PORT Required Parameters: 
Keyword: LOAD Type: EQUATION 
Opt i ona I Parameters: 
Key1.Jord: ORI.VE Type: EQUATION 
Key1.Jord: MASK Type: MASK 
Kei,Jword: REGISTER Type: REGISTER 
Keyword: PRECHARGE Type: EQUATION 
Key1.Jord: CARRY _OUT Type: OUTPUT 
****** see (6) and ( 13) 












******see (7) and ( 13) 
Type: EQUATION 


















(40) Element: SWAPPING OUTPUT PORT 
Required Parameters: · 
Keyword: ACTIVE Type: REGISTER 
Optional Parameters: 
Keyword: BACKUP Type: REGISTER 
Keyword: SAVE Type: EQUATION 
Keyword: REFRESH Type: EQUATION 
Keyword: RESTORE Type: EQUATION 
Keyword: DRIVE Type: EQUATION 
Keyword: MASK Type: MASK 
***"'**see (31) and ( 13), also section AS.11 











In the same manner as section AS.9 presented IR compounds with various elements, 
this section lists swapping registers fused with other simple elements. Swapping 
registers are effectively a depth= 1 stack. One of the registers (ACTIVE) is connected 
to the simple element with which the swapper is compounded, the other (BACKUP) 
is a backup register. SAVE moves the data from ACTIVE to BACKUP, RESTORE moves 
the data from BACKUP to ACTIVE, and if both are high, the two registers swap 
value. 











Type: EOLIA TI ON 
Type: EQUATION 
Default: NEVER 
Def au I t: AL~JAYS 
Default: NEVER 
This element is just a pair of swapping registers. 



























Oefaul t: CREFRESH:ALWAYSJ 
-296-




























































!44> Element: SWAPPING>ECREMENTER 
Required Parameters: 
K ey1-1ord: LOAD 
Key1-1orcl: ACT I VE 
Optional Parameters: 
Keyword: PRECHARGE 













































Def au I t: CREFRESH: ALWAYSJ 




Def au I t: CREFRESH: ALWAYS J 




In the same manner as section A5.9 presented IR compounds with various elements, 
this section lists CAM registers (28) fused with other simple elements. 
-297-












Key1.,1ord: CARRY _I N_BAR 



































Keuworcl: CARRY OUT BAR 
Keyword: LATCH- -
K ey1.iord: CARRY _I N_BAR 








































Keyword: PRECHARGE Type: EQUATION 
Keyi.1ord: CARRY _OUT . Type: OUTPUT 
******see (28) and (5) 
Def au It: AUJAYS 































Keyt~ord: -CARRY _OUT _BAR 
Keyword: LATCH 
Key1~ord: CARRY _I N_BAR 



















C50l Element: SHIFTERNITH VALUE CHECK 
Required Parameters: -
Keyword: SHIFT_LEFT Type: EQUATION 
Keyword: SHIFT RIGHT Type: EQUATION 
Ke~word: VALUE- Type: MASK 
Keyword: RESULT Type: OUTPUT 
Optional Parameters: 
Def au I t: rREFRESH: ALWA VS l 
Default: AL~JAYS 
Def au I t: AUIA YS 
Default: NEVER 
Keyword: INPUT Type: EQUATION Default: NEVER 
Keyword: MSB Type: OUTPUT 
Keyword: PRECHARGE Type: EQUATION Default: ALWAYS 
Key1-1ord: REGISTER Type: REGISTER Oefaul t: [HEFRESH:ALWAYSJ 
unu see (28) and (30) 
A5.13: Random Compound Elements 
The remaining two elements are SHIFTING ACCUMULATOR and INCREMENTER 
DECREMENTER. The SHIFTIN~CCUMULATOR is a two register adder (7) with a 
. left-right shifter (30) on the input/output register. The INCREMENTER 
DECREMENTER is a back-to-back two-register INCREMENTER (5) and 
DECREMENTER (6). When the LOAD_!JEC line is high, the incrementer input 
register is loaded vvith one less than the value in the decrementer input register. 
When the LOA~NC line is high, the decrementer input register is loaded with one 
more than the value in the incrementer input register. 
-299-
{51) Element: SHIFTING ACCUMULATOR 
Required Parameters: 
Key1-1ord: SHIFT _LEFT Type: EQUATION 
Key1.iord: SHIFT_RIGHT. Type: EQUATION 
Key1..Jord: ACCUt·1ULA TOR Type: REGISTER 
Key1..Jord: LOAD Type: EQUATION 
Opt i ona I Parameters: 
Kew.Jard: INPUT Type: EQUATION Default: NEVER 
Key1..Jord: J1SB Type: OUTPUT 
Key1--1ord: PRECHARGE_2 Type: EQUATION Default: AUIAYS 
Key1--1ord: INPUT TiJpe: REGISTER Oefaul t: lREFRESH:ALWAYSJ 
Keyword: PRECHARGE_l Type: EQUATION Def au It: ALWAYS 
Key1-1ord: CARRY _OUT _BAR Type: OUTPUT 
Key1.iord: LATCH Type:. EQUATION Default: ALWAYS 
Keyword: CARRY _I N_BAR Type: EQUATION Default: NEVER 
(52) Element: INCREMENTER DECREMENTER 
Required Parameters: 
Keyword: LOAD_DEC Type: EQUATION 
Key1-1ord: INC_INPUT Type: REGISTER 
Keyword: DEC INPUT Type: REGISTER 
Key1..Jord: LOAD_INC Type: EQUATION 
Opt i ona I Parameters: 
Keyword: PRECHARGE_DEC Type: EQUATION Default: ALWAYS 
Key1.iord: CARRY _OUT _DEC Type: OUTPUT 
Key1--1ord: PRECHARGE_INC Type: EQUATION Default: AL~IAYS 
Keyword: CARRY_OUT_INC Type: OUTPUT 
A.5.14: Summary 






















































ACCUMULATOR WITH VALUE CHECK 
ADDER 
ADDER WITH VALUE CHECK 
ADDING PORT 
ALU 
ALU WITH FLAGS 




CONTROL TO DATA 
CONTROL TO DATA AND BACK 
·DATA TOCONTROL -
DEC REMENTER 
















PRECHARG E AND BREAK LOWER 













SUBTRACTER WITH VALUE CHECK 
SWAPPING DECREMENTER-
SWAPPINGINCREMENTER 
SWAPPING INPUT PORT 
SWAPPINGIR 





· [lJ App I icon 
Appl icon Users Manual 
Appl icon, Inc. Burlington, MA. 
t2J Automation Technology 
"Precision Artwork Language CPALJ" 
Automation Technology, Inc. 1971 
[3] Ayres, R.F. 
A Language Processor and a Sample Language 
Ph.D. Thesis (#2276) 
California Institute of Technology, 1979 
[4] Ayres, R.F. 
"IC Design Under !CL, Version 1" 
Caltech SSP Report #1388 {revised #4031) 
California Institute of Technology, 1978 
(5] Ayres, R.F. 
"Si I icon Campi lat ion-A Heirarchical Use of PLAs" 
Proceedings of the 16th Design Automation Conference, 1979 
£6) Buchanan, I • 
Model I ing and Verification jn Structured Integrated Circuit Design 
Ph.D. Thesis 
University of Edinburgh, 1380 
C7l Ca I ma 
GOS II Product Specification 
Calma Interactive Graphics Systems Sunnyvale, CA. 
[8] Fairbairn, D.G. and Rowson, J.A. 
"Interactive Integrated Circuit Design on a Smal I Computer" 
Proceedings of 1st Conference on Computer Graphics 
in CAD/CAM Systems, 1973 
£9) Fe I I er, A. 
"Automatic Layout of Low-Cost Quick-Turnaround Random-Logic 
Custom LSI Devices" 
Proceedings of the 13th Design Automation Conference, 1976 
[10) Friedman, T.O. 
"Methods Used in an Automatic Logic Design Generator <ALERT>" 
IEEE Transactions on Computers, C-18, July 1969, p. 593-814 
lll] Herrick, W.V. and Sims, J.R. 
"A Successful Automated IC Design System" 
Proceedings of the 13th Design Automation Conference, 1976 
£12) Johannsen, D.L. 
"Bristle Blocks: A Si I icon Campi ler" 
Caltech Conference on VLSI, 1879 
-302-
£13] Johannsen. D.L. 
"Bristle Blocks: A Si I icon Campi ler" 
Proceedings of the 16th Design Automation Conference, 1979 
£14] Johannsen, D.L. 
"Hierarchical Power Routing" 
Caltech SSP Report #2069 
California Institute of Technology, 1978 
[15] Johannsen. D.L. 
OM2 LSI Chip 
Caltech Part #986 
California Institute of Technology, 1978 
.[16] Johannsen, D.L. 
"0.M2" 
Caltech SSP Report #1111 
California Institute of Technology, 1978 
C17J Johannsen. D.L. 
"Our Machine, A Microcoded LSI Processor" 
Proceedings of the 11th Annual Microprogramming Workshop, 1978 
£18] Lattin, W. 
"VLSI Design Methodology: The Problem of the 80's 
for Microprocessor Design" 
Caltech Conference on VLSI, 1979 
[19J Locanthi, B. 
"LAP: A Simula Package for IC Layout" 
Caltech SSP .Report #1862 
California Institute of Technology, 1978 
[20] Mead, C.A. and Conway, L. 
Introduction to VLSI Systems 
Addison-Wesly Publishing, Reading MA., 1980 
[21] Masteller, R.C. 
REST -- Stick Diagra·m Editing Sustem 
Masters Thesis 
California Institute of Technology, 1981 
[221 Oestreicher, D. 
"PLASYS: Final Report" 
Caltech SSP Report #3655 
California Institute of Technology, 1980 
[23) Parker, A., et al. 
"The CMU Design Automation System: 
An Example of Automated Data Path Design" 
Proceedings of the 16th Design Automation Conference, 1979 
[24] Rowson, J.A. and Trimberger, S. 
"Riot -- A Stupid Graphical Composition Tool" 
Caltech SSP Technical Report #4142 
California Institute of Technology, 1981 
-303-
C25l Rowson, J.A. 
Understanding Hierarchical Design 
Ph.D. Thesis 
Cal1fornia Institute of Technology, 1880 
[261 Schorr, H. 
"Computer-Aided Digital System Design and Analysis 
Using a Register Transfer Language" 
IEEE Transactions, Electronic Computers EC13, Dec. 1954, p. 730-737 
C27J Sega I, R. 
Structure. Placement and Model I ing 
Masters Thesis (#4029) 
California Institute of Technology, 1980 
£281 Sequin, C. 
"STIF: A Proposal for a Structured Topological Interchange Format" 
University of California, Berkeley, 1980 
[291 Tarolli, G. 
"Towards a Working VLSI CAD Tool: A Chip Assembler" 
Caltech SSP Report #3131 
California Institute of Technology, 1979 
(30) Trimberger, S. 
"Combining Graphics and Layout Language in a Single Interactive System" Caltech SSP Technical Report #3794 
California Institute of Technology, 1980 
(311 Trimberger, S. 
"Nick -- A FORTRAN Layout Language Package" 
Caltech SSP Report #3485 
California Jnstitute of Technology, 1980 
(32) Trimberger, S. 
"The Proposed Sticks Standard" 
Caltech SSP Technical Report #3880, 
California Instituta of Technology, 1980 
[33) Trimberger, S. 
A Wire Oriented Mask Geometry Editor 
Masters Thesis 
California institute of Technology, 1979 
[34) Wi 11 iams, J.O. 
Sticks -- A New Approach to LSI Design 
t'lasters Thesis 
Massachusetts Institute of Technology, 1977 
