Standard Transistor Array (STAR).  Volume 1:  Placement technique by Cox, G. W. & Caroll, B. D.
i N B S  k-CB-1612t19) SlAbi D A b C  1bAblSISTOB A B i i A Y  U81-32386 
(STAh)  , VOLUflE 1: PLACEISN1 TECflHIGOb 
P i a a l  kieport (Auburn Univ.) 1511 y 
tic A 0 7 / a ~  a01 C S C L  OYC U n c l d s  
G 3 / 3 3  37462 
NASA CONTRACTOR 
REPORT 
NASA CR-161289 
STANDARD TRANS I STOR ARRAY (STAR) 
Volume 1: Placement Technique 
By G. W. C w  and B. D. Carroll 
Electrical Engineering Depaxtment 
Auburn University 
Auburn, Alabama 36830 
July 26, 1979 
:<. ! 
[! , ; 
&. 
i i 
i 
I. fi' 
k..? 
, :, i. 
i, 
t... 
$ - g 
f 
fa' 
P, 
Prepared for 
NASA - George C .  Mareball  Space  F l i g h t  Center 
* c 
Marshall Space Flight Center, Alabama 35812 
.% . 
w* 9iQ 
- 
- ,  
https://ntrs.nasa.gov/search.jsp?R=19810023843 2020-03-21T11:25:05+00:00Z
TABLE OF CONTENZS 
LIST OF TABLES ........................................ 
LIST OF FIGURES ....................................... 
I. INTRODUCTION .................................... 
11. BACKGROUND . . . . . . . . . . . . .udd. . . . . . . . . . . . . . . . . .g  
111. THE LINEAR ORDERING-FOLDING (LOF) TECHNIQUE ..... 
LV. STAR PLACEMENT OPTIMALITY MEASUREMENT ........... 
V. THE CELL ARRANGEMENT PROGRAM FOR STAR (CAPSTAR) . 
VI. PERFORMANCE OF PROCEDURES ....................... 
VII. CONCLUSION ...................................... 
REFERENCES ............................................ 
APPENDIX A 
EQUATIONAL DEVELOPMENT . . . . . . . . . . . . . . . . . ~ . . . . . . . . . . . o .  
iii 
LIST OF TABLES 
1. Test C i r c u i t  Parameters ............................ 113 
LIST FIGURES 
STAR CMOS Bulk-Metal Understructure ............... 
STAR Logic Cell ................................... 
STAR Placement .................................... 
Typical Polycell LSI Organization ................. 
STAR Cell Global Paths ............................ 
Double-Level Metallization and Vias ............... 
STAR Cell Pin Range ............................... 
Element Interconnections .......................... 
Logic Circuit and Equivalent 
Graph Model ....................................... 
Alternate Graph Models for Net .................... 
Functional Flow of Clustering 
Procedure ......................................... 
Binary Tree Representation of 
Results of Clustering Step ........................ 
Subtree Rotation .................................. 
Functional Flow of Decomposition 
Procedure ......................................... 
Could Criteria Clustering Yodels .................. 
Uniform Cell Folding Methods 
for Infinite STAR ................................. 
STAR Transistor Area Channel Model ................ 
Block-Oriented Folding 
(Block Depth = 1). ................................. 
Black-Or i en t ed  Folding 
(Block Depth - 2 )  ................................. 69 
Rela t ion  Between U t i l i z a t i o n  
and Block Depth ................................... 71 
Non-Uniform C e l l  Placement ........................ 76 
STARS on Which .a Given ( : a l l  S e t  
Cannot be Placed .................................. 78 
Dependence of " F o l d a b i l i t y "  on 
Linear Order ...................................... 79 
Funct ional  Flow of Folding Procedure .............. 82 
STAR Channel Usage Est imat ion 
Concepts .......................................... 86 
Dependence of Routing on 
' . 'nternal  C e l l  S t r u c t u r e  ..................,........ 87 
Example of Net w i t h  Mul t ip le  
........................ Crossings  of Row or Column 90 
CAPSTAR High-Level Plow ........................... 97 
Func t icna l  Flow of a Fold Cycle ................... 1 0 4  
CAPSTAR Performance fo r  Tes t  C i r c u i t s  ............. 115 
E f f e c t  of MAXSOL on Besc 
Pre-PI Rating ..................................... 117 
E f f e c t  of IMPROVE on Best 
Post-PI Rating .................................... 118 
E f f e c t  of RN on Best 
Post-PI Rating .................................... 119 
E f f e c t  of CN on Best 
Post-PI Rating .................................... 1 2 1  
E f f e c t s  of t he  Number of STAR 
Rows and Columns on Placement 
Rating ........................................... 122 
E f f e c t  of STAR Area on 
.................................. Placement Rating 1 2 4  
Effect of Block Depth on 
............................. Average Pre-PI Ratings 125 
CAPSTAR and PWI Performance ....................... 131 
Example Input File ................................ 154 
Data Entry Step Output ............................ 156 
Output Shown for Single 
Cluster Formation ................................. 158 
Clustering Step Summary Output .................... 159 
Linear Ordering Step Cutput ....................... 160 
Wirecross Output .................................. 160 
Folding Summary and 
Result Rating Output .............................. 161 
Pictorial Placement Output ........................ 163 
Pad Placement. Gr id Coordinate 
Translation. and Barrier 
Construction Output ............................... 164 
CAPSTAR Output File ............................... 166 
I. INTRODUCTION 
The physical design problem has long been recognized as 
a critical aspect of the implementation of any system. In 
its most basic form, the solution of this problem consists 
of translating a conceptual specification of the system into 
a physical implementation. 
A number of problems may occur in the satisfaction of 
this translation process. The conceptual system may have 
been designed without regard to physical limitations which 
must be overcome in order to actually construct the system. 
The conceptual specification may be presented at a high, 
functional, level which is not easily translated to the 
detailed level required for construction. In addition, due 
to the complexity of the design, it may not be possible to 
"solven the physical design problem in an acceptable amount 
of time. 
Perhaps in no other area do these problems occur more 
frequently than in the design of digital LSI (Large Scale 
Integration) circuitry. The digital logic designer has 
historically worked at a gate level, often with disregard 
for the physical considerations necessary for implementation 
of this design at the transistor level. Internal 
connections between the transistors composing a gate and 
connections between gates occur with such frequency that 
"good" physical solutions can easily resemble bowls of 
spaghetti. Size, weight, electrical noise, and heat 
dissipation requirements may be ignored at the conceptual 
level, yet become critical once circuit construction has 
begun . 
To assist in overcoming these inherent difficulties, 
the digital iogic physical design problem has been divided 
into three specific steps: partitioning, placement, and 
routing. The first of these steps, partitioning, involves 
the assignment of circuit elements to modules (or chips) 
subject to the size limitations of each module and the 
desired inter-module connectivity. The second step, 
placement, entails specification of the exact location for 
.each element in a module subject to some optimizing criteria 
such as ease of interconnection. Once the position for each 
element is specified, the third step in which the element 
interconnection paths are located . e l  the circuit is 
routed) , is performed. 
All three steps of t..? design procedure are of equal 
importance to the realization of a physical circuit 
implementation, and algorithms have been developed for the 
satisfaction of each step. These existing techniques have 
been developed to meet the physical design requirements of 
particular LSI organizations and manufacturing strategies. 
Due to the wide variety of LSI development systems, certain 
techniques are more adaptable to one system than to another. 
Thus, while the general nature of many of the techniques is 
similar, specialization of physical design procedures to a 
single technology is the rule. 
In the remainder of this report , a procedure for 
implementation of the placement step is presented. This 
procedure is oriented toward a new LSI technology developed 
at Marshall Space Flight Center. The technology is briefly 
described in the following section. Objectives and 
organization of the report are more explicitly stated 
at the end of this chapter. 
The Standard Transistor Array (STAR) 
Historically, a low-volume user requiring a 
special-purpose digital integrated circuit (IC) has beer1 
forced to weigh the advantages of integration (such as 
design secrecy, reliability gain and size reduction) against 
the high costs associated with the development of a custon~ 
IC. These high costs were imparted due to the nature of 
custom integrated circuit development technology which is 
geared toward large quantity production. 
In the last several years, alternate methods of 
development of special-purpose integrated devices, suited 
for low-volume applications, have been achieved. The 
popularity of these devices, known as semi-ustom IC's, is 
reflected by the large number of manufacturers who 
specialize in them [ll. 
These manufacturers have attacked the cost problem in 
several ways. Almost uniformly, the construction of the 
semicustom circuits consists of forming masks for the 
interconnection of a standard (perhaps pre-fabricated) 
understructure of transistors. This organization can be 
contrasted with that of the custom IC process, in which all 
active devices and interconnects are specialized to the 
application and require separate fabrication steps. 
A second factor contributing to cost reduction is the 
use of a standard cell approarh. It was very quickly 
realized that, if the internal connections of common digital 
logic elerients were pre-defined as cells and storej, 
customization could be achieved by selecting an appropriate 
set of :ells, arranging them on the standard understructure 
and interconnecting them as specified by the designer. The 
costs associated with physical design can thus be reduced 
from those required for transistor-level design to the costs 
for a cell-level implementation. This cost reduction is 
due, in great part, to the reduced size (number of elements) 
of the physical design problem which must be solved. 
A third area in which cost reduction techniques have 
been applied concerns the means by which the physical design 
problem 1s solved. Hand design, which might be suited fs: 
custom technologies, must be abandoned in favor of morn 
cost-effective automated techniques. Even some computerized 
procedures must be eliminated due to requirements for 
excessive time, excessive storage, large :muti. , 
facilities, or specialized 1/0 devices. 
Based on the need for semicustom integrated circu .ts 
and on recognition of tile factors discussed, Standard 
Transistor ARray (STAR) [ 2 ]  processing technology has been 
developed at NASA's Marshall Space Flight Center facility. 
This system uses a matrix of transistors as the 
understructure and is supplied with a comprehensive library 
of standard digita.. logic cells which are implemeated by 
means of a two-level metallization process. 
While the STAR organization is adaptable to many 
diverse technologies, during development of the system a 
bulk-metai CMOS (complementary MOS) technology is being 
used. A sketch of the CMOS understructure is shown in Figure 
1 with a typical cell shown in Figure " A sketch of a 
typical STAR placement is shown in Figure 3. 
The STAR organization shown in these f'gures can be 
contrasted with the more common polycell layout organization 
shown in Figure 4. The most evident difference between 
these organizations is the lack of routing channels between 
cell rows in STAR. The paths for cell interconnection 
(global paths) in the STAR organization are provided 
internally to each cell (Figure 5). By performing vertical 
routing on the first level, four global paths are made 
STAR Column 
I 
GND 
STAR 
Row 1 
Figure  1 . STAR CMOS Bul k-Metal  Underst ructure 
-1st ~ e v e l  ~ t ! t a l  69 -2nd Level Metal C u t o u t  
a V i a  
a .  Cell Structure 
b. Circuit Giagram 
Figure 2. STAR Logic Cell (NOR) 

Pads 
It". 
9s 
r; 
Routing Channei 
Fold ing Path 
F igure 4 .  Typica l  Po l yce l l  LSI Organizat ion 

available in the horizontal direction on each 
double-transistor row (STAR r o w )  . In addition, vertical 
passage through the cells is provided by a vertical global 
path in each transistor column (STAR column). A STAR cell 
which is n trap-istors in width supplies 4 horizontal and n 
vertical paths for non-terminating interconnections. There 
are ROWS x COLS of these horizontal and vertical path 
segments in a STAR placement, where ROWS is the number of 
STAR rows and COLS the number of columns. It should be 
noted that metal paths between levels (vias, as illustrated 
in Figure 6 )  are provided for the connection of horizontal 
and vertical segments. 
A second innovation present in the STAR orgal~~zation is 
indicated in Figure 7. External connections to a cell may 
enter from any side and can be joined to a cell bus at a 
number of points. This allows a greater degree of 
flexibility than the single cell entrance side and 
connection point provided in most technologies. 
The STAR processing methodology has been proven by the 
fabrication of a number of test circuits. Due to the lack 
of cell placement facilities, however, cell arrangement on 
the chips has been performed by manual techniques. In the 
following section, the objectives of this dissertation 
(which include description of automated STAR cell placement 
methods) will be stated. 
TOP 
View 
Side 
View 
aa Metal i zat ion  
Figure 6. Double-level Metal i zat ion  
and Vias 

Objectives and Organization 
The overall concern of the w o r ~  on which this 
report is based is the development of simple automated 
STAR cell placement procedures which will enhance the ease 
of placement routing. The primary objective of the 
report , then, is the description of the placement 
techniques developed. 
A second objective of the report is to 
demonstrate that, by modification of the algorithms, a 
placement procedure adapted to use with traditional LSI 
organizations can be utilized with non-polycell technologies 
such as STAR. The final objective is to illustrate STAR 
placement optimality measurement techniques developed. 
In Chapter 11, background information regarding the 
placement problem and existing methods for its solution are 
given. Chapter I11 describes the placement techniques 
developed for use with STAR. Chapter IV contains n 
discussion of several of the placement optimality 
measurement techniques developed. Chapter V details the 
integration of the various techniques into the final 
placement system. Placement system performance results are 
shown in Cllapter VI. The final cha?ter presents a summarl 
of results and recornendations for future work. 
Equational developments are shown in an Appendix to 
this report. 
BACKGROUND 
The LSI cell placement problem is discussed in general 
in this chapter. The first section concerns problem 
definition, characteristics, and modelling. 
Previously-developed methods for problem solution are 
described at the end of the chapter. 
The Placement Problem 
As stated in the preceding chapter, solution of the 
placement problem involves identification of optimum 
locations for circuit elements within a nodule. A more 
exact definition of the problem, as applied to LSI 
technology, is: 
The LSI placement problem consists of 
identification of the optimum arrangement of 
elements on the chip with respect to criteria 
defined on element interrelations. 
This statement of the problem, providing more flexibility 
than many prior definitions (such as that given by Hanan and 
Kurtzberg ( 3 ) )  will now be discussed. 
Discussion of Problem Statement 
A typical partitioned digital LSI circuit will consist 
o f  logic devices (AND gates, OR gates, flip-flops, etc.) and 
16 
of pads (metal areas used for interconnection to off -chip 
devices). 2ach logic device will internally consist of the 
active and passive components required to perform the 
desired logic function in a given technology (TTL, PMOS, 
CMOS, etc.). 
It is thus posslble to perform placement of the circuit 
at either of two levels: component placement in which the 
placement routine should assign optimum locations for each 
resistor, trsnsistor , etc., and sate placement (or cell 
placement) in which the placement routine should form an 
optimum placement of the circuit components at a 
gate-description level. 
The "elements' referred to in the problem statement, 
then, may be circuit constituents of various levels of 
internal complexity. While the level at which a placement 
routine must work is typically consistent for a single 
problem, a completely general placement routine should be 
able to perform efficiently at either level. 
A second item requiring consideration in the statement 
of the placement problem is the meaning of 'element 
interrelation&". These relations are typically well-defined 
for a given problem and are used as a basis for the 
calculation of the optimality of a placement. 
The most immediate element interrelation is that 
specified by electrical connections between elements. In 
the simplest case, shown at the top of Figure 8, each 
Connection 
Net 
Figure 8 . Element Interconnections 
* . 
i * P  - 
?pr. 
'an- 
- .  -- - 
connection joins only two elements. For this case, the 
element interrelation is bi3ia.ry in nature and need only 
specify whether a connection exists between each pair of 
elements. Almost universally, however, points on several 
elements are electrically common and a net is used for 
element interconnection. An example of this structure is 
also shown in Figure 8. The resultant element interre1,ation 
can be modelled as a binary relaticn (as bill be discussed 
later in this chapter) but is most rccurately represented as 
a relation between elements and nets. For notational 
convenience in the following sections, the terms "net" and 
uinterconnsctionu are used to indicate an electrical 
connection between two or more elements. The term 
"connection" is used to describe a net between only two 
elements. 
A second possible interrelation occurs due to the 
variety of power-consumption characteristics of the circuit 
elements. Since the power consumed by an element is related 
to the heat dissipated by the element, it is possible to 
form "hot spots" on the chip by placing high-power elements 
in proximity to each other. If this is of concern in the 
chosen LSI technology, an interrelation may be specified 
which represents the power requirements of the elements. A 
properly-chosen optimality critesion can then be used to 
discourage placements in which hot spots occur. 
While other interrelations are conceivable, the most 
commonly used is that based on element interconnections. 
This is due both to the fact that interconnection data is 
easily retrieved from a circuit description and to the 
assumption that this relation can be used t,o form optimality 
measures for the most common criteria over sll LSI 
technologies. Further discussion of element i~ . relations 
and optimality criteria are deferred to a late: - ion. 
The most disturbing (and, in practice, the most 
difficult to implement) feature of the placement problem 
statement is the specification of an optimum solution. As 
will be discussed in a later section, the placement problem 
cannot be solved in general in a realistic amount of time 
(at least by known methods). The only known methods for 
finding exact solutions of the placement problem (i .e., 
location of the optimum) involve investigation of the 
complete problem solution space. Since even a small, 
13-element placement problem has a solution space containing 
over 6 billion (131) placements, the derivation of an 
optimum solution for any realistic problem requires 3n 
unreasonable amount of time on existing computing systems. 
The most common placement problem solution methods, 
then, strive to identify either an optimal (optimum within a 
restricted region of the solution space) or near-opt imum 
(predictable as lying within a certain percent of the 
optimum) solution. Placement techniques which can realize 
these objectives have hesn developed and have been shown to 
perform within computationally feasible time limits. 
Several of these are 6escribed in a later section. 
A final clarification of the placement problc3 
statement regards possibilities for element arrangement. In 
a number of LSI technologies, elements may be placed at any 
location on the chip as long as minimum inter-element 
distances are observed. In other organizaticns (gridded 
technologies), of which STAR is an example, a grid is 
specified and element positions must be selected to align 
with the grid. Organizations of the first kind are 
attractive since they, in general, provide the most dense 
element packings. The existence of a grid, however, is a 
great aid to chip modelling and element positioning and 
tends to simplify placersent routines. 
A third organization, related to the gridded LSI 
organization, specifies "slots" or, the chip into which the 
elements are fitted. This technique is based on the 
characteristics of printed-circuit boards and is very selddm 
used for placement of elements in the active area of an LSI 
circuit. Pad locations, however, are typically maintained at 
fixed positions on the chip periphery and may be viewed as 
slats into wt.ich the circuit pads are fitted. 
Since the "goodness* of a placement is judged on the 
basis of criteria defined on the element interrelations, 
these criteria must reflect all placement characteristics 
which are desired in the final element arrangement. Several 
desirable characteristics and the corresponding criteria are 
described in the following section. 
Optimality Criteria 
As previously outlined, the placement optimality 
crite-ia selected for use with a placement routine will 
determine the extent to which the results cf the routine 
will possess desired characteristics. Among the 
characteristics which are most commonly used as the 
objectives of placement routines are: 
1. minimum required interconnection length, 
2. minimum required non-linear routing paths, 
3. minimum routing channel crowding, and 
4. maximum routing ease. 
Since the result of the placement step can be viewed as 
a foundation for the routing step, characteristic 4 is 
typically the most desired qaality for the results of any 
placement system. However, the translation of "maximum 
routing ease" into a quantitative measure which can be used 
for placement optimization is anything but straightforward. 
First, a multiplicity of routing techniques exist and 
placements which are "easilym routed by one may not be 
routeable by use of : other. Second, even if a known 
routing system is to be used, information regarding routing 
ease is sketchy - a placement can either be completely 
routed (easy routing) or not (hard routing) and differences 
between the "easy" and "hard" cases may be neither apparent 
nor consistent. 
The use of better-defined, more measurable 
characteristics has thus gained acceptance in modern 
placement systems. By far, the most popular characteristic 
used is characteristic 1 (minimum required interconnection 
length). The wide acceptance of this characteristic as the 
most important (often the only) driving force behind a 
placement system is based on several factors: 
1. total interconnection length is relatively easy to 
estimate given only the element placement, the net 
lists, and a presumed routing scheme, 
2. the criterion to be used in the placement procedure for 
optimality measurement is easily derived from the 
characteristic (for example, placement A is more nearly 
optimum than placement B if the total interconnection 
length of A is less than that of B) , and 
3. as noted by Hanan and Kurtzberg [ 3 ] ,  an LSI placement 
in which total interconnection length is optimized may 
be near-optimum in other respects, such as ease of 
routing. 
The rationale behind factor 3 may be summarized as 
follows: the existence of many long (relative to the chip 
size) interconnections increases routing difficulty by using 
a large fraction of the total available routing area and by 
forming blockages of other interconnections which would 
opLimally use portions of the same paths. By minimizing 
total interconnection length, many of the elements in a net 
are assigned locations in proximity to each other, thus 
reducing the average interconnection path length and 
(hopefully) the number of long paths, leading to increased 
routing ease. 
No known proof of the relationship between minimum 
total interconnection length and routing ease has been 
presented. However, the large number of existing placement 
routines which optimize with respect to this characteristic 
indicates satisfactory placement routine performance is 
obtainable. 
Characteristic 3 (minimum routing channel crowding) 
stems from the typical f inite-capacity routing areas 
(channels) available in gridded technologies and from the 
desire for maximum chip density in non-gr idded technologies. 
Since interconnection paths occupy space which could 
otherwise be used for placement of logic elements in the 
non-qridded applications, the motivation behind this 
characteristic for these technologies is immediate. 
In the gridded LSI organizations, locations and 
capacities (maximum number of routing paths) of routing 
channels are typically pre-defined. Since all inter-element 
connections must be routed by way of these channels, the 
placement formed must not force their overuse. 
Precise calculation of the channel usage requiremeqts 
for a placement of a given circuit is not feasible wit.iorlt 
performance of the routing step. However, it is possible tc 
obtain simple estimates of channel crowdiilg. Methods for 
this estimation will be given in a later chapter. 
The second desirable characteristic (minimum non-linear 
routing paths) has its basis in a technique used in many L 5 i  
technologies. In these organizations, all interconnection 
paths are composed of horizontal and vertical segments. The 
horizontal segments of all paths reside on one layer of the 
chip and the vertical segments on another layer. Path 
segments on the two levels are connected by vias through the 
insulation layer. 
The three levels (horizontal, vertical and insulation) 
are formed in different processing steps by use of different 
patterns (masks). At each point at which a via exists, 
strict alignment between the levels is required. Allowable 
level alignment tolerances vary inversely with the number of 
vias. Thus, with increasing vias, the difficulty of chip 
fabrication will, in general, increase. Since each 
non-linear interconnection path consists of both horizontal 
and vertical segments, the minimization of non-linear paths 
can be expected to aid in reduction of fabrication 
difficulties and thus, may be a placement technique 
objective. 
In the placement step, the number of non-linear routes 
can bnly be minimized by assuring either horizontal or 
vertical alignment of connection points on each element. 
For many LSI technologies, in which only one tie point (pin) 
exists on an element for each incident net, the alignment 
problem is very difficult to sal.ve, For o t h e r  LSI 
organizations, including that with wi.ich this dissertation 
is concerned, a number of tie points exist for each 
interconnection to an element and element alignment can be 
achieved in a number of ways. For these latter 
organizations, the objective of minimizing the number of 
non-linear routes mzy be feasible. 
In this section, various desirable characteristics of 
LSI placements have been presented and criteria for 
achieving the characteristics in the solution of the 
placement problem have been outlined. The following 
sections will discass the placement problem with respect to 
classical problems and will present modelling techniques 
commonly used for solution. 
Circuit Modelling and Problem Characterization 
Since the placemdnt sroblem is computationally 
inconvenient in many respects, it is fortunate that simple, 
yet powerful circuit modelling tools are available. As in 
many electrical problems, the most facile modelling 
techniques use circuit representations in the form of graphs 
or in forms derived therefrom. In the following paragraphs, 
modelling methods will be outlined and will be used to gain 
insight into the placement problem. 
Figure 9 shows a simple digital circuit and a graph 
model derivable from it. Several general characteristics of 
circuit models for the placement problem can be seen from 
this figure. The mapping of elements to nodes and element 
interconnections to edges is typical of logic circuit 
models. While other modelling applications (e.9. 
simulation) may require directed edges for preservation of 
signal flow sense, placement routines are, in general, 
exclusively concerned with the existence of connections 
between elements so that an undirected graph is 
satisfactory. 
While modelling of connections between elements (such 
as A - E) is immediate, the mapping of a higher-order net 
(the 3-element net F) onto the graph nodel is not. The 
modelling process in the case of Figure 9 is such that the 
net F is represented as a complete graph on elements 4, 5, 
and 6. 
Two major problems exist with this net-to-complete 
graph model. First, since the number of edges in a complete 
graph on n nodes is 
Figure 9. Logic C i r c u i t  and Equivalent 
Graph Model 
the data required for element interconnection description 
increases rapidly for circuits containing large nets. 
Second, modelling an n-element net as a complete graph on 
the n nodes implies a high degree of "connectedness" between 
the elements whereas the elements are actually only joined 
by one interconnection. Many existing placement algorithms 
tend to place tightly connected elements in clusters on the 
chip. The groupings formed, then, may be unjustly biased 
toward large nets at the sacrifice of smaller, equally 
important, nets. 
Two alternate graph models for a 3-element net are 
shown in Fig~:e 10. The net-to-star model shown in this 
figure (proposed by Goldstein and Schweikert [ S ] )  overcomes 
tne problems of the previously-described technique by 
modelling the net as a star centered on an artificial vertex 
V. The net is then correctly represented with a single 
incidence on each element vertex. However, the connections 
between elements are inaccurately modelled as indirect 
(through the vertex V). Special handling of the net 
vertices in the placement routine is possible, but the 
requirement for recognizing two vertex types may 
unnecessarily complicate the placement routine. 
The net-to-chain model shown in Figure 10 is, 
conceptually, the simplest of the modelling techniques 
shown, In this model, the 3-element net has been represented 
as two edges forming a chain between the vertices. While 
Net 
Net-to-Star Mode: 
Net-to-Chain Model 
Figure 10. Alternate Graph Models fo r  Net 
this is the most computationally convt-~ient of the three 
methods, it is important to realize that the flexibility 
inherent in the ordering of the elements in the net has been 
lost. For example, if the placement routine performs such 
that connected elements are adjacent in the horizontal 
direction, each of the 6 placements (123, 132, 213, 231, 
312, 321) should be equally acceptable. However, for the 
net-to-chain model, only (123) or (321) are accepted. Since 
others of the 6 possibilities might provide a more optimum 
total placement (due to connections to elements not in the 
net) , the placement routine performance may be poor. 
The graph models shown in this section all map circuit 
elements onto vertices and connections onto edges. 
Alternate graph models are available which can provide 
excellent representations of a logic circuit for purposes 
such as simulation and routing. Since these models are only 
infrequently proposed for use with the placement problem, 
they will not be discussed here, Vancleemput and Linders 
[ 6 ]  have presented an excellent summary of the forms and 
characteristics of these models. 
Wh.tle the circuit graph model is useful for human 
understanding, alternate forms derivable from the graph are 
more suited to computer implementation. These forms are 
typically matrices which specify the network structure. 
Amcng the matrices most commonly used for the placement 
problem are the incidence matrix, A, which specifies 
edge-ver tex adjacencies and the adjacency or connection 
matrix, C, which describes vertex-vertex adjacencies (i .e., 
connected elements). Of these, the connection matrix 
(defined by C (I, J) = the number of edges between vertices 
I and J) is most common. 
Unfortunately, the matrix representations may not be 
computationally feasible for large networks since the C 
matrix grows with the square of the number of graph vertices 
(circuit elements) and the A matrix grows with the product 
of the number of vertices and edges. Because the matrices 
are typically sparse (a large fraction of the elements are 
0 )  or symmetric (in the case of the C matrix), data 
reduction techniques might be used to reduce size 
requirements. However, the main attractiveness of the matrix 
representation (fast access to circuit connection data) may 
be lost. 
An alternate method of circuit data representation, 
which requires significantly less storage, is the 
maintenance of lists detailling, for each circuit net, the 
elements upon which the net is incident. While no explicit 
net moJel is required for this technique, data convenience 
and access speed are sacrificed by its use. This method has 
been adopted for use in the placement system which this 
report describes and further discussion will be 
deferred to a later section. 
- =  
-. 
it- 
- *.* 
' @  
7 ;' . 
As previously mentioned, some of the circuit graph 
models can be used to provide insight into the placement 
problem. In particular, by constructing the nct-to-complete 
graph model as described, the general logic circuit 
placement problem is reduced to a problem in which a simple 
relation (defined by the gtaph edges) exists between each 
pair of vertices (elements) . The reduced placement problem 
then becomes the problem of optimally assigning elements to 
chip positions with respect to criteria defined on relations 
between each pair of cells, which is equivalent to the 
classical quadratic assignment problem ( 3 1 .  There are no 
known methods for the solution of the quadratic assignment 
problem which are computationally feasible with respect to 
execution time. In fact, exact solution methods for this 
problem ,re merely strategies for complete investigation of 
the solution space. 
This apparent lack of promise in the search for 
efficient methods for che exact solution of the placement 
problem explains the profusion of heuristic prccedures which 
have been developed for ;deqtification of near-optimum 
solutions. A number of these procedures have proven 
particularly successfui and will be presented in the 
following section. 
Prior Work 
As noted in the preceding section, numerous methods for 
approximate solution of the placement problem have been 
proposed. The intent in this section is to outline 
characteristics of various methcd classes and to present 
existing solution methods in the framework of this 
classification. 
Classification of Techniques 
- 
The placement techniques to be described might be 
classified in a number of ways. For convenience here, two 
major divisions of placement problem solution methods will 
be recognized. These are initial placement. (TP) techniques 
and placement improvement (PI) techniques. 
The IP class will consist of all techniques which form 
an element placement from an unplaced set of elements and 
interconnections. The PI class will contain methods which 
modify a given starting placement to produce a more nearly 
optimum placement. 
Since many composite placement systems can be 
constructed by following an IP technique with one or more PI 
techniqdes, no attempt will be made to detail these systems. 
Initial Placement Techniques 
Conceptually, the simplest of the IP techniques is the 
Monte Carlo (or "shotgun") placement method [ 3 ] .  In this 
procedure, the circuit elements are randomly assigned to 
locations on the chip on the basis of a uniform 
distribution. The assignment procedure is repeated a large 
number of times and the most optimum placement is retained. 
As noted in [31, the performance of the procedure is poor 
due to the extremely low probabiliry of randcmly solectinq a 
"goodn placement from the sclution space. 
Also noted in [3] is the ex~stence of a Monte Cario 
technique in which the probability with which an element is 
assigned to a particular location is biased by the past 
experience -f ngood" assignments for the element. 
Performance of this technique is limited by the time 
required for distribution adjustment of each iteration. 
e The pair-linking IP technique [7] represents a 
considerable improvement over the P- ;:- .,: l o  methods. In 
this procedure, the most highly-connected pair 02 eicments 
is selected and plac5d on the chip to form the placement 
nucleus. In each following iteration of the procedure, the 
unplaced element which is most connected to a placed element 
is selected and is placed as near as possible to its 
partner. 
A relaLe3 technique, cluster development [ 7 , 8 ]  , begins 
with a nucleus element which is positioned on the chip. On 
each succeeding iteration, the unplaced element which is 
most connected to elements in the nucleus is selected and is 
placed as near as possible to the center of the positions of 
the placed elements it is connected to. 
Due to their simplicity and relatively good performance 
characteristics, the pair-linking and cluster development 
techniques are among the most commonly-used IP routines. 
However, due to their somewhat restricted view of the global 
placement characteristics, actual application sq ,terns 
utilizing these techniques almost always follow them with 
one of the more powerful P I  routines. 
Branch and bound techniques [9,10] ave been shown to 
be capable of forming excellent solutions to the quadratic 
assignment problem, and hence can be applied to the 
placement problem. These techniques are the only commonly- 
proposed methods which can be used to find an optimum 
piacement. 
The branch and bound methods, in general, give a 
strategy for partitioning the solution space and for 
searching for the optimum in each partition. Lower bounds 
on the non-optimality (cost) of the solutions in each 
partition are computed a d  the search for the optimum in a 
partition is terminated when the lower bound exceeds the 
cost of some previous solution. An excellent description of 
the pzocess is shown by Hanan and Kurtzberg [ 3 ] .  
While the bounding strategy eliminates the need for 
examinatioc of many regions of the solution space, the time 
requirements of the procedures are too excessive for 
practical application. 
Modifications to the exact branch and bound procedure 
which allow the isolation of near-optimum solutions have 
been proposed by several authors (in particular , Gilmore 
[Ill and Hillier and Conners [12]). These approximate 
branch and bound techniques can be utilized to significantly 
reduce the solution time required by the exact scheme. 
However, the complexity of the approximate methods remains 
high (on the order of the fourth power of the number of 
elements as compared to the second power for pair-linking 
and cluster development [ 3 ] )  and time requirements may 
remain prohibitively high. 
The final IP technique to be discussed will be referred 
to as the linear ordering-folding (LOF) technique. In this 
procedure, the placement problem is effectively divided into 
two parts: formation of a near-optimum one-dimensional 
  la cement (linear order) and "foldlb~gm of the linear order 
onto the chip. 
This method has been used for a number of LSI 
technologies in which the final placement can conveniently 
be organizeJ as rows of elements. Several of these 
technologies use the MOS complex (or array) organization 
~13,14,15] in which elements (typically, at the transistor 
level: ars  seri3lly interconnected to form the desired logic 
funciion. The elements are arranged as required in the 
linear order and a simple folding operation suffices to 
arrange the order of the chip. Larsen [14] provides an 
excellent discussion of the technology and possible layout 
techniques. 
A more general LSI organization in which the LOF 
technique has seen use is the polycell layout shown in the 
previous chapter. In this organization, elements (at the 
logic gate level) are arranged in back-to-back double rows 
and element interconnections are routed between the rows in 
interconnect channels. 
?he LOF technique is easily adapted to this layout 
organization since the elements can be placed in a 
one-dimensional order and the rows can be easily obtained by 
isolating appropriately-sized segments of the order and 
6 
arranging them on the chip. The use of LOF procedures to 
obtain these organizaiions is reported by Mattison [161. 
Similar techniques are used in the RCA-developed PRF program 
(171 
The LOF procedure is attractive due to its relative 
simplicity and ease of adaption to certain LSI 
organizations. Application of LOF techniques to an alternate 
organization will be described in later sections. 
Several existing IP methods have been outlined in this 
sectJon. In the following section, a number of placemsnt 
improvement techniques are discussed. 
Placement Improvement Techniques 
Many existing PI procedures can be roughly categorized 
as interchange techniques. Among the simplest of these is 
the pair-wise interchange (PWI) placement improvement 
technique. In this procedure, a pair of elements in the 
placement is interchanged and the optimality of the 
resulting layout is calculated. If an improvement results, 
the new placement replaces the old. Each pair of elements 
is trial interchanged during an iteration and iterations 
continue until no further improvement is made or until a 
desired degree of optimality is obtained. 
As noted in [18], due to the large number of 
interchanges and optimality computations required during an 
iteration, the PWI technique is excessively time-consuming 
for large circuits. A variant to the basic PWI procedure 
which attempts to overcome this problem is the 
neighborhood-PWI (NPWI) technique. This routine limits the 
number of trial interchanges in an iteration by considering 
only element pairs which lie within a distance, D l  of each 
other. 
Use of PWI methods (or variants) in application 
environments has been reported in the BTL NOMAD system [ a ]  , 
Raytheon's IPLACE [191, and in the Circuit Design System 
developed at ADAGE, Inc. (201 .  
A much more sophisticated interchange routine is that 
reported by Steinberg [211. This procedure achieves higher 
performance than the PWI techniques by handling groups of 
elements rather than pairs. The groups for consideration 
are formed by partitioning the circuit elements into 
"maximal independent sets" which are the largest partitions 
that can be formed such that no two elements in a partition 
are connected . 
Placement improvement proceeds by removing a maximal 
independent set from the placement and re-assigning the 
elements among the available locations. Since none of the 
removed elements are connected, the re-assignment problem is 
linear in nature and allows use of simple linear assignment 
techniques for solution. 
Placement technique comparison [18] has shown that 
Steinberg procedure performance lies below that of the NPWI 
technique for many problems. However, the algorithm has 
seen use in several application systems including the UNIVAC 
Automated Design System [ 2 2 ]  and the Raytheon IPLACE 
system [19]. 
A third general type of interchange technique is that 
typified by the Min-Cut Placement Algorithms described by 
Breuer [ 2 3 ] .  In these procedures, an imaginary cut-line 
divides the chip and elements are interchanged across it 
such that the number of interconnection paths crossing the 
cut line is minimized. On succeeding iterations, other cut 
lines are drawn and used for swapping (while recognizing the 
boundar ies specified on preceding passes) . The process is 
.- 
4. 
;rL 
"I" 
one of successively refining an estimation of the optimum 
location for each element. 
These algorithms have the added feature that they can 
act as IP techniques as well as PI (i.e., elements can be 
assigned to each side of a cut-line without specifically 
identifying their location). Breuer refers to one 
application system using this technique (PRANCE, by 
Automated Systems Inc.) , although no data on system 
performance is available. 
Relaxation PI techniques represent a radical departure 
from the interchange methods just described. These routines 
effectively model each element as a point source with the 
interconnections modelled as springs between point sources. 
Each point source (element), then, has forces applied to it 
in the directions of and proportional to the distances to 
all other elements to which it is connected. A target 
location for the element can then be identified as the 
location at which the forces on the element are zero. 
In the simplest relaxation technique, the forces on 
each element are calculated in turn and the element is moved 
to its target location if the location is not occupied. 
While this might be satisfactory for sparsely-populated 
(small ratio of elements to element locations) chips, the 
probability of the target location not being occupied in a 
densely-populated chip is very low. 
To overcome this problem, alternate relaxation 
techniques have been proposed. One of these, force-directed 
relaxation [3,18], either selects an available location as 
near as possible to the target location for placement or 
displaces the element at the target location, which is then 
relocated in a similar manner. 
Variants of the force-directed relaxation techniques 
include the force-directed interchange technique [24] which 
uses force-directed concepts to identify profitable element 
interchanges in what is, otherwise, an interchange 
technique. 
Problems occur in the use of these techniques due ta 
the use of a point-source for modelling of the finite-size 
element. In particular, overlapping of elements in the final 
placement is possible and usual. Generally, then, 
post-processing routines are called on to eliminate element 
overlapping without destroying the relative placement 
formed. Typical of these post-processors is the EXPAND 
process described by Scanlon [ 2 5 ] .  
Relaxation techniques appear to be among the most 
powerful and efficient PI techniques available [15] and, not 
surprisingly, among the most popular. A number of reports 
of use of these techniques have been made, among them 
[8,25,26,27,28,29,and 3 8 ) .  
A discussion of the LSI cell placement problem and a 
number of methods for its solution have been presented in 
t h i s  c h a p t e r .  A m o d i f i c a t i o n  to t h e  LOF p lacement  t e c h n i q u e  
which a l l o w s  h a n d l i n g  of STAR-like s t r u c t u r e s  is p r e s e n t e d  
i n  the next c h a p t e r .  
1x1. THE LINEAR ORDERING-FOLDING (LOF)  TECHNIQUE 
The preceding cha-?ters outlined the meaning and 
characteristics of the LSI cell placeme~t problem, methods 
for its approximate solution, and the organization of a 
semicustom LSI technology (STAR). The intent in this 
chapter is to illustrate the development of cell placement 
routines suitable for use with STAR technolo~y. 
In particular, the use of linear ordering-folding 
techniques for STAR cell placement will be described. The 
first section will deal with existing linear order formation 
techniques. In the second section, the development of the 
folding techniques to be utilized will be given. 
The Linear Ordering Procedure 
The STAR cell linear ordering problem can be considered 
as a special case of the STAR cell placement problem in 
which only one-dimensional placement is performed. 
Intuitively, this is a simpler problem. In fs,t, the 
solution spaces of the two-dimensional problem and an 
equivalent (same number of grid positions) one-dimensional 
problem are equal in size and the Jifficulty of exact 
solution of either problem is the same. 
The advantages of the linear ordering problem are due 
to the nature of the approximate placement problem solution 
techniques which, in general, have as their objective the 
location of connected cells as near as possible to each 
other. Since nearness in four directions can be achieved for 
the two-dimensional case (as opposed to two for the 
one-dimen~ional case), the nearness decision processes for 
linear ordering are inherently less complex and near-optimum 
(one-dimensional) solutions can be more quickly obtained. 
While, conceivably, any process suited for approxinate 
solution of the two-dimensional placement problem can be 
adapted to the linear ordering problem, the linear ordering 
techniques proposed by Schule: and Ulrich (311 seem to hold 
the most promise. 
These techniques efficiently achieve near-optimum (with 
respect to total interconnection length) one-dimensional 
solutions by a two-stage process. The first stage, 
clustering, combines pairs of interconnected cells or pads 
ufitil all circuit elements are contained in one cluster. In 
the second stage, decomposition, the clusters are located in 
the one-dimensional placement and are iteratively decomposed 
i ~ t o  their constituent cells. 
In the following paragraphs, these processes are 
detailed. For convenience, the term "cluster" is used to 
describe any group of one or more combined circuit elements. 
The clustering process begins with identification af 
the "most combinablen pair of circuit elements. This pair 
is combined to form a cluster and the combinltion is noted 
in a record of cluster formation (the CHR) . The combined 
cells are deleted from the set of clusters eligible for 
further combination and the new cluster is added. 
Succeeding iterations of the procedure identify the 
"most combinable" cluster pair and fcrm new clusters as 
before. The clustering process terminates when only one 
cluster remains. The clustering procedure is illustrated in 
the flow diagram shown in Figure 11. 
The results of the clustering step can be visualized in 
the form of a binary tree such as that shown in Figure 12. 
The nodes of this tree represent the clusters and the 
branches show the cluster composition. The lower terminal 
nodes of the tree are the original circuit elements. 
The decomposition procedure can be easily 
conceptualized by consideration of this tree form. Each of 
the binary subtrees can be rotated about its root cluster 
into either of two configurations as shown in Figure 13. If 
each binary subtree is rotated into its more optimum 
configuration, (relative to other subtrees) the implication 
is that the optimality of the terminal node order is 
improved. Since a binary tree with n terminal nodes contains 
(n-1) proper binary subtrees, the number of optimality 
comp3risons required is (n-1). 
S t a r t  G) 
More Than A 
D 
one Cluster  \ in...? )Nc+l 
Enter a1 1 
Elements i n  
Cluster ing 
E l i g i b i l i t y  L i s t  
(CEL) 
Ident i fy  Most I 
Combinable 
Pair  i n  CEL, 
I and J 
1 
Comb1 ne 
I and J 
To Form Cluster  
K 
Figure 1 1 .  Functional Flow o f  Cluster ing Procedure 
Delete I and J Add Record 
from o f  Combination 
CEL, t o 
Add K CHR 
+ a a 
m 

Configuration 2 
Figure 1 3 .  Subtree Rotation 
The decomposition process is implemented by simulating 
this subtree rotation. The process begins by placing the 
two constituent clusters of the final cluster formed in an 
arbitrary sequence in the linear order. 
Succeeding iterations of the procedure identify the 
latest-formed cluster in the linear order and replace it 
with its consti:~ents. The optimality of each of the two 
possible orientations of the constituents is calculeted and 
the more optimum configuration selected. The process 
terminates when all elements of the linear order consist of 
a single circuit element. The decomposition procedure is 
shown in the flow diagram of Figure 14. 
Two optimality decision processes are required for 
performance of the linear ordering procedure. The first of 
these occurs in the clustering step when it is desired to 
identify the "most combinableN clusters. Schuler and Ulrich 
propose a method by which the connectivity of a cluster pair 
is evaluated relative to its connectivity to other clusters. 
The pair with the highest relative connectivily is then 
selected for combination. This criterion has the effect of 
achieving ' a near-minimum of interconnections between 
clusters at each clustering step and aids in producing 
linear orders which are near-cptimum with respect to total 
interconnection length. Use of a similar technique is 
reported in [32]. 
START a 
Ident i fy  
Constituents of 
Last-Formed 
Cluster  and Place 
i n  Linear Order 
f 
b 
Order Elements 
Single C e l l s  
Ident i fy  
Constituents of 
Last-Formed 
Cluster  i n  L.O., 
K 
Replace Cluster  
wi th more 
optimal o f  
( 1 4  , (31 )  
_I I - 
f 
Figure 14. Functiona 1 Flow of Decomgos i t ion Procedure 
While the Schuler and Ulrich combination criteria 
perform well for many application environments, an alternate 
method, more suited to the STAR technology, is used in the 
linear ordering procedure deveioped. This combination 
criterion, developed by J. Gould of the Marshall Space 
Flight Center, will now be described. 
The Gould combination criterion is based on a model 
which attempts to account for cluster size (sum of the 
constituent cell widths) in the minimization of average 
interconnection length. A cluster of size w2 is modelled as 
a square with side length w and all interconnections to the 
cluster are considered to emanate from the square (nets 
between cells in the same cluster have zero length). 
The escape distance, ED, is defined as the average 
horizontal and vertical distance that a connection must 
traverse in order to run from the inside to the outside of 
the square (cluster) . From the model, ED can be easily 
estimated as the sum of one-half the horizontal and one-half 
the vertical square dimension, or, the square root of the 
cluster size. 
It is now possible to consider three asses of nets on 
two clusters, A (size x 2 )  and B (size y2) and to estimate 
the effects on interconnection length if the clusters are 
combined. Each of the three classes is illustrated in 
Figure 15. The first class, the non-connecting class, is 
that of an interconnection to A but not B. For this class, 
Before Combi nat i on 
R 
After Combination 
-1 
-w ED  
Non-Connecting Case 
Uniquely Connecting Case 
ED x+y 
ED - 
Non-Uniquely Connecting Case 
F i  yure 15 .  Gould C.4 t e r i a  Clustering Models 
the ED for the net before combination is 
and after is 
- .  
Interconnections from B are treated identically by 
interchanging x  and y. 
The second class, the uniquely connecting class, 
contains those nets which run between A and B, but no other 
cluster. The ED for this class before combination is 
8 . 5 ~  + 8 . 5 ~  + 8 . 5 ~  - 0 . 5 ~  = y 
and is fd after combination. 
For the thiri! class, containing nets between A and B 
which also run to other clusters, the total ED before 
combination is 
X + Y  
and after combination, is 
For the pair of clusters A and 8 ,  then, the total ED 
before combination is 
ED = NClAx + NClBy + NC2y + NC3(x + y )  
where NClA is the number of class 1 connections to A, NClB 
is the number of class 1 connections to B, and N C 2  and N C 3  
are the number of class 2 and 3 connections between A and 8. 
The total ED if the clusters are combined is 
ED' = (NCIA + N C I B  + N C 3 ) ( W  1 
The improvement (reduct ion)  i n  t o t a l  escape d is tance  
expected by combination of the c l u s t e r s  A and B is 
E D 1  = E D  - E D '  
The Gould procedure s e l e c t s  a  c l u s t e r  A ,  and for  each 
c l u s t e r  B which is connected t o  i t ,  computes t h e  E D 1  for  the 
p a i r .  The B which produces the maximum E D 1  i s  se lec ted  for  
combination w i t h  A. 
If  c l u s t e r  B is se lec ted  such t h a t  i t s  s i z e  is equal t o  
or grea ter  than t h a t  of A, i t  can be seen t h a t  maximum 
( p o s i t i v e  or negative) improvement for each of the three  
interconnection c la s ses  1s a t t a ined  when the s i z e  of B is 
equal t o  t h a t  of A (y = x ) .  Since a l l  c l u s t e r s  m u s t  be 
eventual ly  combined, maximum improvement can be made by 
se lec t ing  the smallest  c l u s t e r s  ea r ly  i n  the procedure when 
other small c l u s t e r s  a re  avai lab le .  The r u l e  used for  
s e l e c t i o n  of the c l u s t e r  A ,  then, cons i s t s  only of choosinri 
the smallest  c l u s t e r  ava i lab le .  
The second opt imali ty  decision i n  the l inea r  .jcder ing 
procedure is required i n  the decom;~osition s t ep .  T h i s  
decis ion must i s o l a t e  one of the two cons t i tuen t  orders  
poss ib le  when a  c l u s t e r  is t o  be replaced. 
The c r i t e r i o n  used for t h i s  choice is based so le ly  on 
minimum interconnection length considerat ions.  The number 
of ne t s  by which each of the cons t i tuen t s  is connected t o  
the l e f t  and r i g h t  of the t a r g e t  loca t ion  is  calculated and 
56 
the orientation is selected which minimizes the total 
connection distance. 
The result of the linear ordering procedure outlined in 
this section is a one-dimensional placement in which the 
total interconnection length is near-minimum (considering 
cell width). In the following section folding techniques 
which can be used to map the STAR cell linear order onto the 
STAR will be developed. 
The Folding Procedure 
As noted in a previous chapter, folding methodologies 
have been developed for vario~s LSI organizations. For 
these organizations, however, the required chip layout has 
made obvious the folding strategies required and, for the 
most part, the methods developed have been of an extremely 
simple nature. 
In this section, folding techniques suitable for use in 
STAR and STAR-like organizations are developed. While the 
complexity of these methods is greater than that of the 
simpler folding techniques, the desirable qualities of the 
folding procedures (i .e., speed and relative simplicity) 
have not been sacrificed. 
Following introductory material detailing folding 
objectives, the nethods will be presented in two parts. 
First, foldir,g techniques will be presented that are suited 
to placement of circuits consisting of uniform-size cells. 
Next, the more complex problem of folding of networks 
containing c e l l s  of non-uniform s i z e  w i l l  be t r ea ted .  Since 
separa te  chapters  a r e  devoted t o  placement o p t i n a l i t y  
measurement and folding procedure performance, these items 
w i l l  not be discussed i n  d e t a i l .  Also pad placement, which 
i s  discussed in  a l a t e r  chapter ,  is ignored here. 
Folding Objectives 
Br ie f ly  s t a t e d ,  the objec t ive  of the folding portion of 
an LOF technique is "map the  l inea r  order onto the chip  
without d is turb ing  the r e l a t i v e  c e l l  p c s i t i o n s n .  The 
s i m p l i s t i c  nature of t h i s  statement is  due t o  the intended 
character  of the LOF method i n  which the l i n e a r  ordering 
segment 1; t o  perform t h e  "hard" wori ( i . e . ,  r e l a t i v e  c e l l  
pos i t ion  assignment) and tne folding segment merely t o  lay  
out the chip i n  a pre-defined manner while obeying the 
r e l a t i o n s  formed. 
For STAR orcjanizations, however, i t  is possible  for the 
folding segment, while regarding the spec i f ied  l i n e a r  order ,  
t o  increase the opt imali ty  of the f i n a l  placement over t h a t  
present i n  the one-dimensional case.  T h i s  improvement can 
be achieved by use of folding s t r a t e g i e s  which decreasz the 
d is tance  between connected c e l l s  or which place a  higher 
percentage of c e l l s  i n  juxtaposit ion than spec i f ied  i n  the 
l i n e a r  order.  
In general, the folding methodologies presented rely on 
the assumption that the linear order represents a 
near-optimum one-dimensional cell placement with respect to 
total interconnection distance. The folding procedures, 
tnen, are developed and justified on the basis of preserving 
and augmenting the relations given by the linear order. 
Finally, while optimality maximization is deslred, the 
fact that a primary objective of the folding technique is to 
fir the cells onto the finite STAR cannot be ignored. Since 
failure by the routines to form a Sr2AR placement may require 
the use of expensive manual placement techniques, the 
probability of the folding procedure to find a placement if 
one exists (regardless of optimality) should be acceptably 
high, 
Folding for Uniform-Width Cells 
Since cells in the STAR cell library are defined in a 
wide variety of widths, the probability of the occurrence of 
a randcm-logic custom STAR application in which all circuit 
cells are the same size is extremely low. However, these 
uniform-width networks are useful for illustration of 
several folding crtncepts and are discussed in this section. 
The STAR model used in this sectio? is a reduced form 
bf the normal STAR grid structure. Horizontal grid lines 
conform to the STAR rows and vertical grid lines correspond 
to transistor calumns 1, 1 + W ,  1+2W, . . . , where W is the 
uniform width of the cells. Cells are reduced to point 
sources at their left-hand end (with respect to orientation 
on the STAR). By restricting point source placement to only 
those positions at which a horizontal and a vertical grid 
line intersect, the gridded STAR placement is modelled as a 
more convenient slotted organization. For the purposes of 
this section, these simplifying approximations cause no 
serious loss of generality. 
This model is utilized in the following paragraphs as a 
medium for analysis of various STAR folding strategies. The 
major criterion to be used for measurement of the quality of 
a strategy is the minimization of the average distance in 
the STAR placement between the Ith and (I+k) th cells in the 
linear order with the objective of minimizing total 
interconnection length and channel usage. Analyses of 
results of the linear ordering technique indicate that the 
majority of connections to a target cell in the order in a 
net-to-chain graph model of the linear order run to cells no 
farther removed than four cells from the target. The k in 
the statement above will thus be restricted to the range 1 
to 4. 
As a simple starting point, the problem of placement of 
a C-cell linear order on an infinite STAR will be treated. 
The two folding methods shown in Figure 16 are immediately 
suggested. For the horizontal alignment method, the average 
distance between the Ith and (I+k)th cells is kW. This 
Horizonta l  A1 ignment 
V e r t i c a l  A1 ignment 
F igure 16. Uniform C e l l  Fold ing Methods 
f o r  I n f i n i t e  STAR 
distance in the vertical alignment method is k .  Further, if 
at any point the vertical alignment is modified so that a 
horizontal component appears, the average distance is 
increased from k .  Thus, from interconnection length 
considerations, the vertical alignment folding method is 
optimum (with rsspect to the linear order). 
From the standpoint of minimizing global channel 
crowding, however, the vertical alignment method may not be 
acceptable. To illustrate this fact, a method for estimating 
channel utilization (density) is now introduced. 
The desired utilization estimates should represent the 
expected fraction of the total available horizontal and 
vertical global channel area which is used in any region of 
the STAR. The simple model of a STAR transistor area (the 
intersection of a row with a transistor column) shown in 
Figure 17 ,s used to quantify this area. If each of the 
portions of a channel in a single transistor area is called 
a channel segment, there are exactly 4m horizontal and n 
vertical channel segments available in an m-row by n-column 
portion of the STAR. For this case, 4m will be called the 
horizontal channel area and n, the vertical channel area . 
The area occupied by a cell interconnection can also be 
represented in terms of chcnnel segments. While the area 
occupied for the connection of cell I to cell J cannot be 
less than the distance between t9e point sources 
corresponding to I and J in the STAR placement model, more 
V e r t i c a l  Channel Segment 
ST ' R  
Row 
I Hor iz  t a l  
! STAR 
T r a n s i s t o r  I 
Co 1  umn 
F igure  17. STAR T r a n s i s t o r  Area 
Channel Model 
accurate density calclilations can be achieved by recognizing 
that a par tially-used channel segment cannot be re-util ized 
and thus, should be considert3 as completely used. An 
interconnection between point sources a distance ( d l  apart, 
then, should be charqed with the spanning distance ( d + l )  to 
cccount for c annel usage in both terminal cells. 
By use of the concepts above, channel utilizhtion in an 
rn row by n column area of the STAR c . 7 ~  : estiniated by 
where U(Hj and U ( V )  are the horjzvntal and vertical 
utilizations, H(I,I+J) is the average horizontal spanning 
distance between the Ith cell from the linear order and the 
(T+J)th, V(l,I+J) is the corresponding vertical spanning 
d i s t ~ n c e ,  F(J) is the fraction of the total number of 
connections between cells a 8istance J apart in the linear 
order, and L is rhe number of connections within the n-by-m 
area. 
It is now possible to estimate localized channel 
utilization %r the vertical alignment cese. The STAR area 
of interest is the C-row by W-column region jn which the 
cells have been placed. V ( 1 ,  ItJ) 1 J+l and H (I, I+1) = 0 for 
all cases. Then 
As noted, the quantity L in this equation represents 
the number of connections in a net-to-chain graph model of 
the linear order. sow, the net-to-chain model of an n-cell 
net is a spanning tree on n vertices and contains (n-1) 
edges (connections) . L may then be approximated by 
L = (y - l)N 
where y is the average net size (in cells) and N is the 
number of nets in tne circuit. 
To calculate y, the sum of the net sizes is d i v i d e d  by 
sum o f  net s i z e s  
But, the sum of the net s i z e s  i s  exactly equal to the number 
of pin.. (connection points) in the circuit, so 
t o t a l  number of circuit p i n s  
Y = -  
N 
where z is the average number of pins per ce1,l. Thus, 
and, 
Then 
The first term in this equation is the average pins per 
unit cell width. An analysis of cells available in the STAR 
cell library reveals that this ratio ranges from 0.25 to 1. 
Accepting 1 as a worst-case (highest utilization) value, 
As y increases, this ratio of (y-1) to y approaches 1, so, 
for the worst case, 
It is easily sezn that this quantity is not 
upper -bounded by 1. In fact, for the typical 
(experimentally-derived) values, 
F(1) = 8.5 
F(2) = F(3) = F ( 4 )  = 8.1 
the right-hand side of this equation becomes 2.2. 
Equivalently stated, appr~ximately 2 2 e 8  of the vertical 
channel segments available in the placement area are 
required for worst-case circuit characteristics. 
Routing of the vertically-aligned placement is possible 
by use of the unfilled area to the right of tne cells (use 
of an area Sw in width results in a vertical utilization 
of 0.44). However, a placement which requires this type of 
routing is hardly likely to be classified as 
"easily-routed". The use of vertically or 
horizontally-aligned folding is thus rejected for 
appiication purposes. 
An alternate folding strategy, called block-oriented 
folding, has been developed for use in the STAR cell 
placement problem. This technique can provide improved 
channel utilization characteristics over the simple methods 
presented in the previous section. In addition, the method 
allows recognition of th2 finite STAR size. 
The block-oriented technique combines aspects of both 
horizontal and vertical alignment. Vertically aligned 
segments of the linear order are replicated horizontal'y 
across the STAR to form a block. Blocks are then stackad 
vertically. In the following discussion, the length of the 
vertical segments within a block is referred to as the block 
depth . 
The simplest block-oriented folding method is typified 
in Figure 18. This, in fact, is the same folding structure 
used in the polycell organization described in an earlier 
chapter. The blocks in this layout are the horizontal rows 
of n cells, each. 
For derivational convenience, in the remainder of this 
section, only values of 1 and 2 will be substituted for J in 
- - - - Folded Linear Order 
Figure 18. Block-Oriented Folding (Block Oepth = 1 ) 
the calculation of the distance to the (I + J)th cell. 
These average horizontal and vertical distances in this 
organization can be easily obtained as 
and 
The worst-case utilization figures for this 
organization are 
and 
Derivation of equations 3-1 and 3-2 is shown in Appendix A. 
Equations 3-1 and 3-2 can be contrasted with 
expressions for tne same quantities in a different 
organization (Figure 19). The average horizontal and 
vertical distances for this layout are 
(n- 1 ) (W+l ) n+ 1 
H(I,I+l) = 2 n ; V(I,I+l) = - n 
and 
The worst-case channel utilization figures (developed in 
m o o  - Folded Linear Order 
Figure 19. Block-Oriented Folding (Block Depth = 2 )  
Appendix A) are 
1 1 
U(HIWc (0.088) [ ( I -  ;i ) W  - t I] 
and 
As might be expected, a comparison of utilization 
between the first organization (block depth = 1) and the 
second (block depth = 2) reveals that for all applicable 
values of n, the horizontal usage of the block depth = 2 
layout is less than that of the layout in which the block 
depth is 1. The relation between the vertical utilizations 
is the reverse. 
Thus, the anticipated result for block-oriented layouts 
is that horizontal utilization decreases and vertical 
atilization increases with an increase in block depth. This 
is, in fact, the case as shown in Figure 20. 
The objective for a placement in which channel crowding 
is to be minimized must be to hold both horizontal and 
vertical channel usage as low as possible. Any other 
criteria, such as minimization of the sun, of the 
utilizations in both directions, is apt to minimize density 
in one direction at the sacrifice of the other. Since the 
total interconnection length can be approximated as a linear 
multiple of this utilization sum, it can be seen that the 
minimization of channel usage in both directions provides a 

more ~0werful optimization criterion than this more common 
placement objective. 
The problem of finding an optimal block-oriented 
folding of a linear order can thus be approached by 
identifying a block depth at which both horizontal and 
vertical chamel usage are minimum. From Figure 28, it can 
be seen that this is an impossible objective since U(H) Is 
minimized at high block depths and U(V) at low depths. 
An approximate solution, then, can be obtailied by 
selection of a block depth at which both U(H) and U ( V )  are 
as small as possible. A reasonable strategy would seem to 
be the selection of the block depth at which U(H) is most 
nearly rlual to U (V) since, at this point, either increasing 
or decreasing the block depth must worsen the utilization in 
one direction. Unfortunately, a-priori location of this 
point is difficult due to the unproportional dependence of 
both U (H) and U (V) on cell width (W) and row length (n) . 
However, a method for solution can be suggested by 
noting the regularities present in the block-oriented 
structures and the resultant computational simplicity of 
folding. Since a layout for a single block depth can be 
generated quickly, and since the range of block depths is 
limited by the number of STAR rows, it is feasible to sweep 
the entire block depth range and to select the most optimal 
solut.ion. This is the strategy used in the actual STAR 
placement routine and will be described in a later section. 
Several  q u a l i f i c a t i o n s  for  the methods of t h i s  sec t ion  
a r e  required.  F i r s t ,  it should be noted t h a t  the 
conceivable range of unique folding s t r a t e g i z s '  is 
e f f e c t i v e l y  unlimited. The use of the block-or iented 
techniques presented here has been based on performance 
comparisons w i t h  other folding methods and on the s impl ic i ty  
of the procedure. While other s t r a t e g i e s  may be s i r 7 l e r  or 
produce m9re optimum so lu t ions ,  the block-oriented methods 
have shown s a t i s f a c t o r y  performance and a re  i n  use i n  the 
current  STAR fclding rout ine.  
A second portion of the methods requiring c l a r i f i c a t i o n  
regards modifications t.o the block-or iented technique which 
might be necessary for c e r t a i n  STAR s i z e s .  For example, 
fewer than r  rows might be avai lab le  a t  the end of the STAR 
for  placement of the l a s t  block i n  a  procedure i n  which the 
block depth is r .  The procedure can be e a s i l y  modified t o  
sense the case in  which fewer than r  (say,  s )  rows remain 
and t o  perform only s-block depth folding for the l a s t  
block. 
More ser ious  s i z e  cons t ra in t s  occur i n  the horizontal  
d i r ec t ion .  For the STAR models presented i n  t h i s  s ec t ion ,  
the  row length in  c e l l s  ( n ) ,  has been impl ic i t ly  assumed t o  
be odd. An even n prevents the use of the normal folding 
s t r a t e g y  by el iminat ing the p o s s i b i l i t y  for cor rec t  matching 
between the ends of the blocks. For sparsely-populated 
STARS, i t  may be possible  t o  preserve the folaing pa t te rn  by 
neglecting the final column of slots. For dense STARS, the 
fuil row width must be used and the block connection 
problems are ignored. 
Finally, mention shouJd be made of the possib~lity of a 
circuit specification which cannot be fitted onto the 
particular STAR requested. FOP the uniform cell case, this 
event can be simply detected prior to cell placement by 
comparison of the number of. W-wide slots available with the 
nuinber of cells. 
Folding techniques applicable to unif~tm cell STAR 
placement have been presented in this section. The more 
general case, in which various cell sizes exist, will be 
discussed in the following section. 
Folding for Non-Uniform-Width Cells 
As was noted previously, the occurrence of a STAR 
a~plication in which uniform cell sizes are requested is an 
extremely low probability event. The most common STAR 
applications are those in which cells occur in many 
different widths. 
Several of the simplifying assumptions applied to the 
uniform celi problem cannot be justified for the non-uniform 
cell case. In particular, the modelling of the STAR zs a 
slotted organization is not, in general, possible since this 
model depended on division of the STAR into regions equal in 
size to the uniform cell width. While modelling of slots 
which are larger than the largest cell might be satisfactory 
for a sparsely-populated, non-uniform probles, the STAR 
space wasted by this approach would preclude solution of 
dense problems. Thus, the true gridded organization of the 
STAR must be recognized for the non-uniform cell case. 
However, the block-oriented folding techniques applied 
to the uniform cell problem can be adapted for use in the 
non-uniform case. The block-oriented technique is used to 
compute a base row for each cell to be placed. An alternate 
--
row adjacent to the base row, is also specified. The ' 
alternate is selected to be the row which is in the 
directio~. of the current vertical placement trend within the 
block. The column locations to be occupied by a cell are 
selected as the left-most available positions of the row for 
left-to-right blocks (odd blocks) ar.3 the r ight-most 
available positions for right-to-left blocks (even blocks ) .  
A STAii placement: formed in this manner is shown in 
Figure 21. 
As can be seen from this figure, it is possible that 
there is insufficient space on the desired row for placement 
of a cell (note cell 1:'. In this eventuality, placement is 
attempted on the a1ter:late row. If this fails, the next 
base row is selected. 
The folding procedure must allow for the possibility of 
being unable to completely place the cells in the linear 
crder. This situation can arise in two ways. E'irst, it may 
Linear  Order 
1 2 3 ... 20 
Block 
1 
Block 
2 I 
r;l - Unused Transistor  Area 
Figure 21. Non-Uniform Cell Placement 
be impossible to fit the complete cell set on the STAR 
selected. Three ways In which this may occur are noted in 
Figur2 22. 
The first two cases shown can be simply isolated by 
comparison of the cell widths and the STAR size. In the 
tk-rd case shown, none of the possible cell-to-row 
assignments can result in a complete p; cement. 
This third case bears great ~iimilarity to the classical 
bin packing r.roblem (a special case of the scheduling 
problem) in which a number of finite-length tasks are to be 
assigned among severai workers and it is desired to 
determine if all the tasks can be completed in a given 
amount of tine. As noted by Graham [ 3 3 ] ,  the only 
techniques available for co:~pleta solution of this problem 
involve exhaustive investigation of all gossibilities. 
A-priori knowledge of a cell ",;chedulingn problem, then, can 
only be obtained by use of L,roce+ures which are on the same 
order of difficulty as the co,:.;:ete placement problem. Lo 
pre-folding tests for the existence of this problem are 
performeg. The assumption is that this is the cause o" he 
problem if the complete folding proceii~res (to be s i ~ ~ n )  
Zail to identify a solution. 
A second way in which failure of folding 9E a linear 
order may arise is shown in Figure 23. Two : a . ) ? z r  orders 
for a given set of cells are shown in this figure. The 
first linear order cannot be folued at. any block depth f B D ) .  
Cell Set 
1 (width 5 ) ,  2(width 4 ) ,  3(width 4 )  
3TAR Narrower Than Widest C e l l  
ROWS = 2 
Cols = 5 
I n s u f f i c i e n t  STAR Area 
Rows = 2 
Cols = 7 
Ce l l  ' Schedul i ng ' Probl2m 
Eigure 22. STARS on Which a Given C e l l  
Set Cannot be Placed 
Cell Set 
l (width 4 ) ,  2(width 3 ) ,   width 2) 
4(width 2 ) ,  S(width 21, 6(width 3) 
Rows = 3, Cols = 6 
Linear Order = 1 2 3 4 5 6 
BD = 3 
Linear Order 3 4 5 6 1 2 
Figure 23 . Dependence o f  ' Foldabi 1 i t y  ' on Linear Order 
However, the second order is e a s i l y  folded. T h u s ,  l i n e a r  
order has a non-negligible e f f e c t  on " f o l d a b i l i t y " .  
To take advantage of t h i s  f a c t ,  the  procedures 
developed f.,r the STAF c e l l  folding problem u c i l i z e  l imited 
modification t o  the spec i f ied  l inea r  order i n  the event t h a t  
folding for a pa r t i cu la r  order cannot be performed. 
T h i s  modification t o  the l inea r  order is ca l led  
ro ta t ion  and cons i s t s  of s p l i t t i n g  the l inea r  order a t  a 
boundary between two c e l l s  a d  reversing the order of the 
two p a r t s  formed. For example, the l inear  order 
1 2 3 4 s  
can be rotated about the boundary between c e l l s  3 and 4 t o  
form the new order 
4 5 1 2 3  . 
Rotation of a l inea r  order has the e f f e c t  of presenting 
a d i f f e r e n t  sequence of c e l l  w i d t h s  t o  the folding rout ine 
and is performed i n  the hope tha t  the modified order can be 
folded to  f i t  the STAR. 
A ro ta t ion  d i s rup t s  a l l  intercannec_ions which crossed 
the ro ta t ion  boundary i n  the o r ig ina l  order.  To keep t h i s  
disrupt.ion t o  a minimum, the c e l l  boundaries t o  be used a s  
ro ta t ion  boundaries a re  selected i n  the i r  reverse order of 
connect i r . .  s t r ~ n g t h .  I n  other words, the f i r s t  ro ta t ion  of 
a l inear  order is performed a t  the boundary which is crossed 
by the fewest connections. Succeeding ro ta t ions  of the 
o r i g i n a l  l i n e a r  order a re  performed a t  boundaries w i t h  more 
connection crossings. Thus, a 2 placement formed by 
folding an early rotation of the linear order should contain 
relatively few disturbed connections. 
A second operation which improves the probability of 
complete circuit placement is based on the characteristics 
of the block-oriented procedure when used with non-uniform 
cells. The nature of the procedure is such that "holes" or 
unfilled portions of the STAR, may remain after processing 
of a row has been completed. A cell occurring later in the 
linear order may be small enough to fit the "hole" and, if 
placed there, 1 relieve crowding conditions on the 
remainder of tie chip. Thus, in the event of folding 
failure with block depth modification and rotation, a 
lookback operation is performed. This operation scans a 
iinited number of preceding rows for "holesn large enough to 
contain the cell to pe placed. If any are found, the cell 
is located there. If not, the normal base or alternate row 
is used for placement. 
The general flow of the non-uniform folding technique 
is shown in Figure 24. Data concerning the performance of 
this and other procedures shown here are presented in a 
later chapter. A unified placement system, which is based 
on the LOF techniques presented here is described in 
Chapter V .  
START 
L 1 
Get C e l l  
From 
Linear  
Order A 
I J 
Compute New +a Base Row 
> 
I 
C f 
Try  t o  Place Ce l l  
On Rows 
(Base Row - LBKD) 
Through 
(Base Row) 
I 
Block 
Compute 
A1 ternate  
T-y t o  Place 
C e l l  On 
A1 ternate  
Row 
Success 
Figure 24 .  Functional Flow o f  Folding Procedure 
I V .  STAR PLACEMENT OPTIMALITY MEASUREMENT 
In the preceding chapter, general methods for 
generating a linear order of STAR cells and for folding the 
order onto the STAR have been presented. The performance of 
the final placement system depends on these methods and on 
the exi~tence of a fast medium for gauging STAR placement 
optimality ( a placement rater) . The development of this 
optimality measurement technique is discussed in this 
chapter. 
The following section contains a description of 
criteria concerning optimality measurement. In the next two 
sect ions, methods for measur ing two opt imality 
characteristics are discussed. The final section of this 
chapter describes a method by which nearness of a placement 
rating to the highest expected rating can be estimated. 
Criteria for STAR Placement Optimality Measurement 
The overall objective for the STAR placement system is 
to improve the ease with which a STAR placement can be 
routed. As is shown in the next chapter, the optimality of 
the placement produced by the STAR placement system depends, 
to a great extent, on the perforaance of the STAR placement 
rating routine. 
83  
This rctisg procedure should satisfy two basic 
criteria: 
1. the procedure should form measures which are 
proportional to ease of placement routing, and 
2. the procedure should operate in as little time as 
possibl-e so that many repetitions of the technique will 
not produce an adverse effect on total system speed. 
As s t ~ t e d  in a previous chapter, the objective of 
placement to aaximize routing ease is not easily expressed 
in terms of measurable phenomena. Translation of this 
objective into simple criteria is necessary prior to 
dzvelopment of optimality measurement techniques. 
For the STAR placement system, two measurable 
quantities that have been selected for use in placement 
rating are channel usage and the fraction of linear routing 
paths. In the following sectio;, methods for estimation of 
STAR channel usage are presented. Estimation of the 
fraction of linear routes is discussed in a later section. 
Channel Usage Prediction 
The tlecessity for high-speed ratlng of STAR placements 
precludes use of many conceivable "exact" channel usage 
prediction techniques. The method presented hers has been 
developed to allow computational speed. The method is also 
valuable since it can be used to provide estimates of both 
horizontal and vertical usage in all areas of the STAR. A 
t h i r d  advantage of the method is t h a t  n-cell  ne ts  can be 
handled d i r e c t l y  and need not be t r ans la t ed  i n t o  ce l l -pa i r  
connection equivalents .  
The estimation procedure can be e a s i l y  v isua l ized  by 
considerat ion of the STAR s t r u c t u r e .  I f ,  a s  shown in  
Figure 25, a t  l e a s t  one c 11 in  net I l i e s  t o  the l e f t  of 
column J and a t  l e a s t  one c e l l  t o  its r i g h t ,  then i t  is 
possible  t o  s t a t e  w i t h  c e r t a i n t y  t h a t  a t  l e a s t  one 
horizontal  channel segment i n  column J m u s t  be used for  the 
routing of net I.  I n  a  s imi lar  manner, it  can be seen t h a t  
a t  l e a s t  one v e r t i c a l  channel i n  row K m u s t  by used for  net  
I .  Corresponding statements can be made for  any of the rows 
or columns which i n t e r s e c t  the rectangle  A ( t h e  minimum 
rout ing boundary) in  Figure 25. 
- 
In the ac tua l  routing ~f the placement, it  is required 
fo r  the net t o  extend beyond the boundaries of t h i s  
rectangle  in  order t o  connect t o  the in te rna l  pins of the 
c e l l s .  Exactly how fa r  the net extends is determined by the 
loca t i cn  of the net  en t ry  ranges of the c e l l s  a t  the 
ext remi t ies  of .he rectanqle .  Thc c e l l  pa i r  shown i n  
Figure 26 i l l u s t r a t e s  t h i s  point .  
The best  (min imum channel usage) interconcect ion path 
for  these c e l l s  i s  the path obtained i f  the c e l l s '  pin 
s t r u c t u r e s  a re  such t h a t  the nearest  ends o i  t h e  c e l l s  can 
be connected. The worst-case path is t h a t  required i f  the 
f a r t h e s t  ends of t.re c e l l s  m u s t  be connected. I t  is then 
STAR Column 
Figure 25. STAR Channel Usage Est imat ion 
Concepts 
- - -- 
STAR Row 
K 
Best-Case Routing 
I 
Worst-Case Routing 
'Average ' Routing 
Figcre 26. Dependence o f  Routing on 
I n t e r n a l  Ce l l  Structtrre 
possible to define the "average" routing of the connection 
to be such that the number of channel segments charged in 
each row or column of the STAR is exactly equal to the 
average of the charges for the best and worst-case routings 
in the corresponding columns. 
The average charges for each net in the placement can 
be calculated and summed for each STAR row and column. The 
result of this summation is an estimate of the number of 
channel segments to be used in the routing of the placement 
in each STAR row an? :olumn. Since the number of channel 
segments available in a row or column is fixed, the results 
are easily presented as thz expected fraction of the 
segments to be filled in each row or column i e., the 
channel utilization). 
It should be noted that, in the calculation of 
horizo.:tal charges, the four horizontal routing channels 
available on each row are considered to be equivalent and 
equidistant from any other row. Thus, while the best and 
worst-case charges for the horizontal direction are 
different, they are the same for the vertical direction. 
Averaging, then, need only be performed for the horizontal 
segments of each net. 
It should also be noted that, regardless of net size, 
these simple usage calculations can be made based olely on 
the cells located at the extremities of tne routing 
boundary. Thus, the assumptioi~ of a particular routing 
s t r u c t u r e  ( s u c h  a s  t h e  minimum spanning s ree )  i s  not 
necessary. 
Two previously un-mentioned aspects  of t h i s  procedure 
may adversely a f f e c t  the accuracy of the c a l c r l a t i o n s .  
F i r a t ,  the impl ic i t  assumption t h a t  a l l  routing of a  net  
w i l l  be conducted w'ithie the routing boundary does not 
accura te ly  r e f l e c t  the performance of modern routing 
techniques. While routing of a  net w i t h i n  its boundary is a 
good model for routinq under uncrowded circumstances, a s  
chani~els  w i t h i n  the boundary are  f i l l e d ,  more and more 
cha?siel segments outs ide the boundary must be u t i l i z e d  for 
the net path. However, the 'easiest '  routing for typica l  
rou te r s  does res ide  w i t h i n  the routing boundary. T h u s ,  the 
densi ty  ca lcu la t ions  r e f l e 7 t  the interconnection of the 
c e l l s  under the simplest  condi t ions and can be used as  
ind ica to r s  of routing ease.  
A more se r ious  l imi ta t ion  of the procedure is  its 
f a i l u r e  t o  consider the c h a r a c t e r i s t i c s  of a  net which must 
cross  e i t h e r  a STAR row or column more tnan once. An 
example of k h i s  type of net is shown i n  Figure 27 .  
Since pe ts  of t h i s  type mus t  i n c r e .  ! the channel usage 
in  e i t h e r  the horizontal  or v e r t i c a l  d i r e c t i o n ,  the usage 
f igures  obtained by use of rhe procedure discussed should be 
recognized as  op t imis t i c  views of the ac tua l  channel usage. 
However, r e s u l t s  of s t a t i s t i c a l  ana lys i s  of typica l  STAR 
appl ica t ion  c i r c u i t s  indica te  tha t  neglecting these m u l t i p ~ :  
Col umn 
J 
Crientation of Net Cells 
Possf bl e Routing Paths 
Figure 27. Example of Net with Mu1 tiple Crossings 
of Row or Column 
row or column crossings may not produce excessive 
d i f fe rences  between the calculated and ac tua l  usage f igures .  
These s t a t i s t i c s ,  showrr in  Chapter V I ,  ind ica te  t h a t  average 
n e t s  contain between two and three  c e l l s  and, because the 
minimam possible  net s i z e  is tvo, indica tz  t h a t  the majority 
of ne ts  a re  simple connections. Since,  regardless  of 
r e l a t i v e  c e l l  pos i t ion ,  i t  is a;ways poss:ble t o  route 
between two c e l l s  withoct use of mult iple  cLrt:-'lngs of a  
s ing le  cniumn or row, i t  can be seen tha t  L!*-~ cha!iiiel usage 
of the majority of c i r c u i t  ne ts  is cor rec t ly  estimated. 
The procedure used for  ca lcula t ion  of t h e  channel usage 
est imates  w i l l  now be given. I n  t h i s  procedure, DH(I) is 
the best-case number of horizontal  channel. segments used i n  
column I ,  WH(1) the worst-case number of horizontal  segmencs 
used i n  the column and V(J) the number of v e r t i c a l  segments 
used in  row J. WIDTH(2) is the w i d t h  i n  t r a n s i s t o r s  3f c e l l  
J. The two f i g u r e s ,  U(B) and U(\j, represent the global 
horizontal  and v e r t i c a l  channel usage ac described i n  a  
precc6ing chapter.  
The wage estimation prcccdurs is 
1. Form vectors S and R sacn ti.*t 
S (J )=STRR c2lumn containing the left-most crarisistor i n  
c e l l  J ,  and 
R(J )=STAP row containing cell .  J 
for  a l l  c e l l s  J i n  the plas~,rt!nt .  
2. For each netf I, in the circuit and for each cell, Jf 
in net X, let 
LEFT (I) =MIN (S (J) ) 
RIGHT(I)=MAX(S(J) ) 
TOF(- -MIN(R(J)) 
BOTT (I) =MAX (R (J) ) 
Sl (I) =MIN (S (J) +WIDTH (J) -1) 
S2 (I) =MAX (S (J) +WIDTH (J) -1) 
3. BH(K) =the number of nets I such that 
(Sl (I) 5 K) AND (RIGHT(1) > K) ) 
OR [(Sl (I) < K) AND (RIGHT (I) - > K) 1 
4. WH(K)=the number of nets I such that 
[(LEFT(I) 5 K) AND (S21I) 2 K ) ]  
5 .  V(L)=the number of nets I such that 
[ (TOP(1) L) AND (BOTT(1) > L) 1 
OR [(TOP(I) < L) AND (BOTT(1) 1 L) 1 
6 .  1 COLS 
U(H) = [ z (BH(K)  + WH(K))I  
8ROWS0COLS K=l 
I ROWS 
Linear Routing Prediction 
The maximization of the fraction of STAR cell nets 
which can be routed without bends was selected as a 
placement objective both to increase routing ease (by 
providing simple routing paths) and to minimize the number 
of vias required for routing. 
Since, like exact channel usage prediction, exact 
linear routing prediction requires knowledge of internal 
cell pin structure, approximate methods for counting linear 
nets have been selected. 
The quantity which is actually measured is the number 
of potentially linear nets (i.e,, the number of nets which 
can be routed linearly if internal cell structures permit). 
This can be easily calculated as the sum of the number of 
nets in which all cells reside on the same row and the 
number of nets in which all cells share at least one STAR 
transistor column. 
These figures are available as by-products of the 
chacnel usage e:;timation procedure described previously. If 
a net, I, is linear in the horizontal direction, then 
TOP(1) = BOTT (I) . If the net is linekr vertically, then 
Sl(1) - > RIGHT(1). The fraction of the total number of 
circuit nets which are potentially linear (FSN) can be used 
for an indication of optimality with respect to linear 
routing. 
Placement Quality Measurement 
While measurement of various placement characteristics, 
such as channel usage or fraction of linear nets, can be 
used as indices for the comparison of two placements of a 
circuit, neither give an estimate of the nearness of a 
placement to the optimum. 
If a large number of random placements are formed, each 
is rated, and the rating results are plotted in histogram 
format, then a normal curve is approximated. Standard 
probabalistic techniques can then be used to estimate the 
fraction of all possible placement which would have ratings 
lower than a particular placement (the placement quality) . 
The assumption is that the closer this fraction approaches 
to one, the more optimum is the placement. 
It should be emphasized that this procedure for 
estimating placement quality is based on no theoretical 
study of the placement problem, but rather on the nature of 
observable results. Quality is not used as a driving force 
for the STAR cell placement routines, but is presented to 
the usdr as additional output data. Further discussion of 
placement quality is deferred to a later chapter. 
V. TEE CELL ARRANGEMENT PROGRAM FOR STAR (CAPSTAR) 
General techniques for the solution of the STAR cell 
placement problem and methods for placemsnt rating have been 
discussed in the preceding chapters. The incorporation of 
these techniques into a FORTRAN placement program (CAPSTAR) 
for use in an application environment will be discussed in 
this chapter. 
CAPSTAR was developed to act as an integral 2art of a 
system of programs which solve the physical design problem 
for logic circuits to be implemented by use of STAR 
technology. As such, several portions of the program are 
involved with the formating of input and output data 
necessary for communication with other programs. These 
portions of CAPSTAR will not be discussed in this chapter. 
The features of CAPSTAR which will be presented here 
are those which deal with the previously-discussed placement 
and rating techniques and other functions necessary for 
high-speed identification of ncsr-optimum STAR placements. 
T h e  following sections contain descriptions of high-level 
program organization, database organization, LOF procedure 
implementation, placement improvement techniques, placement 
rating procedures, and a method far placement of circuit 
pads. 
A u s e r ' s  g u i d e  f o r  t h e  program d e s c r i b e d  h e r e  is  g i v e n  
i n  1351 .  The s o u r c e  l i s t i 1 1 g  of  t n e  program is shown i n  
[ 3 4 1  
Program O r g a n i z a t i o n  
The h i g h - l e v e l  o r g a n i z a t i o n  o f  CAPSTAR is  i l l u s t r a t e d  
i n  F i g u r e  28. A s  shown i n  t h i s  f i g u r e ,  f o l l o w i n g  t h e  
pe r fo rmance  o f  t h e  c l u s t e r i n g  and d e c o m p o s i t i o n  p o r t i o n s  o f  
t h e  LOF t e c h n i q u e ,  t h e  f o l d j ~ g  p r o c e d u r e  is used 
r e p e t i t i v e l y  t o  g e n e r a t e  a  number of  d i f f e r e n t  p l a c e m e n t s .  
A f t e r  a  u s e r  - e n t e r e d  number of s o l u t i o n s  (MAXSOL) h a s  been 
formsd,  t h e  b e s t  ( h i g h e s t - r a t e d )  lMPROVE o f  t h e  p l a c e m e n t s  
a r e  s e l e c t e d  and a r e  improved by means of  a s i m p l e  P I  
r o u t i n e .  The  b e s t  improved p l a c r m e n t  is  s e l e c t e d  a s  t h e  
problem s o l u t i o n .  
A s  p r e v i o u s l y  n o t e d ,  t h e  f o l d i n g  s t e p  s o l u t i o n s  a r e  
formed u s i n g  v a r i o u s  b lock  d e p t h s ,  r o t a t i o n s ,  and lookback  
d i s t a n c e s .  The p r o c e d u r e  is  d e s i g n e d  s o  t h a t  t h e  
e a r l i e s t - f o r m e d  p l a c e m e n t s  a r e  u s u a l l y  t h e  h i g h e s t  r a t e d  
p l a c e m e n t s  t h a t  can be formed by f o l d i n g .  Due t o  t h e  
s i m p l i c i t y  o f  t h e  f o l d i n g  p r o c e d u r e ,  t h e  time r e q u i r e m e n t s  
f o r  a  l a r g e  number o f  r e p e t i t i o r . ?  is u s u a l l y  n o t  e x c e s s i v e .  
The v a r i a b l e  MAXSOL, t h e n ,  is  t y p i c r l l y  p i c k e d  t o  be a  l a r g e  
number t o  a l l o w  t h e  b e s t  IMPROVE t o  be s e l e c t e d  from a 
r e l a t i v e l y  l a r g e  sampl ing  of  p l a c e m e n t s .  
START 0 
Perform 
. C1 uster ing 
Procedure 
Perform 
Oecompos i ti on 
Procedure 
1 I 
I 
I 4 
Perform 
Folding 
Procedure 
MAXSOL Times 
I 
Select  
IMPROVE Best 
Placements 
Perform 
P I  
Procedure I , iMPROVE Times 
I Select  Best Improved Placement 
PI ace F I  
Figure 28. CAPSTAR H i  gh-Level Flow 
A placement improvement routine is then performed on 
the set of IMPROVE best folded solutions. This routine, 
which is based on the neighborhood pair-wise interchange 
technique, is quite simple conceptually but tends to execute 
rather slowly. The variable IMPROVE, then, is usually 
selected to be a number in che range, 3 to 5. 
It is possible to set IMPROVE to 1 co reduce execution 
time . However, experimental CAPSTAR runs have indicated 
that the best folding solution is often not the best 
solution after placement and sub-nominal placements may 
result by reduction of IMPROVE from the range noted above. 
Following im?rovement, the highest-rated placement is 
selected and pad placement is performed. The pad location 
process is deferred to this point in the program so that 
this relatively slow procedure need only be performed on one 
placement. 
After the pads are located, the appropriate output 
files are constructed and results are presented to the user. 
The program then terminates and control is passed to the 
successor program, a STAR placement router. 
A discussion of the high-level aspects of CAPSTAR has 
been presented in this section. In the following sections, 
a more detailed description of the CAPSTAR segments will be 
q iven . 
Database Organization and Storage 
CAPSTAR has been designed for execution on computing 
systems with limited available program and data storage 
areas. Thus, the minimization of data storage requirements 
is a high-priority objective of the program. 
In general, the data storage requirement for any 
placement program is dependent on the maximum circuit size 
intended for use. For the STAR placement problem, these 
restrictions have been established so that the largest 
circuit which can be handled by CAPSTAR is one consisting of 
1000 elements and 500 nets. 
If the data describing the interconnection structure of 
a circuit of this size is stored using the connection matrix 
(C), one million entries must be saved. If each entry is 2 
bytes (16-bit integer format) in length, almost 2M 
(1M Z2O) bytes of storage is required for this array, 
alone. Storage of only the entries above the diagonal of 
this (symmetric) matrix would require 1M bytes, which is 
still in excess of the total s ,rage available in many small 
computers. 
The C-matrix, then, has not been utilized in CAPSTAR. 
Circuit interconnection data is stored in a vector (NTC) 
which is a list of the elements in each net with 
single-entry delimiters between nets. The required length 
of this vector is a function of the maximum number of nets 
and the maximum average net size. Selecting an up2er bound 
of 5 elements on average net size, the required length of 
this vector is 3000 entries, or less than 6K (1K = 210) 
bytes of storage at 2 bytes per entry. 
Tnis approach has the added advantage of removing the 
recjui.rr.ment for modelling nets as connections between cell 
p a l r s  ?as  is necessary for the C-matrix organization). 
Thus, the processing time associated with modelling of the 
input network is not required. 
While the placement problem for STAR can be solved by 
use of the net-to-ceii mapping specified by the NTC vector, 
an alternate organization of the interconnection data is 
more facile for some portions of CAPSTAR. These portions 
(primarily, the clustering and decomposition portions of the 
LOF process) can be more simply structured if a cell-to-net 
specification of the circuit is available. A second vector 
(CTN) is provided for use by these segments. This vector 
is, basically, an ordered list of elements in the network 
which specifies, for each element, the nets incident to the 
element. 
The length of the CTN vector is the product of the 
maximum number of elements and the average number of nets 
per element. If the maximum average net size is 5 elements, 
a 500-net circuit contains no more than 2500 pins. The 
average number of nets per element in a 1000 element 
circuit, then, is 2.5 . The length of the CTN vector can 
thus safely be set at 3008 entries. 
For convenience i n  the decomposition segment, the 
c l u s t e r s  formed i n  the c lus te r ing  s t e p  a re  t r ea ted  a s  
elements and a re  a l so  entered i n  the CTN array .  The ac tua l  
working dimension of t h i s  a r ray ,  then, is roughly twice the 
3008-entry f igure  s t a t e d ,  or approximately 12K bytes of 
s torage  a t  2 bytes per en t ry .  
Speed of access to  the data  i n  the NTC and CTN vec tors  
can be improved by supplying l ists of poin ters .  For 
example, i f  the data  for net I begins a t  loca t ion  J i n  the 
NTC vec tor ,  the NTC pointer en t ry  a t  locat ion I would 
contain J. While improvement of overa l l  processing speed 
might occur i f  t h i s  s t r u c t u r e  was maintained i n  a l l  segments 
of CAPSTAR, the current  version of the program uses pointer 
vectors  only i n  the c lus te r ing  s tep .  
Most other data s t r u c t u r e s  used i n  the syctem, such as  
those cb.?taining the c l u s t e r  formation h is tory  and c e l l  
width d a t a ,  a re  small i n  comparison to  the NTC and CTN 
vec tors .  However, the da ta  s t r u c t u r e s  which specify a  STAR 
placement can be la rger  and w i l l  now be described. 
The gridded organizatiotl of the STAR leads t o  a matrix 
model for ase i n  STAR c e l l  placement. The s i z e  of t h i s  
matrix is fixed by the number of rows and columns avai lab le  
i n  the l a r g e s t  STAR. A t  the time of t h i s  wr i t ing ,  the 
l a r g e s t  STAR is one cons is t ing  of 28 rows and 9 4  columns. 
The working dimensions of the STAR model matrix (chip)  a re  
t h u s  s e t  a t  30 rows and 100 columns, requiring 3000 e n t r i e s .  
The storage required for this array is less than 6K bytes at 
2 bytes per entry. 
An alternate form of the placement, specifying the row 
and column position of each element in the circait is also 
constructed. Since the maximum element count is 1000, this 
array contains 2000 entries. "he total storage per 
placement is thus 5000 entries, or, approximately 10K bytes 
for 2 byte entries. 
While this storage requirement may not be excessive for 
a single placement, the CAPSTAR structure requires the 
storage of no fewer than IM7ROVE placements. If the maximum 
IMPROVE is selected to be 10, almost l00K bytes of storage 
are required. Even if only the smaller alternate version of 
each placement is retained, the storage required is almost 
40K bytes. 
To alleviate this problem, CAPSTAR maintains these 
intermediate placements in a disk file rathmer than in main 
memory. The storage required is thus reduce6 to that for a 
single placement. A time penalty is incurred, however, due 
to the increased number of disk accesses required. 
A final consideratic? regarding CAPSTAR storage 
requirements should be noted. The program has been 
logically separated into functions so that physical 
separation of program parts can be facilitated. This may be 
useful in the event that the entire program requires an 
excessive amount of storage space. 
In the current version of CAPSTAR, a physical division 
has been made between the decomposition segment and the 
folding segment (see Figure 28). This division was found to 
achieve a significant reduction in required storage over the 
case in which the complete program was executed as a unit. 
Implementation of LOF Procedures 
The clustering and decomposition segments of CAPSTAR 
are organized as described in Chapter 111. The result of 
these procedures is a linear order which is to be folded 
Onto the STAR. The CAPSTAR implementation of this folding 
process will now be described. 
A fold cycle is defined to be the set of operations 
required to either successfully fold a line. c order onto the 
STAR or to determine that possibilities for fold structure 
modification (block depth modification, rotation, etc.) have 
been exhausted. The logical organization of a fold cycle is 
shown in Figure 29. In this figure, FOLD (n,m,p) is defined 
as the folding operation performed at a block depth of n 
using linear order m with a lookback distance of p. 
On the first fold cycle performed, control is 
transferred to the primary entry point shown in Figure 29. 
If a placement is found in the cycle, it is rated and, if it 
is among the IMPROVE best solutions so far identified, is 
saved for further use. Control is returned to the fold 
cycle by use of the alternate entry point. Fold cycles are 
Primary (-1 
LBKD = 0 n 
SUCCESS s 
LT,/ / LBKD \ 
E x i t  
notes : 
- 1 F ?  I 7  
LBKD = Lookback O i s t .  I Exit 1 I LBKD + 1 I LO = Linear  Order 
WLO = Working LO 
BO = Block Depth 
Figure 29. Functional  Flow o f  a Fold k y c l e  
repeated until either failure is noted or MAXSOL solutions 
are found. 
The result of the folding step is the set of IMPROVE 
best placements generated. These placements are passed to 
the succeeding CAPSTAR segment which iteratively improves 
them with respect to rating. This PI segment is described 
in the following section. 
Placement Improvement 
The use of simple placement improvement routines in the 
CAPSTAR system is based on two considerations: 
1. while excellent placements can be obtained by use >f 
the repetitive folding procedure, it is almost always 
possible to identify some way in which they might be 
slightly improved, and 
2. the use of a PI routine which is driven directly by the 
CAPSTAR rating procedure can improve the placements in 
ways not easily obtained by the folding strategies. 
The PI techniques utilized in CAPSTAR are based on 
simple neighborhood PWI concepts. The routine consists of 
two segments. In the first of these, each row in the 
placement is trial interchanged with the rows that are 
within a given row-neighborhood (RN) of it. After each 
trial interchange, the placement is rated by use of the 
CAPSTAR placement rating facility. A trial interchange is 
accepted if the resulting placement has a higher rating. 
In the second segment, each cell in the placement is 
trial interchanged with the cells on its row which are within 
a given cell-neighborhood (CN) of it. As in the row 
interchange segment, a trial interchange is accepted only if 
the rating of the overall placement is improved. 
It should be noted that the set of cells occupying a 
single row in the original placement also occupy a single 
row in the final placement. The use of general NPWI 
routines, in which cells can be interchanged between rows, 
has been avoided in CAPSTAR. For a non-uniformwidth 
structure like STAR, it may be impossible to interchange a 
pair of cells between rows since the space cn a row left by 
the removal of one cell may not be enough for the placement 
of the other cell. Thus, for true NPWI routines, processing 
is required before each attempted interchange to determine 
if cell sizes permit swapping. 
In the simplified NPWI procedures used in CAPSTAR, no 
such processing is required since each row must fit the 
space occupied by any other and since a re-order ing of the 
calls within a row does not affect row length. The 
interchange iterations, then, proceed more quickly than in a 
true NPWI procedure. As will be shown in a later chapter, 
overall placement optimality is not sacrificed by use cf 
this simplified procedure. 
Placement Ratin9 
As can be seen from the preceding discussion, the 
CAPSTAR placement rating routine is critical to the 
derivation of "goodn placements. Much effort, :hen, has 
been devoted to the development of this segment. 
Four factors relating to desired characteristics of 
CAPSTAR placements are measured by this routine. These are: 
1. horizontal channel usage, 
2.  vertical channel usage, 
3. fraction of potentially linear routes, and 
4. distance of unused transistors from the horizontal STAR 
center. 
Factors 1 through 3 in this set have been previously 
discussed and U(H) , U(V) and FSN are calculated as 
described. 
The fourth factor relates to the observation that the 
highest channel densities in s typical STAR placement occur 
toward the horizontal center of the array. Since 
transistors which are not included in any cell do not 
require any internal connections, the paths normally used 
for internal cell connection can (with care so as to assure 
electrical isolation) be allocated to global 
interconnections. If the unused transistors, then, are 
assigned positions near the STAR center, the number of 
effective global channel segments is increased in the 
densest area and increased routing ease should result. 
The process for measuring this factor involves summing 
the horizontal distance to the STAR center from all unused 
transistors to obtain the number ETD. A normalized distance 
figure, ETR, is then computed by 
ETD 
ETR = - 
ETDWC 
where ETDWC is the ETD wh!=h would be noted if the unused 
transistors were evenly distributed over the rows and packed 
at the row ends (i.e., ETDWC is the worst-case ETD) . As 
shown in Appendix A, 
MT MT 
ETGWC = - [ COLS - 1 -  1 (6-1 ) 
2 2ROWS 
where MT is the number of unused transistors in the 
placement. The normalized distance, ETR lies between 0 and 1 
with higher values signifying less-desirable unused 
transistor placements. 
Each of the four factors can thus be easily measured. 
However, comparison of placements on the basis of four 
different measurements is not straightforward. To provide 
the capability of simple placement comparison, the 
measurements should be combined into one overall placement 
rating. 
After consideration of many techniques for measurement 
combination, one of the simplest conceivable methods was 
selected. Each of the measures is translated into a fraction 
from 0 to 1 with undesired qualities producing higher 
f r ac t ions .  For each measure, a  weighting fac to r  is  
ca lcula ted  which indica tes  the importance of the  measure t o  
placement opt imali ty .  The weighting f a c t o r s  a re  then 
mult ipl ied by the appropriate  f r a c t i o n s  and summed together .  
T h i s  sum is normalized ( 0  t o  1) by dividing by the sum of 
the weighting fac to r s .  Subtracting t h i s  r e s u l t  from one 
produces the placement ra t ing .  
A t  f i r s t  considerat ion,  for the STAR placement ra t ing  
problem, U ( H )  and U ( V )  a re  equal ly important t o  placement 
opt imali ty  and should be assigned the same weighting 
f a c t o r s .  However, a s  w i l l  be recal led from a previous 
chapter ,  the objec t ive  is n o t  t o  minimize the sum of U(H) 
and U ( V ) ,  b u t  t o  assure t h a t  both a re  as small a s  possible .  
The scheme adapted for t h i s  measurement combination 
t e c h t ~ ~ q u e  is t o  def ine  two va r i ab les ,  UW and UB,  where 
UW = M A X ( U  ( H )  , U  ( V )  ) 
and 
UB = M I N  ( U  (H) , U  ( V )  ) . 
By a s ~ ~ g n i n g  a n i g h  weighting factor  t o  OW and a  lower one 
t,' UB, CAPSTAR can be forced to  always s t r i v e  t o  minimize 
the  worst measure. 
The ra t ing  for STAR placements (PR) is  t h u s  obtained a s  
(UWWF) (UM)+(uBWF) (UB)+(FSNWF) (1 -FSN)+(ETRWF) (ETR) 
P R a l  - 
UWWF+UBWF+FSNWF+ETRWF 
where XXWF i s  the weighting fac tor  associated w i t h  the 
measure XX. 
This rating combination procedure lends itself to 
future addition of other rating criteria and re-evaluation 
of th, relative importance of the factors. For present 
purposes, the weighting factors are set at (UWWF = 6, UBWF = 
2, FSNWF = 1, and ETRWF = 1). 
Pad Placement 
Once the highest-rated STAR placement has been 
identified, the pads specified in the circuit description 
are located on the chip periphery. For each STAR size, the 
possible pad locations are pre-specified in a disk file. 
The pad placement problem, then, consists of assigning each 
of the circuit pads to one of the pad locations. 
The assumption made is that no two pads are directly 
connected so that pad placement can be performed by use of a 
simple linear assignment procedure. The procedure assigns, 
to each pad, an optimum pad location based on nearness to 
cells directly connected to the pad. The most optimum 
assignment over all pads is then selected and the pad is 
placed at the specified location. 
The procedure continues by selecting the most optimum 
assignment and placing the pad until all pads are placed. 
If the assigned location for a pad is occupied, a new 
optimum location is selected from those not filled and the 
procedure selects the most optimum from the new set of 
assignments. 
Experimental data indicates that this procedure 
achieves near-optimum pad placements with respect to the 
cell layout. Due to the simplicity and standard nature of 
this technique, pad placement will not be included in the 
discussion of CAPSTAR performance in the following chapter. 
V I  . PERFORMANCE OF PROCEDURES 
The organization of a placement program for use with 
the STAR processing technology has been described in the 
preceding chapter. The performance of this program 
(CAPSTAR) is discussed in this chapter. 
There are two primary objectives to this performance 
analysis. First, it is desired to indicate that CAPSTAR can 
form near-optimum cell placements in a computztionally 
feasible amount of time and that the time requirements and 
placement optimality can be influenced by certain program 
variables. 
Second, the validrty of the CAPSTAR approach (and, by 
inference, the LOF procedure) is to be tested by its 
comparison to a more common placement technique. 
The first portion of this chapter will deal with 
CAPSTAR performance characteristics, alone. Several test 
circuits will be identified and results of CAPSTAR executson 
with various input parameter settings will be shown. In the 
latter part of the chapter, results of comparisons between 
CAPSTAR performance and that of the pair-wise interchange 
technique will be given. 
CAPSTAR Performance 
The opera t ion  of CAPSTAR has been v e r i f i e d  by use of 6 
test  c i r cu i t s ,  TC1 through TC6. These c i r c u i t s  represen t  
a c t u a l  d i g i t a l  l o g i c  a p p l i c a t i o n s  and have been s e l e c t e d  a s  
t y p i c a l  of STAR a p p l i c a t i o n  c i r c u i t s .  Data descr ib ing  the  
test  c i r c u i t s  is shown i n  Table 1. The CAPSTAR t e s t  runs 
fo r  these  c i r c u i t s  were performed on an IBM 370/158 and 
program compilat ion was performed by means of the  IBM 
FORTRAN-IV Level  G compiler.  
Table 1 
Test C i r c u i t  Parameters 
Six  CAPSTAR input v a r i a b l e s  were modified dur ing 
t e s t i n g  t o  s tudy t h e  e f f e c t s  of d i f f e r e n t  values.  These 
v a r i a b l e s  a r e  MAXSOL, IMPROVE, RN, C N ,  ROWS and COLC, t h e  
f unc t ions  of which have been descr ibed i n  previous s e c t i o n s .  
, 
4 
AVG 
CELLS/NET 
2 . 6 0  
2.52 
2.97 
2 . 6 0  
2.52  
2 . 6 0  
AVG 
PINS/CELL 
3 . 6 2  
3.15 
3.05 
3.21 
3.15 
3.62 
AVG 
. WIDTH 
7 . 6  
8 . 4  
6.9 
7 . 4  
8.0 
7.6 
NETS 
8 8  
25 
la4 
3 5  
25 
4 4 0  
CIRCUIT 
TC1 
TC 2 
TC 3 
TC4 
TC 5 
TC 6  
CELLS 
61 
20 
96 
24 
20 
305 
Unless o therwise  noted,  t he  va lues  fo r  these  v a r i a b l e s  a r e  
s e t  a t  MAXSOL=200, RN-CN-2, and IMPROVE-8. The d e f a u l t  
va lues  fo r  (ROWS, COLS) a r e  (8 ,24)  f o r  c i r c u i t s  T C 2 ,  TC4, 
and TCS, (16,48)  f o r  TC3, (20,25) f o r  T C l ,  and (28,46) f o r  
TC6. * 
The r e s u l t s  of the  f i r s t  t e s t  s e r i e s  a r e  d i sp layed  in  
Figure 30. In t h i s  simple s e r i e s ,  CAPSTAR was v e r i f i e d  t o  be 
capable of placement of a l l  t e s t  c i r c u i t s  i n  a cc sp t ab l e  
t ime.  In the  f i g u r e  shown, i t  is i n t e r e s t i n g  t o  note the  
correspondence between required process ing time and 
i n t u i t i v e  c i r c u i t  complexity. The exac t  r e l a t i o n s h i p  
between input  c i r c u i t  complexity and CAPSTAR performance has 
not  been e s t a b l i s h e d .  
A second type of r e s u l t  from the  f i r s t  t e s t  s e r i e s  is 
a l s o  i l l u s t r a t e d  i n  Figure 30.  I n  t h i s  f i g u r e ,  the  r e l a t i v e  
ga in  i n  placement r a t i n g  t h a t  is provic3ed by the  placement 
improvement rou t ine  is i l l u s t r a t e d .  I t  should be noted a t  
t h i s  po in t  t h a t  no conclus ions  can proper ly  be drawn from 
the  comparison of the  r a t i n g s  of d i f f e r e n t  c i r c u i t s .  The 
only  methods of c o n t r a s t i n g  the  r a t i n g s  of two c i r c u i t s  
m u s t ,  i n  some way, include the  optimum r a t i n g s  fo r  
placements of each c i r c u i t .  Since t he se ,  i n  gene ra l ,  a r e  
unknown, comparison of r a t i n g s  of d i f f e r e n t  c i r c u i t s  should 
not  be at tempted.  
The remainder of the  t e s t  s e r i e s  t o  be discussed i n  
t h i s  s ec t i on  dea l  with var iance  of CAPSTAR input  parameters.  
R a t i  ng v] - Post-PI 
Figure 30. CAPSTAR Performance f o r  Test C i rcu i ts  
Since s imilar  r e s u l t s  have bee2 notr.3 for a l l  s i x  t e s t  
c i r c u i t s ,  the r e s u l t s  for these s e r i z s  w i l l  be presented for  
the c i r c u i t  TC2 only. 
The r e s u l t s  of t e s t  s e r i e s  2 ,  i n  which MAXSOL was 
varied a re  shown i n  Figure 31. A s  implied i n  t h i s  f igu re ,  
,t:he highest-rated placements t h a t  a re  found a re  usually 
those formed ea r ly  i n  the folding h is tory .  T h i s  is i n  l i n e  
w i t h  tne nature of the folding procedure which should 
produce the best  placements f i r s t .  However, the CAPSTAR 
operat::ing philosophy has been t o  make MAXSOL a t  l e a s t  2 0 0  i n  
order t o  allow se lec t ion  of the IMPROVE best  placements from 
a s  la rge  a  placement sampling as  possible .  T h e  s impl ic i ty  
of the folding procedure allows t h i s  without incurring 
severe time pena l t i e s .  
Test  s e r i e s  3, the r e s u l t s  of which are  summarized i n  
Figure 32, involved a  study of the e f f e c t s  of IMPROVE on 
processing time and f i n a l  placement ra t ing .  A s  indicated i n  
t h i s  f i g u r e ,  the best  post-improvement placement is  
general ly  obtained from among the best  two t o  three  folding 
so lu t ions .  I n  addi t ion ,  the r e l a t i v e  slowness of the NPWI 
improvement procedure causes severe execution-time pena l t i e s  
for  a large IMPROVE. 
The fourth s e r i e s  of t e s t s  consisted of a  s t u d y  of 
CAPSTAR performance w i t h  respect  to  variance of the row 
neighborhood d is tance  ( R N )  . The curve shown i n  Figure i 3  
summarizes the r e s u l t s  of t h i s  s e r i e s .  A s  can be seen i n  

Rating 
I 
* 
Best Post-PI Sclut ion 
IMPROVE 
Figure 32. E f f e c t  a f  IMPROVE on Best Post -P I  Rating 
Rating 
.725 
.705 
-700 
.695 
Figure 33 E f f e c t  o f  RF 1 Best Post-PI Rat ing 
t h i s  f i g u r e ,  t h e  v a l u e  of RN h a s  v e r y  l i t t l e  o v e r a l l  e f f e c t  
on placement  r a t i n g .  T h i s  i n d i c a t e s  t h a t  t h e  rows of t h e  
p lacement  produced by t h e  f o l d i n g  s t e p  a r e  v e r y  n e a r l y  i n  an 
o p t i m a l  r e l a t i v e  p lacement .  T h i s  can be i n t u i t i v e l y  
s u p p o r t e d  by c o n s i d e r i n g  t h a t  t h e  n a t u r e  of t h e  f o l d l n g  
o p e r a t i o n  is  such  t h a t  a  c e l l  i n  row J most l i k e l y  h a s  its 
n e i g h b o r s  from t h e  l i n e a r  o r d e r  on rows ( J+ l )  and ( J -1) .  
Thus ,  row J s h o u l d  be t i g h t l y  l i n k e d  t o  t h e  s u r r o u n a i n g  rows 
and i n t e r c h a n g e  w i t h  a n o t h e r  row would n o t  be a p t  t o  p r o v i d e  
improvement. 
T h e  r e s u l t s  of t e s t  s e r i e s  5 ,  i n  w h i c h  t h e  c e l l  
neighborhood d i s t a n c e  ( C N )  was v a r i e d ,  a r e  shown i n  
F i g u r e  3 4 .  I t  can  be s e e n  from t h i s  f i g u r e  t h a t  t h e  main 
power of t h e  p lacement  improvement r o u t i n e  l i e s  i n  
r e o r g a n i z a t i a n  of t h e  c e l l s  w i t h i n  a  placement  row r a t h e r  
t han  i n  row movement. However, o n l y  l i m i t e d  improvement is 
ach ieved  by i n c r e a s i n g  C N  above 2 .  Based on t h e  r e s u l t s  of 
t h i s  and t h e  p r e c e d i n g  t e s t  s e r i e s ,  t h e  nominal s e t t i n g s  f o r  
RN and CN have been e s t a b l i s h e d  a s  2 .  
T e s t  s e r i e s  6 was a  s t u d y  of t h e  e f f e c t s  on p lacement  
r a t i n g  when t h e  s i z e  of t h e  STAR is i n c r e a s e d .  F i g u r e  35 
shows a p o r t i o n  of t h e  r e s u l t s  o b t a i n e d  by f i x i n g  t h e  number 
of  STAR columns and i n c r e a s i n g  t h e  number of rows. A s  might  
be e x p e c t e d ,  t h e  r a t i n g c  of t h e  r e s u l t a n t  p l acemen t s  show 
s i g n i f i c a n t  improvemer.t a s  t h e  number of rows i n c r e a s e s  
( i . e . ,  a s  STAR d e n s i t y  d e c r e a s e s ) .  S i m i l a r  r e s u l t s  a s  shown 
Rating 
Figure 34.  Effect of CN on B e s t  P o s t - P I  Rating 
8 9 10 
Number of STAR Rows 
Rat ing 
.75-- 
.70 
.65 
-- 
I 
1 
I 
I 
I 
I 
2 4 27 30 
Number o f  STAd Co l umns 
Rat ing 
Figure :35. E f f ~ c t s  of the Number o f  
STP#i? Rows and Columns on 
F9acement Rating 
. 7 5 - -  
.70-- 
.65 I 1 I I i i 
in Figure 35 for an equivalent increase in the number of 
STAR columns with a constant number of rows. 
Figure 36 illustrates an interesting phenomenon 
observed from the results of test series 6. As shown in 
this curve, decreasing effect is produced by the placement. 
improvement procedures as the STAR area i.s increased. The 
iaplication is that the folding strategy used is nost 
effective i e ,  the folding solutions are more nearly 
optimum) for sparsely-populated STARS. This would be the 
expected result, since restrictions on STAR size tend to 
force more reliance on the optimality-reducing rotation and 
lookback operations. 
Test series 7 considered effects of block deptt. on the 
ratings of unimproved placements. The curve shown in 
Figure 37 displays the average pre-improvement rating of 477 
solutions for the TC2 circuit versus the block depths used 
in obtaining the solutions. The peak at block depth 4 in 
this figure indicates that, at this depth, the horizontal 
and vertical channel usages were approximately equal for the 
majority of the generated placements. 
The sharp upward trend at the maximum block depth is 
unexplained. Similar unpredicted peaks have been noted in 
the rating-versus-block depth characteristics of several of 
the test circuits. The existence of these peaks tends to 
reinforce the usefulness of completely sweeping the block 
depth range in the folding procedure. 
R
at
in
g 
.
95
 i
 
ST
AR
 A
re
a 
(Ro
ws
 x
 
Co
ls
) 
Fi
gu
re
 3
6 
.
 
E
ff
ec
t 
o
f 
ST
AR
 A
re
a 
on
 P
la
cc
rr
~r
rt
 Rd
t.i
ng
 

An important exercise in the validation of CAPSTAR 
procedures has been the attempt to identify situat~ans in 
which it is possible to place a circuit on a given-size 
STAR, but in which the CAPSTAR folding procedures fail to 
form a placement. Early CAPSTAR versions, in which the 
rotation and lookback operations were not used, exhibited 
numerous folding failures. Later versions, in which only 
one of the two operations appeared produced folding failures 
for high chip densities (over 9 0 %  of the STAR used;. 
Since the use of both operations was initiated, roughly 
5000 trial executic-1s of CAPSTAR have been made. The test 
applications have ranged up to 9 5 %  chip density. During 
this time, no folding failures have been detected. 
While no validation certainties can be based on this 
limited testing, the current operating assumption is that no 
folding failures will occur for placement densities less 
than 9 5 % .  It is anticipated that future use of CAPSTAR in 
the application environment for which it is designed will 
allow establishment of this bound with more certainty. 
The preceding discussion has not treated the problem of 
determination of the nearness of CAPSTAR-produced placements 
to the optimum. As indicated previously, the only knovn 
methods of identifying optimum placements involve complete 
investigation of the solution space which is computatior~lly 
feasible for t~ivial cases, only. 
A previously-noted method of estimating nearness of a 
placement t o  the optimum involves use of Monte Carlo 
techniques t o  form the d i s t r i b u t i o n  of r a t ings  of a l a rge  
number of placements of a c i r c u i t .  The d i s t r i b u t i o n  is 
t r ea ted  a s  a normal d i s t r i b u t i o n  and the expected f r a c t i o n  
of placements with t a t i n g s  lower than the placement of 
i n t e r e s t  is ca lcula ted  by standard techniques. 
A modified version of t h i s  procedure is implemented in  
the CAPSTAR system a s  noted i n  the preceding discussion of 
placement q u a l i t y .  The sample space is the s e t  of MAXSCL 
so lu t ions  produced by the folding s t e p  w i t h  the s e t  of 
IMPROVE improved placements. The q u a l i t i e s  calculated for  
t h e  highest-rated placement of each t e s t  c i r c u i t  range from 
0.99707 fo r  TC6 t o  0.99918 fo r  TC5. I n  other words the best  
so lu t ions  t o  each problem a re  expected t o  l i e  w i t h i n  0 . 3 %  of 
the  optimum. 
Due t o  the non-random methods used for formation of the 
sample space and t o  the l imited sample space s i z e ,  the 
CAPSTAR q u a l i t y  est imation procedure cannot be used f o r  
performance va l ida t ion .  The use of random (Monte Carlo) 
techniques has thus been undertaken t o  form large numbers of 
random placements for the TC3 and TC4 c i r c u i t s .  Q u a l i t y  
est imation procedures performed using these data bases has 
shown q u a l i t y  i n  excess of 0.999 fo r  each c i r c u i t  
( indica t ing  ra t ings  within 0 . 1 %  of optimum). 
While t he  use of t he se  methods fo r  e s t ima t ion  of the  
optimum is suspec t  fo r  a number of reasons  (among them, t he  
unproven Gaussian na tu re  of t h e  r a t i n g  d i s t r i b u t i o n ) ,  i t  
seems a reasonable approach f o r  p r e d i c t i n g  the  f r a c t i o n  of 
placements with r a t i n g s  lower than a given r a t i n g .  Based on 
experimental  d a t a ,  t h i s  f r a c t i o n  f o r  the  average CAPSTAR 
placement is  ( conse rva t ive ly )  s e t  a t  0 .98 .  
A f i n a l  note regarding q u a l i t y  computation is i n  o r d e r .  
Since methods of r a t i n g  placements for  rout ing ease  3 r s  
imperfect ,  a technique fo r  q u a l i t y  computation which i s  
based on placement r a t i n g s  does not compute q u a l i t y  on t h e  
b a s i s  of nearness t o  the  optimam (most e a s i l y  rou ted)  
placement b u t ,  r a t h e r ,  on the  nearness  t o  the  h ighes t  
expected r a t i n g .  Proper ly ,  then,  placement q u a l i t y  should 
not  be i n t e r p r e t t e d  a s  "nearness  t o  the  optimum" b u t  a s  
"nearness  t o  the  h ighes t  expected r a t i n g " .  
T h i s  s e c t i o n  has d e a l t  w i t h  CAPSTAR performance without 
regard t o  o ther  placement techniques .  In the  fol lowing 
s e c t i o n ,  the  performance of CAPSTAR w i l l  be con t ras ted  w i t h  
t h a t  of the  PWI p l ~  2ment improvement technique.  
Comparison With PWI Techniques 
- 
The second phase of the  CAPSTAR t e s t i n g  procedure 
involves  comparison of the  performance of the  CAPSTAR 
procedures w i t h  t h a t  of ano ther ,  more commonly used 
placement method. 
The placement method selected for comparison to CATSTAR 
is a modified version of the Monte Carlo IP technique in 
combination with the PWI PI technique. These methods were 
selected due to their relative simplicity and the good 
placements which can be obtained if the PWI procedure is 
allowed to run to completion. 
The initial placement is generated by forming a dummy 
linear order (cells ordered by cell number) and by using the 
CAPSTAR folding section to form a STAR placement. The 
normal methodology used for cell numbering prevents this 
from being a true random initial placement. Cell numbers 
are usually assigned at the logic-diagram stage and cells 
(gates) which are drawn in the same area of the diagram are 
typically assigned numbers in the same range. Since these 
cells have a higher-than-random probability of being 
connected, the dummy linear order has a lower total 
interconnection length than a random linear order. Since 
the folding strategy preserves the linear order, the 
performance of the PWI routine might be expected to be 
slightly better than that obtained from a truly random 
start. 
After construction of the initial placement, the PWI 
routine is begun and proceeds in the manner outlined in 
Chapter 11. The decision to accept or to reject a trial 
interchange is made on the basis of a placement rating 
produced by the CAPSTAR rating facility. Only trial 
in terchanges  which r e s u l t  i n  an increased r a t i n g  a r e  
accepted.  The PWI procedure t e rmina tes  when no t r i a l  
in terchange between any p a i r  of c e l l s  r e s u l t s  i n  placement 
improvement. 
Four of the  t e s t  c i r c u i t s  (TC1  through TC4) were 
s e l e c t e d  fo r  use i a  t he  comparison procedure. The use of 
TC6 was r e j ec t ed  due t o  t he  excess ive  amount of time 
required t o  perform the  PWI procedure for  such a  l a rge  
c i r c u i t .  
CAPSTAR and PWI performance for  the  four c i r c u i t s  is 
shown i n  Figure 38. A s  shown i n  t h i s  f i g u r e ,  
post-improvement CAPSTAR placement r a t i n g s  a r e  on the  same 
order a s  those of the  PWI rou t ine .  
For very small  c i r c u i t s ,  such a s  TC2 and TC4,  t h e  time 
c o s t s  assoc ia ted  w i t h  the  PWI rou t ine  a r e  s l i g h t l y  l e s s  than 
t h a t  of CAPSTAR. For rhe handling of l a r g e  c i r c u i t s  (TC31, 
t he  t i n e  spent  i n  comparison of a l l  c e l l  p a i r s  q u i c k l y  
f o r c e s  the  execut ion time fo r  the  PWI r ou t ine  above t h a t  of 
t he  CAPSTAR procedures.  
A s  can be set:, from the  r e s u l t s  fo r  TC1, CAPSTAR 
execut ion time can increase  g r e a t l y  when extremely dense 
placements a r e  required.  In  the  case  of TC1,  over 9 2 %  of  
t he  STAR area  is  occupied by c e l l s .  Because of t h i s  h i g h  
d e n s i t y ,  and corresponding d i f f i c u l t y  of f i t t i n g  the  
placement onto the  STAR, over 11,800 placement a t t empts  were 
required by t k c  CAPSTAR f o l d i n s  s ec t i on  i n  order t o  ob t a in  
CAPSTAR 
T i m e  (sec) 
500 
TC 1 T C 2  T C 3  TC4 
Figure 38. CAPSTAR and PWI Performance 
200 (MAXSOL) successful placem~nts. if desired, the CAPSTAR 
time requirements for these high-density cases can be 
reduced by use of smaller values for MAXSOL (for TCl, a 
MAXSOL value of 6 would produce the same final ?lacement 
rating). 
The results of the comparison procedure indicate t h a t  
the LOF-NPWI techniques used in CAPS'iAA produce n ! a r ~ a ~ n t s  
with ratings roughly equal to those of the PWI T ~ + J + I Z S S .  
However, the execution-time required for us?  0: ?\41 
procedures on large circuits can, in general, be reduced  by 
use of the CAPSTAR techniques. 
VII. Conclusion 
A strategy for the placement of digital logic ::ells for 
the Standard Transistor Array (STAR) has been presented in 
this dissertation. The placement procedures used are based 
Qn the linear ordering-folding (LOF) technique which have 
been utilized in a number of simpler ~rganizations. 
Modifications to the usual folding methods which provide 
minimization of interconnection channel crowding and which 
allow placement of extremely dense layouts have also been 
given. Methods for measurement of placement optimality have 
been developed. 
The organization of a program which implements the 
placement procedures has been shown and the results of 
program performance testing have been indicated. The 
program has been shown to produce cell placements which are 
comparable in optimality to those produced by an existing 
placement procedure. The execution time which can be saved 
by use of the new proced~re for the placement of large 
circuits has been noted. 
A number of areas for further study are indicated. 
Several of th?se are outlined in the following paragraphs. 
A method for translation of "routing ease" into 
criteria measurable during placemenc formation is required. 
133 
This method should be applicable for use with any technology 
and routing strategy. Placement characteristics which lead 
to ease of routing should be identified and methods for 
xeasurement of the characteristics developed. 
Improved methods for estimation of the nearness of a 
given placement to the optimum are needed. As noted 
previously, the Monte Carlo methods currently in Gse are 
unsatisfactory in several respects. If these techniques are 
to be used in the future, the distribution of ratings of the 
sample set should either be proven to be normal or should be 
re-defined. In addition, the effects on the rating 
distribution of various rating techniques should be 
analyzed. 
Another future study could be devoted to analysis of 
the rharacteristics of folding strategies other than the one 
proposed here. It might be found that folding techniques 
can be developed which optimize a placement with respect to 
other criteria, or, which better optimize with respect to 
the criteria proposnd. LOF technique performance could then 
be increased and applicability broadened. 
i-ie LOF procedures have been shown to perform 
relatively well for STAR-like structures when they are 
followed hy a simple placement improvement routine. Future 
work in this area might incorporate studies of LOF technique 
performance as an initial placement procedure when followed 
by a more powerful PI method. 
Achieving the goals listed above would aid development 
of highly effective placement routines for use with STAR and 
related technologies. In addition, the identification of 
other suitable placement improvement routines may reduce 
execution time with a corresponding reduction in semicustom 
IC development costs. 
REFERENCES 
[l] Franson,P. "Dont Overlook Semicustom ICs for your Next 
Design Project," EDN, Feb.5,1977, pp.72-76. 
-
[ 2 ]  Edge,T., "Semicustom and Custom Integrated Circuits and 
the Standard Transistor Array Radix (STAR) ," 
Electronics and Control Lab, George C. Marshall S ~ a c e  
Flight Center, RTOP: 506-18-31. 
[ 3 ]  Hanan,M. and J. Kurtzberq, "Placement Techniques," 
Chapter 5 in Design ~utomation of aigita: systems,  
-- 
M.A.Breuer , ed., Englewood Cliffs, NJ, Prentlce-iiall, 
[4] Deo,N., Graph Theory with Applications to Engin3erlnq 
Computer Science, Englewood Cliffs, NJ., Prentice-Hall, 
1974, p.32. 
151 Goldstein, A.J. and D.G. Schweikert, "A Proper Model 
for Testing the Planarity of Electrical Circuits," Bell 
System Technical Journal, va1.52, no.1, Jan,1973, 
FP. 135-142. 
[5] Van~1eemput~W.M. and J.G.Linders, "An Improved 
Graph-Theoretic Model for the Circuit Layout Problem," 
Proceedings of the 11th Design ~utomation Workshop, 
Denver, CO., June 17-19,1974, IEEE no. 74 CH0865-6C, 
[7] Kurtzberg,J.M., "Algorithms for Backplane Formation," 
Microelectronics in Large Systems, Spartan Books, 1965, 
PF 51-46 
[a) Shupe,C.F., "Automatic Component Placement in the NOMAD 
System," Proceedings of the 12th Design Automation 
Conference, Boston,MA, June 23-25,1975, IEEE no. 
75 CH0980-3C, pp.162-172. 
[9] Lawler,E.L. and D.Wood, "Branch m a  Bound Methods: A 
Survey," Journal of Operations Research, vo1.14, 1966, 
pp.699-719. 
[la] Little,J., et.al., "An Algorithm for the Travelling 
Salesman Problem," Journal of Operations Research, 
vol.11, Nov.1963, pp.972-989. 
[ll] Gilmore,P., "Optimal and Suboptimal Algorithms for the 
Quadratic Assignment Problem," SIAM Journal, vol .l0, 
no.2, JUner1962, pp.305-313. 
[121 Hillier ,F. and M.Conners, "Quadratic Assignment 
Problems and the Location of Indivisible Facilities," 
Management Science, vo1.13, no.1, Sept.,1966, pp.42-57. 
[I31 Weinberger,A., "Large Scale Integration of MGS Complex 
Logic: A layout M ;hod," IEEE Journal of Solid-State 
Circuits, vol.SC-2, no.4, Dec.,1967, pp.182-198. 
[ 14 I Larsen ,R., "Computer-Aided Preliminary Layout Design of 
Customized MOS Arrays," IEEE Transactions on Computers, 
vo1.C-20, no.5, May,1971, pp.512-523. 
[IS] Yoshizawa,H., H.Kawanishi, and K.Kani, "A Heuristic 
Procedure for Ordering MOS Arrays," Proceedings of the 
12th Dzsign Automation Conference, Boston,MA, June 
23-25,1975, IEEE no. 75 CH0980-3C, pp.384-393. 
[16] Mattison,R., "Design Automation of MOS Artwork," 
Computer, Jan.,1974, pp.21-27. 
[17] Feller ,A . ,  "Automatic Layout of Low-Cos t 
Quick-Turnaround Random-Logic Custom LSI Devices, I' 
Proceedings of the 13th Design Automation Conference , 
San Francisco, CA, June 28-30,1976, IEEE no. 
(181 Hanan,M., P.Wolff, and B.Agule, "A Study of Placement 
Techniques," Journal of - Oesign ~utomation and 
Fault-Tolerant Computing, vol.1, no.1, Oct.,1976, 
pp. 28-61. 
[19] Ciampi,P., "A System for Solution of the Placement 
~roblem," proceedings of the 12th Design Automation 
Conference, Boston,MA, June 23 - 25,19'/5 , IEEE no. 
75 CH0980-3C, pp.317-323. 
(28) Matthews,A., "A Human Engineered PCB Design System," 
Proceedings of the 14th Design Automation Conference, 
New Orleans,LA, June 20-22,1977, IEEE no. 
[211 Steinberg,L. "The Backboard Wiring Problem: A 
Placement Algorithm," SIAM Review, vo1.3, no.1, 
Jan.,1961, pp.37-50. 
[22] Harvey,J. "Automated Board Layout," Proceedings of the 
ACM-IEEE Design Automation Workshop, Dallas,TX, June 
- 28,1972, IEEE no. 72 CH0706-2-SC, pp.264-271. 
[23] Breuer,M.A., " A  Class of Min-Cut Placement Algorithms," 
Proceedings of the 14th Design Automation Conference, - 
New Or leans, LA, June 20-22,1977, IEEE no. 
[24] Hanan,M. and J.M.Kurtzberg, "A Review of the PLasemenc 
Problem and Quadratic Assignment Problems, " - SIAM 
Review, vo1.14, no.2, Apri:,1972, pp.324-342. 
[25] Scanlon,F., 'Automated Placement of FIulr i- ? ? r x ~ : : i ~  
Components," Proceedings of the SHARE-ACM-IEEE desiqn 
Automation Workshop, Atlantic City,NJ, June 28-38,1971 
pp.143-154. 
[261 Giugliano,A. and F.Bosisio, "Present and Future on 
P.C.B. Layout Desi~n Automation at SIT-Siemens , " 
proceedings-of the 12ti Design Autornation C~r~ference, 
Boston,MA, June 23-25,1975, IEEE no. 75 CH0980-3C, 
[271 Quinn,N., "The Placement Problem as Viewed from the 
- - -  
physics of classical Mechanics," Proceedings of the 
12th Design Automation Conference, Boston,MA, June 
23-25,1975, IEEE no. 75 CH0980-3C, pp.173-178. 
[261 Crocker,N., R.McGuffin, and A.Micklethwaite, "Automatic 
ECL LSI Design," Froceedings of the 14th Design 
Automation Conference, Net.$ Orleans,LA, June 20-22,1977, 
IEEE no. 77 CH1216-lC, pp.158-167. 
[29] Ueda,K., Y.Sugiyama, and K.Wada, "An Automatic Layout 
System for Masterslice LSI: MARC," IEEE Journal of 
Solid-State Circuits, vol.SC-13, no.5, Oct.,1978, 
pp.716-721. 
[30] Wilson,D. and R.Smith, "An Exper~mental Comparison cf 
Force Directed Placement ~echnlques," proceedings of 
the 11th Desiqn Automation Workshop, Denver,CO, June 
17-19,1974, IEEE no. 74 CHQ865-6C, pp.194-199. 
(311 Schuler,D. and E.Ulrich, "Clustering and Linear 
Placement," Proceedings of the ACM-IEEE Design 
Automation Workshop, Dallas,TX, June 26-28,1972, IEEE 
no. 7 2  CH0706-2-SC, pp.50-56. 
(321 COxrG. and B.Carrol1, "Placement Techniques for the 
Standard Transistor Array (STAR)," Technical Report, 
NASA Contract no. NAS8-31572, Electrical Engineering 
Dept., Auburn University, Auburn,AL, Sept.,1978. 
[33] Graham,R., "The Combinatorial Mathematics of 
Schjduling, " Scientific American, March,1978, 
pp.124-132. 
1341 Cox, G. and B. Carroll, "CAPSTAR Programmer's Guide," 
NASA Contract no. NAS8-31572, Electrical Engineering 
Dept., Auburn University, Auburn, AL, July, 1979. 
(351 Cox, G. and B. Carroll, "CAPSTAR User's Guide," .NASA 
Contract no. NAS8-31572, Electrical Engineering Dept., 
Auburn University, Auburn, A L ,  July, 1979. 
APPENDIX A 
EQUATIONAL DEVELGPMEYT 
Development of Equations 3-1 and 3.2 
The area o f  in terest  i s  the en t i re  STAR, so 
L (n-1) (W+l) 
U(H)  I ( F(1) + 
4ROWS COLS n 
L 2 4 
U(V) = C (  - F(1) + ( -  F(2)  I 
ROWS-COLS n n 
Complete f i l l i n g  o f  the STAR i s  assumed. Then 
COLS = nW 
and 
where C i s  the number o f  c e l l  s . 
Then 
Using the same worst-cace conditions as i n  Chapter 2, 
Then, f o r  F(1) - 0.5 and F(2) a 0.1, 
1 1 U(H)wc ( 0 . 1 7 ) ( 1 - )  - ( 0 . 1 2 ) ( )  + 1.5 
142 
Development of Equations 3 -3  and 3 - 4  
I n i t i a l  l y ,  
L (n-1 ) (W+l ) 
U(H) 2 [ ( ) F(:) + 
4ROWS COLS 2 n 
L n+l 2n+l 
U(V) = [ (  - ) F ( 1 )  + ( -  j F ( 2 )  1 . 
ROWS' COLS n n 
By the same reasoning as i n  the previous development, 
L 
= I  , 
ROWS - COLS 
f o r  the worst case. 
Then, for F ( l )  = 0 .5  and F(2) = 0.1 , 
The worst-case ETD occurs if the unused transistors l i e  as far 
as possible toward t t e  ends of each row. ETDWC, then i s  thc ETD ob- 
tained when the unused transistors are evenly divided among the rows, 
and evenly divided on a row between the two ends. 
Then, 
= COLS 
U D W C  = (ZROWS) 1 - - i I 
i =I 2 
MT COLS 
(2ROWS) (- ) (  - ) 
2ROWS 2 
MT COLS MT MT 
E.roWC I - (2ROWS)( - + -  
2 4ROWS 2ROWS 
1 
MTOCOLS MT M T ~  
ETDWC = - -  - -  
2 2 4R9W3 
