Area penalty for sublinear signal propagation delay on chip by Vitányi, P.M.B. (Paul)
Centrum voor Wiskunde en lnformatica 
Centre for Mathematics and Computer Science 
Department of Computer Science/ Algorithmics & Architecture 
Paul M.B. Vitanyi 
Area penalty for sublinear signal 
propagation delay on chip 
(Preliminary Version) 
Report CS-R8514 Augustus 
Sibfialheek 
CentrumvomW~~ informatica 
Ams-tetdam 
The Centre for Mathematics and Computer Science is a research institute of the Stichting 
Mathematisch Centrum, which was founded on February 11 , 1946, as a nonprofit institution aim-
ing at the promotion of mathematics, computer science, and their applications. It is sponsored by 
the Dutch Government through the Netherlands Organization for the Advancement of Pure 
Research (Z.W.0.). 
Copyrig~t © Stichting Mathematisch Centrum, Amsterdam 
Area Penalty for Sublinear Signal 
Propagation Delay on Chip 
(Preliminary Version) 
Paul M.B. Vitanyi 
Centre for Mathematics & Computer Science (CW/) 
Kruislaan 413, 1098 SJ Amsterdam, The Netherlands 
Sublinear signal propagation delay in VLSI circuits carries a far greater penalty in wire area 
than is commonly realized. Therefore, the global complexity of VLSI circuits is more layout 
dependent than previously thought. This effect will be truly pronounced in the emerging 
wafer scale integration technology. We establish lower bounds on the trade-off between 
sublinear signalling speed and layout area for the implementation of a complete binary tree 
in VLSI. In particular, sublinear delay can only be realized at the cost of superlinear area. 
Designs with equal length wires can either not be laid out at all, viz. for logarithmic delay, or 
require such long wires in the case of radical delay (i.e., Ith root of the wire length) that the 
aimed for gain in speed is cancelled. Also for wire length distributions commonly occurring 
on chip it appears that the requirements for sublinear signal propagation delay tend to can-
cel the gain. 
1980 Mathe'}1atics Subject Classification: 6p5, 94C99 h~ f> (<> ' b '1:;-1.. '2-
CR Categones: B.7.0, F.2.3 
Keywords & Phrases: very large scale integrated circuits (VLSI), wafer scale integration, 
sublinear signal propagation delay, electronic principles, driving long wires, wire aspect 
ratio, circuit topology, complete binary tree circuits, H-tree layout, layout area, time, compu-
' tational complexity and efficiency, actual wire length distributions, Rent's Rule 
Note: This paper will be presented at the 26th Annual IEEE Symposium on Foundations of 
Computer Science, held at Portland, Oregon, USA, October 21-23, 1985. 
1. INTRODUCTION 
The aim of this paper is to correct a widely spread misunderstanding. In the literature 
on theory of VLSI algorithms and complexity it is generally, but erroneously, held that sig-
nal propagation delay logarithmic in the wire length can be achieved on chip at the cost of 
an area overhead per wire which is linear (say 10%) in the area taken originally by the 
wire. Since the area penalty needs to be in fact far greater (viz. square in the wire length) 
many known lower bounds on the area X time or area X time2 in the sublinear time range 
are way too low, while some known upper bounds are actually false. Recall also that the 
Report CS-R8514 
C.Cntre for Mathematics and Computer Science 
P.O. Box 4079, 1009AB, Amsterdam, The Netherlands 
2 
probability of a chip being flawed, due to the random defects introduced by the fabrication 
process, rises exponentially with its area. Therefore, even small area increases necessary to 
obtain faster signalling speed on chip may already be prohibitive. 
This paper addresses the basic model for VLSI computation. It has consequences for vir-
tually all work in the field which considers sublinear propagation delay versus layout area 
complexity. 
I .I. Background. 
In current chips, synchronization requirements slow down the computation to a clocked 
switching time in the order of the delay in the longest wire. Thus, the overall efficiency of 
many very large scale integrated (VLSI) electronic switching circuits depends strongly on the 
signal propagation delay in long wires. As the minimal feature width continues to decrease 
into the submicron range this delay governs overall performance more and more. This 
seems truly the case on the level of the emerging wqfer scale integration technology which 
manufactures chips of over 4 inch across. Within a particular technology, the on{y way 
currently known to obtain sublinear signal propagation delay (in the literature usually log-
arithmic) in long wires is by fitting a hierarchy of driver transistors to long wires, as sug-
gested by [9]. The area cost of the ramp of driver transistors is then not more than part of 
the area taken by the driven wire while the width of the wire is generally, but erroneously, 
assumed to remain the same unit width, depending on the minimal feature width of the 
underlying technology. However, in (10] it appears that life is not so simple. Viz., the loga-
rithmic delay assumption is incompatible with the constant wire width assumption. 
To achieve a propagation delay logarithmic in the length of the wire, as in [9], electronic 
considerations show [10] that all wires need have the same ratio between width and length, 
that is, the same aspect ratio. (Thus, in multilayered chips now being manufactured, the 
communication wires are grouped in metal layers according to length. In a layer with a 
group of longer wires those wires are proportionally wider and thicker.) We show how the 
driver-hierarchy method to obtain logarithmic signal propagation delay in [9] can also be 
used to obtain radical (rth root) delay with similar (but less pronounced) effects on the 
width of long wires. 
1.2. Outline of the Results 
First we treat the electronic background somewhat more detailed than [10]. As a conse-
quence, it turns out that the efficiency of VLSI circuits with sublinear propagation delay is 
more layout sensitive than hitherto assumed. To demonstrate this, we analyse the effect of 
the sublinear delay requirement on a basic circuit: the complete binary tree topology. (For 
more intricate circuits requiring many more long wires, like n-cubes, butterfly networks, 
cube-connected-cycles networks, the area penalty for sublinear delay is far greater. For 
matrix networks the area penalty is obviously nil.) We show for logarithmic delay, with a 
* We use the Order-of-Magnitude symbols as follows: 
g (n) E O(f (n )) if there exists a positive constant c and g (n).;;;; c If (n) I for all but finitely many positive integers 
n. 
g ( n) E O(f ( n)) if there is a positive constant c and g ( n) ~ if ( n) for infinitely many positive integers n. 
8(f(n)) = O(f(n)) n O(f(n)). 
3 
constant aspect ratio a for the wires and using c layers: 
• every layout of a complete binary tree with N nodes takes f!.(N lolf I 6c N) area*, so also the 
H-tree layout of [9, 8]. 
• For synchronization it may be required that all wires have the same length. A layout for 
a complete binary tree with all wires of equal length turns out to be impossible for large 
enoughN. 
For radical delay, that is, delay according to the rth root of the wire length (using a constant 
number of layers) we find: 
• The layout area of a complete binary tree is bounded below by the product of the 
number of nodes N and an unbounded function of r. 
•For layouts of complete binary trees with equal length wires, the wire length needs to 
increase exponential with r. Therefore, the gain in speed is lost completely by cause of the 
longer wires, while the required layout area rises exponential with r nonetheless. 
•We also briefly investigate the effect on natural wire length distributions. Using plausible 
arguments, and empirical data from actual chips, (5] determines what wire length distri-
butions tend to occur. For these distributions it appears that the gain of having sublinear 
propagation delay is cancelled by the requirements on wire area. (Because the wires need 
to be much longer, the faster signals take as long to traverse a particular wire as they did 
before.) Worse, logarithmic delay may be impossible outright. 
13. Related work. 
Almost all work in the theory of computing of sublinear propagation delay VLSI models 
is related in some way to the present issue. E.g., apart from the fact that we require wide 
wires to obtain a sublinear propagation delay, we also need to insert the drivers to drive the 
long wires (cf. next section). In [9, 8, 10], it is shown that such drivers need an area propor-
tional to -say 10% of- the length of the wire. In [ 12] questions are treated concerning the 
insertion of these drivers in given layouts with unit width wires before and after insertion. 
Some effects of increasing wire width in layouts, or related issues, have been treated in vari-
ous contexts in, e.g. [13,6]., 
In the computational models rampant in the literature, the assumptions concerning signal 
propagation delay in long wires range from constant delay (irrespective of the wire length) 
[17,2,3,21, 15], via logarithmic [11, 19, 18], and linear delay [1,4], to signal propagation 
delay that is square in the length of wires (1,4, 16]. In all of these models the width (or 
thickness) of wires is assumed to be a unit, depending on the minimal feature width of the 
underlying technology as seems suggested in [9, 8]. However, for sublinear signal propaga-
tion delay this is a misunderstanding (10] of which the consequences are explored in the 
present paper. 
4 
2. ELECTRONIC BASICS 
The time it takes a minimum transistor to drive a wire of length L, width Wand thick-
ness H can be estimated as follows. The wire is assumed to have distance D1 to neighbour-
ing layers and Dw to other wires in the same layer. If W0 is the minimal width of a wire in 
the current technology, then the minimal transistor, consisting of a wire crossing, occupies 
area wij. The total time T to drive a wire is approximated by: 
(1) 
where R, is the resistance of the minimum transistor, Rw the resistance of the wire and Cw 
its capacitance. Therefore, the total time T can be thought of as the sum of the time Td 
needed to drive a zero resistance wire of capacitance Cw, and the time Rw Cw needed to 
transport the appropriate charge from a zero resistance source. Roughly, Td is the time 
needed to transport the necessary charge through the bottleneck consisting of the switch 
(the minimal transistor), and RwCw is the time needed to distribute the charge appropri-
ately over the wire w. Since the resistance of a wire is proportional to its length and 
inversely proportional to its cross section we have: 
L 
Rw = Pw WH (2) 
where Pw is the resistivity of the considered wire material. The capacitance of a wire is 
inversely proportional to the distance of its neighbouring wires and layers, and proportional 
to the area of the side facing that neighbouring layer or wire: 
H W 
Cw = 2t:w L ( Dw + D, ) (3) 
where t:w is a proportional constant consisting of the product of the permittivity of free 
space and the dielectric constant of the insulating material (usually Si02 ). Thus, 
L 2 H W 
RwCw = 2pwt:w WH ( Dw +Di) · (4) 
This suggests a signal propagation time quadratic in L. However, the resistance R, of the 
minimum transistor dominates in (1) for the magnitudes of L under consideration (smaller 
than, say, 1 foot). We can decrease that term by fitting a larger driver transistor to the 
wire. This transistor, in its turn, must be driven by the minimal transistor. Iterating this 
scheme, cf [9], we obtain a sequence of transistors, of which each next one is a factor a 
larger than the preceding one. The final transistor in the sequence should be large enough 
to drive the wire in a sufficiently short time. (We can think of this scheme as a sequence of 
switches where each switch serves to switch the next larger switch, and the largest switch in 
the sequence controls the large channel through which the charge is transported to the 
wire. Although the time to actually pass the appropriate charge from source to wire can be 
made smaller by fitting a larger final driver transistor to the sequence, there seems no way 
to get rid of the time needed to switch all transistors in between the smallest tramistor and 
the largest one.) The time to drive a driver with capacitance C 2 by a driver with smaller 
capacitance C 1 is given by [9]: 
5 
(5) 
where T is the time it takes a minimal transistor to charge the gate of another minimal 
transistor. If C, is the capacitance of the minimal transistor then for a ramp of r drivers: 
(6) 
taking Td = r Ta time to charge the wire if it had no resistance. The capacitance of the 
minimum transistor is given by 
(7) 
where D 0 is the thickness of the gate insulator and €t is the product of the permittivity of 
free space and the dielectric constant of the gate insulator. Thus we can drive a zero resis-
tance wire of capacitance Cw through a sequence of r drivers for fixed a in time: 
Cw 
Td = aT loga C, (8a) 
We can also use a ramp of r drivers, for some fixed r, and choose a accordingly: 
(8b) 
From (1), (3), (4) and (8a) we obtain an expression for T. 
Cw L 2 H W 
T :=:::::: a 'T loga C, + 2Pw €w WH ( Dw + Di ) (9a) 
From (1), (3), (4) and (8b) we obtain: 
(9b) 
It is therefore clear that the signal propagation time heavily depends on the various dimen-
sions and materials involved in the chip. The relation (8a) can be considered a borderline 
case of (8b ). In [ 10] it was observed that by keeping the derivatives, with respect to L, of 
the two terms in the right-hand side of equations like (9a&b) balanced: 
aT ,....., L H + W 
Llna ,....., Pw €w WH ( Dw D1 ) ' (lOa) 
T grows logarithmic in L. With 
l 
[ 
€wDoLl-r H W ]-; 
T (-+-) 
€t U'ij Dw D1 (lOb) 
6 
T grows as the rth root of L. Viz., from (9a) we obtain by assumption of equality (lOa): 
T~ ot'T {ln [ fwDoL ( H + W )] + 1 } (lla) 
ln ex ft wij Dw Dz 
and from (9b) we obtain by assumption of equality (lOb): 
l T~ (r+ l)T [ fw~L ( H + W )]7 
ft o Dw Dz (llb) 
In the next section we establish the area penalty involved with a wire of given length L. 
Remark. Different ratios between the successive capacitances of the transistors in the ramp 
of drivers can be used to obtain, for instance, noninteger values for r in the formulas above. 
This issue is not addressed in the present paper. 
Without a ramp of driver transistors, having the minimum transistor drive the total wire 
outright, we obtain from (1), (4), (5) and (7): 
Cw 
T ~ T Ct + Cw Rw (12) 
TDo L H W 
= (--+pw-) 2fwL (-+-) ftwfi WH Dw Dz 
The above analysis shows that we can reduce the signal propagation time by employing 
new materials with more favorable characteristics, like Gallium-Arsenide and Silicon-on-
Sapphire technologies [22]. We can also change the size of the wires and the interwire-
and interlevel separation. To increase signalling seed, extra layers with wider and thicker 
wires are used for the long interconnect wires. 
3. SUBLINEAR DELAY AND WIRE DIMENSIONS 
3.1. Logarithmic Delay and Constant Aspect Ratio 
Under assumption (lOa) we can obtain a logarithmic signal propagation delay by, all 
other things being equal, maintaining: 
L 2 ( 1 + 1 ) W Dw H Dz = constant ' (13a) 
rather than by just keeping L 2 proportional to WH as in [10]. Keeping the interwire dis-
tance proportional to the wire width, and the interlayer distance proportional to the wire 
height, we observe that if W, H and L are kept in proportion a logarithmic propagation 
delay is attained. (Note that we cannot reach this effect by keeping the wire width the 
same but using very 'tall' wires or vice versa.) The aspect ratio of a wire is the quotient of its 
width and length. To obtain a logarithmic signal propagation delay we thus need the fixed 
7 
constant aspect ratio following from (10) and (13a) for all wires in the layout. In designing 
such a high speed layout we therefore need to install drivers to drive the long wires and to 
design all wires with a constant aspect ratio a >O. Therefore, a wire of length Lin such a 
layout has area aL2 • The area taken by the driver is linear in the length of the wire (10]: 
the minimal transistor occupies area w5, the next driver area a fV5, and so on for logaL 
terms for an L-length wire. The total driver area for an L-length wire becomes 
w5 (L -1) / ( a-1 ). This area is required at the lowest silicon layer of the chip; the long 
interconnect wires are executed in the upper metal layers. 
3.2. Radical Delay 
Under assumption (lOb) we can obtain a signal propagation delay of the order of the rth 
root of the length of the wire under a certain balancing of the aspect ratio of the dimensions 
of the wire: 
L r H W r 2-1- [ ]1-1-
WH Dw + Dt = constant . (13b) 
Call this type radical delay. (Together with the previous logarithmical delay this essentially 
exhausts all possibilities for sublinear propagation delay obtainable by the [ 10] method.) 
We assume that all dimensions (but for L) are scaled proportional to the same radical frac-
tion La of L (l<a<l). So each parameter X=faL in (13b) satisfies X = axLa, for some ax 
a constant depending only on X. Therefore, for fixed given r equation (13b) determines a 
by: 
1 
r=---
2(1-a) (14) 
(The logarithmic delay of (13a). is, in a certain sense, a limiting case of this radical delay.) 
Hence, to obtain a propagation delay proportional to the rth root of the length L of the 
wire the dimensions need to be scaled proportional to L l - l I 21 and a wire of length L takes 
area 
2-.l... 
O(L 2r) (15) 
For r =Ml this yields the limiting worst-case quadratic delay with all dimensions (but L) 
scaled proportional to constants like the minimum feature width. For M!~r ~ 1, however, 
there is another way by inserting repeaters (inverters) at constant intervals in the long 
wires. This gives linear propagation delay (r = 1) at an area cost in repeaters, only linear in 
the length of the wire, anyhow. Therefore, we are only interested in the case r > 1, that is, 
sublinear propagation delay. 
Note, that the effect of the scaling to obtain sublinear propagation delay is not only 
confined to the area but also to the height of the chip, since all dimensions need to be 
scaled proportional. So, the volume of a wire of length L needs be Il(L 3 ) for logarithmic 
propagation delay, by (13a), and Il(L3-l/r) for a rth root of L propagation delay, similar 
to (15). 
8 
4. AREA, LENGTH AND TIME 
The area for a VLSI layout is expressed in A area units. The area unit is the square of 
the basic length unit which is the feature width of the underlying. technology. This is 
currently 4·10-6 - 10·10-6 meter and is expected to continue to decrease in the submicron 
level in the near future. 
• The area is taken to be the area of the smallest convex region enclosing the layout. 
•There is a cross-over constant c >O such that no unit circle encloses points of more than c 
different edges (wires) or nodes (components or transistors). 
(In case we allow an unlimited amount of cross-over, we should consider the worst-case 
'area X cross-over' product instead of the area. Effectively, we then consider 3-
dimensional 'layered' chips which is outside the scope of this paper.) 
4.1. Time 
The execution time of a problem instance is the time elapsed between the entering of the 
first bit of the problem instance in the circuit and the leaving of the last bit of the answer 
from the circuit. In pipelined and especially in systolic computations [9] the period is impor-
tant. The period is the time elapsed between entering the first bit of a problem instance and 
the first bit of a next problem instance. In 'moving belt' type computation the period can 
be substantially less than the execution time. Below the period of a (systolic) computation 
appears to be more sensitive for the propagation delay assumption than the overall execu-
tion time. If the signal propagation delay depends on the length of the wire the signal has 
to traverse, then the minimax edge length in the layout will determine the period in a systolic 
network. The minimax edge length (or wire length) e(.) of a layout for a given circuit is the 
minimum over all layouts, implementing the circuit, of the length of the longest wire in 
such a layout. See also [11 ]. 
4.2. Implementation Details of Area and Time 
We may assume that the circuits are laid out on a Manhattan grid. In [7] algorithms 
are presented to embed easily separated graphs efficiently in grids. The considerations 
below assume that the processing elements have unit area and the links between them have 
unit bandwidth. This view captures the underlying communication structure. This· leaves 
free the precise implementation. For example, to communicate a word of k bits between two 
processing elements, one can either use a link of bandwidth k and one cycle or use a link of 
bandwidth 1 and k cycles. Let each of the P processing elements actually fit in area U and 
let the total area used by the links normalized to bandwidth 1 be L. Let W be the 
bandwidth of the precise implementation, e.g., word-parallel or word-serial communication. 
Using methods of [7], a rough upper bound on the total actual chip area is given by 
Ap E O(PU) plus Aw E O(LW2), where Ap is the area taken by the processing elements 
and Aw is the area for the wires. We follow the normalization generally adhered to, so by a 
layout area A we mean P + L, that is, we normalize the processing elements to unit area 
and the wires to unit bandwidth. Concomitant with the assumption of unit bandwidth, we 
also assume that each communication between processors concerns a unit (bit). Once the 
bandwidth and sizes of messages are resolved, the estimates give a basis for the actual chip 
9 
area and the actual chip times. 
5. TREE CIRCUIT AND SUBLINEAR DELAY 
Complete binary trees are basic ingredients for many circuits. They are exemlplary for 
the embedding of a large class of hierarchical circuits in layouts on silicon. Such circuits are 
obvious candidates on which to test the effects of the wire area penalty for sublinear delay 
on chip. 
The described effect of sublinear delay on short-wire-length layouts like two-dimensional 
Manhattan circuits is nil because all wires have the same constant length in the layout, 
while on circuits like fast permutation networks (cube-connected cycles, perfect shuffle, 
butterfly) the effects are very pronounced, and perhaps disastrous, because necessarily there 
are many long wires in the layout [20]. Does the wire area penalty involved with sublinear 
signal delay imply a significant overall layout area penalty for that most significant exam-
ple in between: the complete binary tree? This depends on whether the layout for such a 
tree has long wires. 
Recall that the H-tree layout for a complete binary rooted tree [9] achieves area less 
than 4N for an N-node complete binary tree, under the unit wire width assumption. Below we 
give upper and lower bounds on the layout area for a complete binary tree circuit with (i) 
logarithmic signal propagation delay and (ii) rth root signal propagation delay, using to the 
model described in the previous section. 
Proposition. Each layout of a complete bina!L tree with N leaves (i.e., 2N -1 nodes), using A 
area, contains wires ef length at least el_N) = YA / (2logN). Moreover, there is a path.from the root 
to a leaf of total length of at least 0( VA ). 
Proofsketch. We can find two points p and q in the layout which are at least VA units 
apart. p and q can be nodes or locations on a wire. To go from p to q along the edges of the 
tree cannot cause us to traverse more than 2logN edges. Hence, there is an wire in the lay-
out of length at least VA/ (2 logN) units. (For instance, if A E O(N) then the minimax 
edge length for a complete binary tree layout is e (N) E 0( VN /log N).) 
If r is the node on the VA length path between p and q which is nearest to the root, then 
either r - p or r -q is a 0( VA )-length subpath of a path from the root to a leaf. D 
Definition. We denote by A (N) the minimal area to lay out a complete binary tree with N 
leaves under the different assumptions on wire area and cross over. 
5.1. Logarithmic Signal Propagation Delay. 
Let the area occupied by a wire of length L be aL 2 for some constant 0 <a :s:;;; 1. 
Upper bound. We analyse the area occupied by an H-tree layout with N leaves and no 
overlap. This is an upper bound on A (N). Recall the familiar picture of the H-tree layout 
with constant width wires for complete binary trees as given in, for instance, [9]. Let 
N = 2m. Let the ratio between the length of the wires at two consecutive levels be a. That 
is, the quotient of the length of a level k + 1 wire and the length of a level k wire is a, 
O<a<l, for all O:s;;;k<m (k =O is root level and k =m is leaf level). Considering that lay-
out, it is not to difficult to see that, with constant aspect ratio a, 
10 
for each level k between 1 and m, suffices to layout the H-tree compactly with no overlap of 
wires and nodes. Consequently we obtain 
0 <a~ a(l-2a2 ) , 
and, for both a, a > 0, 
0 < a < V2 & 0 < a ~ ,2r,:-
2 3 v6 
Note that in the limit for a~ V2 / 2 we have that a ~ 0. For given a and a in the 
appropriate ranges, an upper bound on the total wire area plus node area for the H-tree lay-
out is computed by: 
00 
A(N) ~a~ 2ka2k-2m + 2m+l 
k=O 
aa-2m + 2m+l . 
1-2a2 
Therefore, 
A(N) ~ CN-2Iog2a + 2N , 
with C = a/ (1-2a2). From the relation between a and a it follows that 
O<a<V2 /2. · Setting a= (1-2a2)/2, so 1 /2 ~a< V2 /2 and 
0 < a ~ 1 / 4 and C = 1 / 2, yields 
A (N) ~ ~N1-1og2(1-2a) + 2N . 
C~a for 
therefore 
Therefore, for each £>0 there is an a< V2 /2 such that A (N) E O(N1+(). Since a must 
be greater than 0 this £ remains greater than 0 as well. 
For a ~ V2 / 2 (so the aspect ratio a~O) the upper bound on A (N) goes to: 
lim ama-2m + 2m+l 
a->V2 /2 
= aNlogN + 2N . 
Lower bound. Let N=2m, and let c be the maximal amount of cross-over. We obtain a 
lower bound on the layout area for a complete binary tree for lZ19' layout, so also the H-tree 
layout. For each i, 1 ~ i ~m, A (2i) is the minimum layout area for a complete binary tree T; 
with 2i leaves, under the current assumptions of c cross-over and a aspect ratio. Imagine, 
for the sake of the argument, a (nonexistent) layout for Tm, such that each maximal subtree 
determined by a node of Tm takes minimal area. Selecting wires from the maximal lengths 
paths in these subtrees, we sum their areas while taking care that each such wire is counted 
only once. Let T; be a complete binary tree of i + 1 levels with the root at level 0 and the 
11 
leaves at level i. 
Claim 1. There is a path in T; of at most 2i edges and of length at least VA (2i) in the 
layout. The sum of the areas of the wires in such a path exceeds aA (2i) / 2ic. 
Proef ef Claim. By arguments concerning the diameter of the smallest convex area con-
taining T; it is easy to see that there is a path of length VA (2i) in the layout with 
1 •s;;;,j;~2i wires. The sum of the area of the wires in such a path is therefore il(aA (2i) / cj;). 
(Assume that the area of the wires in the path can be distributed over c levels, so as to keep 
A (2;) as small as possible.) 0 
Claim 2. Let P be a path through T;. At each level j, 2 ~j ~ i, there are at least 2 max-
imal complete binary subtrees of i - j + 1 levels with roots at level j, which have no node in 
common with P. Moreover, the 2(i-1) complete binary subtrees concerned are pairwise 
disjoint as well. (This is easy to verify from a simple picture.) 
By Claims 1 and 2, we ea~ a lower bound on the area A (2i) of T; by adding the 
minimal possible area of a YA (2;)-length path in T; to twice the sum for j ranges from 2 
through i of the areas of subtrees T; _ j: 
A (2;) > aA (~i) + 2 ±A (2i-J). 
2ci j=2 
Unfolding this inequality we obtain 
. i- l aF'_iA (i-1) a.F,A (1) 
A (2') > ~ 2 c· - .) + ' j=O C t J c 
with Fj the j-th element of the Fibonacci-like sequence generated by the recurrence relation 
F; = F; - l + 2F; -2 with F o = 1 and F 1 = 0. Therefore, 
F· = 2; + 2( -1 i ~ j_ 
I 3 3 
Substituting A (2i) = g (i)2i in the inequality above yields 
a j nf 1\ a 
g(i) > - ~ ~ + - ' 
6c j=l J 3c 
which is satisfied for g (i) E il(iE) with f:>a / 6c. Hence, 
A (N) E il(Nlog'1 /6c N) 
Crudely derived, considering only volume without considerations of placement and routing, 
this lower bound is yet nonlinear and reflects both the necessary influence of the aspect 
ratio a and the cross-over coefficient c. It seems likely that more sophisticated arguments for 
fixed aspect ratio a >O will raise the lower bound to match the upper bound il(N1+E). 
5.2. Radical Signal Propagation Delay 
For radical signal propagation delay proportional to the rth root of the length L of the 
wire, the area taken by the wire needs be aL 2- 112r, for some constant a, by (15). 
Upper bound. The upper bound on A (N) is determined by using the H-tree layout 
(without overlap) again. Let N = 2m. Let again the quotient between the lengths of the 
12 
wires at two consecutive levels k + 1 and k be a, O<a<l, with the root at level 0 and the 
leaves at level m. Considering the H-tree layout, it is not to difficult to see that, with aspect 
ratio aL - l I 2r for wires of length L (O<a < 1 ), 
1 
k (k-1)(1-2) k+2 
<r ~aa r + 2a- , 
for each level k between 0 and m, suffices to layout the H-tree compactly with no overlap of 
wires and nodes. Consequently we obtain 
I+ k-1 
0 < a ~ a 2r (1-2a2) 
and, for a, a > 0, 
O<a< V2 
2 ' 
2r+k-l 
O<a <( 2r+k-1 ) 4r 
12r+2k-2 
4r+2k-2 
(l - 12r +2k-2 ). 
For r-HI;) we obtain the U.ef>er bound on a we saw above for the logarithmic delay case. 
Note that again for a~ v'2 /2 we have that a ~o. For given a and a in appropriate 
ranges, that is, 
2-_!_ 
2a 2r<l 
and a accordingly, the upper bound on A (N), consisting of the total wire area plus node area 
for the H-tree layout, is computed as follows: 
Therefore, 
oo (k-m)(2-_!_) 
A(N)~a~2ka 2r +2m+l 
k=O 
-(2-_!_)m 
aa 2r 
----- + 2m+l . 
2-_!_ 
1-2a 2r 
1 
-(2-2)1og2 a 
A (N) ~ CN T + 2N ' 
with C = a/ (1-2a2- 1 / 2r). From the constraints on a and a it follows that C~a. Setting 
a= (1-2a2- 1 / 2r)/2 we have C = 1 /2. Therefore, for each r~MI and £>0 there is an 
a < V2 / 2 such that A (N) E O(N1+(). 
For 
~ 1 
24r-l ~a< 2 2 
we leave the analysis to the reader. 
For a ~ V2 / 2 (so a~O) the upper bound on A (N) goes to: 
m (k-m)(2-_!_) 
lim a~ 2"a 2r + 2m+l 
a--+.../2 /2 k=O 
aN 
1/4r + 2N . 1-2-
13 
Lower bound. Let N = 2m. Let aL - l I 2r be the aspect ratio of the wires to obtain rth root 
radical signal propagation delay. Let c be the maximal amount of cross-over. For each i, 
o:s;;;;i :s;;;;m, let A (2i) be the minimum layout area for a complete binary tree T; with 2i leaves. 
Imagine, for the sake of the argument, a virtual layout for Tm, such that each maximal sub-
tree determined by a node of Tm takes minimal area. . Selecting wires from the maximal 
lengths paths in these subtrees, we sum their areas while taking care that each such wire is 
counted only once. 
Claim 1. Let T; be a complete binary tree of i + 1 levels with the root at level 0 and the 
leaves at level i. If the minimal layout area of T; is A (2i) then there is a path in T; of at 
most 2i edges and of length at least v'A(2i) in the layout. The area taken by this path 
must therefore exceed 
where aL - l I 2r is the aspect ratio of the wires and c the maximal amount of cross-over. 
Proof of Claim By arguments concerning the diameter of the smallest convex area con-
taining T; it is easy to see that there is a path of length VA (2i) in the layout with 
1 :s;;;;ji,.;;;;; 2i wires. The area of L-length wires is aL 2- 1 I 2r, and r > 1. The sum of the area of 
the wires in such a path is therefore least if the path contains as many wires as possible, 
that is, 2i wires. By distributing the area of the wires over c levels we obtain the expression 
in the claim. 0 
Claim 2. Let P be a path through T;. At each level j, 2:-s;;;;j:s;;;;i, there are at least 2 max-
imal complete binary subtrees of i - j + 1 levels with roots at level j, which have no node in 
common with P. Moreover, the 2(i -1) complete binary subtrees concerned are pairwise 
disjoint as well. (This is easy to verify from a simple picture.) 
So by Claims 1 and 2, we c~e a lower bound on the area A (2i) of T; by adding the 
minimal possible area of a YA (2i)-length path in T; to twice the sum for j ranges from 2 
through i of the areas of subtrees T; _ j: 
-1+_!_ 1-_!_ i 
A (2i) > i!.(2i) 2r A (2i) 4r + 2 ~A (2i-j). 
c j=2 
Unfolding this inequality we obtain 
i-1 -1+_!_ 1-_!_ aFA(l) 
A (2i) > !!. ~ F1(2i-2j) 2r A (2i-J) 4r + -
1
--
c j=O c 
with F1 the j-th element of the Fibonacci-like sequence generated by the recurrence relation 
14 
F; = F; - l + 2F; - 2 with F 0 = 1 and F 1 = 0. Therefore, 
2i + 2(-li Fi= 3 
Substituting 
A (i) = D (r,i)2i 
and changing the summation order in the inequality above yields 
-1+..!.. . 1 1 . 
a2 2r 1 -1+- 1-- _..J_ 
D(r,i) > ~j 2r D(r,j) 4r 2 4r 
3c j=l 
Since D(r,j)~l (j>O), we can bound D(r,i) below by: 
-1+..!.. 1 . 
a2 2r i -l+- _..J_ 
D(r,i) > __ 3_c __ j;/ 2r 2 4r 
-1+..!.. 
a2 2r R(r) + ...!!.. ~ ____ ___._..._ 
3c 3c ' 
+ ...!!.. 
3c 
where the series converges to the unbounded function R(r). 
+...!!.. 
3c 
(*) 
For boundary value r=0, we obtain from(*) that D(0,i)EO(l) suffices, which is wit-
nessed by the unit wire width H-tree. For the other extreme value of r the inequality (*) 
yields: 
limD(r,i) E O(ia/6c) , 
r--.oo 
giving us the earlier derived lower bound for logarithmic delay. Considering only volume 
without considerations of placement and routing, this lower bound reflects the influence of 
the radical r, the aspect ratio a and the cross-over coefficient c. 
5.3. Execution Time and Period 
Let again A (N) denote the minimal area for the layout of a complete binary tree T with 
N = 2m leaves under the appropriate assumptions on wire area and cross-over. The period is 
computed from the minimax wire length 
e (N) E 0( VA (N) /log N) . 
The execution time is the greatest sum of the delays along a path from the root to a lea£ The 
delay in each wire of each path is at least that of the longest wire in that path. Therefore, 
the execution time is at least log N times the delay in Jn e (Jf) length wire. According to 
the Proposition, the minimax edge length e(N)EO( A(N) /logN), and therefore the 
execution time is O(VA (N) ). Consequently, together with the respective minimal layout 
areas A (N) for the different propagation delays we obtain: 
15 
Logarithmic radical (r) 
area ll(Mog" /6c N)nO(N1-Iog,(t-2a)) ll(D(r,i)N) 
period 8(logN) ll [(D(r,i)N) ~r log-+ N] 
Table 1. Minimal area with unequal length wires. Here a is the wire aspect ratio c is the cross-
over coefficient or number of layers and radical propagation delay as the rth root. 
5.4. Layouts with Equal Length Wires 
If we want to synchronize then it may be preferable to have layouts with only equal length 
wires. Under the constant wire width assumption, the least such wire length for a layout of 
a complete binary tree is N / log2 N with simultaneous least area of 0(N2 / log2 N) [ 11 ]. 
Table 2 summarizes the effect of the requirement of equal length wires on layouts of a com-
plete binary N-node tree. The derivation is given below. 
Logarithmic Radical (r) 
area ll(aN3 I clog4N) g [ °1: [ 4cl:1;N r-1] 
period 0 [[ 4c~NrJ 
execution time 
Table 2. Minimal area with equal length wires, with r the radical, a the wire aspect ratio and c the 
cross-over coefficient or number of layers. 
For logarithmic propagation delay with wires of constant aspect ratio a, viz. the left column 
of the table, the lower bound on the area is obtained by determining the combined area 
16 
taken by N wires of length L Efl(N / log2 N). However, under the requirement of unique 
wire length e(N) for all wires, cross-over number c (number of layers) and an aspect ratio 
a, the following relations have to hold for any complete binary N-node tree layout. 
ae(N)
2
N :;;;;,A (N) :;;;;;,4e(N)2log2N . 
c 
Viz., on the one hand the area must accommodate all wires, on the other hand the diame-
ter of the layout cannot exceed the length of the longest path (2log N edges). Therefore, 
!!_ ~ 4log2N 
c...,. N ' 
and, for fixed constant a and c independent of N, the desired layout is impossible for large 
enough N. Moreover, substituting the upper bound on a/ c in the the lower bound on the 
area then yields the [ 11] value A (N) E D(N2 / log2 N) again, indicating that such a circuit 
gets impossible for already quite small N. Therefore, the period an execution time are - . 
Similarly, for radical (rth root) propagation delay we have: 
and therefore 
2-_l_ 
ae(N) 2r N ~-~-- :;;;;;,A(N) :;;;;;,4e(N)2log2N 
c 
e(N);;:;;. [ i ]2r . 
4clo N 
Substituting e (N) of the last displayed equation in the left hand term of the preceding 
displayed inequality yields the lower area bound on A (N) in Table 2. Note that the boun-
dary case r = 12 also yields the standard [ 11] value for e (N) for unit width wires. The period 
is D(e(N)1 fr) and the execution time is D(e(N)1frlogN). Therefore, the effect of the longer 
wires required for O(L 1 I r) signal propagation delay for L-length wires eradicates the gain 
over quadratic delay, while the area rises exponential with r nonetheless. Consequently, a 
hierarchy of drivers is not a viable solution to speed up a tree implementation with equal 
length wires. For such circuits with equal length wires a better solution is periodic 
repeaters in long wires giving linear delay. 
6. WIRE LENGTH DISTRIBUTIONS 
Let f: N ~ N, connected with a VLSI layout, be a wire length distribution function which 
yields the number f ( i) of wires of length i in the design. 
Every VLSI layout must have a constant bounded fan-in and fan-out of wires for the 
components (transistors). If the chip area is A, then the average maximal wire length Lmax 
can be estimated by the statistical formula [14] Lmax = KAg with Kand g constants which, 
by rule of thumb, can be set to V2 each. A reasonable assumption therefore is that the maxi-
mal wire length on a chip does not exceed 
Lmax =VA . (16a) 
Consequently, the amount of wires in the layout is given by 
VA 
#wires = ~f(i) 
i=l 
17 
(16b) 
We now identify a common class of wire-length distributions for VLSI layouts. Firstly, it 
is argued that the requirement of logarithmic propagation delay favors such distributions. 
Secondly, other studies have shown such distributions to be likely for VLSI layouts on both 
theoretical and empirical grounds. 
6.1. Logarithmic Delay 
Recall, that to obtain a logarithmic signal propagation delay we need a fixed constant 
aspect ratio for all wires in the layout. In designing a high speed layout we therefore 
needed to install drivers to drive the long wires and to design all wires with constant aspect 
ratio. The area taken by such a driver is linear in the length of the wire. This area is 
required at the lowest silicon layer of the chip; the long interconnect wires are executed in 
the upper metal layers. If we double the length and width of the chip then the length of the 
longest wires and the area of their drivers doubles too. The area of the lowest layer, how-
ever, is quadrupled and can therefore accommodate at least double the amount of drivers. 
This allows us to add a new layer to place still longer wires. These longer wires come on 
higher levels where the wires are wider. If the wires on a level k + 1 are /1 longer, wider 
and taller than the wires on level k then the maximal amount of wires Nk on level k satisfies 
Nk = N 0p-2k. This suggests that the number of wires f(i) of length i should decrease at 
least as fast as N 0 i-2• However, in actual chip layouts the number of long wires may 
decrease less fast than this inverse square of the wire length; empirical wire length distribu-
tions/ ( i) = Lc2-).J (l:;;;;i:;;;;Lmax) and/(i)~O (i>Lmax) with 1.5<l\<2 have been 
reported in [5]. To achieve logarithmic propagation delay we can estimate and bound the 
layout area occupied by the fattened wires as follows. Let C be the amount of area of the 
layout occupied by non-wire components such as transistors. Assuming that C is also the order 
of magnitude of the number of basic components like transistors or logic gates in the circuit 
we can reason as follows. Since the wires only serve to connect components we have 
C E O(#wires) in a connected layout. The components are assumed to have at most a lim-
ited t connections to attach wires, which we suppose to account also for the fan-in and fan-
out of the interconnect wires. Therefore C E O(#wires) and consequently C E 0(#wires). 
Since we are primarily interested in order of magnitude in the sequel, we are justified to use 
C interchangeably for the amount of area occupied by the non-wire components, the 
number of non-wire components and the number of wires. The maximal area occupied by 
the wires (and interwire distances) under (13a) is bounded by the available area: 
VA ~f(i)ai2 ~A -C , (16c) 
i=l 
where a is the constant quotient of width and length (the aspect ratio) of the connect wires 
as required by (13a). Using a simple theoretical argument and an experimental study of 
actual layouts [5] develops the following wire length distribution relationship: 
f(i) = Lci-AJ (l:;;;;i:;;;;Lmax) and/(i)~O (i>Lmax) (17) 
18 
for a normalization constant c yet to be chosen. Here Lrnax is a constant related to the size of 
the array (rectangular chip) and the adequacy of the placement; and A is a constant 
characteristic of the logic. Equation (17) is derived using "Rent's Rule" which states that 
the average number of terminals per complex of C elements (in units, modules, cards, gates 
etc.) is tCP, where t is the number of connections per individual element and p is the Rent 
constant characteristic of the logic complex. The analysis goes by dividing a square array of 
cells into 4 equal square arrays recursively down until the individual areas are the indivi-
dual elements of the original logic. On each level of the recursion the number of connec-
tions crossing boundary lines is determined using Rent's rule. This shows that A~ 3-2p. 
In [5] experimental results are given for some actual layouts placed using a hierarchical 
placement program: layouts for high-speed logic were p was found to be 0.75 and a layout 
for a hand calculator chip with p =0.59. For 1.;;;;'A <3, equation (17) is of the form of the 
Pareto-Levy distribution; similar laws occur in contexts like word frequencies, noise in 
transmission channels etc. For additional discussion see [5]. Let furthermore the network 
be connected, so the maximal amount of area units C available to place the components is 
not greater than the number of wires plus 1. From (16c) and (17) we can estimate the 
maximal figure for the normalization constant c. For 'A=¥=3: 
and for 'A=3, 
c ~ -'(,__A_-_C~)(,_3_-_'A)'-­
a (A <3-;\.) / 2 -1) 
2(A -C) 
alogA 
Consequently, for 'A=¥=1 & 'A=¥=3 by (16b): 
and for 'A=3, 
For 'A=l, 
Vi . (A -C)(3-'A)(A (I-;\.)/ 2 -1) 
C ~ i;/(z) ~ a(l-'A)(A(3-;\.)/2_1) 
C ~ (A -C)(A -1) 
aA logA 
Vi . A-C 
C ~ i;/(z) ~ a(A -l) logA 
(18a) 
(18b) 
(19a) 
(19b) 
(19c) 
(Note: for A< 1 we obtain c < 1, resulting in f (i)~O also for small i, and C a small con-
stant.) From the above analysis follows: 
Lemma 1. Let f ( i) be the wire length distribution fanction of a chip layout with area A. Let the 
signal propagation delay be logarithmic and let therefore all wires in the layout have the same aspect 
ratio. If f (i) = Le/ i J for some constant c then the total number of wires, and similarly the total 
number of gates and transistors, is O(log A ). 
Since the number of components bounds the number of bits manipulated in each compu-
tation step, Lemma 1 tells us that this number is very much layout dependent, and depends 
19 
in particular on the distribution of wire lengths. Consequently, even if the area A is polyno-
mial in the binary size N of a problem, under a layout and signal propagation delay as in 
the lemma, the execution time, and also the period for systolic computation will be O(N) 
since it takes at least N /log N stages just to scan all N bits and for each such stage the 
delay in the longest wires is O(log N). In some designs, like trees for instance, the number 
of long wires decreases faster with the wire length. For such f the series in ( 16c) converges 
also faster, and the maximal number of wires, and similarly the maximal number of com-
ponents, may rise to order A (the area expressed in area units), under the logarithmic pro-
pagation delay requirement notwithstanding the attendant constant aspect ratio for wires. 
The constant wi're width assumption. For comparison we give the analogous analysis with 
above under the constant wire width assumption. Then equations (16a) - (16b) stay the 
same but equation ( 16c) becomes 
Vi 
~f(i)i ~ A-C (20) 
i=l 
Thus, for f(i) = Lei->. J (l~i~ VA) andf(i) ~o (i>VA) and with A, C and c as above 
we obtain the following relations. For 'A= 1: 
c~ 
A-C 
v'A-1 
c~ (A -C)logA 
2cv'A-1) 
For 'A=#=l & 'A=#=2: 
For 'A=2: 
c ~ (2-'A)(A -C) 
A (2-X)/2 - l 
C ~ (2-'A)(A -C)(A(l-X)/ 2-1) 
(1-'A)(A (2-X)/2 -1) 
c~ 
2(A -C) 
log A 
2(A -C)(VA -1) 
VAlogA 
(Note: for 'A<O we obtain c<l.) For 'A>O we have C E O(VA). Thus: 
(21) 
(22) 
(23) 
Lemma 2. Let f ( i) be the wire length distribution fanction q[_ a chip layout with area A under the 
constant wire width assumption. If f(i) = Le/ i>. J (l~i ~VA) andf(i)~O (i> VA)for a con-
stant c then the maximal feasible number .!1[ wires in the layout, and similar[y the maximal number ef 
gates and transistors in the layout, is O(VA )for all A~O. 
Recall that the quotient of the length and width of a wire is its aspect ratio. By the previ-
ous analysis, considering just the wire length distribution while leaving free the actual cir-
cuit topology, placement and routing in the layouts, attaining a logarithmic signal propa-
gation delay by changing constant wire width to constant aspect ratio for all wires in a 
20 
layout can carry a surprisingly severe penalty. 
Theorem. Let the original layout area be A and the original amount ef wires in the la;_out be C. For 
the wire length distribution f { i) = Lci- 1 J for 1 ~ i ~ VA and f { i) ~ 0 for i > \/A, the change 
from constant wire width to wires with a constant aspect ratio has the .following effect. 
{i) Retaining the original amount ef wires C and the original wire length distribution relative to the lay-
out area, that is, f (i) = Lc'i- 1 J for 1 ~i ~fr and f (i) ~ Ofor i> fr with the normaliza-
tion constant c' set to its maximal value, exponentially increases the required area A ' over the original 
area A. 
(ii)Retaining the original area A and the original wire length distribution, that is, f (i) = Lc'i- 1 J for 
l~i~VA andf(i)~O.for i>VA with the normalization constant c' set to its maximum value, 
reduces the maximal amount ef wires in the layout, and therefore the proportionate amount ef usefUl 
components like gates and transistors, to C' E O(log C). 
(iii)Retaining the original amount ef wires C (or logic components) and the original area A requires a 
thoroul{!!_change qfwire length distribution tof(i)= Lc'i->.J for l~i~VA andf(i)~Ofor 
i > \/A and the normalization constant c' set to its maximum value. Each A ~ 2 +£for some small 
£>0 depending only on A and a suffices. For a given network topology this entails a placement and 
routing ef the layout which may well be impossible in many cases. 
Proof. Since we assume the circuit to be connected we have A > A - C > A / 2 in the 
various equations. We also assume A>> 1. 
(i) Equate expression (21) for C with expression (19c) for C, with A' substituted for A in the 
latter. This yields logA' E O(VA). 
(ii)Substitute C' for C in equation (21) and express C' in terms of C by eliminating A from 
the resulting equation and (19c). 
(iii)Equate expression (21) for C with expression (19a) for C (expressions (19b) and (19c) 
contradict (21)). The terms (A -C) on both sides cancel each other. Solving A yields 
A= 2+£(A, a)> 2 with £(A, a)~O for A~oo and a constant. Every distribution with 
exponent equal or larger than this A suffices. D 
We observe that in case (i) of the Theorem the wires get so long that the logarithmic pro-
pagation delay turns out to yield about the same absolute time delay as in the original 
wires. In case (ii) of the Theorem matters are probably as bad because the bit capacity of 
the chip has been logarithmically reduced. Finally, in case (iii) of the Theorem the subject 
circuit topology may not have a layout with the required wire length distribution. 
For values of the exponent A> 1 in the wire length distribution the analog of the Theorem 
holds with polynomial relationships in cases (i) and (ii). For A>3, the number of long wires 
decreases so fast with the wire length that it suffices that A' E 0(A) in (i), C' E 0(C) in (ii) 
and nearly the same wire length distribution function suffices in (iii). However, A>3 
implies a negative Rent constant p since A~ 3-2p in [5], and therefore layouts which do 
not satisfy Rent's Rule. The reader is invited to analyse the relations for different values of 
A. We look at one more case, the wire length distribution with A= 2, which is interesting 
because it is associated with hierarchical designs. For A= 2, the different parts of the 
Theorem yield the following: 
(i) A' E Q(A 2 / Iog2 A) . 
(ii)C' E O(VClogC). 
21 
(iii)This requires a change to a new distribution function f(i) = Lc'i->. J (1 ~i~ y'A) and 
f(i)-:::::,0 (i> VA) with the new normalization constant c' set to its maximum value. 
Each A~3 suffices. 
Network topologies, which can be realized with constant width wires, may not be realiz-
able at all on a multilevel Manhattan grid geometry with all wires having the same aspect 
ratio. Even for network topologies which do have a layout with a constant aspect ratio for 
the wires, the Theorem shows that the increase in area can be so much that the 
amplification in length of the wires will nullify (or worse) the increase in speed due to a 
change from linear or square propagation delay to a logarithmic one. It therefore appears 
that but circuits with proper topologies (like the tree circuit in the previous section), for 
which there are layouts with the considered wire length distributions with relative large A, 
are proper candidates for an improvement of speed by a logarithmic signal propagation 
delay. 
Exercise. Do a similar analysis for radical delay. 
REFERENCES 
[1] Bilardi, G., M. Pracchi, and F.P. Preparata, "A critique and an appraisal of VLSI models of 
computation," pp. 81-88 in VLSI Systems and Computations, ed. H.T Kung, B. Sproull & G. 
Steele, Springer verlag, Berlin (1981). 
[2] Brent, R.P. and H.T. Kung, "The chip complexity of binary arithmetic,'' pp. 190-200 in 
Proceedings 12th ACM Symposium on Theory of Computing (1980). 
[3] Brent, R.P and H.T. Kung, "The area-time complexity of binary multiplications,'' J. Ass. 
Comp. Mach., vol. 28, pp.521-534, 1981. 
[4] Chazelle, B. and L. Monier, "A model of computation for VLSI with related complexity 
results," pp. 318-325 in Proceedings 13th ACM Symposium on Theory of Computing (1981). 
[5] Donat, W.E., "Wire length distribution for placement of computer logic,'' IBM]. Res. Develop., 
vol. 25, pp.152 - 155, 1981. 
[6] Leighton, F.T., Complexity Issues in VLSI. The MIT Press, 1983. 
[7] Leiserson, C.E., Area Efficient VLSI Computation. The MIT Press, 1982. 
[8] Mead, C. and M. Rem, "Cost and Performance of VLSI Computing structures,'' IEEE J. on 
Solid State Circuits,, vol. SC-14, pp.455 - 462, 1979. 
[9] Mead, C. and L. Conway, Introduction to VLSI Systems. Reading, Mass.:Addisson-Wesley, 1980. 
[10] Mead, C. and M. Rem, "Minimum propagation delays in VLSI,'' IEEE]. on Solid State Cir-
cuits, vol. SC-17, pp.773 - 775, 1982. Correction: Ibid, SC-19 (1984) 162. 
[11] Paterson, M.S., W.L. Ruzzo, and L. Snyder, "Bounds on the minimax edge length for com-
plete binary trees," pp. 293 - 299 in Proceedings 11 th ACM Symposium on Theory of Com-
puting (1981). 
[12] Ramachandran, V., "On driving many long lines in a VLSI layout,'' pp. 369 - 378 in 
Proceedings 23rd IEEE Symposium on Foundations of Computer Science (1982). 
[13] Ramachandran, V., "Upper bounds for the area increase caused by local expansions in a 
VLSI layout,'' pp. 163-179 in Advances in Computer Research (1984). 
[14] Saraswat, K.C. and F. Mohammadi, "Effect of scaling of interconnections on the time delay of 
VLSI circuits,'' IEEE]. of Solid State Circuits, vol. SC-17, pp.275-280, 1982. 
22 
[15] Savage, J., "Planar circuit complexity and performance of VLSI algorithms," pp. 61-68 in 
Proceedings CMU Conference on VLSI Systems and Computations, ed. H.T. Kung et. al., 
Computer Science Press (1981). 
[16] Seitz, Ch.L., "Ensemble architectures for VLSI - A survey and taxonomy," pp. 130-132 in 
Proc. of MIT Conference on Advanced Research in VLSI, ed. P. Penfield, Jr., Artech House 
(1982). 
[17] Thompson, C.D., "A Complexity Theory for VLSI", Ph.D. Thesis, Dept. of Computer Sci-
ence, Carnegie-Mellon University, 1980. 
[18] Thompson, C.D. and P. Raghavan, "On estimating the performance of VLSI circuits", Tech. 
Rep. UCB/CSD 84/138, Computer Science Division (EECS), University of California, Berke-
ley, September 1983. 
[19] Thompson, C.D., "Fourier transforms in VLSI," IEEE Transactions on Computing, 1984. 
[20] Ullman, J.D., Computational Aspects of VLSI. Rockville, Maryland:Computer Science Press, 
1984. 
[21] Vuillemin, J., "A combinatorial limit to the computing power of VLSI circuits," pp. 294-300 
in Proc. 21th IEEE Symposium on the Foundations of Computer Science (1980). 
[22] Yuan, H.-T., Y.-T. Lin, and S.-Y. Chiang, "Properties of silicon, sapphire, and semi-insulating 
gallium arsenide substrates," IEEE]. of Solid State Circuits, vol. SC-17, pp.269-274, 1982. 
