1. Introduction 1.1. Background.
The aim of this paper is to correct a widely spread misunderstanding. In the literature on theory of VlSI algorithms and complexity it is generally, but erroneously, held that signal propagation delay logarithmic in the wire length can be achieved on chip at the cost of an area overhead per wire which is linear (say 10%) in the area taken originally by the wire. Since the area penalty needs .to be in fact far greater (viz. square in the. wire length) many known lower bounds on the area X time or area X time 2 in the sublinear time range are way too low, while some known upper bounds are actually false. Recall also that the probability of a chip being flawed, due to the random defects introduced by the fabrication process, rises exponentially with its area. Therefore, even small area increases necessary to· obtain faster signalling speed on chip may already be prohibitive. This paper addresses the basic model for VlSI computation. It has consequences for virtually all work in the field which considers sublinear propagation delay versus layout area complexity.
• This work was supported by the Stichting Mathematisch Centrum.
In current chips, synchronization requirements slow down the computation to a clocked switching time in the order of the delay in the longest wire. Thus, the overall efficiency of many very large scale integrated (VLSI) electronic switching circuits depends strongly on the signal propagation delay in long wires. As the minimal feature width continues to decrease into the submicron range this delay governs overall performance more and more.! This seems truly the case on the level of the emerging uXJftr scale integration technology which manufactures chips of over 4 inch across. Within a particular technology, the on!>, way currently known to obtain sublinear signal propagation delay (in the literature usually logarithmic) in long wires is by fitting a hierarchy of driver transistors to long wires, as suggested by [9] . The area cost of the ramp of driver transistors is then not more than part of the area taken by the driven wire while the width of the wire is generally, but erroneously, assumed to remain the same unit width, depending on the minimal feature width of the underlying technology. However, in [10] it appears that life is not so simple. Viz., the logarithmic delay assumption is incompatible with the constant wire width assumption.
To achieve a propagation delay logarithmic in the length of the wire, as in [9] , electronic considerations show [10] that all wires need have the same ratio between width and length, that is, the same aspect ratio. (Thus, in multilayered chips now being manufactured, the communication wires are grouped in metal layers according to length. In a layer with a group of longer wires those wires are proportionally wider and thicker.) We show how the driver-hierarchy method to obtain 0272-5428/85/0000/0197$01. 00 © 1985 IEEE (2) logarithmic signal propagation delay in [9] can also be used to obtain radical (rth root) delay with similar (but less pronounced) effects on the width of long wires.
Outline of the Results
First we treat the electronic background somewhat more detailed than [10] . As a consequence, it turns out that the efficiency of VLSI circuits with sublinear propagation delay is more layout sensitive than hitherto assumed. To demonstrate this, we analyse the effect of the sublinear delay requirement on a basic circuit: the complete binary tree topology. (For more intricate circuits requiring many more long wires, like n-cubes, butterfly networks, cube-connectedcycles networks, the area penalty for sublinear delay is far greater. For matrix networks the area penalty is obviously nil.) We show for logarithmic delay, with a constant aspect ratio a for the wires and using c layers:
• every layout of a complete binary tree with N nodes takes O(Nlog' /6cN) area*, so also the H-tree layout of [9, 8] .
• For synchronization it may be required that all wires have the same length. A layout for a complete binary tree with all wires of equal ·length turns out to be impossible for large enough N.
For radical delay, that is, delay according to the rth root of the wire length (using a constant number of layers) we find:
• The layout area of a complete binary tree is bounded below by the product of the number of nodes N and an unbounded function of r.
• For layouts of complete binary trees with equal length wires, the wire length needs to increase exponential with r. Therefore, the gain in speed is lost completely by cause of the longer wires, while the required layout area rises exponential with r nonetheless.
• We also briefly investigate the effect on natural wire length distributions. Using plausible arguments, and empirical data from actual chips, [5] detennines what wire length distributions tend to occur. For these distributions it appears that the gain of having sublinear propagation delay is cancelled by the requirements on wire area. (Because the wires need to be much longer, the faster signals take as long to traverse a particular wire as they did before.) Worse, logarithmic delay may be impossible outright.
Related work.
Almost all work in the theory of computing of sublinear propagation delay VLSI models is related in some way to the present issue. E.g., apart from the fact that we require wide wires to obtain a sublinear propagation delay, we also need
• We use the Order-of-Magnitude symbols 8$ follows: 
8(f(n»
= O(f(n» n O(f(n».
198.
to insert the drivers to drive the long wires (cf. next section). In [9, 8, 10] , it is shown that such drivers need an area proportional to -say 10% of-the length of the wire. In (12] questions are treated concerning the insertion of these drivers in given layouts with unit width wires before and after insertion. Some effects of increasing wire width in layouts, or related issues, have been treated in various contexts in, e.g. [13, 6] .,
In the computational models rampant in the literature, the assumptions concerning signal propagation delay in long wires range from constant delay (irrespective of the wire length) [17, 2, 3, 21, 15] , via logarithmic [11, 19, 18] , and linear delay [1, 4] , to signal propagation delay that is square in the length of wires [1, 4, 16] . In all of these models the width (or thickness) of wires is assumed to be a unit, depending on the minimal feature width of the underlying technology as seems suggested in [9, 8] . However, for sublinear signal propagation delay this is a misunderstanding [10] of which the consequences are explored in the present paper.
Electronic Basics
The time it takes a minimum transistor to drive a wire of length L, width Wand thickness H can be estimated as follows. The -wire is assumed to have distance D, to neighbouring layers and D w to other wires in the same layer. If W o is the minimal width of a wire in the current technology, then the minimal transistor, consisting of a wire crossing, occupies area W5. The total time T to drive a wire is approximated by:
(1 ) where R t is the resistance of the minimum transistor, R w the resistance of the wire and C w its capacitance. Therefore, the total time T can be thought of as the sum of the time T d needed to drive a zero resistance wire of capacitance C w , and the time R w C w needed to transport the appropriate charge from a zero resistance source. Roughly, T d is the time needed to transport the necessary charge through the bottleneck consisting of the switch (the minimal transistor), and RwC w is the time needed to distribute the charge appropriately over the wire w. Since the resistance of a wire is proportional to its length and inversely proportional to its cross section we have:
where Pw is the resistivity of the considered wire material. The capacitance of a wire is inversely proportional to the distance of its neighbouring wires and layers, and proportional to the area of the side facing that neighbouring layer or wire:
where f. w is a proportional constant consisting of the product' of the pennittivity of free space and the dielectric constant of the insulating material (usually Si0 2 ). Thus, It is therefore clear that the signal propagation time heavily depends on the various dimensions and materials involved in the chip. The relation (8a) can be considered a borderline case of (8b). In [10] it was observed that by keeping the derivatives, with respect to L, of the two terms in the righthand side of equations like (9a&b) balanced:
T grows as the rth root of L. Viz., from (9a) we obtain by assumption·of equality (1 Oa):
and from (9b) we obtain by assumption of equality (lOb):
In the next section we establish the area penalty involved with a wire of given length L. Remark. Different ratios between the successive capacitances of the transistors in the ramp of drivers can be used to obtain, for instance, noninteger values for r in the formulas above. This issue is not addressed in the present paper.
Without a ramp of driver transistors, having the minimum transistor drive the total wire outright, we obtain from (I), (4), (5) and (7):
This suggests a signal propagation time quadratic in L. However, the resistance R, of the minimum transistor dominates in (1) for the magnitudes of L under consideration (smaller than, say, 1 foot). We can decrease that term by fitting a larger driver transistor to the wire. This transistor, in its turn, must be driven by the minimal transistor. Iterating this scheme, cf [9], we obtain a sequence of transistors, of which each next one is a factor a larger than the preceding one. The final transistor in the sequence should be large enough to drive the wire in a sufficiently short time. (We can think of this scheme as a sequence of switches where each switch serves to switch the next larger switch, and the largest switch in the sequence controls the large channel through which the charge is transported to the wire. Although the time to actually pass the appropriate charge from source to wire can be made smaller by fitting a larger final driver transistor to the sequence, there seems no way to get rid of the time needed to switch all transistors in between the smallest transistor and the largest one.) The time to drive a driver with capacitance C 2 by a driver with smaller capacitance C 1 is given by [9]:
where Do is the thickness of the gate insulator and £, is the product of the permittivity of free space and the dielectric constant of the gate insulator. Thus we can drive a zero resistance wire of capacitance C w through a sequence of r drivers for fixed a in time:
We can also use a ramp of r drivers, for some fixed r, and choose a accordingly:
The above analysis shows that we can reduce the signal propagation time by employing new materials with more favorable characteristics, like Gallium-Arsenide and Siliconon-Sapphire technologies [22] . We can also change the size of the wires and the interwire-and interlevel separation. To increase signalling seed, extra layers with wider and thicker wires are used for the long interconnect wires.
From (1), (3), (4) and (8a) we obtain an expression for T.
From (1), (3), (4) and (8b) we obtain:
3. Sublinear Delay and Wire Dimensions
Logarithmic Delay and Constant Aspect Ratio
Under assumption (lOa) we can obtain a logarithmic signal propagation delay by, all other things being equal, maintaining:
1 1
rather than by just keeping L 2 proportional to WH as in [10] . Keeping the interwire distance proportional to the wire width, and the interlayer distance proportional to the wire height, we observe that if W, Hand L are kept in proportion a logarithmic propagation delay is attained. (Note that we cannot reach this effect by keeping the wire width the same but using very 'tall' wires or vice versa.) The aspect ratio of a wire is the quotient of its width and length. To obtain a logarithmic signal propagation delay we thus need the fixed constant aspect ratio following from (10) and (13a) for all wires in the layout. In designing such a high speed layout we therefore need to install drivers to drive the long wires and to design all wires with a constant aspect ratio a >0. Therefore, a wire of length L in such a layout has area aL 2 • The area taken by the driver is linear in the length of the wire [10]:
the minimal transistor occupies area w'ij, the next driver area a WB, and so on for 10gaL terms for an L-Iength wire. The total driver area for an L-Iength wire becomes
This area is required at the lowest silicon layer of the chip; the long interconnect wires are executed in the upper metal layers.
Radical Delay
Under assumption (lOb) we can obtain a signal propagation delay of the order of the rth root of the length of the wire under a certain balancing of the aspect ratio of the dimensions of the wire:
Call this type radical delay. (Together with the previous logarithmical delay this essentially exhausts all possibilities for sublinear propagation delay obtainable by the [10] method.) We assume that all dimensions (but for L) are scaled proportional to the same radical fraction L a of L (1 < a < 1). So each parameter X=I=L in (13b) satisfies X = axL a, for some ax a constant depending only on X. Therefore, for fixed given r equation (13b) determines a by:
(The logarithmic delay of (13a). is, in a certain sense, a limiting case of this radical delay.) Hence, to obtain a propagation delay proportional to the rth root of the length L of the wire the dimensions need to be scaled proportional to L 1-1 /2r and a wire of length L takes area (15) For r =~this yields the limiting worst-case quadratic delay with all dimensions (but L) scaled proportional to constants like the minimum feature width. For~E;;;r E;;; 1 however there is another way by inserting repeaters (invert~rs) at cons tant intervals in the long wires. This gives linear propaga-200 tion delay (r = 1) at an area cost in repeaters, only linear in the length of the wire, anyhow. Therefore, we are only interested in the case r > 1, that is, sublinear propagation delay.
Note, that the effect· of the scaling to obtain sublinear propagation delay is not only confined to the area but also to the height of the chip, since all dimensions need to be scaled proportional. So, the volume of a wire of length L needs be 0(L 3 ) for logarithmic propagation delay, by (13a), and 0(L 3 -1 /,) for a rth root of L propagation delay, similar to (15).
Area, Length and Time
The area for a VLSI layout is expressed in A area units.
The area unit is the square of the basic length unit which is the feature width of the underlying technology. This is currently 4·10-6 -10·10-6 meter and is expected to continue to decrease in the submicron level in the near future.
• The area is taken to be the area of the smallest convex region enclosing the layout.
• There is a cross-over constant c >0 such that no unit circle encloses points of more than c different edges (wires) or nodes (components or transistors).
(In case we allow an unlimited amount of cross-over, we should consider the worst-case 'area X cross-over' product instead of the area. Effectively, we then consider 3-dimensional 'layered' chips which is outside the scope of this paper.)
Time
The execution time of a prob~em instance is the time elapsed between the entering of the first bit of the problem instance in the circuit and the leaving of the last bit of the answer from the circuit. In pipelined and especially in systolic computations [9] the period is important. The period is the time elapsed between entering the first bit of a problem instance and the first bit of a next problem instance. In 'moving belt' type computation the period can be substantially less than the execution time. Below the period of a (systolic) computation appears to be more sensitive for the propagation delay assumption than the overall execution time. If the signal propagation delay depends on the length of the wire the signal has to traverse, then the minimax edge length in the layout will determine the period in a systolic network. The minimax edge length (or wire length) e(.) of a layout for a given circuit is the minimum over all layouts, implementing the circuit, of the length of the longest wire in such a layout. See also [11] .
Implementation Details of Area and Time
We may assume that the circuits are laid out on a Manhattan grid. In [7] algorithms are presented to embed easily separated graphs efficiently in grids. The considerations below assume that the processing elements have unit area and the links between them have unit bandwidth. This view captures the underlying communication structure. This leaves free the precise implementation. For example, to communicate a word of k bits between two processing elements, one can either use a link of bandwidth k and one cycle or use a link of bandwidth 1 and k cycles. Let each of the P processing elements actually fit in area U and let the total area used by the links normalized to bandwidth 1 be L. Let W be the bandwidth of the precise implementation, e.g., word-parallel or word-serial communication. Using methods of [7] , a rough upper bound on the total actual chip area is given by
, where A p is the area taken by the processing elements and A w is the area for the wires. We follow the normalization generally adhered to, so by a layout area A we mean P +L, that is" we normalize the processing elements to unit area and the wires to unit bandwidth. Concomitant with the assumption of unit bandwidth, we also assume that each communication between processors concerns a unit (bit). Once the bandwidth and sizes of messages are resolved, the estimates give a basis for the actual chip area and the actual chip times.
Tree Circuit and Sublinear Delay
Complete binary trees are basic ingredients for many circuits. They are exemplary for the embedding of a large class of hierarchical circuits in layouts on silicon. Such circuits are obvious candidates on which to test the effects of the wire area penalty for sublinear delay on chip.
The described effect of sublinear delay on short-wire-length layouts like two-dimensional Manhattan circuits is nil because all wires have the same constant length in the layout, while on circuits like fast permutation networks (cube. . . connected cycles, perfect shuffle, butterfly) the effects are very pronounced, and perhaps disastrous, because necessarily there are many long wires in the layout [20] . Does the wire area penalty involved with sublinear signal delay imply a significant overall layout area penalty for that most significant example in between: the complete binary tree? This depends on whether the layout for such a tree has long wires.
Recall that the H-tree layout for a complete binary rooted tree [9] achieves area less than 4N for an N-node complete binary tree, under the unit wire width assumption. Below we give upper and lower bounds on the layout area for a complete binary tree circuit with (i) logarithmic signal propagation delay and (ii) rth root signal propagation delay, using to the model described in the previous section. If r is the node on the VA length path between p and q which is nearest to the root, then either r -p or r -q is a O(VA)-length subpath of a path from the root to a leaf. 0 Definition. We denote by A eN) the minimal area to layout a complete binary tree with ,N leaves under the different assumptions on wire area and cross over.
Logarithmic Signal Propagation Delay.
Let the area occupied by a wire of length L be aL 2 for some constant O<a~1.
Upper bound. We analyse the area occupied by an H-tree layout with N leaves and no overlap. This is~n upper bound on A (N). Recall the familiar picture of the H-tree layout with constant width wires for complete binary trees as given in, for instance, [9] . Let N == 2 m • Let the ratio between the length of the wires at two consecutive levels be a. That is, the quotient of the length of a level k + 1 wire and the length of a level k wire is a, O<a<l, for all O~k<m (k ==0 is root level and k ==m is leaf level). Considering that layout, it is not to difficult to see that, with constant' aspect ratio a,
for each level k between 1 and m, suffices to layout the ,Htree compactly with no overlap of wires and nodes. Consequently we obtain and, for both a, a > 0, o< IX < V2 & 0 < a 0;;; _2;;:-.
3 v6
Note that in the limit for a~V2 /2 we have that a~O.
For given a and a in the appropriate ranges, an upper bound on the total wire area plus node area for the H-tree layout is computed by:
..E:!!:.-+ 2 m + 1
1-2a 2
Therefore,
with C == a / (1 -2a 2 ) . From the relation between a and a it follows that
C~a for
O<a< V2 /2. Setting a == (1-2a 2 ) / 2, so 1/2~a < V2 /2 and therefore 0< a~1/4 and C == 1/2, yields
Therefore, for each € > 0 there is an a < V2 / 2 such that N 1 +f) . Since a must be greater than 0 this t: remains greater than 0 as well.
For a~Vi /2 (so the aspect ratio a~O) the upper bound on A (N) goes to:
Substituting A (2 i ) =g(i)i in the inequality above yields
and, for a, a > 0, O<a< Vi 2 For r~oo we obtain the upper bound on a we saw above for the logarithmic delay case. Note that again for a~Vi / 2 we have that a~o. For given a and a in appropriate ranges, that is, and a accordingly, the upper bound on A (N), consisting of the total wire are~,plus node area for the H-tree layout, is computed as follows:'"
Crudely derived, considering only volume without considerations of placement and routing, this lower bound is yet nonlinear and reflects both the necessary influence of the aspect ratio a and the cross-over coefficient c. It seems likely that more sophisticated arguments for fixed aspect ratio a >0 will raise the lower bound to match the upper bound n(N I +f).
for each level k between 0 and m, suffices to layout the Htree compactly with no overlap of wires and nodes. Consequently we obtain
Radical Signal Propagation Delay
For radical signal propagation delay proportional to the rth root of the length L of the wire, the area taken by the wire needs be aL 2-1/2" for some constant a, by (15) . 
2" j=2
Unfolding this inequality we obtain Selecting wires from the maximal lengths paths in these subtrees, we sum their areas while taking care that each such wire is counted only once. 
Unfolding this inequality we obtain where the series converges to the unbounded function R (r). For boundary value r =#, we obtain from (*) that D (#, i) E O(1) suffices, which is witnessed by the unit wire width H-tree. For the other extreme value of r the inequality (*) yields:
giving us the earlier derived lower bound for logarithmic delay. Considering only volume without considerations of placement and routing, this lower bound reflects the influence of the radical r, the aspect ratio a and the crossover coefficient &.
Execution Time and Period
Let again A(N) denote the minimal area for the layout of a complete binary tree T with N = 2 m leaves under the appropriate assumptions on wire area and cross-over. The period is computed from the minimax wire length
The execution time is the greatest sum of the delays along a path from the root to a leaf. The delay in each wire of each path is at least that of the longest wire in that path. Therefore, the execution time is at least 10gN times the delay in an e(N) length wire. According to the Proposition, the minimax edge length e(N) E~V A (If) / log N), and .therefore the execution time is D ( A (N) ). Consequendy, together with the respective minimal layout areas A (N) for the different propagation delays we obtain:
binary N-node tree layout.
Viz., on the one hand the area must accommodate all wires, on the other hand the diameter of the layout cannot exceed the length of the longest path (2logN edges). Therefore,
N ' Table 1 . Minimal area with unequal length wires. Here a is the wire aspect ratio c is the cross-over coefficient or number of layers and radical propagation delay as the rth root.
and, for fixed constant a and c independent of N, the desired layout is impossible for large enough N. Moreover, substituting the upper bound on a / c·in the the lower bound on the area then yields the [11 ] value A (N) E D(N 2 / loiN) again, indicating that such a circuit gets impossible for already quite small N. Therefore, the period an execution time are -. Similarly, for radical (rth root) propagation delay we have:
and therefore Table 2 . Minimal area with equal length wires, with r the radical, a the wire aspect ratio and c the cross-over coefficient or number of layers.
For logarithmic propagation delay with wires of constant aspect ratio a, viz. the left column of the table, the lower bound on the area is obtained by determining the combined area taken by N wires of length L ED(N / loiN). However, under the requirement of unique wire length e(N) for all wires, cross-over number c (number of layers) and an aspect ratio a, the following relations have to hold· for any complete Consequently, the amount of wires in the layout is given by Substituting e(N) of the last displayed equation in the left hand term of the preceding displayed inequality yields the lower area bound on A (N) in Table 2 . Note that the boundary case r = # also yields the standard [11] value for e(N) for unit width wires.· The period is D(e(N)I / r) and the execution time is D(e(N)I/rlogN). Therefore, the effect of the longer wires required for O(L 1/ r) signal propagation delay for Llength wires eradicates the gain over quadratic delay, while the area rises exponential with r nonetheless. Consequendy, a hierarchy of drivers is not a viable solution to speed up a tree implementation with equal length wires. For such circuits with equal length wires a better solution is periodic repeaters in long wires giving linear delay.
Radical (r) Logarithmic

Layouts with Equal Length WJreS
If we want to synchronize then it may be preferable to have layouts with only equal length wires. Under the constant wire width assumption, the least such wire length for a layout of a complete binary tree is N / loiN with simultaneous least area of 8(N 2 / loiN) [11] . Table 2 summarizes the effect of the requirem~nt of equal length wires on layouts of a complete binary N-node tree. The derivation is given below. Since the number of components bounds the number of bits manipulated in each computation step, Lemma 1 tells us that for a normalization constant c yet to be chosen. Here L max is a constant related to the size of the array (rectangular chip) and the adequacy of the placement; and A is a constant characteristic of the logic. Equation (17) results are given for some actual layouts placed using a hierarchical placement program: layouts for high-speed logic were p was found to be 0.75 and a layout for a hand calculator chip with p =0.59. For 1~A<3, equation (17) is of the form of the Pareto-Levy distribution; similar laws occur in contexts like word frequencies, noise in transmission channels etc. For additional discussion see [5] . Let furthermore the network be connected, so the maximal amount of area units C available to place the components is not greater than the number of wires plus 1. From (I6c) and (17) we can estimate the maximal figure for the normalization constant c. For A=#:3:
For A=I, and for A=3, We now identify a common class of wire-length distributions for VLSI layouts. Firstly, it is argued that the requirement of logarithmic propagation delay favors such distributions. Secondly, other studies have shown such distributions to be likely for VLSI layouts on both theoretical and empirical grounds.
Logarithmic Delay
Recall, that to obtain a logarithmic signal propagation delay we need a fixed constant aspect ratio for all wires in the layout. In designing a high speed layout we therefore needed to install drivers to drive the long wires and to design all wires with constant aspect ratio. The area taken by such a driver is linear in the length of the wire. This area is required at the lowest silicon layer of the chip; the long interconnect wires are executed in the upper metal layers. If we double the length and width of the chip then the length of the longest wires and the area of their drivers doubles too. The area of the lowest layer, however, is quadrupled and can therefore accommodate at least double the amount of drivers. This allows us to add a new layer to place still longer wires. These longer wires come on higher levels where the wires are . To achieve logarithmic propagation delay we can estimate and bound the layout area occupied by the fattened wires as follows. Let C be the amount of area of the layout occupied by non-wire components such as transistors. Assuming that C is also the order of magnitude of the number of basic components like transistors or logic gates in the circuit we can reason as follows.
Since the wires only serve to connect components we have C E O(#wires) in a connected layout. The components are assumed to have at most a limited t connections to attach wires, which we suppose to account also for the fin-in and fin-out of the interconnect wires. Therefore C E O(#wires) and consequently C E 8(#wires). Since we are primarily interested in order of magnitude in the sequel, we are justified to use C interchangeably for the amount of area occupied by the non-wire components, the number of nonwire components and the number of wires. The· maximal area occupied by the wires (and interwire distances) under (I3a) is bounded by the available area:
where a is the constant quotient of width and length (the aspect ratio) of the connect wires as required by (13a). tJsing a simple theoretical argument and an experimental study of actual layouts [5] develops the following wire length distribution relationship:
and with A, C and c as above we obtain· the following relations. For X= 1: Proof. Since we assume the circuit to be connected we have A > A -C > A /2 in the various equations. We also assume A » 1.
(i) Equate expression (21) for C with expression (19c) for C, with A' substituted for A in the latter. This yields 10gA' E n(VA"").
(ii)Substitute C' for C in equation ( We observe that in case (i) of the Theorem the wires get so long that the logarithmic propagation delay turns out to yield about the same absolute time delay as in the original wires. In case (ii) of the Theorem matters are probably as bad because the bit capacity of the chip has been logarithmically reduced. Finally, in case (iii) of the Theorem the subject circuit topology may not have a layout with the required wire length distribution. For values of the exponent X> 1 in the wire length distribution the analog of the Theorem holds with polynomial rela-' tionships in cases (i) and (ii). For X> 3, the number of long wires decreases so fast with the wire length that .it suffices that A' E S(A) in (i), C' E S(C) in (ii) and nearly the same wire length distribution function suffices in (iii). However, can carry a surprisingly severe penalty. this number is very much layout dependent, and depends in particular on the distribution of wire lengths. Consequently, even if the area A is polynomial in the binary size N of a problem, under a layout and signal propagation delay as in the lemma, the execution time, and also the period for systolic computation will be D(N) since it takes at least N / logN stages just to scan all N bits and for each such stage the delay in the longest wires is D(logN). In some designs, like trees for instance, the number of long wires decreases faster with the wire length. For such f the series in (16c) converges also faster, and the maximal number of wires, and similarly the maximal number of components, may rise to order A (the area expressed in area units), under the logarithmic propagation delay requirement notwithstanding the attendant constant aspect ratio for wires. Recall that the quotient of the length and width of a wire is its aspect ratio. By the previous analysis, considering just the wire length distribution while leaving free the actual circuit topology, placement and routing in the layouts, attaining a logarithmic signal propagation delay by changing constant wire width to constant aspect ratio for all wires in a layout A>3 implies a negative Rent constant p since A~3-2p in [5] , and therefore layouts which do not satisfy Rent's Rule. The reader is invited to analyse the relations for different values of A. We look at one more case, the wire length distribution with A = 2, which is interesting because it is associated with hierarchical designs. For A = 2, the different parts of the Theorem yield the following:
(ii)C' E O(YClogC) .
(iii)This requires a change to a new distribution function f(i) = lc'i-AJ (1~i~~) and f(i)~O (i>~) with the new normalization constant c' set to its maximum value. Each A~3 suffices.
Network topologies, which can be realized with constant width wires, may not be realizable at all on a multilevel Manhattan grid geometry with all wires having the same aspect ratio. Even for network topologies which do have a layout with a constant aspect ratio for the wires, the Theorem shows that the increase in area can be so much that the amplification in length of the wires will nullify (or worse) the increase in speed due to a change from linear or square propagation delay to a logarithmic one. It therefore appears that but circuits with proper topologies (like the tree circuit in the previous section), for which there are layouts with the considered wire length distributions with relative large A, are proper candidates for an improvement of speed by a logarithmic signal propagation delay.
Exercise. Doa similar analysis for radical delay.
