We present a new delay model for use in logic synthesis. A traditional model treats the area of a library cell as constant and makes the cell's delay a linear function of load. Our model is based on a different, but equally fundamental linearity in the equation relating area, delay, and load: namely, we may keep a cell's delay constant by making its areu a linear function of load. This allows us to technology map using a library with continuous device sizing, satisfies certain electrical noise and power constraints, and in certain cases is computationally simpler than a traditional model. We give results to support these claims. A companion paper [14] uses the computational simplicity to explore a wide search space of algebraic factorings in a mapped network.
Introduction. Most technology mapping algorithms for logic synthesis have been targeted at technologies
with a limited number of cell sizes. A straightforward modeling technique will then model each library element with a unique cell, whose area is fixed and whose delay varies with output loading. A class of technology-mapping algorithms called tree-mapping [ 1, 2, 3] is well suited to such a model. Given a tree-structured network and a fixed cell library, tree-mapping algorithms run in time linear in the number in the number of circuit nodes. They are also linear in the number of library cells, which is of course not a problem for these reasonably-small libraries.
At the other end of the performance spectrum, fullcustom design can achieve high device densities and clock speeds [13] . However, it requires, among other things, the ability to create gates of any desired size. This conceptually implies an unbounded number of library cells, and clearly precludes the direct use of a tree mapper, whose execution time is linear in the library size. One alternative is to approximate the continuous library with a discrete, near-continuous (and very large) cell library. However, this produces suboptimal results (since the library is still not continuous), and is also slow.
We propose a new model for continuously-sized CMOS gates. In this model, a cell's deZay will be held constant. As the cell's load changes, the cell's size automatically grows exactly enough to hold delay constant; making its area a function --in fact a linear function --of load. This model will enable us to use a modified tree-mapping technology to efficiently produce continuously-sized netlists satisfying certain electrical noise and power constraints.
Our own application is for continuously-sized, fullcustom designs. However, the delay model is also applicable to other methodologies, such as high-end standard cell, where there are many sizes of each cell. Essentially, it applies to any technology where cell sizing to obtain a desired delay is viable.
Constant-delay modeling has been used frequently in technology-independent algorithms. For example, Wang [8,pg. 1671 proposed decomposing a network into bounded-fanin NAND gates, assigning a unit delay to each level of logic, and determining and restructuring critical regions with the resulting arrival times.
SinghIl2, pp.13-191 has measured the accuracy of various technology-independent delay models. He concluded that the unit-delay model on bounded-fanin gates was the most accurate. His speedup [8] Figure 3 shows this graphically (the four plots lie on top of each other). Theorem 2 also explains the phenomenon observed in Figure 2 . As a gate gets faster, its self loading ratio increases. It thus takes more and more extra area to make it faster by the same constant delay increment At.
Electrical and power constraints.
We can also relate self-loading ratio to power. Self-loading capacitance implies work which is "useless" in that it is not charging external loads. Thus, keeping the selfloading ratio low implies a high ratio of useful work to self-loading work. Figure 3 tells us that when designing for minimum self-loading ratio, it is sufficient to consider a gate's delay only --its area may be whatever it needs to be for the proper delay, without affecting the ratio.
We can thus use Figure 3 to Both the constant-area and constant-delay models relate area, load, and delay. Given a library with enough cells, either model can take any two of the three variables and predict the third, and thus are functionally equivalent in the limit case of densely-populated libraries. However, given the 3: 1 range in cell delay and a nearly 100: 1 range in cell area, our library can be far smaller than conventional ones --for a methodology allowing continuous sizing.
We incorporate our delay model within a tree-mapping technology mapper [1, 2] . Tree mappers can be used to optimize many different cost functions: e.g., area, delay, and area under a delay constraint.
Since our model essentially reverses the roles of area and delay, minimal-area tree mapping can now be done very similarly to existing min-delay algorithms (e.g., [2] ch.
2). Min-delay mapping is analogously done with an existing min-area algorithm (e.g., [ 11) .
Min-area mapping under a delay constraint is perhaps the most useful and difficult problem. There are several methods in the literature of dealing with it, e.g., [2, p.221 and [4] . We focus on the two-pass algorithm in [4] . For the first pass, [4] chose a constant value K, and assumed that the expected load at all tree-internal nodes was equal to K. This reduced the delay of each cell to a constant, and allowed [4] to store simple (arrival time, area) pairs for the solutions at each node instead of piecewise-linear functions. The inaccuracies due to [4] 's simplified model were assumed to be minimal, and were heuristically adjusted on a later pass. We keep the first pass from [4] exactly. Our cells have their native constant delays. Their area is heuristically assumed constant and equal to the slope of their actual area vs. load line. We then eliminate the second pass of [4] altogether and, as mentioned, replace it by a sizing technique similar to [9, pg.2521.
Results and Conclusions.
We have built a technology mapper using these ideas on top of SIS [l 11. It is based on a tree mapper which minimizes area under a delay constraint (MADC), as described in Section 3, and followed by a simple device sizer based on [9, pg.2521. We have then built both a constant-area library and a constant-delay library. The constant-area library has approximately 16 sizes for each gate type. The constantdelay library has two. We have taken several networks from a low-power, high-performance microprocessor currently in design, and processed them with our technology mapper using the constant-delay library.
For comparison purposes, we have then converted the results to the best possible equivalent using the constantarea library. At nodes which are speed-critical, we choose the next-larger cell size to reduce delay, Al. nodes which are bounded by the methodology's slowest possible delay, we likewise round up to the next larger cell to avoid violating electrical constraints. At other nodes, we round to the nearest legal discrete cell size. Columns 5 and 6 give the area results for each library. As expected, the constant-delay library used significantly less area (14.8%) than did the discrete library. This is partially because the constant-delay library was able to use exactly the smallest cell size on non-critical nodes, where the constant-area library had to use the next larger size. It is also due to the constant-delay library avoiding area overkill on critical nodes. Columns 7 and 8 give the total power expended for each circuit. ' They disregard switching probabilities and use a simple model where power CC CV2. Note that the power measurements are roughly in line with the area measurements.
Finally, columns 9 and 10 compare delay-model complexity. Column 9 gives the total number of solution points used by our MADC mapper. Column 10 contrasts this to a min-delay piecewise-linear mapper such as in [2] . As mentioned in Section 2, the piecewise-linear mapper uses at least one solution point for each cell strength in the library at every node while calculating the minimum-delay solution. A true MADC solution such as proposed in [2,pg.22] would be substantially more expensive still. As expected, the computational gain from the 3:l range in delays vs. the 1OO:l range in areas is substantial. We observe that it is so substantial that it enables a true MADC mapper to use 15% fewer solution points than a simpler min-delay mapper. This computational simplicity will be used to good effect in 1141.
In conclusion, we have developed a new delay model. Our model keeps the delay of any cell constant by varying the cell's size in proportion to changes in its output load. We have shown the model to be both accurate and computationally efficient, and motivated it with circuitintegrity and power considerations. We have used it to give insight into previous technology-independent delay modeling, and demonstrated its use in technology mapping. A companion paper 1141 uses its computational simplicity to explore a wide range of structurings of a mapped network. 
