A new method ispresented to determine thepower dissipation and propagation-delay time of small logical blocks (micro-cells). This method is a combination of the RC-tree and the macro modeling methodr. It is a fast and accurate method, three orden of magnitude faster that SPICE, while the maximal error is 10 percent. This method will be used in a performance-driven micro-cell generator for a sea-of-gates environment.
Introduction
With the growing integration of todays and future electronic circuits and the increasing possibilities of technology, it is getting difficult to meet the specifications of the entire circuit. These specifications do not only require a logical description, but also demands with regard to constraints. Among these are operating speed, noise margins, power dissipation and area. This paper presents a layout-generation technique for a sea-of-gates environment that takes these specifications into account. It aims to use all the possibilities offered by todays technology in order to improve performance.
A sea-of-gates array consists of an array of core-cells [ 11. A core-cell is the smallest repetitive element of this array, and consists of n-and p-transistors. A sea-of-gates environment is a semi-custom technique, which is based on a transistor array equal for all applications, and two or three metal layers that are different for each application. Figure la shows a part of a transistor array [l] . A circuit is realized by generating the interconnect layers. Figure lb shows an implementation of an eight-input NAND. The main difference with a gate-array is the absence of separate interconnection areas. It offers therefore the advantages of semi-custom gate-arrays, such as quick turn-around times and low cost without its disadvantages, as low densities and medium performance. A micro-cell is defined as a small logical block, the most complex one being a full adder. It is realized in a sea-of-gates array and consists therefore of at least one core-cell with interconnect.
An elegant way to meet the performance specifications is to use performance-driven circuit-generation techniques. A macro-assembler, employing logic synthesis techniques is used to decompose the global specifications into micro-cells. This decomposition must take the parasitic capacitances and resistances caused by the interconnect into account, since they influence the performance. From this, specifications for micro-cells can be deduced. These micro-cells are generated by a performance-driven micro-cell compiler (PMCC). The PMCC must also take the interconnections that cross the generated micro-cell into account.
In this paper, the development of a fast and accurate performance-analysis tool is discussed, which is part of the performance-driven micro-cell compiler.
2. Performance criteria A basic criterion is which performance constraints can be of interest in a PMCC environment. It is obvious that the maximum operational frequency, the total power dissipation and the total area of an entire chip are very important aspects in chip design. The logic synthesis techniques must translate these global constraints to micro-cell constraints, resulting in a propagation delay, a power dissipation and an area and shape at micro-cell level. Improving the operational frequency increases the danger of noise susceptibility. Noise margins are often compromised to improve speed. Therefore it is very important to introduce a lower bound on noise margins to avoid a high noise susceptibility. Therefore the performance constraints for the PMCC are chosen to be:
-Maximum propagation delay.
-Maximum power dissipation.
-Maximum area.
-Minimum noise margins.
The PMCC must be able to predict the performance of the generated micro-cells to verify if these constraints have been met.
The next point of interest is the tradeoff between accuracy and CPU-time. Since meeting the constraints can be seen as an optimization technique, an iterative approach is used for the PMCC; small changes, e.g. adding transistors in parallel or in series, or adding buffers, are implemented in the micro-cell until the constraints are met. This implies very fast iterations h order to prevent large CPU-times. The accuracy 2130/9110000r0576$01.00 0 1991 IEEE of the PMCC must be comparable to that of the logic synthesis technique, resulting in a required accuracy of 10 percent.
Current performance prediction techniques
The three most important performance prediction methods in literature, will be briefly discussed next.
The first method consists of circuit simulation techniques (SPICE and 121). This method calculates analog waveforms and offers therefore high accuracies, but suffers from long CPU-times.
The second method is very often used in applications with fixed layout libraries, such as gate-arrays. It consists of macro modeling techniques [3] . The shape of the output waveform of each micro-cell (delay-time and rise/fall time) is determined as a function of several parameters, usually the load capacitance and the input transition time. This method offers high accuracies (under 5 percent) and small computational times. Using macro modeling techniques in a performancedriven layout generator results in a tremendous increase in the total number of parameters that determine the shape of the output waveform. This is because the layout is no longer fixed by a library. The layout parameters, such as capacitances and resistances of the interconnect from each micro-cell, must also be taken into account. Macro modeling is therefore not practical in our application.
The third method is very often used in performance-driven applications, such as full-custom transistor sizing algorithms.
These applications use RC-tree calculations [4-51 to approximate each transistor by a linear RC network ( fig. 2a-2b ). Several models and calculation techniques with a variation of accuracies can be found [SI. The best methods offer an average error per micro-cell of 20 percent with regard to delay calculations. This method offers a high computing speed, but can not be used to predict accurate power dissipation values. Table 1 shows the advantages and disadvantages of these three methods, together with the requirements of the PMCC.
The last column represents the practicability of these methods. Macro modeling is not practical for the PMCC, due to the large total number of equations.
A new performpnce prediction technique
The new technique is a combination of the RC-tree and the macro modeling methods. It aims to combine the BU-efficiency of the RC-tree method with the high accuracy of the macro modeling method. An increase in accuracy is achieved by using macro modeling and substituting only the non-switching transistors by RC networks. The last has two consequences:
-First, the most important transistors (the switching ones)
are not substituted by RC networks, resulting in an increase in delay accuracy.
-Second, it is also possible to take the short-circuit power dissipation into account, resulting in an increase in power accuracy. The dissipated power in the case of full-CMOS, is given by: The second basic structure is used when a transmission gate contains the switching transistors. This structure consists of a transmission gate with RC networks (fig. 3a) . The non-switching full-CMOS micro-cell at the input of the transmission gate is modeled by RC network RC1.
Transmission gates and the load capacitance at the output are modeled by RC network RC2.
The same basic structures are used to predict the propagation delay and power dissipation. The only differences are the R and C values of the RC networks, which are used to replace the non-switching transistors. The exact R and C values are determined by accurate SPICE simulations.
Accurate SPICE simulations are also used to determine relations between the performance of these basic structures and their resistances, capacitances, transistor widths, transistor lengths and the input rise/fall time T d . These relations are stored in a table. Table lookup operations instead of SPICE simulations are carried out during a performance prediction session.
If this reduction is possible without losing performance data and the performance of these basic structures can be determined using a lookup table, then one can determine the performance of all reducible micro-cells.
The algorithm
The performance prediction method can be divided into four different phases: These phases will be explained in detail in the following.
Phase 1
The new performance prediction technique can only be applied to single stages. A stage is defined as a part of a circuit that has a single n-transistor and a single p-transistor (mi and m2) as input, and no transistor gates in the data path of the stage ( fig. 4a ). This is not a severe limitation, since most micro-cells consist of one stage, and more complex micro-cells (i.e. full adder) can easily be divided in several stages. Subsequently, each of these stages can be treated separately.
Phase 2
Assume, as most frequently occurs, that there is only a single switching n-and a single switching p-transistor per stage (mi and mz) and that this switching pattern results in a transient at the output. Other cases are not of interest. Replacing the non-switching transistors of figure 4a by RC networks results in an inverter with complex RC networks at the sources and drains of the n-apd p-transistors ( fig. 4b ). This replacement is shown for transistor m3. Some RC-tree applications [5] use RC networks that are a function of the input and output rise/fall times. Simulations showed that the effects of these rise/fall times can also be modeled by two additional capacitors. Therefore fued RC networks with two additional capacitors (Ci and C2 of fig. 4c ) are used to model these effects.
Parasitic capacitances and resistances of the interconnect can be easily included in these RC networks.
Phase 3
The complex RC networks of the inverter ( fig. 4c ) are simplified, using an Elmore-based method [7-91, to obtain simplified RC networks ( fig. 4d) . Finally, the capacitors connected to the power-supply can be omitted while others can be replaced by a single capacitor, resulting in a simplified network ( fig. 4e ). This network has sufficient properties from the original network to be used for accurate performance evaluation.
Phase 4
The reduction applied in the previous phase results in two basic structures, one for power and one for delay predictions. Table 2 8. Conclusions and future work A new performance determination method has been presented, to be used in a performance-driven micro-cell compiler for a sea-of-gates environment. A C-program has been written that is able to predict the power dissipation and the propagation delay of full CMOS micro-cells. This program uses a lookup table having a size of 7 KElytes, which was generated in five CPU-hours.
A full adder is predicted more than 3 orders faster in comparison to SPICE. The marimal prediction error is 10 percent, in contrast to an averuge error of 20 percent for the RC-tree method.
6 13-617.
1985, pp 507-509. Simulated power dissipation @J) Fig. 6d The overall power dissipation accuracy. Input rise/faU time (as) F i g . 6b The power dissipation of a four input NOR
