In this paper, we propose a novel power macromodeling technique for high level power estimation based on power sensitivity. Power sensitivity defines the change in average power due to changes in the input signal specification. The contribution of this work is that we can use only a few points to construct a complicated power surface in the specificationspace. With such a power surface, we can easily obtain the power dissipation under any distribution of primary inputs. The advantages of our technique are two-fold. First, the required parameters corresponding to each representative point can be efficiently obtained by only one symbolic power estimation run or by only one Monte Carlo based statistical power estimation process. This stems from the fact that power sensitivity can be obtained as a by-product of probabilistic or statistical power estimation runs. Second, the memory requirements for the macromodel are reduced to O(dn), where n is the number of primary inputs of a circuit and d is the number of representative points (d can be as small as 1 in some cases). Results on a number of benchmark circuits demonstrate the effectiveness of our technique.
Introduction
The increasing use of portable computing and communication systems makes power dissipation a critical parameter to be minimized during circuit and system design [l, 141. Hence, there is a great need for tools to accurately estimate power dissipation at various levels of design abstraction.
Research on power estimation has started in earnest, however, most of the research concentrates at the logic level. In order to shorten design time and to reduce design iterations, we have to estimate the power dissipation a t a high level of abstraction to ensure that the strict power requirement of a future design is satisfied. down methods and bottom-up methods. The top-down methHigh level power estimation can be categorized into: topods [8, 101 are not widely investigated because they require that a combinational circuit be specified as a Boolean function with no knowledge of the circuit structure or implementation. Bottom-up methods [7, 9, 11, 12, 131, on the other hand, have been exploited by several authors. However, one of the main concerns is how to develop a power macromodel for a module so that power dissipation can be easily obtained under any distribution of primary inputs.
Since power dissipation of a circuit strongly depends on the statistics of primary inputs, the relationship of power versus primary input probabilities (probability of a signal being logic ONE) and activities (probability of signal switching) is a complicated surface. Once such a surface set up, power dissipation under any distribution of primary inputs can be easily obtained. However, to construct such a power surface, a huge number of discrete points are required. If one chooses d representative values for the probability and activity of each primary input, the number of representative points in the specification-space can be d2n (n is the number of primary inputs). This means that a symbolic or statistical power estimation process has to be repeated d2n times. For large circuits with large number of inputs, such a process is obviously impractical due to the exponential growth of complexity. Moreover, the memory requirements
. This is also not affordable.
In this paper, we present a novel macromodeling technique based on power sensitivity [2, 31. The basic idea of our technique is to use a limited number of representative points in the specification-space to approximate a complicated power surface. Power dissipation under any distribution of primary inputs is calculated by considering the representative points and power sensitivities. For each representative point, we need only four variables (probability, activity, power sensitivity to probability, and power sensitivity to activity) for each input. Thereby, memory requirements is reduced to O(dn).
Preliminaries
'This research was supported in p a r t by DARPA (F33615-95-C-1625), NSF CAREER award (9501869-MIP ), IBM, AT&T, a n d
Rockwell.
Permission to make digital/hard copy of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication and its date appear, and notice is given that copying is by permission of ACM, Inc. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. DAC 98, San Francisco, Califomia 01998 ACM 0-89791-964-5/98/06..$5.00
2.1
Among the three sources of power dissipation -switching current, short-circuit current, and leakage current. The switching power is dominant in current technology. Thus the average power for a CMOS circuit can be approximated by Poweravg = x2Eall nodes C ( J ) A ( s ) where Vdd is the supply voltage, C(3) is the node capacitance, A ( j ) is the activity (probability of signal switching) at node J . If we assume that all primary inputs to the circuits under conPower Dissipation in CMOS Logic Circuits sideration switch only at the leading edge of the clock and that the circuits are delay-free, we can define normalized activity, denoted by a ( x j ) , as A ( x ; ) / f , where f is the clock frequency. C ( j ) is approxima.tely proportional to the fanout at node j . Hence, we can define the normalized power dissipation measure @ as: CJ = nodes fanout(j) u ( j ) where fanovt(j) is the fanout number at node j .
Power Sensitivity
To measure the effect of the variations of primary input specifications on power dissipation, we define power sensitivity to primary input activity and power sensitivity to primary input probability <:p(2r) as follows:
where a ( z t ) and P ( x e ) are the activity and probability of primary input z l , respectively, and a ( j ) is the activity of node 3. For simplicity, we can refer to a a ( j ) / a a ( x , ) and a a ( j ) / a P ( x t ) as activity sensitivities.
Let e(xt) be P ( x l ) or a ( z , ) . We define second order power sensitivities to primary input xr as 
Power Macromodel
If we can get the power sensitivities to each primary input, we can compute power under any distribution of primary inputs by the following expansion: This means that only those higher order power sensitivities in which each 8 ( x e ) appears exactly once in the denominator will survive. This can be proved by induction.
For the first order approximation, we can ignore the second or higher order power sensitivities to obtain power by
eEPI's ( 6 ) Thus, the power macromodel based on power sensitivities can be given by:
Construction of Power Surface
Power dissipation of a CMOS circuit is heavily dependent on the primary input probabilities and activities. If we plot the relationship between average power and input specifications in terms of signal probability and activity (in this paper we call such a plot power-specification plot), we will get a complicated surface. Since 2n (PZ is the number of primary inputs) dimensions are required to represent a certain specification of primary inputs, the power-specification plots
However, the process to obtain such a power-specification plot is nontrivial. A possible method is to use random number generators to generate a large number of distributions of primary inputs. Then use symbolic or statistical method to obtain the power dissipation corresponding to each set of such distributions, which is a representative point for the power surface in the specification-space. However, the effectiveness of this method strongly depends on the density of the chosen points. The more points one chooses, the more accurate result one can obtain. However, more points directly translate to longer CPU time. In practice, one must use a finite number of points to approximate a complex power surface. In order to obtain the power under a given specification of primary inputs, one has to approximate its power by the value of a simulated point. This means that for some inputs there exist differences between the actual values of the probabilities and activities and those corresponding to the representative points in the specification-space. In [2], it has been shown that for some circuit a small deviation in the statistics of some primary inputs may have a major effect on power dissipation. For those sensitive primary inputs, such approximations may make average power severely off the actual value. These errors can be effectively reduced by taking power sensitivities at different nominal values.
The basic idea of our method is to approximate a complex power surface with a number of small planes. Firstly, randomly choose d different statistics of primary inputs. Then determine average power and power sensitivities due to the d different statistics of primary inputs. Given a certain statistics of primary inputs, we use the nearest point in the specification-space as the nominal value of the needed power. Finally, the required power dissipation can be obtained by equation ( 6 ) . Consequently, the vicinity of a representative point is approximately characterized by a plane.
To illustrate how our technique works, let us consider a 2-input AND gate y = z1 x2, where z1 and 1 2 are independent primary inputs. We have activity a ( y ) Figure 2 gives the activity surface obtained by equation (6) with 36 representative points. It fits the actual surface quite well. It should be noted that for a circuit with n inputs, the (272 + 1)-dimensional power surface is impossible to plot. Even for the 2-input AND, it is still impossible to plot the actual 5-dimensional activity surface when considering all possible probability values.
Eficient Techniques to Estimate Power Sensitivities
A naive approach to estimate power sensitivity would be to simulate a circuit to obtain the average power dissipation based on nominal values of primary input probabilities and activities. Then assign a small variation to only one primary input and re-simulate the circuit. After all the primary inputs have been exhausted, power sensitivity can be obtained using APower,/AO(z,). This naive method can be easily implemented. However, it involves n + 1 times of power estimation processes. If the number of primary inputs is large, this method can be computationally expensive. Therefore, the naive method is impractical for large circuits with large number of primary inputs. In this section we present two efficient techniques to estimate power sensitivities.
4.1
The basic idea of the symbolic technique to obtain power sensitivities is to express the activity of each internal node or primary output in terms of the probabilities and activities of primary inputs. After we obtain the exact expression for signal activity, we can easily compute power sensitivities and Cp(,,) by equations (1) and (2) , and higher order power sensitivities by equation (4) .
Note that accurate calculation of power sensitivity depends on whether we can accurately express the probability and activity of each internal node and primary output in terms of independent inputs. However, the size of symbolic probability expression and activity expression grow exponentially with respect to the number of independent inputs. Consequently, we resort to circuit partitioning to trade-off accuracy versus computation time. The circuit partitioning technique is similar to the one given in [4] .
Symbolic Technique to Obtain Power Sensitivities

Statistical Technique to Estimate Power Sensitivities
The basic idea of the statistical technique is to obtain sensitivities as a by-product of statistical power estimation processes. A more detailed derivation can be found in [3] .
The expected value of average power in a CMOS circuits can be expressed as follows:
Power(Xo XT)P(Xo X') (8) xo ,x= 
Results for Power Sensitivities
Results for power sensitivities indicate that, for some circuits power is much more sensitive to some primary inputs than others. A small activity variation of such highly sensitive primary inputs will result in a dramatic change of the average power. Consider circuit i6. All the primary inputs are assumed to have probability and activity values (nominal) of 0.5 and 0.26, respectively. The power sensitivities to activities of the lst, 2nd, 60th, and 124th primary input are 237, 276, 133, and 30, respectively while the power sensitivities of the other primary inputs are less than 4. If the activities of the 1st and 2nd primary input have a variation of f0.05, results indicate that power can change by 30%. The naive method to compute power sensitivities proposed in the previous section acts as a figure of merit for STOPS (symbolic technique to obtain power sensitivity) and STEPS (statistical technique to estimate power sensitivity). The average percent difference between the results obtained by STEPS and those obtained by the naive method is about 4% [3] . For STOPS, circuit partitioning may introduce an error of the order of 5 -159% [a] . Both STOPS and STEPS can be several orders of magnitude faster than the naive method.
It should be noted that STEPScan handle different delay models for logic gates. Due to the delay of each logic gate, paths arriving at any internal gate may have different propagation delays. The delay mismatch of different paths can cause spurious transitions. This in turn increases power dissipation and power sensitivities. Another interesting result is that the relative values of power sensitivities to different primary inputs may change when considering non-aero delay. For circuit C3540, with zero-delay model, the average power is more sensitive to the third primary input than any others. However, with unit delay, the most sensitive primary input changes to the first one instead of the third. Circuit Chosen C432 c499 C880 C1355
Power under Any Distribution of Primary Inputs
After obtaining power sensitivities, we use equation (6) simplicity, we assume that each primary input has a fixed probability value of 0.5. Table 1 shows the results for circuit C432 based on different nominal values for activity. Corresponding to each chosen nominal activity, all primary inputs are assumed to take this value. The column "macro" represents the power obtained by our technique with Aa equal to 0.05 for each primary input, while column "sim" represents the power obtained by a Monte Carlo based simulation method with the activity value of each primary input equal to (anom + Au). The values shown in the column i' %" are obtained by (P,,,,, -Ps,m)/P,,m, which stands for the percent difference between the results obtained by the two methods. The last row stands for the average percent difference. Results indicate that with zero-delay model the percent difference never exceeds 3.0% and the average percent difference is only 0.65%. With unit delay model, the average percent difference is 0.83% with the maximum percent difference no greater than 3.5%. Table 2 gives the results for different benchmarks. In this case, each primary input of each simulated circuit has the same anom of 0.25 and the same Aa of 0.05. The values shown in columns "macro", "sim", and " %" are obtained in the same way as those shown in table 1. With zero delay model, compared with the Monte Carlo based simulation, the average percent difference is only 0.5% and the maxi- Table 3 : Average power with randomly a and Aa = 0.05 each primary input of a tested circuit has a randomly generated activity value between 0.05 and 0.95 while Aa is still equal to 0.05. The column "macro" still represents the power obtained by equation (6) based on randomly generated possible activities as the nominal values. The maximum percent difference is under 1.0%.
The difference between table 3 and table 4 is that we choose a relatively large value 0.25 for Aa for each primary input. Compared to the Monte Carlo based simulation method, the average percent difference is about 4%. This is encouraging. It demonstrates that even a relatively large activity change does not affect the accuracy of our technique. Hence, the number of nominal activity values used to construct a power surface can be really small. The fewer the number of simulated points, the less will be CPU time and memory requirements. Table 5 gives the results when both u and Au are randomly generated with values between 0 and 1. This means that each primary input may have a different activity and a different Au. But for each primary input, there exists a constraint a + Aa < 1.0. For the zero delay case, results obtained by our method match extremely well with those obtained by the Monte Carlo method. The maximum percent difference is only 0.2%. For the unit delay case, the average percent difference is still under 6% while the maximum difference is less than 10%. power surface approximated by t e n randomly chosen points for each circuit, where tables 6 and 7 correspond to zero delay model and unit delay model, respectively. For each representative point, we randomly generate an activity value for each primary input, and then use STEPS to estimate the corresponding power and power sensitivities. Therefore, the power surface of a circuit is approximated by ten randomly generated points. In order to compare our technique with Monte Carlo based simulation, for each circuit, ten different distributions of primary inputs are randomly generated.
For distribution YE, we first find the nearest point which minimizes E, a(g) -a(j)nom(t))2r where 3 varies for all primary inputs and z varies for the ten representative points in the specification space. Then by equation (6) we use the corresponding power as the nominal value to compute the power corresponding to distribution T,. The percent differences between the results obtained by the two methods are shown in columns 2 to 11 in tables 6 and 7. Column "Average" represents the average percent difference over the ten distributions. The results are promising. For each simulated circuit, our technique can yield 95% accuracy on an average with only ten randomly chosen representative points.
We conclude this section with a few comments. Firstly, when we show the above results we assume a fixed probability for each primary input. However, our technique is not limited to such an assumption. Secondly, the power dissipation shown in the "macro" columns in the above tables are obtained by only considering (first order) power sensitivities. In order to improve accuracy, higher order power sensitivities can be included without much overhead, which is due to the fact that power sensitivities can be obtained as by-products during power estimation process. Thirdly, we can further reduce memory requirements by storing only those parameters associated with very sensitive primary inputs. Finally, in some cases, the number of representative points used to construct the power surface can be reduced to ONE, which means only one symbolic run or only one statistical power estimation run is needed. For example, if the properties of each primary input falls into a relatively small range, for each primary input, we can use the middle point of its range as the nominal value to obtain all the necessary parameters: nominal average power, power sensitivities to activities, and power sensitivities to probabilities. Based on these parameters, power under any input distribution within the given range can be easily computed.
Conclusions
In this paper we have presented a novel power macromodel technique for high level power estimation based on power sensitivities. A key advantage of this work is that we can use a relatively small number of representative points to construct a complicated power surface which can then be used to determine the average power under any statistics of primary inputs. This makes the implementation of our technique very efficient. Another feature of our technique is that the memory requirements for the power macromodel can be reduced to O(n) (n is the number of primary inputs). The feasibility and effectiveness of our technique have been verified on a large number of benchmark examples.
