In this paper, we propose a novel power macromodeling technique for high level power estimation based on power sensitivity. P o w er sensitivity de nes the change in average power due to changes in the input signal speci cation. The contribution of this work is that we can use only a few points to construct a complicated power surface in the speci cationspace. With such a p o w er surface, we can easily obtain the power dissipation under any distribution of primary inputs. The advantages of our technique are two-fold. First, the required parameters corresponding to each representative point can be e ciently obtained by only one symbolic power estimation run or by only one Monte Carlo based statistical power estimation process. This stems from the fact that power sensitivity can be obtained as a by-product of probabilistic or statistical power estimation runs. Second, the memory requirements for the macromodel are reduced to Odn, where n is the number of primary inputs of a circuit and d is the number of representative points d can be as small as 1 in some cases. Results on a number of benchmark circuits demonstrate the e ectiveness of our technique.
Introduction
The increasing use of portable computing and communication systems makes power dissipation a critical parameter to be minimized during circuit and system design 1, 14 . Hence, there is a great need for tools to accurately estimate power dissipation at various levels of design abstraction.
Research o n p o w er estimation has started in earnest, however, most of the research concentrates at the logic level. In order to shorten design time and to reduce design iterations, we h a v e to estimate the power dissipation at a high level of abstraction to ensure that the strict power requirement of a future design is satis ed.
High level power estimation can be categorized into: topdown methods and bottom-up methods. The top-down methThis research w as supported in part by D ARPA F33615-95-C-1625, NSF CAREER award 9501869-MIP , IBM, AT&T, and Rockwell. ods 8, 10 are not widely investigated because they require that a combinational circuit be speci ed as a Boolean function with no knowledge of the circuit structure or implementation. Bottom-up methods 7, 9, 11, 12, 13 , on the other hand, have been exploited by several authors. However, one of the main concerns is how to develop a power macromodel for a module so that power dissipation can be easily obtained under any distribution of primary inputs.
Since power dissipation of a circuit strongly depends on the statistics of primary inputs, the relationship of power versus primary input probabilities probability of a signal being logic ONE and activities probability of signal switching is a complicated surface. Once such a surface set up, power dissipation under any distribution of primary inputs can be easily obtained. However, to construct such a p o w er surface, a huge number of discrete points are required. If one chooses d representative v alues for the probability and activity of each primary input, the number of representative points in the speci cation-space can be d 2n n is the number of primary inputs. This means that a symbolic or statistical power estimation process has to be repeated d 2n times. For large circuits with large number of inputs, such a process is obviously impractical due to the exponential growth of complexity. Moreover, the memory requirements
. This is also not a ordable. In this paper, we present a n o v el macromodeling technique based on power sensitivity 2, 3 . The basic idea of our technique is to use a limited number of representative points in the speci cation-space to approximate a complicated power surface. Power dissipation under any distribution of primary inputs is calculated by considering the representative points and power sensitivities. For each representative point, we need only four variables probability, activity, p o w er sensitivity to probability, and power sensitivity to activity for each input. Thereby, memory requirements is reduced to Odn.
Preliminaries 2.1 Power Dissipation in CMOS Logic Circuits
Among the three sources of power dissipation switching current, short-circuit current, and leakage current. The switching power is dominant in current technology. T h us the average power for a CMOS circuit can be approximated by Po w e r avg = 1 2 V 2 dd P j2all nodes Cj Aj where Vdd is the supply voltage, Cj is the node capacitance, Aj i s the activity probability of signal switching at node j. I f sideration switch only at the leading edge of the clock and that the circuits are delay-free, we can de ne normalized activity, denoted by axi, as Axi=f, where f is the clock frequency. Cj is approximately proportional to the fanout at node j. Hence, we can de ne the normalized p ower dissipation measure as: = P j2all nodes fanoutj aj where fanoutj is the fanout number at node j.
Power Sensitivity
To measure the e ect of the variations of primary input speci cations on power dissipation, we de ne power sensitivity to primary input activity ax i and power sensitivity to primary input probability Px i as follows: der Rn, the initial point being Px1nom, ax1nom, : : : , P x n nom, axnnom 5 . Po w e r nom is the average power dissipation based on nominal probabilities Pxinom and activities axinom.
For higher order power sensitivities, for any primary input xi, w e h a v e @ m Po w e r avg=@ k xi = 0 for any m 2 and k 2. This means that only those higher order power sensitivities in which each xi appears exactly once in the denominator will survive. This can be proved by induction.
For the rst order approximation, we can ignore the second or higher order power sensitivities to obtain power by Po w e r=Po w e r nom + X i2PI 0 s , a x i axi + P x i P x i 6 Thus, the power macromodel based on power sensitivities can be given by:
Po w e r=f P x i ; a x i ; a x i ; P x i 8 x i2 PI 0 s 7 3.2 Construction of Power Surface Power dissipation of a CMOS circuit is heavily dependent on the primary input probabilities and activities. If we plot the relationship between average power and input specications in terms of signal probability and activity in this paper we call such a plot power-speci cation plot, we will get a complicated surface. Since 2n n is the number of primary inputs dimensions are required to represent a certain speci cation of primary inputs, the power-speci cation plots are intrinsically 2n + 1-dimensional.
However, the process to obtain such a p o w er-speci cation plot is nontrivial. A possible method is to use random number generators to generate a large number of distributions of primary inputs. Then use symbolic or statistical method to obtain the power dissipation corresponding to each set of such distributions, which is a representative point for the power surface in the speci cation-space. However, the e ectiveness of this method strongly depends on the density o f the chosen points. The more points one chooses, the more accurate result one can obtain. However, more points directly translate to longer CPU time. In practice, one must use a nite number of points to approximate a complex power surface. In order to obtain the power under a given speci cation of primary inputs, one has to approximate its power by the value of a simulated point. This means that for some inputs there exist di erences between the actual values of the probabilities and activities and those corresponding to the representative points in the speci cation-space. In 2 , it has been shown that for some circuit a small deviation in the statistics of some primary inputs may h a v e a major e ect on power dissipation. For those sensitive primary inputs, such approximations may make a v erage power severely o the actual value. These errors can be e ectively reduced by taking power sensitivities at di erent nominal values.
The basic idea of our method is to approximate a complex power surface with a number of small planes. Firstly, randomly choose d di erent statistics of primary inputs.
Then determine average power and power sensitivities due to the d di erent statistics of primary inputs. Given a certain statistics of primary inputs, we use the nearest point i n t h e speci cation-space as the nominal value of the needed power. Finally, the required power dissipation can be obtained by equation 6 . Consequently, the vicinity of a representative point is approximately characterized by a plane.
To illustrate how our technique works, let us consider a 2-input AND gate y = x1 x2, where x1 and x2 are independent primary inputs. We h a v e activity ay = P x 1 a x 2 + P x 2 a x 1 , 1 2 a x 1 a x 2 . Figure 1 gives a simpli ed activity surface for the 2-input AND with Px1 = 0 : 3 and Figure 2 gives the activity surface obtained by equation 6 with 36 representative points. It ts the actual surface quite well. It should be noted that for a circuit with n inputs, the 2n + 1-dimensional power surface is impossible to plot. Even for the 2-input AND, it is still impossible to plot the actual 5-dimensional activity surface when considering all possible probability v alues.
E cient Techniques to Estimate Power Sensitivities
A naive approach to estimate power sensitivity w ould be to simulate a circuit to obtain the average power dissipation based on nominal values of primary input probabilities and activities. Then assign a small variation to only one primary input and re-simulate the circuit. After all the primary inputs have been exhausted, power sensitivity can be obtained using Po w e r i = x i . This naive method can be easily implemented. However, it involves n + 1 times of power estimation processes. If the number of primary inputs is large, this method can be computationally expensive. Therefore, the naive method is impractical for large circuits with large number of primary inputs. In this section we present t w o e cient techniques to estimate power sensitivities.
Symbolic Technique to Obtain Power Sensitivities
The basic idea of the symbolic technique to obtain power sensitivities is to express the activity of each i n ternal node or primary output in terms of the probabilities and activities of primary inputs. After we obtain the exact expression for signal activity, w e can easily compute power sensitivities ax i and Px i by equations 1 and 2, and higher order power sensitivities by equation 4.
Note that accurate calculation of power sensitivity depends on whether we can accurately express the probability and activity of each i n ternal node and primary output in terms of independent inputs. However, the size of symbolic probability expression and activity expression grow exponentially with respect to the number of independent inputs. Consequently, w e resort to circuit partitioning to trade-o accuracy versus computation time. The circuit partitioning technique is similar to the one given in 4 .
Statistical Technique to Estimate Power Sensitivities
The basic idea of the statistical technique is to obtain sensitivities as a by-product of statistical power estimation processes. A more detailed derivation can be found in 3 .
The expected value of average power in a CMOS circuits can be expressed as follows: E Pw r = 
Experimental Results
We h a v e implemented the two techniques to estimate power sensitivities in C under the Berkeley SIS environment. Based on the power sensitivities, the power macromodel has been implemented to accurately estimate power under any signal statistics.
Results for P o w er Sensitivities
Results for power sensitivities indicate that for some circuits power is much more sensitive to some primary inputs than others. A small activity v ariation of such highly sensitive primary inputs will result in a dramatic change of the aver- The naive method to compute power sensitivities proposed in the previous section acts as a gure of merit for STOPS symbolic technique to obtain power sensitivity and STEPS statistical technique to estimate power sensitivity. The average percent di erence between the results obtained by STEPS and those obtained by the naive method is about 4 3 . For STOPS, circuit partitioning may i n troduce an error of the order of 5 , 15 2 . Both STOPS and STEPS can be several orders of magnitude faster than the naive method.
It should be noted that STEPS can handle di erent delay models for logic gates. Due to the delay of each logic gate, paths arriving at any i n ternal gate may h a v e di erent propagation delays. The delay mismatch of di erent paths can cause spurious transitions. This in turn increases power dissipation and power sensitivities. Another interesting result is that the relative v alues of power sensitivities to di erent primary inputs may c hange when considering non-zero delay. F or circuit C3540, with zero-delay model, the average power is more sensitive to the third primary input than any others. However, with unit delay, the most sensitive primary input changes to the rst one instead of the third.
Power under Any Distribution of Primary Inputs
After obtaining power sensitivities, we use equation 6 to compute the average power for each simulated circuit under certain speci cation of primary inputs. The power sensitivities used in this section are obtained by STEPS. F or Table 2 gives the results for di erent benchmarks. In this case, each primary input of each simulated circuit has the same anom of 0:25 and the same a of 0:05. The values shown in columns macro", sim", and " are obtained in the same way as those shown in table 1. With zero delay model, compared with the Monte Carlo based simulation, the average percent di erence is only 0:5 and the maxi- For table 3 , each primary input of a tested circuit has a randomly generated activity v alue between 0:05 and 0:95 while a is still equal to 0:05. The column macro" still represents the power obtained by equation 6 based on randomly generated possible activities as the nominal values. The maximum percent di erence is under 1:0.
The di erence between table 3 and table 4 is that we choose a relatively large value 0:25 for a for each primary input. Compared to the Monte Carlo based simulation method, the average percent di erence is about 4. This is encouraging. It demonstrates that even a relatively large activity c hange does not a ect the accuracy of our technique. Hence, the number of nominal activity v alues used to construct a power surface can be really small. The fewer the number of simulated points, the less will be CPU time and memory requirements. Table 5 gives the results when both a and a are randomly generated with values between 0 and 1. This means that each primary input may h a v e a di erent activity and a di erent a . But for each primary input, there exists a constraint a + a 1 : 0. For the zero delay case, results obtained by our method match extremely well with those obtained by the Monte Carlo method. The maximum percent di erence is only 0:2. For the unit delay case, the average percent di erence is still under 6 while the maximum di erence is less than 10. Finally, to show the e ciency and accuracy of our technique, we report in tables 6 and 7 the results based on the power surface approximated by ten randomly chosen points for each circuit, where tables 6 and 7 correspond to zero delay model and unit delay model, respectively. F or each representative point, we randomly generate an activity v alue for each primary input, and then use STEPS to estimate the corresponding power and power sensitivities. Therefore, the power surface of a circuit is approximated by ten randomly generated points. In order to compare our technique with Monte Carlo based simulation, for each circuit, ten di erent distributions of primary inputs are randomly generated. For distribution i, w e rst nd the nearest point which minimizes q P j aj , aj nomi 2 , where j varies for all primary inputs and i varies for the ten representative points in the speci cation space. Then by equation 6 we use the corresponding power as the nominal value to compute the power corresponding to distribution i. The percent di erences between the results obtained by the two methods are shown in columns 2 to 11 in tables 6 and 7. Column Average" represents the average percent di erence over the ten distributions. The results are promising. For each simulated circuit, our technique can yield 95 accuracy on an average with only ten randomly chosen representative points.
We conclude this section with a few comments. Firstly, when we show the above results we assume a xed probability for each primary input. However, our technique is not limited to such an assumption. Secondly, the power dissipation shown in the macro" columns in the above tables are obtained by only considering rst order power sensitivities. In order to improve accuracy, higher order power sensitivities can be included without much o v erhead, which is due to the fact that power sensitivities can be obtained as by-products during power estimation process. Thirdly, we can further reduce memory requirements by storing only those parameters associated with very sensitive primary inputs. Finally, in some cases, the number of representative points used to construct the power surface can be reduced to ONE, which means only one symbolic run or only one statistical power estimation run is needed. For example, if the properties of each primary input falls into a relatively small range, for each primary input, we can use the middle point of its range as the nominal value to obtain all the necessary parameters: nominal average power, power sensitivities to activities, and power sensitivities to probabilities. Based on these parameters, power under any input distribution within the given range can be easily computed.
Conclusions
In this paper we h a v e presented a novel power macromodel technique for high level power estimation based on power sensitivities. A key advantage of this work is that we can use a relatively small number of representative points to construct a complicated power surface which can then be used to determine the average power under any statistics of primary inputs. This makes the implementation of our technique very e cient. Another feature of our technique is that the memory requirements for the power macromodel can be reduced to On n is the number of primary inputs.
The feasibility and e ectiveness of our technique have been veri ed on a large number of benchmark examples.
