With the shiji to low power IC design for personal computing and comunication applications. designers' priorities tun to accurate and &idem estimation of power consumption in ICs. Traditional current and power estimation techniques based on a SPICE-like simulation do not provide the necessary eflciencyfor such an application, and rhus new approaches have been recently proposed.
With the shiji to low power IC design for personal computing and comunication applications. designers' priorities tun to accurate and &idem estimation of power consumption in ICs. Traditional current and power estimation techniques based on a SPICE-like simulation do not provide the necessary eflciencyfor such an application, and rhus new approaches have been recently proposed.
In this, the flrst of U series of articles that reflect the new orientation of this column, Professor Farid Najm of the University of Illinois ((I Urbana-Champaign presents an overview of different techniques for estimating power consumption in large-scale IC designs. He also discusses computer aided design tools to help in the task.
-The Editors C ontinuing decreases in feature size and corresponding increases in chip density and operating frequency have made power consumption a major concern in VLSI design Excessive power dissipation in integrated circuits not only discourages their use in a portable environment, but also causes overheating, which degrades performance and reduces chip lifetime. To control the temperature levels, designers have come up with specialized and costly packaging and heatsink arrangements. This combined with the recently growing demand for low-power portable communications and computing systems, has created a need to limit the power consump tion in many chip designs. Indeed, the Semiconductor Industry Association has identified low-power design techniques as acritical technological n d [31.
Managing the power of an IC design adds to a growing list of problems that IC designers and design managers must contend with. Computer Aided Design (CAD) tools we needed IO help with the power management tasks. Specifically, there is a need for CAD tools to estimate power dissipation during the design phase.
July 1994
In the popular CMOS and BiCMOS technologies, the chip components (gates. cells) do not draw steady state power supply current. Instead, they draw current only when they make a logic transition. While this is considered an attractive low-power feature of these technologies, it makes the powerdissipation highly dependent on the switching activity inside these circuits. Simply put, a more active circuit will consume more power. This complicates the power estimation problem because the power becomes a moving target-it is dependent on the input pattern.
Thus, the simple and straight-forward solution of estimating power by using a simulator is severely complicated. The input-signal set or characteristics are generally unknown during the design phase because it depends on the system (or chip) in which the chip (or functional block) will eventually be used. Furthermore, it is practically impossible to estimate the power by simulating the circuit for all possible inputs. Recently, several techniques have been proposed to overcome this problem. Many techniques use probabilities to describe the set of all possible logic signals, and then study the power resulting from the collective influence of all these signals. This formulation achieves a certain degree of pattern-independence that allows one to efficiently estimate and manipulate the power dissipation.
Describing the Problem By power estimation, we generally refer to the problem of estimating the average power dissipation of acircuit. This is different from estimating the worst case instantaneous power, generally referred to as the voltage drop problem [4-61. Chip heating and temperature are directly related to the average power.
We have already alluded to a most straight-forward method of power estimation, namely by simulation: perform a circuit simulation of the design and monitor the power supply current waveform. Subsequently, the average of the current waveform is computed and used to provide the average power. The advantages of this technique are mainly its accuracy and generality. It can be used to estimate the power of any circuit, regardless of technology, design style, functionality, architecture, and so on. The simulation results, however, are directly related to the specific input signals used to drive the simulator. Furthermore, complete and specific information about the input signals is required. in the form of voltage waveforms.
Hence we describe these simulation-based techniques as being strongly pattem-dependent.
The pattem-dependence problem is serious. Often, one estimates the power of a functional block when the rest of the chip has not yet been designed, or even completely specified. In these cases, very little may be known about the inputs to this functional block, and complete and specific information about its inputs would be impossible to obtain. Even if one is willing to guess at specific input waveforms, it may be impossible to assess if such input5 are typical. Large numbers of input patterns would have to be simulated, and this can become computationally very expensive, practically impossible for large circuits.
12
Most other (more efficient) power estimation techniques that have been proposed start out by simplifying the problem in three ways. First. it is assumed that the power supply and ground voltage levels throughout the chip are fixed, so that it becomes simpler to compute the power by estimating the current drawn by every subcircuit. Second, it is assumed that the circuit is built of logic gates and latches, and has the popular and well-structured design style of a synchronous sequential circuit (Fig. I) . In other words, it consists of latches driven by a common clock and combinational logic blocks whose inputs (outputs) are latch outputs (inputs). It is also assumed that the latches are edge-triggered, and that we have a CMOS or BiCMOS design technology in which the circuit draws no steady supply current. Therefore, the average power dissipation of the circuit can be broken down into (1) the power consumed by the latches and (2) that consumed by combinational logic blocks. This technique provides a convenient way to decouplc the problem and simplify the analysis. Finally, it is commonly accepted, in accordance with the results of [7] , to consider only the chargingldischarging current drawn by a logic gate, so that the short-circuit current during switching is neglected.
Consider Fig. 1 . Whenever the clock triggers the latches, some of them make the transition to the on-state and draw pawer. Another challenge has to do with independence, an issue that arises when signals are represented by probabilities. The reason for introducing probabilities is to solve the pattern-dependence problem. Instead of simulating the circuit for a large number of patterns and then averaging the result, one can simply compute (from the input pattem set, for instance) the fraction of cycles in which an input signal makes a transition (a probability measure). The designer can then use-that information to estimate (somehow) how often internal nodes transition and.consequently, the power drawn by the circuit. Conceptually, this idea is shown in Fig. 2 , which depicts both the conventional path of using circuit simulation, and the alternative path of using probabilities. In a sense, one performs the averaging before, instead of after, and then runs the analysis. Thus, a single run of a probabilistic analysis tool replaces a large number of circuit simulation runs, provided the designer can tolerate some loss of accuracy.
The results of the analysis will depend on the supplied probabilities. Thus, to some extent the process is still pattern-dependent. The user must supply information about the typical behavior at the circuit inputs in terms of probabilities. Since the user is not required to provide complete and specific information about the input signals, we call these approaches weakly pattern-dependent.
There are many ways of defining probability measures associated with the transitions made by a logic signal, whether it be at the primary inputs of the combinational block, or at an intemal node. We start with the following two definitions: Coefficients 0 Definition 1 (signal probability): The signal probability Ps(x) at a node x is the average fraction of clock cycles in which the steady state value of x is a logic high.
Definition 2 (transition probability): transition probability Pt(x) at a node x is the average fraction of clock cycles in which the steady state value of x is different from its initial value.
Speed

Fast Fast Fast
Slow
Moderate
The signal probability is a relatively old concept that was first introduced to study circuit testability [9] . Note that bath of these probability measures am unaffected by the circuit's intemal delays. Indeed, they remain the same even if a zero-delay timing model is used. When this condition is assumed, however. the toggle power is automatically excluded from the analysis. This is a serious shortcoming of some proposed techniques.
Consider, for example, a zero-delay model where the transition probabilities are computed. Then the power can be computed as:
where Tc is the clock period, C; is the total capacitance at node 2; and n is the total number of nodes in the circuit. Since this assumes at most a single transition per clock cycle, then this is actually a lower bound on the true average power.
Now consider the issue of signal independence. In practice, signals may be correlated so that, for instance, two of them may never be simultaneously high. It is computationally too expensive to compute these correlations, so that the circuit input and intemal nodes are usually assumed to be independent. We refer to this as the assumption of spatial independence. Another independence issue is whether the values of the same signal in two consecutive clock cycles are independent. If we assume them to be, then the transition probability can be easily obtained from the signal probability according to:
We refer to this as the assumption of temporal independence.
Other recent power measures are based on the transition density formulation [lo, 251. The transition density at node x is then defined as the average number of transitions per second at node x, denoted D(x). Formally:
Definition 3 (transition density): If a logic signal x(t) makes nx(T) transitions in a time interval of length T, then the transition density of x(t) is defined as:
The density provides an effective measure of switching activity in logic circuits. If the density at every circuit node is made available, the overall average power dissipation in the circuit can be easily computed as:
In a synchronous circuit, with a clock period Tc, the relationship between transition density and transition probability is:
where equality occurs in the zerodelay case. Thus, thetransition probability can only give a lower bound on the transition density.
Let P(x) denote the equilibrium probability [Z] of a logic signal x(t), defined as the average fraction of time that the signal is high. Formally: 0 Definition 4 (equilibrium probability): If
Reproduced uith pernission of copyright wner. Further reproduction prohibited. 
Example BBD representation
x ( f ) is a logic signal, then its equilibrium Probability is defined as:
In contrast to the signal probability, the equilibrium probability depends on the circuit internal delays since it describes the signal behavior over time, and not only its steady state behavior per clock cycle. In the -delay case, the equilibrium probability r e d w s to the signal probability.
Evaluating (he Techniques
Probabifis!ic Techniques As previously mentioned, several power estimation approaches have bccn proposed that use probabilities in order to solve the Pam-dependence problem. In practice, all are applicable only to combinational circuits and require the user to specify typical behavior at the combinational circuit inputs. We will compare and contmt these techniques based on six criteria: (1) do they include consideration of the toggle power; (2) do they handle temporal correlation: (3) how complcx is thc required input specification; (4) do they provide the power consumed by individual gates; (5) do they handle spatial correlation; and (6) speed. We will discuss five different probabilistic approaches, for which the compariscas are shown in Table 1 Fig. 3 . The Boolean variablesxi are ordered, and each level in the BDD corresponds to a single variable. Each level may contain one or more BDD ncdes at which one can branch in one of two directions, depending on the value of the relevant variable. For example, suppose that xi = 1, x2 = 0, and x3 = 1. To evaluate y, we start at the top node and branch to the right since XI = 1, then branch to the left since xz = 0, and finally branch to the right since x3 = 1 to reach the terminal node "1." Thus, the corresponding value of y is 1.
In general, let y =Ax!, ... xn) andfi,=AO, x2, ..A) a!t the cofactors offwith respect to X I , then:
This equation shows how the BDD can be used to evaluate P@). The two nodes that are descendants of y in the BDD correspond to the cofactors off. The probability of the cofactors can then be expressed in the same way, in terms of their descendants. Thus a depth-first-traversal of the BDD, with a post-order evaluation of p(*) at every node, is all that is required. This can be implemented using the "scan" function of the BDD package [361.
Probabilistic Simulation (CREST)
This approach [20-221 requires the user to specify typical signal behavior at the circuit inputs using probability waveforms. A probability waveform is a sequence of values indicating the probability that the signal is high for certain time intervals, and theprobability that it makes low-to-high transitions at specific time points. The transition times themselves are not random. This allows the computation of the average. as well as the variance, of the current waveforms drawn by the individual gates in the design in one simulation run. These average current waveforms can then be used to compute the average power dissipated in each gate, as well as the total average power.
An example of a probability waveform is shown in Fig. 4 . In this example, the signal is high (probability = 0.5). It then transitions 
Circuits & Devices
The Uktorkal Perspective
The wlicst proposed techniques of estimating power dissipation were strongly pattern-dependent circuit simulation based 111, 121. One would simulate the circuit while monitoring the supply voltage and current wavefotms, which are subsequently used to compute the average power. In addition to being strongly pattern dependent, these techniques were too slow to be used on large circuits, which is where high power dissipation is a problem.
In arder to improve performance, other simulation based techniques were also proposed that were based on various kinds of timing, switching-level, and logic simulation [13] [14] [15] [16] [17] [18] . These techniques generally assume that the power supply and ground voltages 8n fixed. and only the supply current waveform is estimated. While they are imlecd more efficient than traditional circuit simulation, at the cost of some loss in accuracy, they remain strongly pattem-dependent.
In order to ovencome the short-comings of simulation-based techniques, orha specialized approaches have been pmposed, whose focus has been on combinational digital CMOS circuits. embedded in a synchronous design environment. The use of probabilities to estimate power was first proposed in [ 191. In this work, a zero-&lay mcdel was assumed and a temporal independence assumption was made so that the transition probabilities could be estimated using signal probabilities based on Eq. 2. Signal probabilities supplied by the user at the primary inputs are propagated into the circuit assuming spatial independence and the power was computed based on Eq. 1. Since a zero-delay model was used, the toggle power was ignored.
A probabilistic power estimation approach that does compute the toggle power and does not make the zero-delay or temporal independence assumptions, called probabilistic simulation was proposed in [20-221. In this technique, the use of probabilities was expanded to allow the specification of probability waveforms.This approach assumed spatial independence, and was not restricted to only synchronous circuits. Improvements on this technique were proposed in [23, 24] , where the accuracy and the correlation handling were improved.
Anotherprobabilisticappmch was proposed in [25-271, where the transition density measure of circuit activity was introduced. An algorithm was also presented for propagating the transition density into the circuit. This approach docs not make a zcro-delay assumption and makes only the spatial independence assumption.
Yet another probabilistic approach was presented in 1281. where binary decision diagrams (BDDs) [35] were used to take into account intemal node correlations and toggle power, at thecost of increased computational effort. This approach can become computationaily expensive, especially for circuits where toggle power is dominant.
In the above approaches, probabilistic information isdirectly propagated into the circuit. To perform this, special models for circuit blocks (gates) must be developed and stored in the cell library. In contrast, other techniques referred to low-to-high, with probability 0.2 at ti, and the probability rises to 0.25 between ti and 12, etc. At every transition point, the signal may also make a high-to-low transition, the probabilities ofwhich can becomputed from the other probabilities specified in the waveform. Notice that at r ] , the probability of 0.2 is not equal to the product of 0.5 and 0.25, which illustrates that temporal independence is not assumed. Such waveforms at the primary inputs arepropagated into the circuit, and the corresponding probability waveforms are computed at all the nodes. The propagation algorithm is very similar to event driven logic simulation with an assignable delay model. The only difference is that the simulation algorithm and simulation model for each gate deal with the probability of making a transition rather than the definite occurrence of a transition. The events are propagated one at a time, using an event queue based mechanism. Whenever an event occurs at the input to a gate, the gate makes a contribution to the overall average current that is being estimated, and generates an output event that is scheduled after some time delay. In the original implementation of CREST, a transistor level netlist was used to compute the average current pulse and delay of every gate. The same can be achieved using gate level models, provided they are precharacterized to estimate the current pulse and delay.
Transition density (DENSIM)
The average number of transitions per second at a node in the circuit has been called the transition density in [25-271, where an efficient algorithm is presented to propagate the density values from the inputs throughout the circuit. This was implemented in the program DENSIM for which the required input specification is a pair of numbers for every input node, namely the equilibrium probability and transition density. In this case, both signal values and signal transition times are random.
To see how the propagation algorithm works, recall the concept of Boolean diflerence: if y is a Bmlean function that depends on x, then the Boolean difference of y with respect to x is defined as:
where @ denotes the exclusive-or operation.
S i d e test case circuit.
lay of every gate is a specified fixed constant. To illustrate, consider the circuit in Fig. 5 
The simplicity of this expression allows excellent CAD implementations. Given the probability and density values at the primary inputs of a logic circuit, a single pass ova the circuit, using Eq. 9, gives thc density at every node. In order to compute the Bookan difference probabilities, one must also propagate the equilibrium probabilities P(s) f " the primary inputs throughout the circuit, using the same BDD algorithm for signal probability prupagation described above.
As an example, consider the simple case of a 2-input logic AND gate: y =.rim. In this case, 6y/6x1= .q and 6y/& =XI, so that:
In more complex cases, where f is a general Boolean function, binary decision diagrams can be used [25] to compute the Boolean difference phabilities.
Using a BDD
The technique proposed in [28] attempts to handle both spatial and temporal correlations by using a BDD to represent the suc- Thus, in the above example, the probability that the first transition of z occurs is P ( 4 l ) @ 42)) and the probability that the second transition occurs is P(z(2) @ z(3)).
Once these XOR functions have been constructed, both of these probabilities can be computed from the BDD. The expected number of transitions at z in a clock cycle is, therefore, E[nx(T~)l = P(z(1) @ ~( 2 ) ) + P(z (2) 8 2(3) ), and the transition density at z is D(z) = E[n,(Tc)]/Tc.
Using a BDD to perform these tasks implicitly means that the BDD variables are assumed independent. In thc above example, this means that the xi and xz terms are independent. Thus. while some temporal correlation between z( 1) and z(2) is taken cam of, no temporal correlation between y(1) and y(2) is possible. The reason is that temporal and spatial independence are effectively assumed at the primary inputs. Hence the use of the qualified term "internally" in Table 1 
Statistical Techniques
The idea behind these techniques is quite simple and appealing: simulate the circuit repeatedly, using some timing or logic simulator. whilemonitoring the power being consumed. Eventually, the power will converge to the average power, based on (3) and (4). The issues are how to select the input pattems to be applied in the simulations and how to decide when the measured power has converged close enough to the true average power. Normally, the inputs are randomly generated and statistical mean estimation techniques [38] are used to decide when to stop, essentially a Monte Carlo method. We will review the two main approaches that have bctn proposed, whose characteristics are compared in Therefore, for a desired percentage emor in the power estimate, and for a given confidence level (1 -a), we must simulate the circuit until This means that the number of required simulations is:
In practice, this technique was found to be very efficient. Typically, as few as 10 vectors may be enough to estimate the power of a large circuit with thousands of gates. But perhaps the most useful feature of this technique is that the usercan specify the required accuracy and confidence level up-front.
Thus, it retains the accuracy of deterministic simulation-based approaches, while achieving speeds comparable to probabilistic techniques. It also does not require an assumption of independence for internal nodes; it only requires the primary inputs to be independent.
Perhaps the only disadvantage of this approach is that, while it provides an accurate estimate of the total power, it does not provide the power consumed by individual gates or small groups of gates. It would take many more transitions to estimate (with the same accuracy) the power of individual gates, because some gates may switch very infrequently.
Power of Individual Gates (MED)
This recent technique [32] is a modification of the. McPower approach that provides both the total and individual-gate power estimates, with user-specified accuracy and confidence. One reason why designers estimate the power consumed by individual gates is to diagnose any high power problem, and determine which part of the circuit consumes the most power. In essence, estimating gate power is essentially equivalent to estimating the transition density at every node. Indeed. the implementation of this technique in the program MED provides the transition density at every gate output node, in addition to the total power. These density values can then be used to estimate circuit reliability (251.
The Thus, qmi& becomes an absolute e m r bound that characterizes the accuracy for low-density nodes. Although the percentage error for low-density nodes sharply increases as n + 0, the absolute error remains relatively fixed. In fact, the absolute e m r bounds for low-density nodes are always less than the absolute error bounds for other nodes. Although these nodes require the longest time to converge, they have the least effect on circuit power and reliability. Therefore the above strategy reduces the execution time. with little or no penalty.
A weakness of this approach may be its speed (cumntly, a circuit with 16,000 gates requires about 2 hours on a SUN Sparc ELC). Further development may improve this performance.
Sequential Circuits
The main shortcomings of the above techniques is that they do not apply to sequential circuits. While the CREST approach can be used to simulate ii circuit with feedback, tbe resulting loss of accuracy due to the independence assumption, especially when recursively applied in a feedback loop, renders the results quite meaningless. As for 1281. although the title includes "sequential circuits," they assume that all states are equally probable, which is not true in practice.
There is one other technique to find the power dissipation in sequential circuits [33), but the proposed approach is too expensive because it exhaustively enumerates the circuit input states. The author proposes a heuristic in which this enumeration is not carried to completion, but does not provide any systematic way ofdeciding when to stop enumerating. Instead, the process is stopped at an arbitrary point.
Thus, the question of computing the latch output probabilities and densities directly from the sequential machine structure is still an open problem. We recommend that the user perform a long high level (RTL) simulation of the circuit to measure the required statistics at the latch output (with some confidence) and then apply one of the above methods to the combinational blocks based on that information.
Summary and Conclusions
Power estimation tools are required to manage the power consumption of modem V U 1 designs during the design phase, so as to avoid a costly redesign process. Since average power dissipi3tion is directly related to the average switching activity inside a circuit, it would no( make sense to expect to estimate power without some information about the circuit input pattems. Yet this is what one would like to do in order to qualify a chip with a certain power rating that holds irrespective of the application. We have presented a number of power estimation techniques that are designed to alleviate this strong pattem-dependence problem.
It turns out that these techniques are weakly pattem-dependent, since the user is expected to suppl,y some information on the typical bchavior ,at the circuit inputs. This information is usually in the form of probability (average fraction of time that a signal is high) and density (average number of transitions per second). This information is usually much more readily available to designers than specific input patterns are. It is relatively easy for a designer to estimate average input frequencies, for example, by looking at test vector sets, or simply by assuming some nominal average frequency based on the clock frequency. The proposedl techniques are effective ways of using this information to find the circuit power.
All these techniques use simplified delay models. so that they do not provide the same accuracy as, say, circuit simulation. But they are fast, which is very important because one is usually interested in the power dissipation of large designs. Within the limitations of the simplified delay models, some of these techniques, e.g.. the statistical techniques, can be very accurate. In fact the desired accuracy can be specified up-front. The other class of techniques, i.e., the probabilistic techniques. are not as accurate but can be faster. Two of the proposed probabilistic techniques use BDDs and achieve very good accuracy, but they can be slow and may not be feasible for larger circuits.
From an implementation standpoint, one major difference between probabilistic and statistical techniques is that statistical techniques can be built around existing simulation tools and libraries, while probabilistic techniques cannot. Typically, probabilistic techniques require specialized simulation models. In general, it is not clear that any one approach is best in all cases, but we feel that the second statistical approach (MED) offers a good mix of accuracy, speed, and ease of implementation. It may be that a combination of the different techniques can be used for different circuit blocks. 
