Abstract -A power estimation approach is presented in which blocks of consecutive vectors are selected at random from a user-supplied realistic input vector set and the circuit is simulated for each block starting from an unknown state. This leads to two (upper and lower) bounds on the desired power value which can be quite tight (under 10% difference between the two in many cases). As a result, the power dissipation is obtained by simulating only a fraction of the potentially very large vector set.
I. INTRODUCTION
Maximizing circuit speed and minimizing chip area used to be the only major concerns of VLSI designers. In recent years, power consumption of integrated circuits (ICs) has proved to be just as important of a concern. Thus, VLSI designs nowadays emerge as a trade-off among three goals: minimum area, maximum speed, and minimum power dissipation.
Power dissipation is a major concern of the semiconductor industry. This is because excessive power dissipation causes overheating, which may lead to soft errors or permanent damage. It also limits battery life in portable equipment. Thus, there is a need to accurately estimate the power dissipation of an IC during the design phase. We should note that by power estimation we refer to the problem of average power estimation. This is different from the estimation of the worst case instantaneous power. Chip reliability and equipment lifetime are directly related to the average power.
Several approaches have been proposed for power estimation [1] , especially for estimation at the gatelevel. However, even at the gate-level, the problem is not yet completely solved. At least two open problems remain: 1. Accurate and fast estimation of the average power dissipated by individual gates, typically inside an optimization loop, and 2. Accurate and fast estimation of the total average power dissipation in large sequential circuits. The words "accurate" and "fast" are emphasized in both cases to indicate that existing techniques are either inaccurate and fast or accurate and slow. The fact that the first problem is not yet solved has been clearly illustrated in [2] . In this paper, we will argue and demonstrate that the second problem is also still open, and we offer a new method which provides accurate and fast estimation of the total average power of large sequential circuits.
Since the power is pattern-dependent, the average power dissipation of a circuit is not well-defined until a specific vector set is chosen. For combinational circuits, this may not be very critical, because different vector sets may dissipate approximately the same power, provided they have approximately equal values of switching activity. Thus, using a set of randomly generated vectors (with the right statistics) may be appropriate for these circuits. However, this does not hold for sequential circuits, because a real vector set (as opposed to a randomly generated, artificial vector set) may contain specific vector sequences that put the circuit in specific operational modes or sub-spaces of its large state space and, in different operational modes, the circuit may dissipate quite different values of power. All one has to do is think of all the many different operational modes of a large micro-processor. Thus, for sequential circuits, the power may be critically dependent on the specific vector sequences that occur during typical operation.
Most existing techniques of power estimation consider simply the average switching activity and signal probability of the input signals and use either static probability propagation methods [3] [4] [5] [6] or dynamic Monte Carlo simulation using randomly generated vectors [7, 8] . In either case, one runs the risk of taking the circuit into parts of its state space where it does not belong, i.e., into modes of operation that are unrealistic and may never be exercised in practice. When this happens, there is no guarantee that the estimated power has any relation to what the circuit will actually dissipate under typical operation.
To illustrate this problem, we have considered a number of sequential circuits and constructed two sets of input vectors for each. Both sets of vectors have the same switching activity and signal probability for each input node. However, in one vector set, the input signals were generated at random, without any corre-lation between them, and in the other non-zero correlations were considered, both in space (between pairs of bits in the same vector) and in time (between pairs of consecutive vectors). The intention is that these correlations would mimic to some degree the relationships that typically exist between signals, such as signals resulting from decoded instructions or general control signals. Note that these correlations are only the simplest kinds of correlation relations, because they do not model the temporal correlations that can exist in vector streams over several clock cycles. We emphasize this point to indicate that proposed approaches that use correlation coefficients [9] may be able to handle pairwise correlations between bits in a vector or between consecutive vectors, but cannot handle the variety of other input signal relations that can exist in sequential circuits. In other words, although sequence compaction methods [9] replace the realistic, long vector set with a smaller vector set that satisfies similar statistics, they still run the risk of taking the sequential circuit into illegal states. This is because they might introduce new vectors or vector sequences that take the sequential circuit into illegal states and thus result in wrong power estimates. Even with just these simple correlations applied, big differences are possible in the resulting power values, as shown in Table I where Pwr(uc) refers to power dissipated under the uncorrelated vector set and Pwr(co) is the power due to the correlated vector set. The error (Err) is measured as the difference between the two power values, divided by the power due to the correlated set. Note that the power in both cases (uncorrelated and correlated inputs) was measured using the same simulator, so that the errors are due only to the presence of the additional correlations in one vector set but not in the other. It would seem, therefore, that the only truly accurate power estimation method for sequential circuits is to simulate the circuit for a specific realistic and typical vector set. We refer to such a vector set as the power vector set. If one has a power vector set which is short enough to simulate in its entirety, then this would certainly be the method of choice. However, in practice it is very hard (almost impossible) to specify a power vector set which is both short-enough for simulation and long-enough to cover all the interesting operational modes of a large sequential circuit. Micro-processor designers will usually agree that millions of vectors may be needed in order to satisfactorily exercise their large designs.
To solve this problem, we propose a method of power estimation that takes a (potentially very long) power vector set and provides an estimate of the total power by simulating only a fraction of the vector set. The vectors to be simulated are selected by repeatedly choosing blocks of consecutive vectors at random, until certain accuracy criteria are met. We call this a block-sampling approach. From the repeated simulations of the blocks, we collect statistics on the mean upper bound and mean lower bound for the power per block. Using standard Monte-Carlo mean estimation techniques, the two means can be estimated with userspecified accuracy and confidence without having to simulate all blocks. The net effect is that only a fraction of the total vector set is simulated and accurate tight bounds on the total power are estimated, yielding a viable accurate power measure.
II. PROBLEM FORMULATION Let u 1 , u 2 , . . . , u m be the primary input nodes of a sequential logic circuit and let x 1 , x 2 , . . . , x n be the present state lines. For simplicity of presentation, we have assumed that the circuit contains a single clock that drives a bank of edge-triggered flip-flops. On the falling edge of the clock, the flip-flops transfer the values at their inputs to their outputs. The inputs u i (k) and the present state values x i (k) determine the next state values x i (k + 1) and the circuit outputs, where k denotes the clock cycle, so that the circuit implements a finite state machine (FSM).
Suppose a power vector set is provided which consists of the input vectors U (1), U (2), . . . , U (M ), where
is the input vector applied during cycle k, and M is the total number of vectors in the vector set. We assume that the initial state vector X(1) is well-defined, so that there exists a well-defined resulting sequence of state vectors X(1), X(2), . . . , X(M ), where
The initial state need not be known, it only needs to be well-defined, i.e., not arbitrary or variable, in order for the power (due to this vector set) to be well-defined.
The total energy dissipated in the circuit in the kth cycle, denoted e(k), is a function of X(k − 1), X(k), U (k−1), and U (k). For e(1), because X(0) and U (0) are not defined, we arbitrarily define e(1) = 0. Over a block of K consecutive input vectors, starting at cycle i, the average power dissipated is (where T is the clock period):
If K is a constant, the same for any i, then the total power dissipation P (over the whole vector set) is given by:
The second equality is true because for any given cycle k = k 0 , the energy e(k 0 ) due to that cycle will occur in K different blocks and therefore will be part of K different terms P K (i). Note that for the last K − 1 blocks, for which
The same applies to the first K − 1 blocks, e(k) = 0 for k = −K + 2, . . . , 0. This is required in order for the average power per block to be equal to the average power per cycle, leading to (2). If we now consider a probability experiment in which a block of vectors is chosen at random from the power vector set so that all blocks are equi-probable, then the average power per block becomes a random variable, denoted P K , which takes values in the set
We will use bold font to denote random quantities. From (2), it becomes clear that the total power is the following mean or expected value:
where E[·] denotes the expected value operator. If P u K (i) and P l K (i) are upper and lower bounds on P K (i), respectively, then we can also talk about the random variables P u K (random upper-bound value) and P l K (random lower-bound value), so that P l K ≤ P K ≤ P u K , which leads to:
In the next section, we propose a practical method for estimating the two bounds in (4).
III. SINGLE BLOCK POWER ANALYSIS If it were possible to obtain sample values of P K (i) for a sufficient number of values i, it would then be possible to estimate P based on (3) to any desired accuracy (with some specified confidence) using traditional statistical methods of mean estimation. However, since the FSM state at the start of a block is unknown, this cannot be done. Instead, our approach is based on (4) and involves using mean-estimation techniques to find two bounds on the unknown power value.
Briefly stated, we make N random choices for the block start index i (let these constitute a set of indices I) from which we compute by simulation N sample values of each of the random variables P u K and P l K . We then compute the two means:
which we can use as bounds on the desired power value P , based on (4). It remains to describe how to perform the simulation in order to obtain P u K (i) and P l K (i), and discuss the behavior of P u K (i) and P l K (i) as a function of the vectors simulated. Furthermore, we need to describe how we choose values for K and N . These topics are covered below and in the next section.
A. Block Simulation
The simulation of a block of vectors is complicated by the fact that the state of the FSM at the beginning of that block is not known. Any wrong choice made for the state at that time can have the effect that the simulation of this block takes the FSM into states that never occur in practice. Therefore, we set the FSM to an all-X state (all state bits are in the unknown state) and perform three-valued gatelevel simulation, with the values (0, 1, X). During the simulation of the block, we compute two bounds on the power due to that block. The upper (lower) bound is found by assuming that every signal transition containing an X value actually occurs with the X replaced by either a 0 or a 1, whichever leads to the larger (smaller) power dissipation for that transition. For instance, if the output of a gate makes an X → 1 transition, then it is assumed to be a 0 → 1 transition for purposes of computing the upper bound and a 1 → 1 transition for purposes of computing the lower bound. For purposes of continuing the simulation, the transition is kept as X → 1. Likewise, when the output of a gate makes an X → X transition, it is assumed to be a 0 → 1 (or 1 → 0) transition for purposes of computing the upper bound and a 0 → 0 (or 1 → 1) transition for purposes of computing the lower bound. For continuing the simulation, it is kept as an X → X transition. In this way, the true (unknown) signals in the circuit are guaranteed to be sub-sets of the simulated signals, and the true power for that block is guaranteed to be between the two resulting bounds P u K (i) and P l K (i). The reason that this method can be useful in practice is that in many cases, many of the X values become definite 0 or 1 values during three-valued simulation. In fact, we have found that sufficiently many X values become known that the two bounds resulting from the simulation of one vector block can be very close, close enough to constitute a viable measure of power. A related issue of importance at this point is the initializability of circuits. A circuit is said to be functionally initializable if, once implemented, it can always be initialized to a definite state. On the other hand, a circuit is said to be logically initializable if, when started from an all-X state (unknown initial state), there exists a vector sequence that can drive it into a definite state using three-valued logic simulation.
For logically initializable circuits, we have observed that simulating a few vectors typically takes the circuit from an unknown initial state (all Xs) to a known state. The word few is emphasized to indicate that although the actual number of vectors varies as a function of both, the circuit and the vector stream simulated, it is typically a small fraction of the total vector stream.
We verified the above observation for all the circuits of the ISCAS-89 [12] benchmark circuits which are known to be logically initializable as given by [13] . This is illustrated with the histogram shown in Fig 1. We will refer to the simulation of a specific circuit for a specific vector stream as one test case. Also, we will refer to the number of vectors simulated before the state of the circuit becomes known as the length of an initializing sequence. The histogram shows that of the 600 test cases (20 circuits, each simulated for 30 vector streams), 573 had an initializing sequence of less than 10 vectors and only 8 required an initializing sequence larger than 50 (these eight cases, which are not shown in the figure, are: 51, 55, 59, 73, 96, 109, 109, and 267). This leads to our claim that typically, simulating a logically initializable circuit for a few vectors is enough to take the circuit from an unknown initial state to a known state. We should point out that any type of simulation model may be used -the measured power will be as accurate as the simulation model. Because we are computing the total power of the circuit (and not the powers of individual gates), we find that a logic simulator with a good timing model is sufficient. In our implementation, every gate has a scalable delay value, depending on the output loading capacitance due to its drain capacitance and the MOSFET gate capacitance of the logic gates on the fanout branches. Although three-valued logic simulation is usually associated with zero delay simulation, our simulator is actually a three-valued event driven scalable delay logic simulator. It is three-valued to account for unknown logic values since the initial state of the sequential circuit is unknown. Furthermore, it is event driven and uses scalable delay (so that different gates can have different delays), so that the estimated power includes the power due to glitches. Hence, if for two consecutive input vectors, one input of an AND gate is logic 1 while the other input undergoes an X → 1 transition at time t , then the output of the AND gate will undergo an X → 1 transition at time t + D where D is the delay of the gate. Then X is assumed to be 0 or 1 for purposes of computing the bounds on the number of transitions. However, for continuing the simulation, X is maintained as an X.
Let n l k (j) and n u k (j) be lower and upper bounds on the number of logic transitions made by node j in clock cycle k, respectively. These are computed during the simulation by simply considering that signal transitions involving an X value can be interpreted in two ways as explained above, with one way representing more actual transitions and another way representing less. Thus, for example, upon observing a 0 → X transition at node j, we would increment n u k (j) (due to the 0 → 1 possibility) and not increment n l k (j) (due to the 0 → 0 possibility). From this, the total energy dissipated in clock cycle k is bounded by e l (k) ≤ e(k) ≤ e u (k), where the energy bounds are computed as follows:
respectively, where C j is the node capacitance and the summations are taken over all gate/latch output nodes in the circuit. The reason for the 1/2 coefficient is that, on average, half the transitions will be low-to-high and the other half will be high-to-low. As pointed out above, one does not have to use this particular power model (6, 7) , and any number of more accurate power modeling approaches can be used. All that is required is that the energy bounds e l (k) and e u (k) be computable during the simulation. In this work, this model was deemed sufficiently accurate in order to illustrate the feasibility of the approach. From this, the block upper/lower bound values P u K (i) and P l K (i) are computed in a way similar to (1), as follows:
To illustrate the process of block simulation, consider the circuit shown in Fig. 2 . The circuit is simulated for the vectors shown, v 0 , v 1 , v 2 , v 3 , and v 4 , starting from an unknown initial state. Note that after simulating vectors v 0 and v 1 , the state of the circuit is completely known. Thus, the lower and upper bounds on the number of transitions at every node are equal for vectors v 2 , v 3 , and v 4 . Considering node Z, this node undergoes an X → 0 transition when vectors v 0 and v 1 are simulated. Thus, the lower bound on the number of transitions for this node would be 0 (assuming 0 → 0) while the upper bound would be 1 (assuming 1 → 0). Therefore, the lower and upper bounds on the power value are initially different but then start converging to the same value after the state of the circuit becomes known. 
B. Choice of Block Size, K
The choice of block size, K, can affect the tightness of the bounds in (4). This is because the larger the block, the more probable it is that more X values will be converted to definite 0 or 1 values during the simulation. On the other hand, K should not be too large because beyond some point there will typically be very little or no reduction in the number of X values. In our implementation, K was chosen empirically, by looking at a large number of simulations, and we found that a value of K = 500 is appropriate.
A typical plot for a circuit with around 16000 gates is shown in Fig. 3 . In practice, say for a microprocessor design, the value of K would probably have to depend on the instruction set and on the number of instructions that may be required to constitute meaningful processing tasks. In any case, the choice of K will only affect the tightness of the bounds, not their correctness.
For a circuit which is logically initializable, then (as pointed out above) in all cases that we observed, the circuit state becomes completely known after a few vectors (see Fig. 1 ). Once that happens, then the upper and lower bounds on energy per cycle (e u (k) and e l (k)) become identical (equal to the true e(k)). Based on this, one can easily prove (see appendix) that from then on, the power bounds for that block are guaranteed to converge to the same value. This accounts for the observed tightness in our results. If the circuit is not logically initializable, then the circuit state may remain mostly unknown (most of the state bits remain X) and therefore the bounds may remain quite different, and not tight. 
C. Choice of Sample Size, N
The choice of sample size, N , affects the quality of the approximations in (5). It should be clear that the larger N is, the better the approximation, but how much should N be for a certain desired error tolerance? This is the classical problem of meanestimation in statistics. We will briefly review the mean-estimation procedure with reference to an arbitrary random variable x whose mean E[x] is to be estimated from N sample values x 1 , . . . , x N , using:
which is what is done in (5). Basically, x corresponds to the average power per block and thus, by estimating its mean, the total average power dissipated in the circuit is estimated as given in (3). In our work, the start of a block is chosen completely at random every time, independently of all prior block positions, using a uniform random number generator that gives a value between −K + 2 and M . Hence, the values x 1 , . . . , x N are guaranteed to be samples of independent random variables. Furthermore, all blocks are of the same size, and thus, x 1 , . . . , x N are samples of identically distributed random variables.
C.1 Using the t-distribution
Therefore, x 1 , . . . , x N are samples of independent, identically distributed (iid) random variables. Thus, µ N as given in (10) is a sample of a random variable called the sample mean [10] , whose mean is equal to E[x] and whose variance is equal to σ 2 /N , where σ 2 is the variance of x. If the sample values x 1 , . . . , x N are taken from a normal population having the mean E[x] and the variance σ 2 , then t =
is the value of a random variable having the t-distribution with ν = N − 1 degrees of freedom [11] , where s N is the standard deviation of the observed N data values x 1 , . . . , x N . Consequently, with (1 − α) confidence, it follows that [11] 
where 0 < α < 1 and where t α/2 is defined so that the area to its right under the t-distribution curve is equal to α/2. The value of t α/2 for a given α can be easily found using standard statistical tables. As for s N , it is measured as follows:
Therefore, with confidence (1 − α), we have:
If ǫ 1 is a small positive number, and if N is large enough to achieve:
then ǫ 1 places an upper bound on the relative error of the sample, with (1 − α) confidence:
This may also be expressed as the relative deviation from the mean E[x]:
Here, ǫ > 0 is defined as the user-specified error tolerance, and α (or 1 − α) is the user-specified confidence. Thus (14) provides a stopping criterion that determines when to stop sampling in order to yield the accuracy specified in (16) with confidence (1 − α).
Notice that the required number of samples N is not known a priori, but is determined only when (14) is first met. We should note here that it is not always the case that the sample values, x 1 , . . . , x N , come from a normal population. In order to account for the case when the distribution of the random variable x is not normal, we make the following observation. Based on the Central Limit Theorem [11] , the distribution of the sample mean approaches the normal distribution for large N . The minimum number of samples, N , to satisfy near-normality is typically about 30 [11] . Thus, if we define an N -sample as y k =
, where x 30(k−1)+1 , . . . , x 30k are samples of a non-normal distribution, then y k is a sample of a random variable y k whose distribution is near-normal. Consequently, one option in handling non-normal populations is to take the sample mean of 30 iid samples of the non-normal population, and consider it as one N -sample. Then, repeat the process to obtain as many N -samples as needed. This way, it is guaranteed that the obtained N -samples are samples of a near-normal distribution. Hence, we can apply the previous technique of stopping the simulation when the user-specified accuracy and errorcriteria are satisfied with some approximation. Note here that a minimum of 60 samples of the non-normal distribution are needed for convergence. This is because a minimum of 2 N -samples (or sample means) are needed to satisfy the accuracy and error criteria specified in (16).
C.2 An alternative approach
To avoid the minimum requirement of 60 samples, the following approximation proves useful. For large sample sizes (N larger than 30 or so), one may
where σ 2 is the variance of x and s N is the standard deviation of the observed N data values x 1 , . . . , x N , measured as given in (12) . Because the distribution of the sample mean µ N approaches the normal distribution for large N (30 or more), it follows that with (1 − α) confidence [11]
where 0 < α < 1 and where z α/2 is defined so that the area to its right under the standard normal distribution curve is equal to α/2. The value of z α/2 for a given α can be easily found using standard statistical tables. In other words, we simply take N samples from the given distribution (whether normal or not). Then we simply check if (17) satisfied, replacing
Here, ǫ > 0 is defined as the user-specified error tolerance, and α (or 1 − α) is the user-specified confidence. We verified that replacing σ/ √ N by s N / √ N is indeed a valid approximation by implementing both of the above techniques and comparing the results.
The above discussion is applicable for estimating both the mean upper and lower bounds on the power dissipation, E[P 
IV. EXPERIMENTAL RESULTS
The technique proposed above has been implemented and tested on a number of sequential benchmark circuits. All the results to be presented were performed with 5% error-tolerance (ǫ = 0.05) and 95% confidence (α = 0.05). All the circuits were derived from the ISCAS-89 benchmark circuits [12] , after mapping them to a gate library with delay and capacitance values typical of 0.5µ CMOS technology. We have restricted our results to the subset of the ISCAS-89 circuits that are known to be logically initializable. This is because according to [13] , almost all circuits that are functionally initializable are also logically initializable and practical circuits will always be functionally initializable. Other than this, no special considerations were used in picking the circuits below. Only two circuits were shown in [13] to be functionally but not logically initializable and, on these two circuits, our method does not work very well (meaning that although the bounds are correct, they are not tight). While more circuits may need to be tested, this may mean that the method is best suited to circuits that are known to be logically initializable.
Because no input vector sets are available for these benchmarks, we have tried to mimic the correlations that exist in real vector sets by generating a long correlated vector set consisting of 100,000 vectors. The correlation coefficients were changed arbitrarily every M vectors where M is chosen randomly. That is, randomly pick a number M 1 , then generate M 1 vectors with certain correlation coefficients (signal probability, spatial and temporal correlation factors), then randomly pick another number M 2 , and generate M 2 vectors with different correlation coefficients. Repeat until the whole 100,000 vectors are generated. Thus, the statistical properties of the vectors vary widely depending on where they are in the 100,000 vector stream. For each circuit, we first estimated the power due to the whole 100,000 vectors by simulation, and then used our block sampling approach to estimate the power, with 5% error-tolerance In the first set of experiments, with results shown in Table II , we explored the tightness of the power bounds and the speed of convergence. For some details of these circuits (gate count, etc.), the reader is referred to Table I in section II. The table lists the power upper and lower bounds in mW, and the average of the two under the "Power" column. The tightness of the bounds was measured as the difference between them divided by their average, expressed as a percentage. The values illustrate that the bounds can be quite tight in most cases. The table also lists the number of cycles (i.e., vectors) that were required for convergence and the CPU time required. Note that the number of vectors required for convergence is different for different circuits. This illustrates the importance of having a convergence check (a stopping criterion). It would not be sufficient, for instance, to simulate all circuits for the same number of vectors. It is also notable that the required number of cycles is not necessarily larger for larger designs.
Then, the power estimated by our block sampling approach (average of the two bounds) was compared to that computed by simulation of the whole 100,000 vector set. The results are shown in Table III where the power values are expressed in mW. It is clear that the errors are very small and that all are below the specified 5% error tolerance. Table III also includes a column named "Compaction." This is the ratio of the total number of vectors simulated by the block sampling method to the total number of vectors (100,000) in the power vector set. For most of the circuits, it turns out to be enough to simulate around 15% of the total vector set. Note that this is the minimum number of cycles required for the approximations made in section III.C to be valid. The approximations hold if 30 samples or more are obtained. Thus, for our choice of block size K = 500, this would require the simulation of a minimum of 15000 cycles which is 15% of the total 100,000 vectors. For some of the circuits, more samples are required but note that, in all the circuits, it was enough to simulate at most 18% of the total vector set. Thus, the net effect is that the power is estimated by simulating only a small fraction of the total vector set. This feature is essential for simulation of large sequential circuits. Effectively, an implicit compaction of the vector set has been achieved. The adjective "implicit" denotes the fact that this was done on the fly, during the simulation, rather than up-front. We feel that this is the only correct way of performing compaction, mainly because, as observed in relation to Table II , the number of vectors required for convergence depends very much on the special characteristics of the circuit and is not determined simply by signal statistics or by circuit size. In fact, as was pointed out above, sometimes smaller circuits will require more cycles to converge.
Looking at the last column of Table III , some readers may conclude that perhaps simply simulating the first 18% (or whatever the fraction may be, according to the compaction ratio) of the vectors in the long vector set, in the order in which they occur, may be enough. This is not correct because the first section of the vector set may be biased for some reason-it may, for instance, have much lower switching activity than the rest of the vector set. The random choice of the blocks from anywhere in the vector set is essential in order to guarantee that the result is representative of all the various modes of operation in the long vector set. Granted, the random sampling does not explore all the vectors, but it does explore enough of them, and in the right way, in order to provide the desired result with the specified accuracy and confidence.
V. CONCLUSION
We have proposed a simulation-based method for estimating the power dissipation of sequential circuits. The method works by sampling blocks of consecutive vectors from a user-supplied (potentially very long) power vector set and simulating them. Because the state of the circuit at the beginning of each block is unknown, we initialize the circuit to an all-X state and simulate it for one block using three-valued logic simulation. The simulator includes delay information, so that it does capture glitching activity.
The proposed method is very efficient in providing accurate results for logically initializable circuits; that is, circuits whose state becomes known after simulating a few vectors starting from an initial unknown (all-X) state. However, if one finds that, for a given circuit, the circuit state remains unknown, then it would seem that the only fall-back position is to do a full simulation starting from a known initial state.
The major advantage of the method is that the state of the sequential circuit is always guaranteed to be valid-the FSM never goes outside its valid state space. Thus, the estimated power corresponds to realistic typical circuit operation. Another advantage of the method is that only a fraction of the vectors (around 15% for the circuits tested) in the (potentially huge) power vector set needs to be simulated.
