Reliability assessment is an important part of the design process of digital integrated circuits. We observe that a common thread that runs through most causes of run-time failure is the extent of circuit activity, i.e., the rate at which its nodes are switching. We propose a new measure of activity, called the transition density, which m a y be de ned as the average switching rate" at a circuit node. Based on a stochastic model of logic signals, we also present an algorithm to propagate density v alues from the primary inputs to internal and output nodes. To illustrate the practical signi cance of this work, we demonstrate how the density v alues at internal nodes can be used to study circuit reliability b y estimating 1 the average power & ground currents, 2 the average power dissipation, 3 the susceptibility to electromigration failures, and 4 the extent of hot-electron degradation. The density propagation algorithm has been implemented in a prototype density simulator. Using this, we present experimental results to assess the validity and feasibility of the approach. In order to obtain the same circuit activity information by traditional means, the circuit would need to be simulated for thousands of input transitions. Thus this approach i s v ery e cient and makes possible the analysis of VLSI circuits, which are traditionally too big to simulate for long input sequences.
Introduction
A major portion of the design time of digital integrated circuits is dedicated to functional veri cation and reliability assessment. Of these two, reliability assessment is a more recent problem whose severity has steadily increased in proportion to chip density. As a result, CAD tools that evaluate the susceptibility of a design to run-time failures are becoming increasingly important.
Chip run-time failures can occur due to a variety of reasons, such as excessive p o w er dissipation, electromigration, hot-electron degradation, voltage drop, aging, and others. In CMOS logic circuits, the rate at which node transitions occur is a good indicator of the circuit's susceptibility to run-time failures. For example, both power dissipation and electromigration in the power lines are directly related to the power supply current which, in CMOS, is non-zero only during transitions. Hot-electron degradation is related to the MOSFETs substrate current, which, for CMOS, is also only signi cant during transitions. Thus, the rate at which node transitions occur, i.e., the extent of circuit activity, m a y be thought of as a measure of a failure-causing stress. H o w ever, there has traditionally been no way o f quantifying this activity because logic signals are in general non-periodic and, thus, have n o xed switching frequency. This paper proposes a novel measure of activity that we call the transition density, along with a simulation technique to compute the density a t e v ery circuit node. The transition density m a y be de ned as the average switching rate," a more rigorous de nition will be given in section 2. Preliminary results of this work have appeared in 1 .
To further motivate the notion of transition density, consider the problem of estimating the average power drawn by a CMOS gate. If the gate has output capacitance C and generates a simple clock signal with frequency f, then the average power dissipated is CV 2 dd f, where V dd is the power supply voltage. In general, since logic signals may not be periodic, the notion of frequency cannot be used. Instead, one may compute the power as follows. If xt is the logic signal at the gate output and n x T is the number of transitions of xt i n the time interval ,T 2 ; +T 2 , then the average power is : In the next section we de ne the transition density to be the last limit term in 1.1.
Naturally, one can approximate lim T!1 n x T =T by simulating the circuit for a large enough" number of input transitions while monitoring n x T a t e v ery node. The ambiguity in the phrase large enough" is precisely the problem with this traditional approach. It is impossible to determine a priori how long the simulation should be. Furthermore, long simulations of large circuits are very expensive. However, we will show that, if the transition densities at the circuit primary inputs are given, they can be e ciently propagated into the circuit to give the transition density a t e v ery internal and output node. In other words, we use the limits lim T!1 n x T =T at the circuit inputs to directly compute the corresponding limits inside the circuit.
The propagation algorithm involves a single-pass over the circuit and computes the transition densities at all the nodes. It may be thought o f a s a simulation of the circuit in which one studies the density of its internal signals that correspond to input signals with speci ed densities; it has been implemented in a prototype density simulator, called DENSIM. In order to obtain the same circuit activity information by traditional means, the circuit must be simulated for thousands of input transitions. Thus this approach i s v ery e cient and makes possible the analysis of VLSI circuits, which are traditionally too big to simulate for long input sequences.
It turns out to be highly bene cial, in terms of the theoretical results to be presented, to cast the problem in a stochastic probability theory setting. Thus, in the following two sections, we will start with de nitions of idealized logic signals," and then present a stochastic model of logic signals that is essential to the density propagation theorem. Based on these concepts, we then show in section 4 how the transition density can be e ciently propagated from inputs to outputs. In section 5, we study a number of practical applications of the density concept. Namely, w e demonstrate how the density v alues at internal nodes can be used to estimate 1 The average power & ground currents, 2 The average power dissipation, 3 The susceptibility to electromigration failures, and 4 The extent of hotelectron degradation. Experimental results are presented in section 6, and section 7 contains a summary and conclusions.
Appendix A presents the existence proofs of the equilibrium probability and transition density, and appendix B presents a new application for Binary Decision Diagrams BDDs in computing the probability of a Boolean function.
Ideal Logic Signals
Let xt; t2 ,1; +1; be a function of time that takes the values 0 or 1. We use such time functions to model logic signals in digital circuits. This ideal model neglects waveform details such as the rise fall times, glitches, over under-shoots, etc.
De nition 1. The equilibrium probability of xt, to be denoted by Px, is de ned as :
The reason for the name equilibrium probability" will become clear later on. It is easy to observe, however, that Px is the fraction of time that xt is in the 1 state. It is also the average value of xt, over all time. Thus, for instance, a 25 duty cycle clock signal, i.e., one that is high for 1 4th of its period, has Px = 0 : 25. The following proposition guarantees that the equilibrium probability i s a l w a ys well-de ned. Proposition 1. For a logic signal xt, the limit in 2.1 always exists. Proof : See appendix A.
The discontinuity points of xt represent transitions in the logic signal. Let n x T b e the number of transitions of xt in the time interval , T 2 ; + T 2 .
De nition 2. The transition density of a logic signal xt; t2 ,1; +1, is de ned as :
The reason for the name transition density" will become clear later on. It should be clear, however, that Dx is the average number of transitions per unit time. Thus, a 10MHz clock signal has Dx = 2 0 10 6 . The power of the Px and Dx concepts is that they apply equally well to both periodic clock and non-periodic signals. In the remainder of this section, we study the existence of the limit in 2.2.
The time between two consecutive transitions of xt will be referred to as an intertransition time. Let be the average value of all the inter-transition times of xt. Likewise, let 1 0 be the average of the high low, i.e., corresponding to xt = 1 0, inter-transition times of xt. It should be clear that = 1 2 0 + 1 . In general, there is no guarantee of the existence of , 0 , and 1 . If the number of transitions in positive time is nite, then we say that there is an in nite inter-transition time following the last transition, and = 1. A similar convention is made for negative time. We de ne f to be the average of all the nite inter-transition times of xt. In general, there is also no guarantee of the existence of f . Proof : See appendix A.
In order to guarantee that the density i s a l w a ys well-de ned, we make the following basic assumption about every logic signal xt :
Basic Assumption : The average nite inter-transition time f exists and is non-zero.
Logic signals for which this assumption does not hold are considered pathological, and are excluded from the analysis. It can be shown see appendix A that another more stringent su cient condition for the existence of 2.2 is that there be a non-zero lower bound however small on the inter-transition times of xt. This condition is easily satis ed in all practical cases, so that our basic assumption is very mild indeed.
The Companion Process of Logic Signals
We will use bold font to represent random quantities. We denote the probability o f a n event A by PfAgand, if x is a random variable, we denote its mean or expected value by E x and its distribution function by F x a 4 = Pfxag. Let xt, t 2 ,1; +1, be a stochastic process 2 that takes the values 0 or 1, transitioning between them at random transition times. Such a process is called a 0-1 process see 3 , pp. 38 39. A logic signal xt can be thought o f a s a sample of a 0-1 stochastic process xt, i.e., xt is one of an in nity of possible signals that comprise the family xt.
A stochastic process is said to be strict-sense stationary SSS if its statistical properties are invariant to a shift of the time origin 2 . Among other things, the mean E xt of such a process is a constant, independent of time, and will be denoted by E x . It will be shown below that a logic signal is always a sample of a SSS 0-1 process.
Let n x T denote the random number of transitions of xt i n , T 2 ; + T 2 . If xt i s SSS, then E n x T depends only on T, and is independent of the location of the time origin. Proposition 3. If xt is SSS, then the mean E n x T =T is a constant, independent o f T . Proof : Let t 1 t 2 t 3 be three arbitrary points along the time axis. Let n 1 be the number of transitions in t 1 ; t 2 , n 2 be the number of transitions in t 2 ; t 3 , and n 3 be the number of transitions in t 1 ; t 3 . Then n 3 = n 1 + n 2 , and E n 3 = E n 1 + E n 2 . Let T 1 = t 2 ,t 1 and T 2 = t 3 ,t 2 . Since xt is SSS, then E n 1 = E n x T 1 , E n 2 = E n x T 2 , and E n 3 = E n x T 1 + T 2 . Hence E n x T 1 + T 2 = E n x T 1 + E n x T 2 . Since this is true for arbitrary T 1 and T 2 , it means that, in general, E n x T = kT, where k is a positive constant, which completes the proof.
A constant-mean stochastic process xt is said to be mean-ergodic 2 if :
where we h a v e used the symbol 1 =" to denote convergence with probability 1. The reader is referred to 2 , pp. 188 191, for a discussion of the di erent stochastic convergence modes. We reserve the symbol =" to indicate convergence everywhere for random quantities. It will be shown below that a logic signal is always a sample of a SSS mean-ergodic 0-1 process. Proof : At t = 0 , w e h a v e E x 0 = E x . An interesting property o f is that if a is a constant then a + has the same distribution as . Indeed, if F a+ t is the distribution function of a + , then F a+ t = Pfa+ tg=Pf t,ag= 1 = 2 = F t . Therefore, since t + and are identically distributed, we h a v e E x t + = E x , which means that xt i s a c onstant-mean process with :
E xt = E x0 = E x ; for any time t:
Let R a be a subset of the real line R de ned by R a 4 = ft 2 R : x t = 1 ; x t + a = 1 g . I t i s clear that Pfx = 1 ; x + a = 1 g = Pf 2 R a g . Likewise, Pfxt+ = 1 ; x t + + a = 1 g = Pft+ 2 R a g . However, since and t + are identically distributed, the two probabilities Pf 2 R a g and Pft+ 2 R a g m ust be equal, which leads to : Pfxt = 1 ; x t + a = 1 g = Pfx 0 = 1 ; x a = 1 g = Pf 2 R a g ; for any time t: 3:9 Consequently, the joint distribution of xt and xt + a, i.e., F xt;xt+a x 1 ; x 2 , is independent o f t , and depends only on a, which makes xt wide-sense stationary 2 WSS.
By extending this argument t o a 1 ; : : : ; a n , it follows that F xt;xt+a 1 ;:::;xt+an x 1 ; : : : ; x n i s independent o f t , and xt is strict-sense stationary SSS.
To prove mean-ergodicity, and in view of 3.3, it su ces to show that E x = P x .
Consider the random variable
xtdt. F rom 3.3 we h a v e lim T!1 T = Px, where this is convergence everywhere. Therefore lim T!1 E T = P x . By linearity of the expected value operator, this can be rewritten :
But E xt is a constant. Therefore the left hand side of 3.10 is simply E x , and meanergodicity follows, with E x = Pfxt = 1 g = P x .
To complete the proof, we will prove 3.7 by repeating the argument used for T .
By 3.4, the random variable n x T=T converges everywhere to Dx. Therefore its mean must also converge to Dx. Since, by proposition 3, its mean is a constant, independent o f T , then 3.7 follows.
We are now in a position to comment on the names equilibrium probability" for Px and transition density" for Dx. For a 0-1 process, Pfxt = 1 g = E x t . Thus, by 3.3 and since xt is mean-ergodic, Px is the constant probability that xt = 1. The name equilibrium probability" is inspired from the special case when the inter-transition times of a 0-1 process xt are independent exponentially distributed random variables. In that case, the process is the well-known two-state continuous-time Markov process see 2 , pp.
392 393 whose state probability tends to an equilibrium value for t ! 1 , at which time it becomes SSS see 4 , pp. 272 273. By 3.7, Dx is the expected average number of transitions per unit time," which w e compactly refer to as transition density." This name is inspired by the density of random Poisson points see 2 , page 58. If a large number of points are chosen on the time axis at random, then the number of points in a given interval" is a random variable with a Poisson distribution whose density parameter is the expected number of points per unit time." The points that we are concerned with in this paper are the time points at which transitions occur, but we make no assumption about their distribution. The remark about Poisson points is meant only to motivate the terminology.
Density Simulation
A digital circuit provides a mapping from the logic signals at its primary input nodes to those at its internal and output nodes. In the following, we use the term internal nodes" to refer to the primary output nodes as well as other proper internal circuit nodes.
If we consider the companion process of each such logic signal, the circuit may be seen as mapping stochastic processes at its inputs to similar processes at its internal nodes. The statistics such as density and probability of the internal processes are completely determined by those at the primary inputs. In fact, we will demonstrate in this section that the density and probability o f i n ternal processes can be e ciently computed from those at the primary inputs.
We assume that the primary input processes are mutually-independent. Therefore, since these inputs are individually SSS, they are also jointly SSS. In terms of the underlying logic signals xt, this assumption means that the signal values are not correlated, so that if one of them is 1, then the average fraction of time that another is 1 or 0 is unaltered.
Given the density and probability v alues of the companion processes at the primary inputs, we will present an algorithm to propagate them into the circuit to derive the corresponding values at internal nodes. We consider the circuit to be an interconnection of logic modules, each representing a certain combinational Boolean function and possessing certain delay c haracteristics. The propagation of density and probability will then proceed on a per-module basis from primary inputs to primary outputs, a process that we refer to as density simulation.
Propagation through a single module
Consider a multi-input multi-output logic module M, whose outputs are Boolean functions of its inputs, as shown in Fig. 1 . M may be a single logic gate or a higher level circuit block. We assume that the inputs to M are mutually-independent companion processes. The validity of this assumption will be discussed in section 4.2.
We use a simpli ed timing model of circuit behavior, as follows. We assume that an input transition that does get transmitted to an output node is delayed by a propagation delay time of p . Di erent propagation delays may be associated with di erent input-output node pairs. Implicit in this model is the simplifying assumption that the propagation delay is independent of the values at other inputs of M.
In e ect, we decouple the delays inside M from its Boolean function description by introducing a special-purpose delay block to model the delays between every pair of input & output nodes, as shown in Fig. 2 . The block M 0 is a zero-delay logic module that implements the same Boolean function as M.
Since the input signals are SSS, then the output of the delay block has the same statistics as its input, and therefore has the same probability and density. As for the zero-delay module M 0 , w e n o w consider the problem of propagating equilibrium probabilities and transition densities from its inputs to its outputs. Since Px = Pfxt = 1 g by theorem 1 and M 0 has zero delay, then the problem of propagating equilibrium probabilities through it is identical to that of propagating signal probabilities through logic circuits, which has been well-studied 5 9 . Since the internal structure of M 0 is not known, the problem is actually even more generic than that, and can be expressed as given a Boolean function fx 1 ; : : : ; x n and that each x i can be high with probability Px i , what is the probability that f is high?" Any n umber of published techniques can be used to solve this problem. However, we h a v e c hosen for reasons that will become clear below to investigate a new approach based on Binary Decision Diagrams 10, 11 BDDs which h a v e recently become popular in the veri cation and synthesis areas. Appendix B describes how w e use BDDs to compute the probability of a Boolean function.
We consider next the density propagation problem. Recall the concept of Boolean Difference : i f y is a Boolean function that depends on x, then the Boolean di erence of y with respect to x is de ned as : @y @x not. Since the input processes are SSS, then @y=@x is also SSS; in fact it is a companion process with equilibrium probability P@y=@x. We are now ready to prove the following : which, dividing by T and using 3.7, leads to the required result 4.2.
If the Boolean di erence is available, then evaluating P @y j @x i is no more di cult than evaluating the probability of a Boolean function knowing those of its inputs. Note that if M is a 2-input AND gate, with inputs x 1 & x 2 , and output y, then P @y @x 1 = Px 2 . In more complex situations, the compose" and xor" functions of the BDD package 11 can be used to evaluate the Boolean di erence using equation 4.1. The BDD-based algorithm given in the appendix for computing the probability of a Boolean function can then be used to compute P @y j @x i .
Global propagation strategy
The assumption was made at the beginning of the previous sub-section that the inputs to a module are independent. E v en though this is true at the primary inputs as we h a v e assumed, it may not be true for internal nodes. Circuit topologies that include reconvergent fanout and feedback will cause internal nodes to be correlated, and destroy the independence property. This problem is central to any circuit analysis based on a statistical representation of signals, and can usually be taken care of by using heuristics that trade-o accuracy for speed 5-9 .
Based on our previous experience with the propagation of probability w a v eforms 12 , we h a v e found that, if the modules are large enough so that tightly coupled nodes such as in latches or small cells are kept inside the same module, then the coupling outside the modules is su ciently low to justify an independence assumption. While this does take i n to account the correlations inside a module, it may create inaccuracies because internal delays are lumped together. Furthermore, performance may be sacri ced because the BDDs can become too large. Section 6 will investigate this speed-accuracy trade-o .
Practical Applications
Once the density a t e v ery internal node has been computed, these values can be used in a post-processing step to investigate various circuit properties. We present here four di erent applications of the density concept in CMOS circuits.
Average power ground bus currents
Consider the problem of computing the average current in the power or ground bus branches. We will consider only the case of the power bus, since that of the ground bus is identical.
A convenient approximation is to view the bus as an interconnection of lumped resistors, with lumped capacitors to ground, i.e., a linear RC network. Some nodes of this network are connected to the external V dd power supply, while others referred to as contacts are connected to the various circuit components, e.g. CMOS gates, drawing power supply current. Let i k t; k= 1 ; 2 ; : : : ; n ;be the various current w a v eforms that the circuit draws at these contact nodes. Let i j t; j= 1 ; 2 ; : : : ; m ;be the various current w a v eforms in the bus branches. The bus can now be viewed as a linear time-invariant LTI system whose outputs i j t are related to its inputs i k t b y the convolutions : In other words, if the time-averages of the contact currents are themselves applied at the contacts, and the bus is solved i.e., simulated as a resistive network DC solution, the resulting branch currents are the required time-averages of the bus currents. To complete the solution, we will now express the time-average contact currents I k in terms of the transition densities inside the circuit.
Let Dx be the transition density at the output node x of a CMOS gate that draws power supply current it whose time-average is I. F urthermore, let C n C p be the total capacitance from x to the ground power bus connection. These capacitances are the sum of i a n y lumped capacitance tied to the gate output, ii MOSFET drain and source capacitances in the gate output stage, and iii MOSFET gate capacitances in any logic gates driven by x. As such, they are related to both load capacitance and transistor strength. It has been established 13 that a good estimate of the supply current it can be obtained by looking only at its capacitive c harging discharging component. Since the charge drawn from the supply whenever the gate switches low-to-high high-to-low is V dd C n V dd C p , it follows that : = C n + C p is the total capacitance at the output node. Equations 5.3 and 5.2 provide an e cient technique for computing the average current in every branch of the bus, given the transition densities at all circuit nodes. It is signi cant that this requires only a single DC simulation of the resistive network representing the power bus; no transient simulation is required, and the bus capacitance is irrelevant.
Average power dissipation
As a direct consequence of the above results, it should be clear that the overall average power dissipation is given by P av = 1 2 V 2 dd P C i Dx i , summing over all circuit nodes x i .
Electromigration failures
Electromigration 14, 15 is a major reliability problem caused by the transport of atoms in a metal line due to the electron ow. Under persistent current stress, this can cause deformations of the metal, leading to either short or open circuits. The time-to-failure is a lognormally-distributed random variable. It is usually characterized by the median or mean time-to-failure MTF 15 , which depends on the current density in the metal line. The models for MTF prediction under pulsed-DC or AC current stress are still controversial. Some recent models 16 predict that, at least under pulsed-DC conditions, the average current is su cient to predict the MTF, as follows :
where A is a parameter that does not depend on the current and I is the average current. However, other recent studies 17 show that the situation is much more complicated. In any case, even if I is not su cient b y itself to estimate the MTF, it represents a rst order approximation of the current stress in the wire. Thus equations 5.2 and 5.3, based on the transition density, provide the required average current v alues I, and help identify potential electromigration problems in the power ground bus branches.
Hot electron degradation
As MOSFET devices are scaled down to very small dimensions, certain physical mechanisms start to cause degradation in the device parameters, causing major reliability problems. One such mechanism is the injection of hot electrons" or in general hot carriers into the MOS gate oxide layer 14 . Trapping of these carriers in the gate insulator layer causes degradation in the transistor transconductance and or threshold voltage.
It is widely accepted that the MOSFET substrate current is a good indicator of the severity of the degradation. In fact one can write an expression for the age" of a transistor i.e. how far it is down the degradation path that has been operating for time T as follows 18 : In order to see how this can be used in a CMOS circuit, consider a MOSFET in a CMOS inverter whose output node is x. It can be shown that the both I sub t and I ds t are nonzero only when the inverter is switching this also holds for any CMOS gate. Whenever the inverter switches, it generates two current pulses I sub t and I ds t. The pulses resulting from di erent switching events are identical except for a dependence on the rise fall at the inverter input. If one assumes a certain nominal rise fall time at the input, then using 5.5 one can compute the incremental aging due to 0 ! 1 and 1 ! 0 transitions at the inverter output, call these A lh and A hl . Then 5.5 may be written : AgeT = A lh +A hl n x T 2 5:6
Degradation due to hot-carriers takes years to manifest itself. In other words, T and n x T are very large, which using 2.2 permits the approximation n x T T D x , and leads to :
Thus, if CMOS gates are pre-characterized to estimate the incremental damage to their transistors due to a single output transition, then the transition density provides the means to predict transistor aging over extended time periods using 5.7.
Experimental Results
We h a v e implemented this approach in a prototype density simulator, called DENSIM, that takes a description of a circuit in terms of its Boolean modules and gives the transition density at every node. It also accepts values for transition density and equilibrium probability at the primary inputs. Our current implementation is restricted to combinational non-feedback circuits. Every Boolean module should be an instance of a model from a simulation library built by a separate model compiler called MODCOM. MODCOM uses an input speci cation in the form of Boolean equations to build a BDD representation of the module outputs and the relevant Boolean di erences, and stores this in a model le that DENSIM can use. We present below the results of a number of test cases that were used to investigate the accuracy and e ciency of this technique. In order to assess the accuracy of the results, we h a v e devised a test by which randomly-generated logic waveforms are fed to the circuit primary inputs and propagated into the circuit by logic simulation based on the BDDs. The logic simulator uses assignable non-zero delays, scaling them based on the fanout load at every module output. The input waveforms must have the same probability and density values given to DENSIM, and are generated as follows. Starting with Px and Dx v alues, we solve for 0 and 1 from 2.3a, b. We then use arbitrarily an exponentially distributed random number generator to produce sequences of inter-transition times that have the means 0 and 1 the theory presented above holds for any distribution of inter-transition times. Starting from arbitrary initial values, the waveforms are built using these sequences. From the logic simulation results, we estimate the average number of transitions per unit time for every circuit node. For a large number of input transitions, this number should converge to the transition density, according to equation 2.2. We also estimate the fraction of time that the signal spends in the high state and check if that converges to the equilibrium probability, in accordance with 2.1.
In the rst few test cases to be presented, the modules were chosen to contain all reconvergent fanout. Thus all signals are independent and the results from DENSIM should agree exactly with those from logic simulation. We will then move on to other test cases where signal correlation does become an issue and will study the speed-accuracy trade-o involved.
As a rst test case, consider a single logic module with 8 inputs and one output that implements the Boolean function : Z = ABFD+CFD+ABHD+CHD+ABFG+CFG+ ABHG+CHG+AFE +ADE +CFE+CDE. Using input values of P = 0 : 5 The horizontal axis in this gure is the cpu time elapsed during the logic simulation run, and the vertical axis is the cumulative v alues of density and probability at the output node. The two horizontal dashed lines are the values of density and probability computed by DENSIM and the vertical dashed line indicated by the arrow shows the total cpu time required by the DENSIM run. The other vertical line indicates the cpu time required to observe 1000 logic transitions at node Z.
The second test case is the 4-bit ALU function generator SN54181 from the TI TTL data book. This circuit has 75 logic gates and is shown in Fig. 4 .
If we consider the whole circuit as a single Boolean module, then the e ects of all internal node correlations are taken care of, and the DENSIM results should, again, be exact. It takes The preceding test cases show that, even for single-module circuits, computing the den- Figure 6 . Results for node X of the ALU. sity v alues using DENSIM instead of traditional logic simulation is accurate, much faster, and avoids lengthy simulations involving thousands of logic transitions. This observation will be further enforced by the results presented below.
Moving on to multi-module circuits, consider a 32-bit binary ripple adder. In this case, we c hose the full-adders to be our Boolean modules. This again leads to a situation where all reconvergent fanout and signal correlation is inside the modules, and where DENSIM results should be exact. DENSIM takes only 0.46 cpu seconds SUN, as opposed to the 5 minutes required for the logic simulation results to converge, as shown in Figs. 7 and 8, respectively.
An interesting feature of the result in Fig. 7 is the prolonged at" part of the curve around 1000 transitions. This illustrates the point made in the introduction that it is impossible to tell before hand exactly when a logic simulation run should be terminated. In this case, if one were monitoring the density v alues from logic simulation with the intention of terminating the run when the density converged to something," one might terminate the run at 1000 transitions, getting the wrong result.
We n o w m o v e on to a consideration of the e ects of signal correlation caused by reconvergent fanout. As pointed out in sub-section 4.2, one can accurately handle these e ects by keeping all reconvergent fanout within the Boolean modules. However, since large BDDs are expensive to build and maintain, this can become impractical and leads to a speed-accuracy trade-o . To illustrate this point, we again consider the ALU circuit in Fig. 4 . We partition the circuit into the 19 smaller modules shown in the gure and examine the resultant density values at all nodes that are module outputs. By comparing these to the values obtained from the single-module run on this circuit, we get the error histogram shown in Fig. 9 . In this case there was a less than 29 loss in accuracy for a 15X gain in speed. For a further comparison, we ran a logic simulation on the ALU using its gate-level representation, and compared the resulting densities to those observed in the above 19-module run. The error histogram in this case is shown in Fig. 10 . All but one of the densities are within 23. The single point of poor agreement is at node AB which i s a reconvergent node for all four ALU outputs F0 F3.
Finally, w e present some results obtained for the ISCAS-85 benchmark circuits 19 . In this case we used a lowest-level partitioning" in which e v ery logic gate was represented as a separate Boolean module. This provides the fastest, but potentially the least accurate, DENSIM run. The 10 ISCAS circuits, their sizes, and the total DENSIM cpu time on a CONVEX c240 are shown in Table 1 .
The execution times are excellent, taking under 10 seconds even for the largest circuit. As for the accuracy, it becomes exceedingly di cult to assess for large circuits, because the BDDs become unacceptably large. Even though BDDs for the these circuits have been built by other researchers, the BDDs that we require are much larger because they must include the Boolean function at every internal node as well as the output nodes, along with all the associated Boolean di erence terms. Thus we are reduced to having to assess the accuracy by obtaining a best-possible estimate of the densities from long logic simulation runs. Even then, it is practically impossible to examine the density plot for every internal node to determine whether the run was long enough for it to converge. Based on several test cases, however, we found that an average of 1000 transitions per input node seems to be enough to approximate most node densities. Such logic simulation runs were performed on all 10 circuits. In order to tabulate the results, we show the average density values averaged over all circuit nodes in Table 2 . The third column in the table also lists the total cpu time required on the CONVEX to nish the logic simulation run in each case. Even for the smallest circuits, such long simulation runs meant that hundreds of thousands of internal events had to be simulated. Comparing the execution times between tables 1 and 2 clearly demonstrates the speed advantage of this approach e.g. 5.67 sec. versus 8 hrs. 38 mins. for c6288. As for the average density v alues, the agreement i s v ery good for c432 & c3540, acceptable for c880 & c2670, and poor for the other circuits. These results highlight the need to better account for signal correlation if one is to obtain consistently good results in the general case.
In general, the problem of estimating equilibrium probabilities, let alone transition densities, is NP -hard. As a result, no single e cient solution will work well in all cases. The partitioning strategy in general cases, and the speed-accuracy trade-o , are the focus of our continuing research e orts in this area.
Summary and Conclusions
To summarize, we h a v e observed that a common thread that runs through most causes of run-time failure is the extent of circuit activity, i.e., the rate at which its nodes are switching. We h a v e de ned a new measure of circuit activity, called the transition density. Based on a stochastic model of logic signals, we h a v e also presented an algorithm to propagate the density from the primary inputs to internal nodes. To illustrate the practical signi cance of these results, we h a v e considered four ways in which the density v alues can be used to study circuit reliability b y estimating 1 the average power & ground currents, 2 the average power dissipation, 3 the susceptibility to electromigration failures, and 4 the extent of hot-electron degradation. We h a v e also presented experimental results that demonstrate the practical signi cance and power of this approach. We e n vision that the computation of density v alues inside the circuit can be used as a pre-processing step, and the resulting information applied to these and possibly other reliability problems. another signal x 0 t so that x 0 t = x t , for t t 0 , and x 0 t = x t 0 + t 0 , t , for t t 0 , then x 0 t has an in nity of transitions in both time directions and it can be shown that Dx = 1 2 D x 0 . Thus, the existence of Dx i s c o v ered by the general case of a signal with an in nity of transitions in both time directions, to be considered next.
In the general case of an in nity of transitions in both time directions, xt cannot have an in nite inter-transition time, so that f = . It will simplify the discussion below t o refer to rather than f . Consider Fig. A1 where, for every T, t 1 is the time of the last transition of xt before ,T=2, t 2 is that of the rst transition after ,T=2, t 3 is that of the last transition before +T=2, and t 4 is that of the rst transition after +T=2. There are n x T transitions between ,T=2 and +T=2, including t 2 and t 3 . T h us there are n x T , 1 inter-transition time intervals between t 2 and t 3 . Since lim T!1 n x T = 1 , ii I f 0 and 1 exist, and = 0 + 1 = 2 is non-zero, then f exists and is non-zero and 2.2 exists. Existence of 0 and 1 also means that xt has no in nite inter-transition times, so that Dx = 1 =, and we directly get A.3b :
To obtain A.3a, let n 1 T be the number of whole 1-pulses of xt i n , T 2 ; + T 2 . It is easy to verify that n 1 T , nxT 2 1, which gives lim T!1 n 1 T T = 1 2 Dx. Consider Fig. A2 where, for every T, t 1 is the time of the last 0 ! 1 transition of xt before ,T=2, t 2 is that of the rst 0 ! 1 transition after ,T=2, t 3 is that of the last 1 ! 0 transition before +T=2, and t 4 is that of the rst 1 ! 0 transition after +T=2. and the proof is complete. In order to illustrate how mild the condition of proposition 2 is, one can prove another more stringent su cient condition for the existence of Dx, namely, that there exist a non-zero lower-bound x 0 on the inter-transition times. The proof is as follows : Consider the logic signal x t built as follows : x t i s 0 e v erywhere, except on intervals of width x centered at every transition time-point o f x t , where it is 1. It is clear that By proposition 1, and since x 0, the density exists. This condition can be easily satis ed in all practical cases.
Appendix B Using BDDs for Probability Propagation
We will brie y review the concept of a Binary Decision Diagram 10, 11 BDD and then present a new application for BDDs as tools for computing the probability of a Boolean function. Consider the Boolean function y = x 1 x 2 + x 3 , which can be represented by the BDD shown in Fig. B1 . The Boolean variables x i are ordered, and each level in the BDD corresponds to a single variable. Each level may contain one or more BDD nodes at which one can branch in one of two directions, depending on the value of the relevant v ariable. For example, suppose that x 1 = 1 , x 2 = 0, and x 3 = 1 . T o e v aluate y, w e start at the top node, branch to the right since x 1 = 1, then branch to the left since x 2 = 0, and nally branch t o the right since x 3 = 1 to reach the terminal node 1". Thus the corresponding value of y is 1.
The importance of the BDD representation is that is is canonical, i.e., that it does not depend on the Boolean expression used to express the function. In our case, if the function was expressed as y = x 3 + x 1 x 2 + x 3 an equivalent representation, it would have the same BDD. BDDs have been found to be an e cient representation for manipulating Boolean functions, both in terms of memory and execution time. For example, checking if a Boolean function is satis able can be done in time that is linear in the number of variables.
Let y = fx 1 ; : : : ; x n be a Boolean function. We will show that, given signal probabilities for the variables x i , and that these variables are independent random variables, then the probability of the function f can be obtained in linear time in the size of its BDD representation. By Shannon's expansion : y = x 1 f x 1 + x 1 f x 1 B:1
where f x 1 = f1; x 2 ; : : : ; x n and f x 1 = f0; x 2 ; : : : ; x n are the cofactors of f with respect to x 1 . Since x 1 x 1 = 0, then : Py = P x 1 f x 1 + P x 1 f x 1 B:2
Since the cofactors of x i do not depend on x i , and since all variables are independent, then : Py = P x 1 P f x 1 + P x 1 P f x 1 B:3
This equation shows how the BDD is to be used to evaluate Py. The two nodes that are descendants of y in the BDD correspond to the cofactors of f. The probability of the cofactors can then be expressed in the same way, in terms of their descendants. Thus a depthrst-traversal of the BDD, with a post-order evaluation of P a t e v ery node is all that is required. We h a v e implemented this using the scan" function of the BDD package 11 .
