Abstract-With the increasing importance of run-time leakage power dissipation (around 55% of total power), it has become necessary to accurately estimate it not only as a function of input vectors but also as a function of process parameters. Leakage power corresponding to the maximum vector presents itself as a higher bound for run-time leakage and is a measure of reliability. In this work, we address the problem of accurately estimating the probabilistic distribution of the maximum runtime leakage power in the presence of variations in process parameters such as threshold voltage, critical dimensions and doping concentration. Both sub-threshold and gate leakage current are considered. A heuristic approach is proposed to determine the vector that causes the maximum leakage power under the influence of random process variations. This vector is then used to estimate the lognormal distribution of the total leakage current of the circuit by summing up the lognormal leakage current distributions of the individual standard cells at their respective input levels. The proposed method has been effective in accurately estimating the leakage mean, standard deviation and probability density function (PDF) of ISCAS-85 benchmark circuits. The average errors of our method compared with near exhaustive random vector testing for mean and standard deviation are 1.32% and 1.41% respectively.
I. INTRODUCTION
Technology scaling into the deep nanometer geometries has seen an increase in process variability due to lithographic inaccuracies and this problem magnifies with every generation. Though the advantages of scaling are evident and it is necessary to continue with this trend, it is also important to consider variability data in order to make accurate estimations of performance measures. It has become highly important to accurately estimate performance measures such as timing, leakage power and noise in the design phase itself in order to increase the parametric yield after fabrication. It has been shown that a 30% variation in circuit frequency induced by parameter variations can result in nearly 20X variation in leakage power [1] . Variations in process parameters such as channel length (L ch ), threshold voltage (V th ), channel doping concentration (N ch ), gate oxide thickness (T ox ) and gate width (W gate ) are the major contributors to variability in timing and leakage power. It has also been shown that this dependence on process parameters is linear in the case of timing [2] and exponential in the case of leakage power [3] .
Substantial work has been done in the area of analysis, estimation and optimization of statistical static timing [2] , [4] , [5] , [6] , [7] , [8] but statistical leakage power measurement and optimization is still emerging [3] , [9] , [10] , [11] .
Moreover, statistical leakage power estimation has become increasingly important because of the exponential dependence of leakage on several process parameters. The traditional methods of leakage power estimation, such as nominal analysis, underestimate leakage power by a large margin whereas corner analysis, on the other hand, overestimates leakage power dissipation. As a result of under-estimation, there may be a failure in meeting the power yield and this may also cause reliability issues in chips that do meet the power yield during their life span. Over-estimation on the other hand may result in designing unnecessary guard bands and failure to meet timing specifications [12] . Therefore, it is extremely important to accurately estimate leakage power as a function of process variations.
The terms leakage power and leakage current are synonymously used in this paper since leakage power is the product of leakage current which is the variable term and the supply voltage which is treated as a constant in this work. Leakage current comprises of several components [13] of which subthreshold and gate leakage are the prominent ones [14] . With leakage power contributing more than 55% of the total power in present day technologies (32nm and beyond) and this trend being predicted to only increase in future technologies [15] , [16] , accurate estimation methods considering all the important types of leakage has become a necessity.
A chip can settle in an idle state for a considerable amount of time or can enter stand-by mode several times during its lifespan. It enters an idle state or stand-by mode with a different set of input vectors each time. In this scenario, it is not assured that the leakage power specifications are met each time. Excessive leakage power dissipation for a long period of time results in a drastic increase in the thermal profile of most chips during their operation and makes them susceptible to failure. Maximum leakage power estimation provides an upper bound and guarantees that the design constraints are met irrespective of the circuit input state [17] . Maximum leakage can also be used to find hot-spots in a physical design. It can also be used to estimate the worst case battery life of a portable device [17] .
Traditionally, leakage power was considered important only in the stand-by mode whereas dynamic power was considered important in the active mode of operation. But, due to the shrinking physical dimensions, the contribution of dynamic power to the total power has reduced with the growth of leakage power [18] . With its increasing importance, leakage contribution during run-time has become a deciding factor to determine the maximum power bound for reliability of a chip and as a measure of its ever increasing thermal profile. This leakage power is called runtime leakage [19] and is becoming as important as dynamic power dissipation [20] . Runtime leakage is vector dependent and changes each time the input vector changes. It has been well established that the maximum leakage of a circuit can be greater by a few orders of magnitude than the minimum leakage and is dependent on the input vectors associated with them [21] . But, this dependence of maximum leakage power on input vectors alone is not correct anymore. In the presence of variability, the input vectors that cause the maximum or the minimum leakage current changes. Hence, maximum leakage now depends not only on the input vectors but also on process parameter variations.
Though some work has been done in the area of statistical leakage power estimation [3] , [9] , [10] , [11] , no methods have been proposed so far to estimate the run-time maximum leakage power distribution as a function of both process variations and input vectors. The work presented in this paper addresses this problem and gives an approach to accurately estimate it. This method can also be extended to determine the minimum leakage vector which can be used as a sleep vector in standby mode.
II. MOTIVATION: DEPENDENCE OF LEAKAGE ON INPUTS IN THE PRESENCE OF VARIATIONS
Consider a CMOS inverter in 32nm technology, with nominal values of process parameters given by Table I [ 22] . This inverter has maximum leakage current when input state is '0'. However, when process parameters are changed to PMOS L ch = 1.837λ and NMOS L ch = 1.985λ with other parameters at their nominal values, the maximum leakage inducing input is '1'. In fact, when 500 Monte Carlo simulations vary the process parameters around their nominal values with variability given by Table II [15] , we observe that the mean leakage when the input vector is '1' is far greater than the mean leakage when the input is '0'. In fact, the mean leakage for input '1' is almost double when compared to the mean leakage for input '0'. The reason for the inverter to change its maximum leakage state is because the I sub of PMOS in its OFF state (input '1') exceeds that of NMOS in its OFF state (input '0') when the channel length of PMOS is less than the NMOS channel length. This is because of the increased sensitivity of PMOS sub-threshold current to channel length variations as compared to NMOS [23] .
This example can be extended to a simple combinational circuit as shown in Figure 1 . In the absence of process variations, the maximum leakage state for this circuit is '01' whereas in the presence of variations, Monte Carlo simulations show that the input state for maximum leakage changes to '11'. This change in input vector changes the total leakage from INV (0) + INV (1) + AND (101) to 2*INV (1) + AND (001). Our experiments have shown that, for larger cells, the maximum leakage vector differs from the one calculated using the conventional method in cases where the effects of process variations dominate the bias voltage effects or where there is increased sensitivity to a process parameter.
This shows that the maximum leakage inducing input vector is highly dependent on the specific values of process parameters. In this paper, we are interested in finding the maximum leakage inducing vector in the presence of such process variations and use this vector to estimate the maximum leakage power distribution.
III. LEAKAGE POWER DEPENDENCE ON PROCESS PARAMETERS
Two components dominate leakage -sub-threshold (I sub ) and gate leakage (I gate ) currents. Sub-threshold leakage is the most dominant leakage mechanism occurring in the OFF state of a transistor and gate leakage is the most dominant leakage mechanism in its ON state [24] . Considering both these types of leakage and the interactions between them, the total leakage of a circuit can be approximated using (1).
I sub and I gate vary with variation in process parameters such as channel length (L ch ), threshold voltage (V th ), channel doping concentration (N ch ), gate oxide thickness (T ox ) and gate width (W gate ). The equation for sub-threshold leakage current is given by (2) [24]. The threshold voltage is given by (3) and the dependence of L on Vth is given by (4) [11] . For definition of variables refer [24] . 
The equation for gate leakage current is given by (5) [16] .
500 Monte Carlo simulations performed on an example ISCAS-85 benchmark circuit, C1355 with 546 gates (32nm technology using predictive models [22] ) is shown in Figure 2 . The nominal values for the process parameters that were varied are given by Table I . Typical variation data is shown in Table II [15] . Such significant variations in process parameters will induce large variations in performance and power.
It was observed in our experiments that, the mean of the distribution is almost four times the nominal value for leakage current in the case of this benchmark circuit. When the parameters were individually varied in hspice and Monte Carlo simulations were performed, it was observed that the leakage current of this circuit was dependent on the process parameters given by the relationships (6) and (7). This agrees with the analytical expressions and previous work in [3] , [9] - [11] .
Due to the dependence of leakage current represented by equations (6), (7), leakage current can no longer be represented by a single nominal or a corner case value. It is in fact a log-normal probabilistic distribution with mean µ which is significantly greater than the nominal value and has standard deviation D [3] .
IV. MAXIMUM LEAKAGE VECTOR IN THE PRESENCE OF PROCESS VARIATIONS
Exact determination of the input vector that induces maximum (or minimum) leakage requires exhaustive simulations with all input vectors. Several approaches have been proposed to estimate this vector with reduced complexity focusing on sleep vectors for leakage minimization in standby mode [25] , [26] , [17] . Rao et al. [27] proposed an approach where the minimum (or maximum) leakage vector is determined by taking cell functionalities into account but the effects of process variations were not considered. This technique cannot be used in the presence of process variations, as demonstrated in Section II. In this work, we propose a heuristic to accurately estimate the maximum leakage vector considering both cell functionalities as well as process variations. The vector thus obtained is then used to determine the maximum sum leakage distribution of a circuit. This method can also be modified to find the minimum leakage vector and hence the minimum sum leakage distribution. Below, we adapt some of the definitions Rao et al. provided in [27] to accommodate process variations. The definitions are also supported by an illustrative example given by Figure  3 . Only the most dominant leakage state is considered for the purposes of illustration as opposed to the implemented algorithm which considers the top two maximum leakage states. This was done to keep the illustration simple. The notations used in the illustration are given by Table III .
A. Definitions
It is assumed that the circuit under consideration can be decomposed into standard cells. A graph is constructed with cells as the nodes and nets as the edges. The leakage value associated with each state of a standard cell is calculated as a weighted sum of the mean leakage current and the standard deviation of that particular state as given by (8) . The mean and standard deviation are determined by performing 500 Monte Carlo simulations by varying the process parameters as shown in Tables I and II for each input combination for each cell. The mean and standard deviation of leakage for an inverter and a two input OR gate are given by Table IV . It is seen in Figure 2 that the probability density function of a circuit has an increased average leakage value given by its mean and a spread represented by the standard deviation. When λ is chosen to be equal to 1, the heuristic chooses 
where, P i is the probabilistic cell leakage in input state i, µ i is the mean leakage of the cell in input state i, D i is the standard deviation of leakage in input state i, λ , the weighting factor is a real number and λ ∈ [0, 1]
Node Controllability: The controllability of a node in a circuit is defined as the minimum number of inputs that have to be assigned to specific values in order to force the node output to a specific state. Every node is assigned two values, CC0 (controllability to force the cell output to 0) and CC1 (controllability to force the cell output to 1).
Controllability List:
The constraints imposed on the primary input vector in order to force a node output to a specific state is defined as the controllability list. The two constraint lists associated with every node output is CC0 list and CC1 list.
For example in Figure 3 , CC0 list of N3 would require either N1, N2 or PI2 to be '0'. We choose the net which has the least fanout. In case of equal fanout nodes, we pick a net randomly and set it to '0'. In this example we have chosen N1=0 which requires PI=1 and PI2=X and PI3=X on the primary input lines. Table V gives the CC0 and CC1 lists for all internal nets in the example circuit.
Probabilistic Worst Input Condition (PWIC):
The worst input condition for a cell represents the minimum number of primary inputs and their specific values that force the cell to its highest leakage in the presence of process variations. It was observed in our experiments that every gate has two dominant worst leakage states. The remaining states dissipate far less leakage compared to these two dominant states. It was also observed that when the top two worst leakage states were considered the accuracy of the algorithm was improved. Hence, we define Worst Input Conditions, PWIC1 and PWIC2 where PWIC1 yields the worst value for leakage and PWIC2 yields the second worst value.
For example, in the case of cell OR2, PWIC1=00 and PWIC2=01. To force the inputs of OR2 to PWIC1, N3 and N4 must be forced to '0'. This translates to CC0 of N3 & CC0 of N4 which is equal to 1XX & XX0 = 1X0. Table VI gives the PWIC1 constraints for all the cells in the example circuit.
Probabilistic Worst Leakage Advantage (PWLA):
If the PWIC of a cell cannot be satisfied, the cell may settle into one of its low leakage states. The increase in leakage when a cell is forced to its PWIC can be quantified by a metric called Probabilistic Worst Leakage Advantage. PWLA is given by the difference in the leakage of the worst leakage state and the average of the low leakage states and can be represented using (9). Since we have chosen two dominant PWICs, we define PWLAs associated with each of them. Table VII shows the PWLAs for the standard cells in the illustration.
PW LA = P PW IC − AV G(P PLLS )
where, P PWIC is the probabilistic cell leakage in PWIC, P PLLS is the probabilistic cell leakage of low leakage states.
For example, in the case of cell OR2, PW LA1 = P 00 − 0.5(P 11 + P 10 )
PW LA2 = P 01 − 0.5(P 11 + P 10 )
where, P i j is the probabilistic leakage when the cell is in state i j.
Conflicting and Dominated Cells:
When the PWIC of a cell is satisfied, it will result in certain nodes in the circuit being forced to particular states because of the way the gates 
(1XX).
Cost Function: When the PWIC of a cell is satisfied, the PWICs of its conflicting cells are violated, while the PWICs of its dominated cells are satisfied. Therefore, the cost of satisfying the PWIC of a cell Ci is calculated as given by Cost(C i ). As a special case, costs of cells which have infeasible requirements for a primary input are assigned a large negative value, as they can never be forced to their PWIC. The costs of the standard cells in the illustration is given by Table IX .
For example, Cost(BUF) = PWLA(BUF)-PWLA(OR2). The costs associated with both the PWICs are calculated in our method.
B. Heuristic to Determine the Maximum Leakage Vector
This section gives the outline of HeuristicMax, the heuristic proposed to determine the maximum leakage vector in the presence of process variations. The heuristic is given by Algorithm-1.
The complexity of determining the controllability and controllability lists in HeuristicMax was reduced by sorting the nets in the increasing order of its depth. A cell in the higher level was processed only after processing all the cells in the lower levels. By sorting, the need to back traverse all the way to the primary inputs was eliminated and hence the complexity was reduced.
In the illustrative example, INV-2 is selected and its input constraint -X1X is satisfied in the first iteration. The conflicting and dominated cell lists are updated and the new costs are determined for the remaining cells as given by Table X. In the second iteration OR2 is selected and its input constraint -1X0 is satisfied which finally defines the primary inputs as 110.
V. ESTIMATION OF TOTAL LEAKAGE CURRENT DISTRIBUTION
The leakage current distribution of a standard cell in state i can be represented by a log-normal distribution with mean µ i and standard deviation D i . This distribution has a corresponding normal distribution having mean m i and standard deviation σ i obtained by taking the natural logarithm of all the points in the log-normal distribution [28] .
A. Sum of Log-normals
A standard cell in state i with Gaussian mean m i and Gaussian standard deviation σ i , has probability density function represented by (10) .
Given the probability density function of all the standard cells, we can determine the total leakage power distribution of a circuit. In theory, the sum of log-normals is not known to have a closed form. An approximation of the same can be made using the Fenton-Wilkinson's method of estimating the sum of several log-normal distributions [28] . Given f(x) of all the standard cells in the circuit, the sum leakage distribution is given by equating the first two moments, α1 and α2. Equation (11) gives the relationship between α1, α2, m i and σ i . Equations (12)- (13) give the relationship between µ i , D i , m i and σ i . The mean µ and variance D 2 of the sum log-normal distribution are given by (14)- (15) and the distribution function is given by (16 
B. Overall Approach 1. Calculate the maximum leakage vector using the heuristic described in Section IV.B. 2. Set the primary inputs of the circuit to the maximum leakage vector. 3. Forward-propagate the primary inputs and define the input states of all the gates in the circuit. 4. Estimate the probabilistic leakage distribution corresponding to the maximum vector using the method described in Section V.A.
VI. RESULTS
The approach described in this paper was implemented using C++ and was tested on ISCAS-85 benchmark circuits using predictive models for 32nm technology [22] . The nominal values for the process parameters that were varied are given by Table I . Typical variation data is shown in Table II . Vdd for 32nm technology is 0.9V. The cells were characterized using 500 Monte Carlo simulations in hspice.
A. Leakage Power Distribution Corresponding to the Maximum Leakage Vector in the Presence of Variations
This sub-section gives the results for the heuristic implemented to determine the leakage power distribution corresponding to the maximum leakage vector in the presence of process parameter variations. The results are given by Table  XI . This method was compared against random vector testing using 100,000 random vectors except for C17 which has 5 primary inputs, was exhaustively tested with 2 5 input vectors. The average error for mean leakage current and standard deviation was found to be 1.32% and 1.41% respectively. Only the smaller benchmarks were verified against Monte Carlo simulations using 100 random vectors due to the large run-time of Monte Carlo simulations. The results are shown in Table XII. In the presence of process parameter variations, the mean leakage was several times larger than the nominal analysis as expected. The comparison results are given in Table XIII. The mean leakage power obtained using our method was on an average 3.4X greater than the leakage current obtained using the method in [27] . It was observed that when the top two worst leakage states were considered instead of one worst leakage state, the accuracy of the algorithm improved on an average by 3.2% for mean and 2.7% for standard deviation.
B. Pessimistic Approach
A pessimistic approach was also implemented in which the maximum sampled value from the Monte Carlo simulations was used to calculate the maximum bound on leakage rather than the mean and standard deviation of the leakage profile. The results obtained using this approach are given by Table  XIV and the comparison between the pessimistic approach and the algorithm in [27] is given by Table XV.
C. Maximum Leakage Vector in the Presence of Variations
The maximum input vector obtained using the approach in [27] was used to calculate the leakage of the benchmark circuits in the presence of parameter variations. This leakage current was compared with the leakage current obtained using the vector computed by our heuristic which considers the effect of variations. The comparison results are given by Table XVI . The average error in the mean leakage current was found to be 7.7% when parameter variations were not considered for the determination of maximum leakage vector.
D. Complexity and Usage
The implemented heuristic is quadratic in complexity. Leakage estimation using exhaustive hspice Monte Carlo simulations to determine the maximum leakage vector and the leakage power associated with it takes a few hours to days depending on the number of random vectors considered and the size of the benchmarks. Our method can estimate the same with less than 1.5% error in a matter of few seconds or a few minutes depending on the size of the benchmark. The runtime savings when compared to Monte Carlo simulations with a sweep value of 500 for 100 random vectors is given HeuristicMax accepts an RTL description of the CMOS circuit. It is assumed that all the cells in the standard cell library have been pre-characterized for leakage mean and standard deviation using Monte Carlo hspice simulations and are available to HeuristicMax in a look up table. The pre-characterization is a one time effort. The value of λ is specified by the user. Using these inputs, HeuristicMax determines the maximum leakage vector and the leakage power associated with it. It can easily be used to accurately estimate the maximum leakage power of large circuits because of its small runtime when compared to the large run-time of Monte Carlo simulations.
E. Determination of Minimum Leakage Power Vector
HeuristicMax can be modified to determine the minimum leakage vector in the presence of process variations. In this modified approach, a probabilistic best input condition (PBIC) is defined. PBIC puts the standard cell into its least leakage state. The penalty for a cell for not settling into its PBIC,is defined as probabilistic cell leakage penalty (PCLP), given by the difference in the average of the high leakage states and the average of the low leakage states. Conflicting and dominated cells are defined in a similar way as HeuristicMax, but with reference to the PBIC. The calculation of costs is also similar to HeuristicMax but involves the usage of PCLP instead of the PWLA. The modified heuristic to determine the minimum leakage vector tries to minimize the cost penalty and hence finds the leakage vector corresponding to the minimum leakage power. In this work, we have developed a heuristic to accurately estimate the maximum run-time leakage power bound as a function of both the input vectors and variations in process parameters. The implemented method has been effective in accurately estimating the PDF, mean and standard deviation of the total leakage current distribution of ISCAS-85 benchmark circuits and the average errors when compared with exhaustive random vector testing for mean and standard deviation are 1.32% and 1.41% respectively. The algorithm in [27] is found to under-estimate the leakage power by a factor of 3.4X as it does not consider the effect of parameter variations. In this work, process variations were considered to be random. The effects of spatial correlations can be included to increase the accuracy of this approach. Layout level analysis further helps to determine the accuracy of the proposed estimation technique.
