# Statistical Mixed $V_t$ Allocation of Body-Biased Circuits for Reduced Leakage Variation

Jinseob Jeong, Seungwhun Paik, and Youngsoo Shin Department of Electrical Engineering, KAIST Daejeon 305-701, Korea

Abstract-Leakage current is susceptible to variation of transistor parameters and environment such as temperature, which results in wide spread in leakage distribution. The spread can be reduced by employing body biasing: reverse body bias for too leaky dies and forward body bias for too slow dies. We investigate body biasing of mixed  $V_t$  circuits. It is shown that the conventional body biasing has limitation in reducing leakage variation of mixed  $V_t$  circuits. This is because low- and high- $V_t$ devices do not track each other and their body biasing sensitivities are different. We present alternative body biasing scheme that targets compensating die-to-die variation of low  $V_t$ . Under this body biasing scheme, within-die profiles of low- and high- $V_t$ , which we need for statistical allocation of mixed  $V_t$ , get wider thus become different from the original ones. We present an analytical procedure to derive new within-die profiles. Experiments with 45nm predictive model show that the spread in leakage distribution (ratio of maximum and minimum leakage) can be reduced to 4.5 as opposed to 9.4 from conventional body biasing on mixed  $V_t$ circuits.

#### I. INTRODUCTION

The supply voltage of CMOS circuits has been decreasing with technology scaling so as to manage active power consumption. However, as the supply voltage is reduced, the threshold voltage  $(V_t)$  needs to be commensurately reduced to compensate increased circuit delay. This leads to exponential increase in subthreshold leakage current and it is not uncommon to see leakage current being responsible for almost half of total power consumption [1] in recent 90- and 65-nm CMOS technologies. The leakage current consists of many components, but subthreshold leakage takes the largest proportion in most technologies. Lowering  $V_t$  leads to higher sensitivity of leakage current to variation of  $V_t$  due to its exponential nature. What's even worse, the variation of  $V_t$ is increasing as the dimension of device gets smaller. It is reported that the standard deviation of threshold voltage is about 6% of its mean at 180-nm technology, but about 11% at 65-nm technology [2]. The spread of threshold voltage can cause  $20 \times$  variation in chip leakage [3], which makes it very hard to predict power consumption in early stage of design.

The spread in leakage distribution can be significantly reduced by employing body biasing. It can make the distribution of die-to-die (D2D)  $V_t$  variation sharp by applying varying amounts of body bias to different dies since  $V_t$  of CMOS transistor is a function of its body to source potential. Reverse body bias can be applied to too leaky dies to increase  $V_t$ whereas forward body bias can be applied to too slow dies to decrease  $V_t$ .

The use of multiple threshold voltages has been proposed to suppress subthreshold leakage while minimizing the impact on traditional synthesis-based design approach. Various algorithms have been proposed [4]-[6] for allocating multiple threshold voltages. In essence, they try to find as many gates as possible that can take advantage of high  $V_t$  (thus less leakage), while ensuring that critical path delay is within timing constraint. These algorithms are deterministic, as electrical parameters that they query, such as delay and power, are modeled at fixed corner. A corner corresponds to a particular instance (e.g. best, nominal, or worst) of D2D variation, which has been traditionally considered to be the dominant source of variations. For example, a worst corner assumes that variability in each parameter reflects the worst effect on circuit performance and simultaneously at all devices. This is very pessimistic since all gates are assumed to be worst in their performance at the same time, which is very unlikely to happen. Since only those gates that do not harm the timing even in this worst case can be assigned high  $V_t$ , the number of high  $V_t$  gates can get unnecessarily low.

As the amount of within-die (WID) variations has been increasing with technology scaling, statistical model of parameters is introduced to overcome the pessimism from the conventional corner-based design approach. In statistical mixed  $V_t$  allocation, the use of probability density functions (PDFs) modeling WID variations of low- and high- $V_t$  at a particular corner [7] usually yields increased number of gates mapped to high  $V_t$ , which helps further reduce leakage. Fig. 1(a) conceptually shows two WID PDFs taken at the worst corner of D2D profiles, which are then used for statistical allocation of mixed  $V_t$ .

Although the use of statistical model for mixed  $V_t$  allocation can further reduce subthreshold leakage, it cannot reduce the spread of leakage distribution itself. In this paper, we investigate statistical mixed  $V_t$  allocation of body-biased circuits to suppress the spread of leakage distribution while achieving low leakage current as well. However, using conventional body biasing scheme, which targets matching the delay of circuits, on mixed  $V_t$  circuits cannot fully benefit suppression of leakage variation. For example, strong reverse body bias can be applied to die-A in Fig. 1(a), which is too leaky, to the extent that critical path delay is within timing constraint. This can be implemented by monitoring the delay of critical path replica [8]. However, for die-B in Fig. 1(a), which is also too leaky, the amount of reverse body bias that can be applied



Fig. 1. D2D and WID profiles of low- and high- $V_t$  devices (a) for cornerbased statistical allocation of mixed  $V_t$  and (b) after body bias that targets compensating D2D variation of low  $V_t$ .

will be very small because high  $V_t$  is already too high, even though the leakage is dominated by low  $V_t$ , which is too low. Furthermore, body-biasing sensitivities of high- and low- $V_t$  are different [9]. This is due to the fact that more channel dopants are implanted on high  $V_t$  transistors, which naturally leads to larger body effect coefficient. For the same body bias applied to both types of devices, the change of threshold voltage of low- $V_t$  device is smaller than that of high- $V_t$  one. Therefore, dies such as die-B with their low- $V_t$  in low percentile point of D2D profile and high- $V_t$  in high percentile point cannot take advantage of reverse body bias.

To reduce the spread in leakage distribution, thus to achieve predictable leakage, we propose an alternative body biasing scheme that targets compensating D2D variation of low- $V_t$ , as shown in Fig. 1(b). This can be implemented by employing a body tap, whose bias is appropriately set after manufacturing, or by using a controller that monitors leakage of low- $V_t$ devices [10]. Under this body biasing scheme, WID profiles of low- and high- $V_t$ , which we need for statistical allocation of mixed  $V_t$ , get wider thus become different from the original ones [11]. Specifically, for low- $V_t$  devices, instead of using a WID PDF at  $+3\sigma$  point of D2D, we take the one at  $-3\sigma$  point (die-A in Fig. 1(a)), and then derive a new PDF after applying body bias such that its mean gets shifted to the mean of D2D (see Fig. 1(b)). This is because the PDF of die-A receives the largest amount of reverse body bias, which makes it widest, thus the worst. The derivation is more complicated for high  $V_t$ . We take a WID PDF at some point in D2D (such as die-C), which receives the largest amount of reverse body bias such that its mean gets shifted to  $+3\sigma$  point of new D2D profile of high  $V_t$  (refer to Fig. 1(b)). Once we derive new WID PDFs, they are used for statistical allocation of mixed  $V_t$ . The experiments with 45-nm predictive model show that the spread in leakage distribution can be effectively reduced to 4.5 with the proposed body biasing scheme as opposed to 9.4 from conventional body biasing on mixed  $V_t$  circuits.

The remainder of the paper is organized as follows. In the next section, we will briefly review statistical allocation of mixed  $V_t$ . In Section III, we present the analytical procedure to derive low- and high- $V_t$  WID PDFs, when we assume a body bias that can compensate low- $V_t$  D2D variation, which are then used for statistical mixed  $V_t$  allocation. The experimental flow and the results with several benchmark circuits are presented in Section IV, and we draw conclusions in Section V.

#### II. STATISTICAL ALLOCATION OF MIXED $V_t$

The main difference of statistical static timing analysis (SSTA) compared to conventional static timing analysis (STA) is that the delay of each gate is expressed as a PDF, which captures the WID variation. The propagation of signal arrival times and required arrival times are performed similarly as in STA, but all timing parameters are now expressed as PDFs instead of scalar values. The statistical allocation of mixed  $V_t$  is performed while satisfying the timing constraint, which is a target for a high percentile point (e.g.  $+3\sigma$ ) of the PDF rather than for a deterministic critical path delay obtained by conventional STA.

The allocation algorithm [7] is based on concept of *sensitivity* of a gate, which determines the priority of the gate for allocation. The sensitivity reflects the amount of change in leakage and timing of a gate, if its  $V_t$  were to change. Assume that all the gates in a netlist are initially in low  $V_t$ . The sensitivity of a gate *i* is defined by [5]:

$$\xi_i = \frac{\Delta I}{\Delta D} S_i, \quad \text{where } \Delta I = I_{low-V_t} - I_{high-V_t}, \qquad (1)$$
$$\Delta D = D_{high-V_t} - D_{low-V_t}$$

where  $S_i$  corresponds to a slack at the output of the gate *i*. Note that all the variables in (1) are random variables, whose PDFs are dependent on WID profiles of low- and high- $V_t$ devices. We approximate that  $\frac{|\Delta I|}{|\Delta D|}$  and  $S_i$  are independent, which is usually the case, since the former is for the gate *i* itself while the latter comes from multiple gates on timing path thus any dependency on *i* is weak. Therefore, the sensitivity PDF can be calculated by performing convolution on the PDFs of  $\frac{|\Delta I|}{|\Delta D|}$  and  $S_i$ . Once it is derived, the sensitivity itself is assumed at low percentile point (e.g.  $-3\sigma$ ). After we compute the sensitivity of all the gates, we take the one with the largest, and substitute high  $V_t$  gate for it. The sensitivities of the gates that are affected by this allocation are updated. The process



body effect equation for  $x_{sb}$ :

7C-4

$$\mu_{l0} = x_{l0} + \gamma_l \left( \sqrt{\psi_l + x_{sb}} - \sqrt{\psi_l} \right), \tag{4}$$

where  $\gamma_l$  is the body effect coefficient of low  $V_t$  devices and  $\psi_l$  corresponds to the surface potential at threshold.

The variables  $x_{sb}$  and  $x_{l0}$  are not linear in (4), which makes deriving  $x_{sb}$  from a given distribution of  $x_{l0}$  difficult. Fortunately, (4) can be approximated via Taylor series expansion about 0 (i.e. Maclaurin series) to the following equation with fairly good accuracy:

$$\mu_{l0} \approx x_{l0} + \frac{\gamma_l}{2\sqrt{\psi_l}} x_{sb}.$$
(5)

Because both random variables are now linear each other, we can safely assume that  $x_{sb}$  also follows a normal distribution. Hence, it can be readily shown that:

$$x_{sb} \sim N\left(0, \left(\frac{2\sqrt{\psi_l}}{\gamma_l}\sigma_{l0}\right)^2\right).$$
 (6)

Since we apply the same body bias, which is now modeled by (6), to high- $V_t$  devices, the D2D variation of high- $V_t$ becomes different from the original one  $(x_{h0})$  due to body effect, which we now try to derive. The body effect equation of high  $V_t$  with body bias of  $x_{sb}$  is given by:

$$x_h = x_{h0} + \gamma_h \left( \sqrt{\psi_h + x_{sb}} - \sqrt{\psi_h} \right), \tag{7}$$

where  $\gamma_h$  is the body effect coefficient of high  $V_t$  devices,  $\psi_h$  corresponds to the surface potential at threshold, and  $x_h$ is a random variable that models D2D variation of high  $V_t$ in the presence of body bias  $x_{sb}$  (see Fig. 2(b)). We again approximate (7) via Maclaurin series expansion to yield:

$$x_h \approx x_{h0} + \frac{\gamma_h}{2\sqrt{\psi_h}} x_{sb}.$$
 (8)

Since  $x_{h0}$  and  $x_{sb}$  are independent (recall that  $x_{sb}$  is derived from D2D distribution of low- $V_t$ , which we assume independent from that of high- $V_t$ ), it can be readily shown that  $x_h$  also follows a normal distribution given by:

$$\begin{aligned} x_h &\sim N\left(\mu_{h0}, \ \sigma_{h0}^2 + \left(\frac{\gamma_h}{\gamma_l}\sqrt{\frac{\psi_l}{\psi_h}}\sigma_{l0}\right)^2\right) \\ &= N\left(\mu_{h0}, \ \sigma_{h0}^2 + \sigma_{hs}^2\right) \\ &= N\left(\mu_h, \ \sigma_h^2\right), \end{aligned} \tag{9}$$

where  $\sigma_{hs}$  is a factor that determines the extra spread of D2D variation of high- $V_t$  after the proposed body bias. Note that it is a function of  $\sigma_{l0}$ , since the amount of body bias we apply



Die

m<sub>h0</sub>-3s<sub>h0</sub>

(a)

HVT D2D

 $(x_{h0})$ 

Die-C

Die's mean Vt

m<sub>h0</sub>+3s<sub>h0</sub>

LVT D2D

(x<sub>10</sub>)

m<sub>10</sub>+3s<sub>10</sub>

LVT D2D

 $(x_{l})$ 

Fig. 2. D2D and WID profiles of low- and high- $V_t$  devices (a) without body biasing and (b) after body biasing that compensates D2D variation of low  $V_t$ .

continues until there are no more gates that can take advantage of high  $V_t$ .

## III. STATISTICAL ALLOCATION OF MIXED $V_t$ FOR **BODY-BIASED CIRCUITS**

As we have discussed in the previous section, we need WID PDFs of low- and high- $V_t$  devices, which then can be used to derive various PDFs for gate delay, arrival times, sensitivities, and so on. When we assume body biasing that compensates D2D variation of low  $V_t$ , the spread in WID PDF gets wider since short channel effect degrades with increasing body bias voltage. Furthermore, due to different body biasing sensitivities that depend on initial  $V_t$  [9], the spread of WID PDF is wider for high  $V_t$  devices than it is for low  $V_t$  ones, which needs to be taken care of.

## A. WID PDFs of Body-Biased Circuits Compensating Low $V_t$ D2D

The random variable  $x_{l0}$  that models D2D variation of low  $V_t$  (without body bias) is assumed to follow a normal distribution:

$$x_{l0} \sim N\left(\mu_{l0}, \ \sigma_{l0}^2\right),$$
 (2)

where  $\mu_{l0}$  is mean (i.e. the threshold voltage corresponding to perfect process) and  $\sigma_{l0}$  denotes standard deviation. Similarly we define a random variable  $x_{h0}$  that models D2D variation of high  $V_t$ :

$$x_{h0} \sim N\left(\mu_{h0}, \ \sigma_{h0}^2\right).$$
 (3)

We further assume that the two random variables  $x_{l0}$  and  $x_{h0}$ are independent, which is usually the case due to independent masks and processing steps for implementing low- and high- $V_t$ .

| Circuit | # Gates | % gates mapped to high $V_t$ |              | Leakage variation (max / min) |                          |
|---------|---------|------------------------------|--------------|-------------------------------|--------------------------|
|         |         | Conventional                 | Our approach | Conventional ( $\mu$ A)       | Our approach ( $\mu A$ ) |
| c432    | 278     | 58.6                         | 54.3         | 8.8 / 1.2                     | 10.4 / 2.3               |
| c499    | 653     | 41.4                         | 32.9         | 66.7 / 5.1                    | 43.1 / 10.3              |
| c880    | 409     | 76.5                         | 72.9         | 22.7 / 2.3                    | 15.3 / 3.6               |
| c1355   | 715     | 46.7                         | 48.7         | 33.6 / 4.7                    | 34.0 / 7.6               |
| c1908   | 771     | 64.6                         | 61.7         | 36.3 / 4.1                    | 31.9 / 6.6               |
| c2670   | 868     | 73.6                         | 70.4         | 40.9 / 4.3                    | 30.6 / 6.5               |
| c3540   | 1403    | 67.9                         | 65.7         | 82.0 / 8.4                    | 57.2 / 12.3              |

TABLE I Experimental result on ISCAS benchmark circuits

is determined by how much spread the original D2D variation of low- $V_t$  has. For 45-nm predictive model [12], where we assumed that both  $\sigma_{l0}$  and  $\sigma_{h0}$  are 20.0 mV,  $\sigma_{hs}$  was 22.7 mV, which yields  $\sigma_h$  of 30.2 mV, about 1.5 times more spread than the initial D2D variation of high  $V_t$ .

Once we know D2D PDF of low  $V_t$ , which is now an impulse, and that of high  $V_t$ , which now has more spread, we need to derive WID PDFs that we use for statistical allocation of mixed  $V_t$  as described in Section II. For low  $V_t$ , we first take a WID PDF of die-A and derive a new PDF with body bias, which we need to apply until its mean gets shifted to  $\mu_l$  (see Fig. 2(b)). This is because die-A, whose mean of WID PDF is at  $\mu_{l0} - 3\sigma_{l0}$ , receives the maximum amount of reverse body bias (RBB) and the larger RBB we apply the more spread we get due to short channel effect [11].

The situation is complicated for high  $V_t$  because the body bias we apply is a random variable  $(x_{sb})$  as opposed to a deterministic one for each point in D2D variation of low  $V_t$ . This implies that both die-B and die-C, for example, can end up with their means being shifted to the worst corner  $(\mu_h + 3\sigma_h)$  after body bias. We will eventually take die-B over die-C because it receives more RBB. Therefore, we are now interested in  $x_{h0}$  that can get shifted to  $\mu_h + 3\sigma_h$  after some amount of body bias in  $x_{sb}$  applied to it. Interestingly, this can be represented as another random variable following a normal distribution given by:

$$N\left(\frac{\mu_{h0}\sigma_{hs}^{2} + (\mu_{h0} + 3\sigma_{h})\sigma_{h0}^{2}}{\sigma_{hs}^{2} + \sigma_{h0}^{2}}, \frac{\sigma_{hs}^{2}\sigma_{h0}^{2}}{\sigma_{hs}^{2} + \sigma_{h0}^{2}}\right) = N(\mu_{ch}, \sigma_{ch}^{2}).$$
(10)

Its derivation is shown in Appendix A. From (10), we locate the one that will receive the maximum RBB (i.e. the die in  $\mu_{ch} - 3\sigma_{ch}$  point), take its original WID PDF, and derive a new PDF after body bias. The resulting WID PDF together with that of low- $V_t$  are submitted to statistical allocation of mixed  $V_t$ .

### B. Statistical Allocation of Mixed $V_t$

With derived WID PDFs of low- and high- $V_t$  devices, we characterize PDFs of delay and leakage of each gate in the library to capture the impact of WID variation as well as body effect as a consequence of the proposed body biasing scheme. These PDFs are fed to SSTA engine, which drives statistical allocation of mixed  $V_t$  as explained in the previous section.

## **IV. EXPERIMENTAL RESULTS**

We performed experiments on a set of circuits taken from the ISCAS benchmarks. Each circuit was synthesized with SIS [13] and mapped into a 45-nm gate library, which we built based on predictive model [12]. The library consists of 30 cells: inverters, 2-input NOR gates, and 2-input, 3input, and 4-input NAND gates each in three different sizes and two different threshold voltages (i.e. low- and high- $V_t$ ). Technology mapping was done using a weighted sum of area and delay as the cost function. The mean of D2D variation of low- and high- $V_t$  nMOS devices were assumed to be 0.22 and 0.35 V respectively with  $V_{dd}$  of 1.0 V.

In Table I, the third column shows the percentage of gates mapped to high  $V_t$  using conventional statistical allocation of mixed  $V_t$  [7], which we implemented in SIS environment. The WID PDFs at  $+3\sigma$  corner of low- and high-V<sub>t</sub> D2D PDFs respectively (see Fig. 1(a) and Table II) were used for the allocation. The SSTA engine [14] used in the allocation routine was also implemented in SIS. For D2D variation, we assumed 20 mV as standard deviation of both low- and high- $V_t$ devices, respectively. We assumed the same standard deviation for WID variation, which yields overall standard deviation (i.e.  $\sqrt{\sigma_{l0}^2 + \Sigma_{l0}^2}$  for low-V<sub>t</sub> devices, where  $\Sigma_{l0}$  is the standard deviation of WID PDF) of 28 mV. This is based on 15% variation of nominal  $V_t$  as is predicted from [2]. The fourth column corresponds to the percentage of gates mapped to high- $V_t$  when we used two WID PDFs that we derived following the procedure in the previous section, which is shown in Table II.

Timing critical paths are usually dominated by low- $V_t$  gates. However, some of gates in these paths can take high- $V_t$  in our approach, since the mean of low- $V_t$  is smaller, as shown in Table II. The remaining paths, which consist of both low- and high- $V_t$  gates, can or cannot benefit from our approach because the mean of low- $V_t$  is smaller while that of high- $V_t$  is larger. Considering the fact that the proportion of gates in critical paths is small, this is why the percentage of gates mapped to high- $V_t$  in our approach (fourth column) is not significantly different from that of conventional approach (third column).

For each circuit with its mixed  $V_t$  allocation obtained in the third column, we simulate it with SPICE at nine different process corners (combination of best, nominal, worst D2D corner of low- and high- $V_t$  respectively) at each of three different temperatures (25, 75, and 125°C) totaling 27 simulations, and list its maximum and minimum leakage at the fifth column.

TABLE II Worst-corner within-die PDFs for conventional- and our approach.



For each simulation, body bias is carefully chosen such that the critical path of each circuit meets the timing constraint. Then, we repeat the simulation for each circuit but this time its mixed  $V_t$  allocation from the fourth column representing our approach. The sixth column lists the result. Note that the sixth column is the result of 9 simulations because low  $V_t$ has only one D2D corner due to body biasing scheme we used (see Fig. 2(b)). Comparing the two columns, it is readily seen that our approach clearly reduces the spread of leakage. For five circuits (c499, c880, c1908, c2670, and c3540), the maximum leakage was also reduced. For these circuits, the maximum leakage from the conventional approach happens to be at process corner, where low- $V_t$  is at its lowest while high- $V_t$  is at its highest, which cannot benefit from reverse body biasing since high- $V_t$  is already too high being a bottleneck for timing.

The ratio of the maximum and minimum leakage (i.e. leakage variation) of the two approaches in Table I is compared in Fig. 3. We also plot the leakage variation of the conventional approach but without body bias. The conventional approach without body bias has  $12.7 \times$  as the average of leakage variation. This is reduced to  $9.4 \times$  if body bias is employed to reduce leakage. Our approach yields  $4.5 \times$  and very regularly over circuits.

The leakage of c880 from conventional approach at 27 different combinations of process and temperature is shown in Fig. 4. The same circuit with our approach at 9 different cases is also shown, indicating reduced variation of leakage.

Variation of both D2D and WID  $V_t$  has been increasing with technology scaling [2]. We conducted the same experiment



Fig. 4. Comparison of leakage variation of c880.



Fig. 5. Comparison of leakage variation with different process variation.

while we change the standard deviation of D2D ( $\sigma_{l0}$  and  $\sigma_{h0}$ ) and WID ( $\Sigma_{l0}$  and  $\Sigma_{h0}$ ) to see the leakage variation (ratio of the maximum and minimum leakage) of conventional body biasing and our approach on mixed  $V_t$  circuits. Fig. 5 shows the result. For conventional approach, leakage variation increases with increasing variation in  $V_t$ . This is because the percentage of gates mapped to high- $V_t$  gets decreased, and the variation of leakage itself gets increased. The variation of leakage is relatively kept small in our approach, since the D2D variation of low- $V_t$ , which is the dominant source of leakage variation, is suppressed.

#### V. CONCLUSION

Leakage is very susceptible to variation of process and environment, thus is an important source of unpredictability of design. Body bias is useful to alleviate the leakage variability, but its use is limited in mixed  $V_t$  circuits, which is common for low leakage design. We have proposed a body biasing scheme that compensates D2D variation of low  $V_t$ . The analytical procedure to derive new WID profiles of low- and high- $V_t$ devices, which are then used for statistical allocation of mixed  $V_t$ , was presented. The leakage variation was reduced to  $4.5 \times$ on average, compared to  $9.4 \times$  and  $12.7 \times$  from conventional allocation scheme coupled with and without reverse body bias, respectively. The maximum leakage of several circuits was significantly reduced as well.

## APPENDIX A

In order to derive worst case WID profile of high  $V_t$ , we need to know certain PDF, represented by random variable  $x_{ch}$ , that contributes to the worst case WID profile of high  $V_t$  at the presence of body bias  $(x_{sb})$  that has been applied to compensate D2D variation of low  $V_t$ . Continuous PDF of random variable  $x_{ch}$  is obtained by calculating a conditional probability which is the probability that  $x_{h0}$  has certain value x, given the occurrence of  $x_h$  at  $\mu_{h0} + 3\sigma_h$ .

$$f(x_{ch} = x) = f(x_{h0} = x \mid x_h = \mu_{h0} + 3\sigma_h).$$
(11)

It can be further expressed as follows:

$$f(x_{h0} = x \mid x_h = \mu_{h0} + 3\sigma_h) = \frac{f(x_{h0} = x, x_h = \mu_{h0} + 3\sigma_h)}{f(x_h = \mu_{h0} + 3\sigma_h)}$$
(12)  
=  $\frac{f(x_{h0} = x) \cdot f(x_h = \mu_{h0} + 3\sigma_h \mid x_{h0} = x)}{f(x_h = \mu_{h0} + 3\sigma_h)}.$ 

Linear relation between two random variables  $x_h$  and  $x_{h0}$  is derived from (8) in Section III. By using this equation, the conditional PDF of random variables  $x_h$  and  $x_{h0}$  in the numerator of (12) can be expressed as a continuous PDF of random variable  $x_{sb}$ :

$$f(x_h = \mu_{h0} + 3\sigma_h \mid x_{h0} = x) = f\left(x_{sb} = \frac{2\sqrt{\psi_h}}{\gamma_h}(\mu_{h0} + 3\sigma_h - x)\right).$$
(13)

Combining (12) and (13), we now have a continuous PDF of random variable  $x_{ch}$  expressed as a continuous PDF of random variables  $x_{h0}$  and  $x_{sb}$ , which we already know the mean and standard deviation.

$$f(x_{ch} = x) = \frac{f(x_{h0} = x) \cdot f\left(x_{sb} = \frac{2\sqrt{\psi_h}}{\gamma_h}(\mu_{h0} + 3\sigma_h - x)\right)}{f(x_h = \mu_{h0} + 3\sigma_h)}.$$
 (14)

Since  $x_{h0}$  and  $x_{sb}$  follow normal distribution and denominator of (14) is a constant,  $x_{ch}$  also follows a normal distribution. For normal distribution, continuous PDF f(x) is given by

$$f(x) = \frac{1}{\sqrt{2\pi\sigma}} e^{\frac{-(x-\mu)^2}{2\sigma^2}},$$
 (15)

which is a simple exponential function having  $\mu$  and  $\sigma$  as its exponent. By using this basic form of normal distribution, the result of multiplying two PDFs in (14) can be expressed in the form of normal distribution as well, where the mean and standard deviation of  $x_{ch}$  is readily derived from. The simplified result can be expressed by

$$N\left(\frac{\mu_{h0}\left(\frac{\gamma_{h}}{\gamma_{l}}\sqrt{\frac{\psi_{I}}{\psi_{h}}}\sigma_{l0}\right)^{2} + (\mu_{h0} + 3\sigma_{h})\sigma_{h0}^{2}}{\left(\frac{\gamma_{h}}{\gamma_{l}}\sqrt{\frac{\psi_{I}}{\psi_{h}}}\sigma_{l0}\right)^{2} + \sigma_{h0}^{2}}, \frac{\left(\frac{\gamma_{h}}{\gamma_{l}}\sqrt{\frac{\psi_{I}}{\psi_{h}}}\sigma_{l0}\right)^{2}\sigma_{h0}^{2}}{\left(\frac{\gamma_{h}}{\gamma_{l}}\sqrt{\frac{\psi_{I}}{\psi_{h}}}\sigma_{l0}\right)^{2} + \sigma_{h0}^{2}}\right)$$
$$= N\left(\frac{\mu_{h0}\sigma_{hs}^{2} + (\mu_{h0} + 3\sigma_{h})\sigma_{h0}^{2}}{\sigma_{hs}^{2} + \sigma_{h0}^{2}}, \frac{\sigma_{hs}^{2}\sigma_{h0}^{2}}{\sigma_{hs}^{2} + \sigma_{h0}^{2}}\right) (16)$$

#### ACKNOWLEDGEMENT

This work was supported by Samsung Electronics.

#### REFERENCES

- [1] J. Friedrich, B. McCredie, N. James, B. Huott, B. Curran, E. Fluhr, E. Chan G.Mittal, D. Plass Y. Chan, S. Chu, J. Ripley H. Le, L. Clark, S.Taylor, J. Dilullo, and M. Lanzerotti, "Design of the Power6 microprocessor," in *Proc. IEEE Int'l Solid-State Circuits Conf.*, Feb. 2007, pp. 96–97.
- [2] C. Chiang and J. Kawa, Eds., Design for Manufacturability and Yield for Nano-Scale CMOS, Springer, 2007.
- [3] S. Borkar, T. Karnik, S. Narenda, A. Keshavarzi, and V. De, "Parameter variations and impact on circuits and microarchitecture," in *Proc. Design Automation Conf.*, June 2003, pp. 338–342.
- [4] L. Wei, Z. Chen, K. Roy, M. C. Johnson, Y. Ye, and V. K. De, "Design and optimization of dual-threshold circuits for low-voltage low-power applications," *IEEE Trans. on VLSI Systems*, vol. 7, no. 1, pp. 16–23, Mar. 1999.
- [5] T. Karnik, J. Tschanz Y. Ye, L. Wei, S. Burns, V. Govindarajulu, V. De, and S. Borkar, "Total power optimization by simultaneous dual-Vt allocation and device sizing in high performance microprocessors," in *Proc. Design Automation Conf.*, June 2002, pp. 486–491.
- [6] M. Ketkar and S. S. Sapatnekar, "Standby power optimization via transistor sizing and dual threshold voltage assignment," in *Proc. Int'l Conf. on Computer Aided Design*, Nov. 2002, pp. 375–378.
- [7] A. Srivastava, D. Sylvester, and D. Blaauw, "Statistical optimization of leakage power considering process variations using dual-vth and sizing," in *Proc. Design Automation Conf.*, June 2004, pp. 773–779.
- [8] J. W. Tschanz, J. T. Kao, S. G. Narendra, R. Nair, D. A. Antoniadis, A. P. Chandrakasan, and V. De, "Adaptive body bias for reducing impacts of die-to-die and within-die parameter variations on microprocessor frequency and leakage," *IEEE Journal of Solid-State Circuits*, vol. 37, no. 11, pp. 1396–1402, Nov. 2002.
- [9] Y. Yasuda, N. Kimizuka, Y. Akiyama, Y. Yamagata, Y. Goto, , and K. Imai, "System LSI multi-Vth transistors design methodology for maximizing efficiency of body-biasing control to reduce Vth variation and power consumption," in *Proc. Electron Devices Meeting*, Dec. 2005, pp. 68–71.
- [10] T. Kuroda, T. Fujita, S. Mita, T. Nagamatsu, S. Yoshioka, K. Suzuki, F. Sano, M. Norishima, M. Murota, M. Kako, M. Kinugawa, M. Kakumu, and T. Sakurai, "A 0.9-V, 150-MHz, 10-mW, 4 mm<sup>2</sup>, 2-D discrete cosine transform core processor with variable thresholdvoltage (vt) scheme," *IEEE Journal of Solid-State Circuits*, vol. 31, no. 11, pp. 1770–1779, Nov. 1996.
- [11] S. Narenda, D. Antoniadis, and V. De, "Impact of using adaptive body bias to compensate die-to-die Vt variation on within-die Vt variation," in *Proc. Int'l Symp. on Low Power Electronics and Design*, Aug. 1999, pp. 229–232.
- [12] W. Zhao and Y. Cao, "New generation of predictive technology model for sub-45nm design exploration," in *Proc. Int'l Symp. on Quality Electronic Design*, Mar. 2006, pp. 585–590.
- [13] E. M. Sentovich, K. J. Singh, L. Lavagno, C. Moon, R. Murgai, A. Sldanha, H. Savoj, P. R. Stephan, R. K. Brayton, and A. Sangjovanni Vincentelli, "SIS: a system for sequential circuit synthesis," May 1992, Tech. Rep. UCB/ERL M92/41.
- [14] J.-J. Liou, K.-T. Cheng, S. Kundu, and A. Krstic, "Fast statistical timing analysis by probabilistic event propagation," in *Proc. Design Automation Conf.*, June 2001, pp. 661–666.