In this paper we outline a transistor size optimization technique for logic circuits that takes into account BTI (Bias Temperature Instability) and process variations. We demonstrate the accuracy of our results with statistical analysis. Since variations have a large impact on the scaling process, dependable circuit designs should include a quantitative analysis if they are to become more reliable in the future. In this study we used an algorithm to prove that with our technique we efficiently lowered the timing margin of the logic path by 4.4% below the margin achieved by conventional techniques. We also observed that the lifetime of the optimized circuits extended without any area overhead.
Introduction
Bias Temperature Instability (BTI) is one of the most significant aging degradations of MOSFETs fabricated in a 65 nm process or later process nodes [1] . The threshold voltage, V th , increases with the operating time of the circuits. This degradation leads to timing violations in logic circuits and read/write failures in memories. Compensations such as adding timing margins are generally applied to circuit designs to cope with this problem.
There are two types of BTI. The first is negative BTI (NBTI) that appears in PMOSFETs, so named because V th of PMOSFETs increases over time when their gates are stressed by negative bias [2] . The second, positive BTI (PBTI) occurs on NMOSFETs when their gates are stressed by positive bias [3] . In CMOS inverters, either the NMOSFET or the PMOSFET is degraded during its operation. When input is high, the NMOS is degraded by PBTI and when low, PMOS is degraded by NBTI.
BTI originates from defects which trap and detrap carriers in the gate oxide [4] . Therefore, the amount of BTI-induced degradations of the MOSFETs are distributed statistically [5] . The timing margins become larger in the scaled process because the distribution of BTI variations changes drastically. BTI variation is also called "BTI variability" [6] , [7] .
The main sources of variability in scaled MOSFETs are 1) BTI resulting from random discrete-charges and 2) process variation. Process variation can originate from many sources including: random dopant fluctuation (RDF), line edge roughness (LER), and variations in oxide thickness [8] , [9] , [10] , [11] . If more reliable chips are to be produced, both BTI and process variations should be considered by chip designers [12] . kazutoshi.kobayashi@kit.ac.jp
In this paper, we propose a transistor size optimization technique for logic circuits that takes into account both BTI and process variations by considering stress conditions and circuit lifetime. Our aim in doing this is to introduce a design technique that overcome these variations with minimum overhead. Three key features of our technique are 1) we used variation distributions to provide statistically proven results of the optimized lifetime delay, 2) we extended circuit lifetime and reduced the timing margin, 3) our technique does not require any area overhead. Conventional studies of transistor size optimization fail to consider the statistical distributions of variations [13] , [14] even though this analysis is critical if these variations are to be predicted accurately. We believe this to be sub-optimal since variations have a larger impact in smaller devices. Incorporating this statistical component in our technique allowed us to produce accurate delay distributions in this study. Variations have a great impact on the scaling process and timing margins can become too large to handle in real designs. We believe it is necessary to reduce the variation effect within the constraints of a small area. This paper is organized as follows: Section 2 provides an overview and discussion of the analytical circuit simulation methods for analyzing BTI and process variations. Section 3 outlines the proposed transistor size optimization technique considering the variations and the experimental results of the estimated circuit lifetime. Section 4 concludes this paper.
Proposed Methods for Analyzing BTI and Process Variations
In this section, we describe two analytical models for BTI and process variations, the circuit simulation and the delay analyses methods.
c 2016 Information Processing Society of Japan 
Analytical Model of BTI Variation
BTI-induced shifts of the threshold voltage ΔV th are determined by both the characteristics and states of the defects in the gate dielectric [15] . Table 1 shows the parameters used in calculating BTI-induced ΔV th . When the MOSFET has n defects, ΔV th at time t can be calculated by Eq. (1):
where, j is the index of defects (1 -n), and k j is the state of the jth defect. The variable k j becomes 0 or 1 when jth defect captures or emits carriers, respectively. The probability mass function of n is shown in Eq. (2).
Where, N t is the expected value of n and explained as Eq. (3) .
Where, L and W are the length and width of the channel of the transistor, respectively, and D is the density of the defect in gate oxide. We assume D = 4 × 10 −3 /nm 2 [16] , [17] .
The probability density function of V th step of single defect ν is shown in Eq. (4) .
Where, η is the expected value of ν explained as Eq. (5) [18].
Where, s is the coefficient of η. We assume s = 9 × 10 3 mV·nm 2 [18] .
Variables τ eh and τ ch represent the time constants of emission and capture when the gate is biased, respectively. Variables τ el and τ cl represent the time constants of emission and capture when the gate is zero-biased, respectively. Time constant τ el is assumed to distribute log-normally from 10 −9 s to 10 9 s [19] , [20] . The relationship between τ el and the other time constants are as follows: τ ch 0.01τ el , τ eh 100τ el , τ cl 100τ el [21] . The duty factor is shown in Eq. (6) . Where, f and t H are the frequency of the gate input signal and time during input is high, respectively. Long-term P C is shown in Eq. (7) [22] .
Where, τ * e and τ * c are the effective capture time constants, and the effective emission time constants, respectively. Note that Eq. (7) is valid if the characteristics of the time constants are significantly larger than the period of the stress signal.
We show the characteristics of ΔV th of the stress time, duty factor and gate width. Figure 1 shows the stress time characteristics. The x and y axes denote ΔV th and the probability (CDF), respectively. Input assignments for calculations are as follows: t = 10 5 , Table 2 shows the average value μ and the standard deviation σ of ΔV th . As can be seen, μ becomes larger with stress time. This is because some traps continue to capture carriers while stress is applied in BTI. The variables σ remain mostly constant. Table 3 shows the average μ and the standard deviation σ of ΔV th . In this table, μ becomes larger with DF. Note that threshold voltages remains constant when DF = 0.0. Table 4 shows μ and σ of ΔV th assuming the population has a normal distribution. Since σ increases when W decreases, it suggests that BTI variation has a larger impact on smaller devices.
The distributions of ΔV th has the following characteristics:
• The average of ΔV th increases with stress time and DF.
• The standard deviation of ΔV th decreases when W increases.
Analytical Model of Process Variations
Process variations tend to have a normal distribution [11] . Corner models are commonly used by chip designers to evaluate process variations. In this model, the average V th (μ), the V th conditions μ + 3σ and μ − 3σ are defined as typical, slow and fast conditions, respectively. We analyzed the delay characteristics of these three corners in timing designs.
In this paper, we assume the distribution of threshold voltage shifts caused by process variation has a normal distribution N PV as shown in Eq. (10) :
where, μ PV = 0 mV and σ PV = 10 mV [23] . Note that we assume μ PV and σ PV are constant among various gate widths.
Circuit Simulation with Variation-Aware Netlist
Variation-aware netlists are used to analyze the characteristics of BTI and process variations: we used it in our simulation method with the HSPICE circuit simulator. Predicted values of the ΔV th at stress time t are applied to each MOSFET in the netlists. The values are calculated using the parameters L, W, DF and t of the transistor in the netlists with the variation model (more details follows in the next section). Figure 4 shows the variation-aware netlist created from the original netlist and the variation model for the simulation analysis.
In BSIM4, the parameter of threshold voltage VTH0 of each MOSFET can be controlled by the parameter DELVTO [24] . V th of MOSFET shifts in response to the amount of DELVTO. Consider the example of the original netlist and the variation-aware netlist shown in Fig. 5 , an NMOS and a PMOS are applied to ΔV th of the values of dvth n(t1) and dvth p(t1), respectively. Note that we need one variation-aware netlist for each t condition.
Delay Analysis with Variation of BTI and Process
This section describes a method to analyze circuit delay caused by BTI and process variations in logic circuits. Both variations follow a normal distribution, N BTI (μ BTI , σ BTI ) and N PV (μ PV , σ PV ). We consider variations that fall in the ±4σ region [25] . The parameters μ BTI and σ BTI are calculated using the model introduced in Section 2.1.
In this experiment we simulated the rise delay time of the 45 nm CMOS inverters. The simulation conditions were as follows:
The delay times are intervals between 0.5 V of the inputs and 0.5 V of the output. The output of inverters is connected to FO4 inverters. Note that we assume PMOSFETs and NMOSFETs have the same distributions for the purpose of simplification.
The dash line in Fig. 6 shows the result of delay analysis without the BTI variation. The x and y axes are the stress time t and the predicted delay, respectively. Note that the x axis is in logscale. We applied the BTI-induced ΔV th for the condition of process corner μ PV + 3σ PV . The lifetime delay of the condition is the worst-case delay. However, this result is too optimistic because the BTI variation should be considered in the scaled process.
The dot-and-dash line in Fig. 6 shows the result of delay analysis with the process corner μ PV + 3σ PV , which are applied by the BTI-induced degradation of μ BTI + 3σ BTI . The worst-case delay scenario is observed when both variations are at their worst. We consider these results pessimistic because such conditions rarely occur.
We propose a delay analysis method that uses the sum distribution of BTI and process variations to prevent overly pessimistic or optimistic predictions. The shifts of the threshold voltage based on the assumption that BTI and process variations have a normal distribution are shown in Eq. (11) .
Note that Eq. (11) is the sum of the distributions of both variations. The solid line in Fig. 6 shows the result of the delay analysis with the corner of the proposed model μ ΔVTH + 3σ ΔVTH . The condition is the worst case. The statistical results we have obtained from our proposed variation model will be used in the section that follows.
Transistor Size Optimization Considering BTI and Process Variations
In this section, we introduce a technique for transistor size optimization in logic circuits that takes into account BTI and process variations. We compared the results using our proposed technique with those obtained using conventional methods to ascertain any differences in delay degradation.
Our technique has three key features. First, the optimization is proven statistically by calculating the distributions of both BTI and process variations. Second, the extension in circuit lifetime and and the a reduction in timing margin were achieved by considering the stress conditions and lifetime characteristics. Third, our technique does not require any area overhead.
Size Optimization with Lifetime Delay Variation
We propose a transistor size optimization for logic gates that takes into account lifetime delay variation. The core concepts of the proposed transistor size optimization are:
• It considers lifetime delay not just initial delay • It uses transistor size optimization to reduce lifetime delay variations We believe that timing margins for variations (BTI and process) should be applied to circuit designs. Figure 7 is a comparison of the timing margin in our proposed transistor size optimization and a conventional design.
In a conventional design, transistor sizes of logic gates are op- timized by considering the initial delay as a primary factor. The purpose of the optimization is defined to minimize the root mean square (RMS) of t dr (rising delay time) and t df (fall delay time) of the logic gates d RMS in Eq. (14) .
However, BTI variation depends primarily on both transistor size and the stress conditions. The average of the variation increases as the stress time t increases and the DF increases. The standard deviations of the variation increases when transistor size decreases.
The BTI-induced degradations for t dr and t df differ because degradations on PMOSFETs and NMOSFETs are not equivalent. The lifetime RMS of gate optimized by the initial delay is not the minimum in this case. We determined the timing margins using the worst-case delay scenario over the circuit lifetime. Taking these lifetime delays into account allowed us to reduce the timing margin by optimizing the transistor size. Figure 8 is an illustration of (a) NBTI and (b) PBTI on CMOS inverters. The t dr becomes larger when PMOSFET is degraded by NBTI when DF = 0.0. The standard deviation of the NBTI variation decreases when the size of PMOSFET increases thereby suppressing the worst-case t dr degradation. When DF = 1.0, the worst-case degradation of t df can be suppressed by enlarging the size of NMOSFET, and vice versa. We anticipate that timing margin can be reduced by optimizing the transistor size with each stress condition. However, there is a risk that the cell library will become too large if cells are designed with transistors in too many different sizes. Therefore, it is not practical to design the cells with many DF conditions. We limited our conditions to DF = 0.0, DF = 0.5 and DF = 1.0 since the degradations of PMOSFETs and NMOSFETs are maximized when DF = 0.0 and DF = 1.0, respectively. The DF of the nodes in typical circuits are non-uniformly distributed and DF = 0.0, 0.5 and 1.0 are more plausible [13] . Chip designers tend to use cells which have a DF condition close to the actual condition. The DF of each logic gate can be obtained from the logic simulation.
Contrasting Size Optimization in Our Proposed Design with Conventional Techniques
We analyzed the lifetime delays of inverters and NAND gates optimized by the conventional initial-based condition and our proposed lifitime-based method introduced in Section 2. We optimized the size of MOSFETs of a 45 nm CMOS process [23] . The size constraints are defined as follows: L P = L N = 45 nm (15)
For the simulation: V dd = 1 V. The rectangle signal was applied to the input via the buffer (two inverters) and the output was connected to FO4 inverter chain.
In the lifetime-based optimization, we analyzed the delays of three stress conditions, when DF = 0.0, DF = 0.5 and DF = 1.0. When DF = 0.0, t dr increased the most due to BTI-induced degradation of the PMOSFET. This suggests that the optimized size of PMOSFET is larger than indicated in the conventional initial-based method. The NMOSFET value should be large when DF = 1.0, and vice versa. Both PMOSFETs and NMOSFETs are degraded equivalently when DF = 0.5. Figure 9 shows RMSs of the initial-based and the lifetimebased optimized inverters. The x and y axes denote the W P /W N and delay RMS, respectively. In the case of initial-based inverters, RMSs of the initial delays are evaluated. The initial-based optimized size is W P /W N : 460 nm/240 nm. In other words, the RMS is the smallest when the size is W P /W N : 460 nm/240 nm.
The optimized size of the inverters when DF = 1.0, DF = 0.5 and DF = 0.0 are W P /W N : 440 nm/260 nm, W P /W N :460 nm/240 nm and W P /W N : 470 nm/230 nm, respectively. As expected, the optimized size of PMOSFET when DF = 0.0 was larger than the conventional method, and similarly the optimized size of NMOSFET when DF = 1.0 was also larger than the conventional method. The optimized size when DF = 0.5 was the same with our technique and the conventional method.
Since the DF conditions of each input are different c n cells should be designed, where the number of stress conditions is c and the gate has multiple n inputs. We applied this principle to optimize the NAND2 gate shown in Fig. 10 . Figure 11 shows RMSs of both the initial-based optimized NAND2 gates and the lifetime-based optimized NAND2 gates. The x and y axes are the W P /W N and delay RMS, respectively. These results reflect the RMSs of the input B since it has delays that are more critical than those of input A. The initial-based optimized size is W P /W N : 410 nm/290 nm. The optimized size of the NAND2 gates in the lifetime conditions is shown in Table 5 . As expected, it shows a trend similar to the inverters. The results when DF A = DF B = 0.0 and DF A = 1.0, DF B = 0.0 are the same. This is because the effects of the degradations of PMOS2 are dominant under these conditions.
Estimating Path Delay with the Proposed Optimization
We estimated the lifetime delays of inverter chains which were optimized by the conventional initial-based and the proposed lifetime-based technique.
The simulated circuits are shown in Fig. 12 . Circuit stress is assumed to be DF = 0.0. The circuit optimized by the conventional technique consists of inverters in a single size. The circuit optimized by the proposed lifetime delay method consists of inverters in two sizes optimized under the condition that DF = 0.0 and 1.0. The other simulation conditions were the same as in Section 3.2.
The initial delays in the conventional and proposed methods are 64.8 ps and 66.5 ps, respectively. The conventional circuit resulted in a shorter initial delay because it uses a technique that minimizes such initial delay. However, we expect the lifetime delay of our proposed circuit to be shorter. Figure 13 shows the estimated lifetime delay in the simulation circuits. The x and y axes represent the stress time and the predicted delay, respectively. Note that the x axis is log-scale. The estimated lifetime delay of our proposed method is 73.2 ps which is a 4.4% improvement over the 76.6 ps in the conventional method.
Conclusion
Conventional lifetime-based designs for logic circuits tend not to consider BTI and process variations: this leads to overstating or understating the predicted delay degradation in circuits. In this report we propose a statistical size optimization technique for logic circuits that take these variations into consideration. This statistical method produces more accurate results because it is based on the sum distributions of the variations. Although experimental, our results clearly show that circuit lifetime is extended and the timing margins are reduced by 4 
