Abstract -Positive bias temperature instability (PBTI) is poised to cause significant degradation to nFETs with deep scaling into nanometers. It is commonly modeled by a power law fitted with measured threshold voltage shift. For the first time, this paper shows that such models do not warrant PBTI prediction outside the stress conditions used for the fitting. The underlying cause for this failure is the errors in the extracted power exponent. Based on the understanding of different types of defects, we develop a robust as-growngeneration model and demonstrate its capability for accurate prediction of PBTI under both dc and ac conditions. The generation-induced degradation is found to play a key role. Analysis reveals that although PBTI is usually smaller than negative BTI (NBTI) within the typical test time window, it can exceed NBTI by the end of device lifetime.
I. INTRODUCTION
A GING has become a critical concern for CMOS technologies as scaling is reaching nanoscale regime [1] - [8] . Thorough examination and certification of reliable operation throughout the entire application lifetime are required during design. Emerging applications such as the Internet of Things or wearables usually require strict resiliency and long lifetimes [9] . For example, some biomedical applications require a lifetime of more than 50 years for medical implants. An accurate long-term lifetime prediction is the ultimate task of aging evaluation.
Bias temperature instability (BTI) has been considered as one of the important aging mechanisms. Extensive efforts have been made in investigating the negative BTI (NBTI) for pFETs. The recent use of the multilayer gate material, however, has led to considerable positive BTI (PBTI) for nFETs [1] - [8] . From the application perspective, it has been reported that PBTI can be the dominating reliability issue for field-programmable gate arrays [10] and ring oscillators [11] . Despite industry-wide characterization of various aspects of PBTI phenomena and general consensus regarding its empirical features [1] - [11] , the detailed mechanism of the degradation is not fully understood. Charging of preexisting traps and/or generating new traps in the dielectric are considered to be the root of PBTI [12] . Due to the lack of wellaccepted PBTI model, the classic power law as described in (1) is widely used for lifetime prediction [13] , where A, m, and n represent the prefactor, voltage, and time exponents, respectively
One set of typical PBTI results with a different stress gate bias, V gst , and measurement delays are shown in Fig. 1(a) . It is clear that PBTI depends on both V gst and the delay between stress and measurement. The time exponents are extracted and plotted in Fig. 1(b) . When measured with 1-ms delay, the time exponent declines as voltage increases [1] , making it impossible for predicting the long-term PBTI under real use conditions [14] .
The apparent time exponent is close to a constant, when measurement delay is minimized to 3 μs. However, the classic This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/ power law model extracted from these data failed to predict the PBTI even just 0.1 V below the lowest V gst used for model parameter extraction, as shown in Fig. 1(c) . When fitting the measured V th , the uncertainties for the time exponent reported in Fig. 1(d) [1] - [8] do not warrant prediction. There is a need for a test-proven method to characterize and model PBTI-induced degradation, enabling reliable prediction.
The central objective of this paper is to develop a model for long-term PBTI prediction under both ac and dc operation conditions. By separating different types of defects and understanding their kinetics, the proposed model provides excellent predictive capability. The model is then used to assess the long-term PBTI under real use conditions. It is found that the lifetime for dc PBTI can be overestimated by four decades by the model extracted from the filled symbols in Fig. 1(a) , as shown in Fig. 1(c) . In addition, although PBTIinduced degradation can be smaller than NBTI within typical test time window, we will show that long-term PBTI can overtake NBTI.
This paper is organized as follows. The details of devices and experiments are described in Section II. Section III shows how different types of defects can be experimentally separated. Based on the extracted kinetics for each type of defect, the as-grown-generation (A-G) model for PBTI is proposed and validated under both dc and ac operation conditions in Section IV. Section V discusses the long-term PBTIinduced degradation under real use conditions and Section VI concludes this paper.
II. DEVICES AND EXPERIMENT
nFinFETs fabricated using a Hf-based high-k oxide stack and a metal gate are used to demonstrate the proposed model. An equivalent oxide thickness of 1 nm was obtained by adopting a thin TiN metal gate, inducing Si in-diffusion, and reducing the interfacial oxide thickness [15] . Fast measurement of I d -V g within 3 μs on Keysight B1530 is used in this paper [16] . The threshold voltage degradation is monitored by sensing at a constant Id of 500 nA × W/L around the threshold voltage. Unless otherwise specified, temperature is 125°C.
Although PBTI is considered as an electric-field-driven phenomenon [17] , the tests in the literature were usually performed under constant V gst against stress time. The underlying assumption is that the total degradation, V th , is much smaller than the applied voltage, and thus the change in electric field over the dielectric, E ox , during the stress can be neglected. Fig. 2 compared the PBTI degradation under the constant-V gst and constant-E ox condition. The constant-E ox is maintained by adding V th measured in the last step to V gst (time = 0). Although the difference in Fig. 2 is small initially, it becomes considerable as V th increases for longer stress time. In this paper, tests were carried out under constant-E ox , unless otherwise specified.
III. DEFECTS UNDER PBTI STRESS
There is no consensus on the defects and mechanisms of PBTI. Some groups ascribe the degradation to filling preexisting electron traps, such as oxygen vacancies in the high-k layer [18] . Other groups proposed that stress-induced defect generation may also make considerable contribution [13] , [19] , [20] . In order to model PBTI, it is important to separate different types of defects and to model the kinetics of each type separately. In the following, we will show that through separating different types of defects experimentally, accurate PBTI model with predictive capability can be extracted.
A. Generated Defects: Characterization and Modeling
For NBTI, it is reported that generated defects (GDs) can depend on measurement conditions (e.g., discharging time T disch and the discharge voltage V gdisch ) [21] . This is also the case for the GD induced by PBTI. One example is shown in Fig. 3 : if different T disch /V gdisch is used, the extracted GD kinetics varies. For NBTI, this is because not only as-grown traps (ATs), but also some GDs are discharged [21] . By applying the stress-discharging-recharging (SDR) technique, the discharged GDs are refilled, allowing all GDs being captured. Using all GDs obtained in this way, a reliable power law is obtained, which is independent of measurement conditions [21] .
In this paper, the SDR technique is applied for PBTI with the waveform shown in Fig. 4(a) . The details about this technique can be found in [21] . As shown in Fig. 4(b) , the dependence of GD extraction on V gdisch and T disch is removed with SDR.
To study the dependence on stress conditions, GDs were measured under different stress overdrive voltages, V gov = V gst − V th . As shown in Fig. 4 (c), they are well described as It is worth noting that the extracted GD has a V govindependent time exponent of 0.32. This is larger than that extracted from the total V th in Fig. 1(a) and most of the values reported in early works [1] - [8] in Fig. 1(d) . It is also larger than the ∼0.2 reported for NBTI [21] . From a circuit point of view, the large time exponent for PBTI can impact the long-term reliability, as will be further elaborated in Section V.
The atomic structure of GD remains unknown and the electrical measurement used here does not give direct information on it. To test if GD is interface states, we study the subthreshold swing (SS) against stress time. An increase in SS is considered as an indicator for interface states and/or border traps generation [7] , [22] . As shown in Fig. 5(a) , with a substantial GD, there is a little change in the SS. This indicates that the GD is oxide traps, rather than interface states.
To further explore the defects responsible for GD, Fig. 5 (b) shows that GD and stress-induced-leakage current (SILC) have the same time exponent. This strong correlation supports that they originate from the same generation process. It is reported that the defects responsible for the intrinsic breakdown are the generated electron traps, rather than hole traps [23] . Hydrogenous species has been proposed to cause the generation [24] and one may speculate that the GD Fig. 6. (a) Discharge profiles of preexisting traps using the method in [26] . The traps were first charged under V gch = V g − V th = 0.3 V and the subsequent discharging was recorded to give the lowest set of symbols. V gch was then raised to charge the traps again, following by discharge for the next set of symbols. This charge-discharge sequence was repeated until V gch reached 1.1 V, which corresponds to the highest set of symbols. Charging of (b) AT and (c) EAD. The energy level after charging does not change for AT, but lowers for EAD.
contains hydrogen. Whether it contains hydrogen before the generation is not known. It is commonly accepted that SILC and intrinsic breakdown are caused by the same types of defects, which are randomly distributed in the oxide [25] . For intrinsic breakdown, one may speculate that foreign elements are not needed in the structure before defect generation.
B. Preexisting Defects: Characterization and Modeling
Preexisting defects originate from the fabrication imperfectness. By definition, their charging and discharging kinetics will not be affected by the device's stress history. Therefore, they can be readily characterized by using heavily stressed devices. Here, significant amount of GDs have already been generated, so that there is little further generation in the following preexisting defect characterization. This suppresses the interference of trap generation on the characterization.
To understand preexisting traps, their discharge properties are studied. For each set of symbols in Fig. 6(a) , traps were first charged under a constant V gch − V th and the highest point represents the charged level. The trapped charges were then progressively discharged by stepping V gdisch − V th in the negative direction and each more negative V gdisch − V th step gives one lower point. After completing one discharge sequence, a higher V gch − V th is applied to charge the traps to a higher level, followed by a new discharge sequence to give the next set of symbols in Fig. 6(a) .
When V gch − V th is low, the discharge profiles are independent of V gch − V th , i.e., they overlap well. However, when V gch − V th increases further, they deviate from each other and are higher for higher V gch − V th . This is because there are two different types of electron traps, as illustrated in Fig. 6(b) and (c). One of them captures an electron without changing its energy level [ Fig. 6(b) ] and is named as ATs. In contrast, after capturing one electron, the energy level of the other type shifts downward from their ground/neutral state [ Fig. 6(c) ]. This is named as energy alternating traps (EADs).
Under low V gch − V th , charging is dominated by ATs. Since their energy level does not change after charging, their discharge profiles overlap. Under high V gch − V th , however, both ATs and EADs were charged. As the charged EADs have lowered its energy level, they do not discharge immediately when Using the lowest set of symbols as the base, these short lines were shifted downward until they all joined together to give a single profile over the whole voltage range, which are the ATs. The details are given in [26] and [27] . (b) Example of separating EADs from ATs. After ATs were extracted, EADs were obtained by subtracting ATs. V gdisch -V th starts reducing from the V gch − V th . This results in the higher discharge profiles for higher V gch − V th in Fig. 6(a) . EADs can only be neutralized when their lowered charge states are reached under more negative V gdisch − V th . In contrast, the additional ATs charged under a higher V gch − V th will be discharged as soon as V gdisch − V th is lowered.
Based on the above-mentioned condition, AT can be extracted by adding the additional discharge for two consecutive V gch − V th levels to the overlapping discharge profile, as illustrated in Fig. 7(a) : starting with the discharge profile at the lowest V gch − V th in which only AT traps are involved, the additional discharging trace under the next charging level is shifted down to align these two curves at the last point of the lower curve. By following this procedure up to the highest V gch − V th , the distribution of AT is extracted for the whole voltage range. Once AT is known, EAD can then be extracted by subtracting AT, as shown in Fig. 7(b) . AT dominates initially, while EAD follows a power law. Further details can be found from [26] and [27] .
The kinetics for EAD and AT under different overdrive voltages are extracted and shown in Fig. 8(a) and (b) , respectively. ATs clearly saturate, confirming their "As-grown" nature. Empirically, the saturation level, AT sat , can be well modeled with (3) and its charging kinetics with (4) [28] AT
where p1, p2, τ , and γ are constants and extracted from the test data. EADs follow a power law with
where g 2 , m 2 , and n 2 are fitting parameters. What is worth of noting is that, although both EAD and GD follow power law, their time and voltage exponents are quite different (see Table I ) and therefore they must be modeled separately.
As most circuits operate under ac condition, preexisting traps charge-discharge dynamically. The discharging is directly measured and shown in Fig. 9 . It can be described by the universal recovery curve in the following equation [29] :
where B and β are fitting parameters.
We further explored the apparent activation energy, Ea , of AT and EAD, as shown in Fig. 10 . When compared with AT, EAD has a much larger Ea , suggesting its thermally activated process. One may speculate that the structure of EAD relaxes following trapping: capturing one electron could lead to rearrangement of microscopic structure in terms of local bond length and angle, which in turn lowered the energy level [29] , [30] . In contrast, little change in energy levels of AT after trapping indicates little structure relaxation. This, together with different trapping kinetics in Fig. 8 and the different temperature dependence in Fig. 10 , support that ATs and EADs are different types of defects.
AT and EADs are also clearly different from GD: 1) there is no GD in fresh devices; 2) majority of GD will not discharge when AT and EAD are neutralized [see Fig. 7(a) ], indicating that GD has deeper energy levels; and 3) they have different kinetics. These differences should be taken into account for atomistic modeling in the future.
IV. A-G MODEL AND VALIDATION

A. Model Construction
Based on the above-mentioned knowledge of defects, an A-G model can be built with the architecture in Fig. 11 . The model parameters for each defect are given in Table I . 1 kHz duty factor = 0.5. For both dc and ac stresses, V gov = V gst − V th (time = 0) and V gst was a constant for each set of symbols, i.e., V gst was not adjusted for ΔV th . These test data were not used for fitting the model parameters.
They were obtained through fitting the data of short V g -accelerated dc stress, as described in Section III. For a given set of inputs: V g , frequency, and duty factor, the total V th equal to the sum of all charged defects.
B. Model Validation for Predictive Capability
The proposed A-G model is of value only if it can predict the PBTI degradation at the low use bias and longer time, outside the range used for fitting the model parameters. The constant-E ox paradigm is necessary for model parameter extraction, but the circuits operate under constant-V g condition. To test the predictive capability of the A-G model, PBTI degradation under several constant V gst conditions was measured. Specifically, a relatively long test time is used for the lowest V gst , which is limited by the minimum measurable degradation. V th under constant V g is predicted by the model framework in Fig. 11 . with (2)- (6) . The excellent agreement between the measurement and prediction for both dc and ac PBTI validates the predictive capability of the A-G model, as shown in Fig. 12(a) and (b) . The frequency and duty factor dependences can also be predicted well in Fig. 13(a) and (b) . It is worth of pointing out that this good agreement is not from the fitting. This is because the model parameter extraction is based on the data from short-term dc constant-E ox tests, while the test data under constant-V gst in Figs. 12 and 13 were not used for the fitting. Indeed, the lowest V gov = 0.5 V in Fig. 12(a) is well outside the range of stress biases used to extract the model parameter in Fig. 4(c) . Therefore, PBTI cannot be modeled reliably by simply fitting test data with a power law and a defect-based A-G model should be used.
V. IMPLICATION TO PRACTICAL DEVICE OPERATION
Based on the established A-G model, PBTI can be predicted under operation condition [solid lines in Fig. 1(c) ]. If the prediction is made from the classical power law fitted with the total V th [the filled symbols in Fig. 1(a) ], there is a clear gap between the measured data and the prediction [the lower line in Fig. 1(c) ]. In the short term, this gap may appear insignificant (∼2 mV), but it leads to an overestimation of lifetime by over four orders of magnitude.
The contribution from each type of defect under dc and ac real use conditions is assessed in Fig. 14(a)-(d) . Fig. 14(a) shows the PBTI kinetics under dc condition. ATs reduce slightly for longer time, because the surface potential varies when GD and EAD increases under a constant V g . Although AT becomes insignificant by the end of device lifetime, they must be taken into account during the PBTI test, so that an accurate time exponent can be extracted for GDs and EADs. EADs are one of the major components at both short and long time for dc PBTI.
Due to their larger time exponent, GDs increase faster than EADs and overtake EADs when approaching 10 years. Quantitatively, Fig. 14(c) shows the relative contributions of each defect at 1 day, 2 months, and 10 years. By the end of 10 years, the contribution from GD increases to ∼60%.
The degradation kinetics under ac PBTI condition is shown in Fig. 14(b) . Compared with dc, PBTI under ac is smaller. This is mainly due to the reduction of EAD, which discharges effectively during V g = 0 phase. On the other hand, GDs are similar for both dc and ac. At 10 years, GDs contribute to over 80% of the total degradation, as shown in Fig. 14(d) .
In recent years, most of test data report that PBTI is smaller than NBTI under the same electric field [31] , [32] . The test time, however, is limited typically to less than 10 4 s. In order to compare the long term NBTI and PBTI, we extracted and validated the A-G model for both of them from the same wafer. The predicted degradations under real use condition are compared in Fig. 15 . It is clear that when the test time is short (e.g., <10 k), NBTI is indeed larger. At 10 years, however, PBTI overtakes NBTI by a factor of 1.5, because of the time exponent of GD is ∼0.32 for PBTI and ∼0.2 for NBTI.
VI. CONCLUSION
In this paper, for the first time, we demonstrate that the common power law model extracted from V th do not warrant predicting PBTI outside the test conditions used for fitting model parameters. An A-G model is proposed, which can predict PBTI not only under dc but also under ac conditions with a different frequency and duty factor. This excellent predictive capability is validated through comparison between the measured data and the prediction from the model. Further analysis based on the established model reveals that although PBTI can be smaller than NBTI within typical test time window, it becomes larger toward the end of device lifetime due to its larger time exponent. The simplicity of the model and its parameter extraction make the proposed methodology favorable for future process qualification and circuit-level reliability evaluation.
ACKNOWLEDGMENT
The test samples were supplied by IMEC.
