The aging effect "Negative Bias Temperature Instability", which is highly dependent on device history, has a direct impact on the design of integrated circuits. In order to make realistic predictions available in the design process, simulation durations of existing history aware models must be significantly reduced. Therefore, a performance-oriented, yet accurate abstraction of the switching trap NBTI model is presented within this paper. Evaluation results for various stress scenarios demonstrate very precise NBTI simulations and a major improvement to another performance-oriented model abstraction. Simulation durations facilitate realistic aging predictions of larger components in a reasonable period of time.
INTRODUCTION
With the pace of Moore's law, industry is driving technology dimensions further towards the atomic regime. With this scaling, the technology picked up more and more physical artifacts, influencing the usage of such devices. The advent of various flavors of static currents such as gate tunneling in 65nm and gate induced drain leakage in 45nm [9] , also introduced an increasing susceptibility to process variations as well as an electro-thermal coupling. Currently, it seems, as if aging effects could become one of the main challenges for this decade. Similar to the static currents, there is not just a new physical phenomenon -there is rather a vast selection of aging mechanisms.
We can separate these aging effects into different classes: At first, there are degradation effects, slowly varying relevant process parameters over time. At high temperatures, Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). Copyright is held by the author/owner(s). ISLPED'14, August 11-13, 2014, La Jolla, CA, USA. ACM 978-1-4503-2975-0/14/08. http://dx.doi.org/10.1145/2627369.2627618. degradation is dominated by negative bias temperature instability (NBTI), where chemical traps in the gate oxide can capture and emit charges, thus increasing the devices threshold voltage [4] . At lower temperatures hot carrier degradation (HCD) dominates, where fast (hot) carriers can get trapped in the oxide thus again influencing the threshold voltage of the device [10] . From an abstract view, all degradation effects result in a change of power demand and path timing. As soon as the available slack within one path is exceeded, degradation will also lead to a timing failure.
Permanent failures are thus the second class of aging. Electro-migration specifies a force, introduced by the high current density, that can dislocate interconnect material at elevated temperatures, resulting in a total connection-loss. Time dependent dielectricity breakdown (TDDB) may occur in oxides, having collected a vast number of trapped oxide charges, forming a conductive path through the oxide and thus to a permanent device failure [3] . Finally, there are radiation induced permanent failures like single event latchup (SEL) and others, each of which finally leads to a thermal destruction of a device as a result of ionizing radiation.
Research and industry are currently trying to develop tools and methodologies, helping to cope with aging at all design levels from system design, where parameter adaption and redundancy may be employed, down to devices, where direct reduction of the effects are the main focus. Our contribution to this challenge, as presented in this work, is a simple, yet accurate abstraction of the state of the art description of NBTI [7] , reducing the per transistor description form over a thousand parameters to just three without losing details such as healing behavior or the behavior against time varying stress as induced by system idle phases or techniques as power gating. The resulting abstraction is more than three orders of magnitude faster, enabling either the simulation of the total lifetime of a device or the simulation of a huge system of such devices in a reasonable time scale.
Before we introduce the implementation details of our model in Sections III and IV, we present the recent state of the art in NBTI modeling. Afterwards, Section V presents direct comparisons between our model and the standard model using industrial characterization data directly from the author of [7] . Section VI finally concludes our work.
RELATED WORK
Early in the last decade, a first NBTI model, called reaction-diffusion (RD) model was proposed [2] . It explained NBTI with a good agreement to the measurement data available. [2] proposed that NBTI is caused by passivation hydro- gen atoms from the gate-oxide interface, which were freed under stress by hole trapping (reaction) and then drifting through the oxide (diffusion) with a potential annealing over a short annealing distance. Much later, it was shown, that RD can neither explain the ultra-short time behavior of NBTI, nor the long time for recent devices, giving favor to the switching trap (ST) model as proposed by [4] .
The recent ST based NBTI model proposes huge numbers (thousands) of traps inside each device's oxide, each of which is formed between two adjacent SiO2 molecules. Each uncharged trap may capture an electron and the charged trap may re-emit it back with potential barriers and thus related transition times in the order of milliseconds. Additionally, for a charged trap, the local crystalline structure of the oxide can swap, driving the trap over a potential barrier large enough to also explain the macroscopic transition times (seconds to month) observed in measurements. Depending on the exact configuration of the molecules, the potential barriers between the states may significantly vary leading to characteristic capture τC and emission times τE per trap. Figure 1 (left) shows such a distribution as a two dimensional histogram. As can be seen, capture and emission times are weakly correlated: While capture and emission times spread over several orders of magnitude, usually capture and emission time are in the same order of magnitude; e.g. traps with a long capture time also tend to have long emission times. During ST simulations occupation probabilities of the existing traps are calculated based on given stress scenarios.
Besides the abstraction, we propose, there have been several other approaches. [6] proposed a first abstraction of the NBTI conditions, lumping all transient NBTI defects into two components. Even though the model itself is too simplistic, the authors propose a useful gate-netlist reduction technique, eliminating nodes, which are irrelevant for the circuits timing correctness. In [5] , an abstract model for system level NBTI and HCD description is presented. It models NBTI with an extended RD model, called the composite model. It introduces good prediction of a full system's timing behavior under degradation. However, the composite model as other closed-form solutions of the NBTI degradation only handles constant stress over the entire system life time and does not feature varying stress conditions within the device's history. Additionally, as it still bases on the RD model, it cannot be characterized with recent (ST based) technology data.
There are already industrial tools for aging simulation available such as RelExpert [1] , which can accurately predict aging of a circuit under static conditions by simulating the fresh transistor netlist, compute per device stress conditions and apply a per device aging model. With the degraded netlist, the simulation at SPICE level is repeated to get power and timing figures of the system after stress. However, the accuracy of such methods depends on the employed aging model, which is today also bound to static stress conditions.
Best to our knowledge, all existing NBTI models are either too slow for a full chip and/or full life-time simulation, or they cannot handle varying stress conditions as typically occurring in most systems, or they are not accurately following silicon measurements.
ABSTRACTED DEFECT OCCUPATION
The initial model assumption, as already proposed by [7] , is that the discrete traps, which can either be fully charged or fully uncharged can be replaced by a continuous statistical process, which can also be described using RC circuits. This step is necessary when compacting the explicit traps into a trap distribution map as presented in Figure 1 . Thus we assume to having a two dimensional histogram ∆V th (τC , τE) and the occupation state over time of all these traps can be described by a defect occupation P (τC , τE, t) with
if the system is stressed for a time ∆t and
if the stress is removed for a time ∆t. The transient threshold damage due to NBTI over time be computed as ∆V th tran (t + ∆t) = dτC dτEP (τC , τE, t + ∆t) · ∆V th (τC , τE) (3) Examples for P functions under different stress loads are visualized in Figure 2 . As can be seen, occupied and unoccupied traps are separated by a monotone function, since trap charging and discharging always starts at low values of τC and τE, respectively. For a wide range of stress scenarios this monotone function is typically almost rectangularly shaped. Figure 1 (right) also depicts this function and illustrates the connection between defect occupation and distribution histogram for a partially charged transistor state. The permanent threshold shift is modeled similar to Equation 1 and 3 with the only difference that the histogram ∆V th perm (τC ) and the defect occupation are only one-dimensional.
First step in order to develop a history aware and efficient NBTI model is an abstraction of the switching trap transistor state using only a few parameters. In this way, the device history can be tracked by observing the alteration of the abstraction parameters. Obviously, permanent ∆V th perm and transient threshold shift ∆V th trans are used (bottom left) and a worst case scenario with a very generic minutes stressminutes relax -seconds stress -seconds relax -milliseconds stress sequence (bottom right).
as the first abstraction parameters. A third parameter H is used to characterize the slope within the defect occupation due to peridioc stress (compare Figure 2 (top right and bottom left)). Since the gradient is an intrinsic feature of the ST model [8] , H only describes the logarithmic temporal length (emission time) of the interval stated by Equation 4 . Since this parameter characterizes the short-time healing ability of the system, it is called "healability" within this paper.
A conversion between the original defect occupation of the ST model and the abstract representation is mandatory for our approach. The calculation of the abstraction parameters based on the defect occupation is a straight forward process using Equations 3 and 4. However, the generation of a defect occupation based on the abstraction parameters is a complex operation with various degrees of freedom. Therefore, this conversion is the key point of our approach and will be evaluated extensively in Section V.
The generation process of the defect occupation as indicated in Figure 3 (top) is based on the monotone function described above. The one-dimensional occupation of permanent defects is reconstructed by increasing a virtual capture time (rf. Equation 1) until ∆V th perm has reached the intended value. The two-dimensional occupation of transient defects that occur under typical stress conditions can't be reconstructed based on a single parameter. However, the virtual capture time of the permanent component is a perfect estimate of the height of the defect occupation at τE M ax, since the maximal emission time is too high to be reached under normal operation conditions. At first, the virtual capture time is used for the complete range of emission time values and a slope characterized by H is added in the short time regime. Afterwards, a virtual emission time (rf. Equation 2) is increased, while the slope is always restored, until ∆V th trans has reached the intended value. In this way, the parameter ∆V th trans also determines the position of the slope on the emission time scale. The result of Figure 2 .
the generation process is always a rectangular shaped defect occupation, which may have an additional slope (rf. Figure  3 (bottom) ). Failure values of the recreated defect occupations for the examples in Figure 2 are shown in Table 1 . A clear deviation only occurs for the short time regime of the worst case scenario (compare Figure 2 (bottom right) and Figure 3 (bottom) ). Naturally, such generic defect occupations can't be characterized precisely using only three parameters. However, as long as different defect occupations that are characterized by the same parameter values produce nearly the same results in a consecutive ST simulation, this characterization is sufficient for our phase space approach.
PHASE SPACE APPROACH
Instead of a NBTI simulation that is based on the transformation of the complete defect occupation, the phase space approach tracks the transformation of the three parameter abstraction (rf. Section III). However, the ST model can't be easily modified to rely directly on the abstraction parameters. Consequently, the new approach uses a precomputed phase space to perform a NBTI simulation by following a (interpolated) phase space trajectory. The phase space is ∆V th perm , ∆V th tran , Healability defined by the abstraction parameters ∆V th perm , ∆V th tran and "healability" H. A flow chart of the general simulation method is shown in Figure 4 . During a phase space based simulation the abstraction parameters and additional input arguments (e.g. stress scenario) are used to interpolate a step on a phase space trajectory. The resulting values of ∆V th perm , ∆V th tran and H can be used for the next time step, which may have different values of the additional input arguments. In this way, the simulation simply follows the phase space trajectory as long as the additional input arguments don't change.
In order to construct the phase space (rf. Figure 5 ), the defect occupation is generated with the method described in Section III and the actual transformation of the abstraction parameters during a time step is directly calculated with the ST model. The phase space construction can be processed in parallel and has to be done only once for each transistor technology and geometry.
Step size of the three abstraction parameters defines the interpolation failure of the phase space simulations as well as the construction duration. Since ∆V th perm , ∆V th tran and H are restricted to certain ranges, the phase space incorporates all system states and a trajectory can't leave the area of precomputed values. It is possible to simulate NBTI induced aging with the same time resolution as with the ST model. Every time step on a phase space trajectory will induce a small interpolation failure and a failure due to the three parameter abstraction. However, the phase space approach facilitates the usage of longer simulation time steps, if there is a reasonable abstraction of the stress (e.g. signal probability) and the stress scenario changes only on macroscopic time steps (e.g. power gating or ambient temperatures). These longer time steps vastly reduce the simulation duration.
EVALUATION

Model setup
As the main reference to assess all other model's accuracy, we use the model of [7] , with all model parameters based on direct silicon measurement at 440 K temperature and 2.2 V supply voltage (accelerated aging conditions), directly from the author of [7] (rf. Figure 1 (left) ). We will refer to it as full switching trap model (FST). Under different generic and realistic stress conditions, we used the FST model with a time resolution of 1 ms per step, lumping together the cycle based activity statistics into milliseconds of 100% busy or 100% idle. Going from cycle base to milliseconds was mandatory for the evaluation -its drawbacks will be discussed later. Additionally, we re-implemented the model, used in [6] , being closest to our own approach. We will refer to this model as the lumped switching trap model (LST). Even though it is not fully disclosed in [6] , we tried to optimize their free model parameters for our evaluation by using the same silicon data. To improve fairness of comparison, we also increased the number of defects from 2 to 4 by introducing two additional permanent defects. In order to define the defect properties of LST, we devided the capture time scale of transient and permanent defects within the silicon data in two regions. While the induced ∆V th of each lumped defect is simply a summation of all defects within the region, the new time constants are calculated using a weighted mean of all corresponding time constants (rf. Equation 5). The particular regions are chosen in order to attain optimal evaluation results in the scenario 1 day, 10Hz and 50% duty cycle (rf. Figure 6 ). Table 2 summarizes the parameters of the 4 defects.
We call our model as described in Sections III and IV the phase space based model (PSB). Taking the FST as a reference, we pre-compute the phase space for a time step of 60 s. For the sake of comparison of the abstraction itself, we use a very fine mesh to avoid additional interpolation errors.
Stress scenarios
We defined 24 stress scenarios to evaluate different aspects of our PSB model. A continuous stress scenario gives the upper bound of reachable threshold shift. 30s stress, 30s relax is an example for a perfectly rectangular defect occupation (rf. Figure 2 (top left) ) that doesn't need the "healability" abstraction parameter; worst case generically constructs the worst defect occupation we could reach by setting the sys-tem under a very generic minutes stress -minutes relaxseconds stress -seconds relax -milliseconds stress sequence. Nine scenarios represent typical frequency -duty cycle stress to make this work comparable with other publications. All these stress scenarios were applied for a simulation time of 1 hour, 1 day and 1 week. Therefore, the PSB simulations comprise of 60, 1440 and 10080 conversions between defect occupation of FST and the abstract representation, respectively. Evaluation results of these scenarios are shown in Figure 6 .
Additional evalution results with a simulation time of 1 day are shown in Figure 7 . There are 5 binary Markov graph scenarios, showing non periodic, yet strictly characteristic signals with parameterized up-P01 and down-switching P10 probability. In order to demonstrate short-time effects within these scenarios, a relaxation time of 0.1 s was added in additional examples. The corresponding frequency -duty cycle scenarios were also implemented as a reference. Quasiperiod fq and signal probability P1 can be computed as
Finally, there are three tri-state Markov graphs, generating stress scenarios, which are typical for devices with long idle times (e.g. from power gating). These systems remain for a macroscopic time (several minutes, as specified by P PG ) either in idle (P1 = 0) or in active with given P01 and P10 values. For each scenario, two examples with a on-and off-state in the last macroscopic time step are implemented. These realistic scenarios are currently not supported by any other model, but the three models, evaluated here.
Result discussion
At first we have to notice, that PSB partially over-and underestimates the correct threshold in the scenarios of Figure  6 with a worst case error smaller than 4%. Neither the variation of simulation time, signal frequency nor duty cycle has a distinct impact on the accuracy of the simulation results. It is also worth mentioning, that the error for the permanent part of the threshold damage is never larger than 0.05 mV. This can be explained by the modeling methodology: In the FST, permanent damage is represented by an occupation of traps with different capture times. Since there's no emission, the occupation distribution is directly determined by Equation 1 using a transistor lifetime that is reduced by all relaxation times. This behavior is directly modeled with the phase space based approach. For much longer simulation times, we may therefore see a slightly larger modeling error of the transient part, but the permanent part will still be almost perfectly modeled. Accuracy of the scenario with almost rectangularly shaped defect occupation (30s stress, 30s relax) coincides with other periodic stress scenarios, showing that the slope within the defect occupation of peridioc stress examples is well abstracted by the healability.
Parameters of our implementation of LST were characterized to have minimal deviation for the scenario 1 day, 10Hz and 50% duty cycle. However, clear deviations up to -64% and +139% occur in other scenarios. LST shows the tendency to underestimate the threshold shift in scenarios with 10% duty cycle or long duration times and to overestimate ∆V th for 90% duty cycle or short simulation times. Since there is only a minor influence of the signal frequency on LST and almost no impact on PSB, it may be suggested that these results are also valid for real components, surely having much higher switching frequencies.
Much longer on and off times are possible in the binary Markov graph scenarios in comparison to quasi frequency signals. However, differences in threshold voltages are only simulated for scenarios with quasi duty cycle of 67% and 83% (rf. Figure 7 ). An additional relaxation time of 0.1 s is needed in these Markov graph scenarios to align ∆V th with the quasi frequency scenarios. This relaxation time corresponds to an off-time that is 4 or 10 times larger than the nominal value of the quasi frequency signal, respectively. Hence, this difference cannot be modeled with static NBTI models, such as [1] or [5] , leading to an underestimation of the NBTI effect. Although LST operates close to the char- acterization example, simulation results of PSB are slightly more precise within these scenarios. The aforementioned difference between Markov graph and quasi frequency scenarios can also only be simulated by FST and PSB. While the effect of the additional relaxation time is only slightly underestimated by PSB, LST simulates almost no transformation in threshold voltage.
Both abstract models are able to simulate components with long off times, as induced by idle, active waiting, or power gating. This is a great advantage in comparison to static NBTI models ( [1] or [5] ). Examples having an offstate in the last macroscopic time step show that LST clearly overestimates the relaxation within such a large time step.
We implemented all three models in an interpreter language for a 64 core 2.3GHz AMD Opteron with 512GB memory. In the very accurate FST model could compute 1 hour device lifetime in about half an hour. Thus it would need prohibitive 5 years to simulate a 10 year transistor lifetime. Both alternative models, the LST and our PSB were over 3 orders of magnitude faster, simulating 1 hour device life in about one second. Thus a 10 year simulation would become feasible, needing less than 2 days computation time. The PSB model needs to do one phase space characterization run in advance which has to be repeated for each transistor technology and geometry. Such a characterization needs some hours of computation time (36 hours for our very fine characterization for evaluation purposes).
CONCLUSION
Several abstract NBTI models are already known, some of which show a good accordance to silicon data, when assessed for periodic signals with arbitrary duty cycles. Assuming long and non-periodic off times, e.g. induced by components that are idle or actively waiting or even power gated, the NBTI model has to incorporate the complete device history. In these scenarios only the original switching trap model and our new model can accurately regard the transient healing effects for a wide range of signal probabilities and simulation times. Our model shows a moderate error in the transient part and a negligible error in the permanent part of NBTI and speeds up the simulation by more than three orders of magnitude. The model we presented is focusing on the device description only and could be easily integrated into sophisticated simulation environments, such as RelExpert [1] , the academic path pruning [6] or any other ageism flow.
Our follow-up research will focus on a support for varying temperature and voltage stress. Finally, we intend to include an accurate model into a fast simulation environment enabling long time all system assessment of the NBTI and the effect of optimization techniques such as power gating.
ACKNOWLEDGMENT
This work presents results achieved within the European ICT FP7 project MoRV (619234) and within a subcontract from Infineon in the European Catrene project RELY (CA 403). We would like to thank Infineon for their friendly cooperation.
