Abstract-With the downscaling of MOSFETs to nanometer dimensions, transistor electrical parameter variability is produced by factors other than variations of physical dimensions and doping profiles, which are there since device fabrication and remain static over time. Besides these time-zero variability factors, factors that lead to performance variability from one instant in time to the other start playing a significant role. Random Telegraph Noise (RTN) is among these relevant time-dependent variability sources. In this work we extend the knowledge of the time-dependent random variability induced by RTN, by providing a statistical model for transistor threshold voltage jitter produced by RTN. The area scaling of is detailed and discussed, supporting designers in transistor sizing towards a more reliable design. Not only the jitter expected in a transistor is modeled, but also its variability among transistors that by design should be equal. Besides analytical modeling, Monte Carlo simulations are run. The simulations account for the charge carrier capture and emission events related to RTN, allowing the proper evaluation of the RTN related jitter. The Monte Carlo simulations validate the analytical model and illustrate the area scaling of jitter and its variability.
INTRODUCTION
OSFETS are employed in digital, analog and mixedsignal integrated circuit (IC) designs. Yield and reliability of ICs using MOSFTES depend on the variability of the transistor parameters and noise. Both stochastic variability and noise scale inversely with area. On the other hand, cost increases with area, and increasing area may also increase capacitive load, what decreases performance and increases power consumption. For the designer to be able to find the adequate balance between cost, reliability and performance, adequate models are demanded.
There are sources of variability that are time independent, present in the fresh (new) device due to imperfection in fabrication process and the discrete nature of the matter. This includes line edge roughness, random dopant fluctuations, and metal (or poly) gate granularity. But there is also time dependent -time varying -sources of variability. Random Telegraph Noise (RTN) is among these time varying sources of variability [1, 2] .
There are well stablished models for the time-zero variability. To a first order, time-zero parameter variability (parameter statistical standard deviation) is considered to be inversely proportional to the square root of the area [1] . Furthermore, in circuits that contain a large number of small area transistors, the impact of RTN-induced fluctuation is considered to increase when it is compared with the static variation caused by manufacturing process [2] .
In this work we discuss and detail the area dependence of the time dependent variability induced by RTN.
In digital circuits, the RTN chronological statistics, especially trap occupancy switching, has direct impacts on circuit performance and reliability, as degradations like jitter of signals happen when a trap switches state.
In this work, an analytical model to evaluate the RTN induced threshold voltage jitter RTN and its impact in digital circuits is presented. To the best of our knowledge, this is the first statistical model derivation of the RTN induced threshold voltage jitter and its variability. Properly modeling variability is of paramount relevance, since not only the amplitude of RTN increases with area downscaling, but also the variability increases. Variability increases even faster than average amplitude, as shown below.
ANALYTICAL MODEL

A. Model Derivation
The alternate capture and emission of carriers at individual defect sites (traps) generates discrete fluctuations in the device current. These fluctuations, also called Random Telegraph Noise (RTN), are the main source of Low-Frequency Noise in deep-submicron MOSFETs. Figure 1 shows an RTN time trace measured on a small area nMOSFET, originating from a single trap. The current is seen to alternate between a higher current state and a lower current state. The difference (fluctuation) in current between states is δID. In large area transistors a large number of traps is usually found, but with a reduced individual impact. Usually current fluctuations are measured. The current fluctuation δID may be translated into a threshold voltage fluctuations δVT [1, 3, 7, [11] [12] .
This capture and emission of electrons by a charge trap may be modeled as a two-state fluctuation of the threshold voltage VT. If the trap is empty, we consider that the device is at a lower VT (hence higher current). If a charge carrier is trapped, we consider the device is at a higher VT (hence lower current). The difference in VT between the trapped and empty state of a single trap is then δVT. Considering the ith trap, δVTi is the VT fluctuation due to the ith trap. Please note that for traps that lead to a higher VT if occupied (and hence a lower VT if empty) the notation for the states may be exchanged, without any loss of generality and model validity.
The value of the VT fluctuation due to all traps at time t, here called ΔVT(t), is then evaluated as the sum of the contribution of all n traps found in a device:
( ) = ∑ ( ) (1) For simplicity and consistency with noise, we make ⟨ΔVT(t)⟩=0 (noise is assumed to have an average value of zero). This is also convenient to lead to an elegant resolution for the jitter and its variability. Zero average value is achieved by making the average value of the VT fluctuation due to each trap equal to zero, by writing Si(t) as being
Where Pi(1) is the probability of the ith trap being occupied and Pi(0) is the probability of the ith trap being empty:
(0) = and (1) =
where τC and τE capture and emission time constants, respectively. Please note that this notation does not change the waveform of the RTN. It just removes the average value (DC component) of the waveform. The amplitude of the ΔVT(t) induced by the the ith trap keeps being δVTi, since Pi(0)+ Pi(1)=1. See Fig. 2 
where this is exemplified. The average value (DC component) of the waveform does not contribute neither to jitter nor to noise. Hence, it is appropriate to remove it, besides facilitating the statistical analysis.
Then the variance of the threshold voltage fluctuation is evaluated starting from
For evaluating ⟨ΔVT(t) 2 ⟩, we note that for a single trap < (
In the case of τCi=τEi (50% occupation probability) Ai will be equal to δVTi/2. As in the case of noise, this (τCi=τEi) is the situation in which a trap generates the largest jitter. In the case of τCi≠τEi, Ai will be smaller than δVTi/2. Equations (6) and (7) above are for the average value of the jitter expected in a single transistor, i.e. the expected threshold voltage variation over time in a given transistor. It is also important to evaluate the variability of the expected (average) value of jitter among different transistors. Different transistors will show different jitter. Jitter variability may be evaluated using
It is evaluated as being =< >< >
B. Area Scaling and Relation to 1/f noise To detail the area scaling of and we must look at the area dependence of <n> and Ai. These are parameters well studied in the literature. The number of traps is known to be proportional to area, <n>~WL. The average amplitude to a trap is known to scale inversely with area, Ai~1/WL [1, 3, 7 and 8]. From (7) following relation between device area and average expected jitter value can then be written = ~1 (10) For the variation of jitter among devices, from (9) following relation between device area and jitter variability among devices can be written ~1 ( )
Please note that then ~1 ( ) . This means that with area scaling not only the expected jitter value increases, but also the variability of expected jitter value increases. This increasing time-dependent random variability is a significant challenge for the device designer.
The average value of threshold voltage jitter is related to the average value of 1/f noise power. Equation (7) above has the same form and same area dependence as equation (34) in [3] . Both equations (7) above and (34) in [3] are proportional to < > and proportional to the average number of traps in a given device size. The same area dependency is seen in (10) above and (35) in [3] . This is expected, since average value of jitter (and average value of phase noise) due to RTN is related to the average 1/f noise due to RTN. Also, similarly to 1/f noise (frequency domain) traps that have maximum contribution to jitter (time domain) are the ones with capture time similar to the emission time, i.e., β ≈ 1.
Similarly, the variability of threshold voltage jitter is related to the variability of 1/f noise power. Equation (9) above has the same form and same area dependence as equation (38) in [3] . Both equations (9) above and (38) in [3] are proportional to < > and proportional to the average number of traps in a given device size. The same area dependency is seen in (11) above and (38) in [3] . This is expected, since jitter variability (and phase noise variability) due to RTN is related to the variability of 1/f noise due to RTN.
Hence, as seen in 1/f noise, not only the expected average value of jitter is expected to increase with device area Carlo simulations are run with 1000 runs for each device size. Each point in the graph is the value for a Monte Carlo simulation run, i.e., the value of of a single device. Black diamonds show the average jitter for each area, which is in good agreement to (7) . Not only VT jitter average value increases with device downscaling, but also its spread (variability) increases. downscaling, but also the variability of jitter performance between devices strongly increases with device downscaling. Each device has a random number of traps with random amplitudes and random time constants. The VT jitter of a device is given by its particular number of traps and related parameters. Number of traps and related parameters vary among devices. The smaller the device size, the larger the jitter variability among devices.
MONTE CARLO SIMULATIONS
We did run Monte Carlo (MC) simulations to confirm the behavior predicted by the analytical model. The MC simulations are run assuming that: i) charge trapping and detrapping are stochastic events governed by characteristic time constants, which are uniformly distributed on a log scale; ii) the number of traps is assumed to be Poisson distributed, and the average number of traps in a device is proportional to the device area; and iii) the VT fluctuation induced by a single trap is a random variable, exponentially distributed, being the average amplitude inversely proportional to the device area. These assumptions are in line with relevant experimental data for RTN and BTI [1, 3] , as well as with TCAD analysis [9, 10] . Please note that these assumptions are done only to allow running the MC simulations. The analytical model derived in the previous section is valid for any statistical distribution of number of traps and VT fluctuation (no particular distribution is assumed in model derivation). Besides illustrating the applicability of the model here developed, the results from Monte Carlo simulation are compared to the analytical model, validating it. Values for the parameters are taken from the literature. The average VT fluctuation due to a trap <δVTi> is assumed to scale inversely with transistor gate area, <δVTi>=Bη /(W.L) [1] . For details please see equation (6) in [1] and related discussion. In the 28nm technology studied in [1] , for the nMOSFET Bη = 0.0053 mV/μm 2 [1] . For the calculation of average number of active traps per device <n>, a defect density of 10 11 /cm 2 is assumed [1, 8] . Table I shows the device sizes used in the Monte Carlo simulations and evaluation of analytical model equations, as well as the respective average number of active traps per device <n> and the average VT fluctuation due to a trap <δVTi>.
For each device size, 1000 Monte Carlo simulations are run. For each simulation, the size of the time series is 3x10 5 points. At each simulation time step, trap switching probability (capture or emission of a charge carrier by a trap) is evaluated according to the trap capture or emission time constant. As an example, figure 2 shows the first 10 4 points of the Monte Carlo runs of two transistors with WxL of 0.12μm x 0.03μm. The device in Fig. 2(a) shows a much larger VT fluctuation than the one in Fig. 2(b) , illustrating the importance of properly modeling not only the expected VT jitter in a transistor, but also its variability among devices that by design should be equal. For the transistor in Fig. 2(a) , the threshold voltage jitter VTjitter was σΔVT = 5.03mV ( = 25.4 ). For the transistor in Fig. 2(b) , the threshold voltage jitter VTjitter was σΔVT = 0.013mV ( = 0.00016 ). The values obtained in the Monte Carlo simulations are compared to the analytical model. Please note that the moments of an exponentially distributed random variable X are given by <X n > = n!/λ n , with <X>=1/λ being the parameter of the exponential distribution [6] . This allows evaluation of equations (7) and (9) and comparison to the results from Monte Carlo simulations.
The results are shown in Fig. 3 . Each point in the graph is the value for a Monte Carlo simulation run, i.e., the value of of a single device. As expected, due the random number of traps, random trap amplitude and random trap activity, the VTjitter 2 is different for each device. Black diamonds show the average jitter for each device size, which is in very good agreement to value predicted by equation (7) . The area scaling of and is as predicted by equations (10) and (11), respectively. As expected, due to the area scaling of , in Fig. 3 the maximum vertical value for each area increases rapidly as area becomes smaller. This behavior has been experimentally observed, but analytical modeling was lacking. The RTN distributions clearly show the non-Gaussian distribution, with variation increasing as the device area scales down, as observed e.g. in [2, 4] . Furthermore, it is seen to be a heavy tailed distribution, as observed e.g. in [2, 5] . Figure 4 shows the histograms of the Monte Carlo simulations for the two different device sizes. Each histogram shows the VTjitter 2 for the 1000 devices of same size.
It is also important to address the question of how the VT jitter varies among devices that by design should be identical (have the same W x L). This question is addressed by looking at the variance of VTjitter 2 , which is , given by equation (9) . Figure 5 shows how the VT jitter varies among transistors that by design should be identical. This is addressed the variance of VTjitter 2 , which is , given by equation (9) . It is seen that variability increases with decreasing device size, as predicted by the model. Black squares show the value obtained from the Monte Carlo simulations. Red dots show as predicted by (9) . Very good agreement between analytical model and Monte Carlo simulations is seen.
This increase in variability with decreasing device size has been also experimentally observed. In [13] , authors observed that by slightly increasing the transistor size, more than 50% reduction of ring oscillator frequency uncertainty can be achieved. Ring oscillator uncertainty was related to transistor delay uncertainty due to RTN.
The results also highlight the relevance of studying the shape of the distribution of the VT fluctuation (δVTi) induced by a single trap. For instance, for an exponentially distributed δVTi, is factorial of 2 (i.e., 2 times) larger than if compared to a constant δVTi, while is factorial of 4 (i.e., 24 times) larger. The area scaling, however, does not depend on the shape of the δVTi distribution.
CONCLUSION
In this work we extend the knowledge of the time-dependent random variability induced by RTN, by providing an analytical model for the threshold voltage jitter produced by RTN. We addressed not only the average (expected) jitter value, but also its variability among devices. The area scaling of RTN induced jitter and its variability is detailed and discussed, supporting designers in transistor sizing towards a more reliable design. Monte Carlo simulations are run, validating the analytical model and illustrating its applicability.
