Abstract-Optimization of SRAM yield using dynamic stability metrics has been evaluated in the past to ensure continued scaling of bitcell size and supply voltage in future technology nodes. Various dynamic stability metrics have been proposed but they have not been used in practical failure analysis and compared with conventional static margins. This work compares static and dynamic metrics to identify expected correlations. A dynamic stability characterization architecture using pulsed word-lines is implemented in 45 nm CMOS to identify sources of variability, and their impact on SRAM stability. Static read margins were observed to overestimate failures by 10-100 X while static write margins failed to predict outliers in critical writeability. Critical writeability was demonstrated to exhibit an enhanced sensitivity to process variations, random telegraph noise (RTN), and negative bias temperature instability (NBTI), compared to static write margins.
I. INTRODUCTION

S
RAM scaling has been identified as one of the bottlenecks for supply voltage reduction in current and future technology nodes. Minimum SRAM operating voltage is a function of the magnitude of process-induced variability as well as array size. Aggressive SRAM bitcell scaling, as well as continued increase in SRAM array sizes, has resulted in stagnation in SRAM scaling. This trend is observed in reported values of SRAM array and is recognized in the latest edition of the International Technology Roadmap for Semiconductors (ITRS) (Fig. 1 ) [1] .
is traditionally estimated using static margins such as static noise margin (SNM) and N-curves [2] , [3] . These metrics are known to be optimistic in writeability and pessimistic in read stability from comparisons between static and actual dynamic access [26] . Dynamic stability metrics, derived from the SRAM under dynamic access, have been proposed to provide a better estimate of SRAM [4] - [6] . While these metrics have been studied extensively through simulations, results based on large-scale silicon characterization of both read and write stability have not yet been reported. Similarly, a quantitative relationship between the static and dynamic read and write margins has not been studied. The sensitivity of dynamic stability to non-idealities such as random telegraph noise (RTN) and aging is still largely an open problem.
In this work, we propose a characterization architecture for measuring dynamic SRAM stability through pulsed word-lines calibrated up to 10 ps accuracy [7] . Measuring word-line pulse-widths calibrates out any timing uncertainty introduced by SRAM peripheral circuits, thus allowing characterization of the fundamental variability of the SRAM bitcells. This characterization methodology is validated in a commercial low-power 45 nm CMOS process. The test chip also provides a means of correlation with static read and write metrics via direct bit-line measurements [8] . This method is used to identify new sources of variability in dynamic stability by observing deviations from expected correlations between dynamic stability and static margins.
We first review conventional static and dynamic 6 transistor SRAM metrics as well as their expected correlations in Section II. Monte Carlo simulations, introducing Gaussian distributions of to the 6 SRAM transistors, are presented in this section to illustrate expected correlations between the metrics. Section III presents the proposed dynamic stability characterization architecture while Section IV describes an implementation in a 45 nm CMOS test chip. Section V summarizes measurement results and their implications. typically exhibit proportionality to the supply voltage, and normalizing them allows for comparison with prior studies (e.g. [8] , [9] ).
II. STATIC AND DYNAMIC SRAM METRICS
A. Read Access
1) Static Read Current
: corresponds to the current that is being sourced from the bit-line into the SRAM node storing a "0". Under SRAM read operation, this current is responsible for discharging the pre-charged bit-line capacitances enough to overcome the offset voltage of the sense-amplifier to result in a correct value being latched. It is expected to correlate with actual read access time :
Actual read access time might deviate from this linear relationship due to leakage currents from inactive bitcells sharing the bit-line as well as the fact that is a distributed RC network spanning the entire column of the SRAM array. Degradation in due to RTN also contributes to this discrepancy, as will be shown in Section V.
2) Read Access Time : Fig. 2 illustrates an SRAM bitcell undergoing read access with pulse-widths and . Pulse-width is too short to sufficiently discharge the bit-line capacitance to overcome offset in the sense-amplifier. There exists a critical pulse-width, , where the sense-amplifier is on the threshold of a successful read access that is defined as the read access time. This is similar to the dynamic access failure criteria defined in [5] . This definition of read access time isolates out variability in the read access operation due to variability of the SRAM bitcell and ignores other delays such as word-line driver delay and sense-amplifier delay. 
B. Read Stability
1) Static Read Stability Margins:
Conventional stability metrics, such as SNM and N-curves [2] , [3] , require sweeping internal nodes in order to obtain the voltage transfer curves, which is not practical for evaluating large arrays. We choose to characterize the supply read retention voltage (SRRV), which does not require access to the internal nodes. A direct correlation between this and other stability metrics has already been established in [8] .
2) Critical Read Stability : Fig. 3 illustrates an SRAM bitcell undergoing read stress with pulse-widths and . Pulse-width is short enough that the internal nodes ( and ) return back to their original levels after the word-line pulse. The longer pulse-width subjects the bitcell to too much read stress, causing the cell to flip to an opposite state after the word-line pulse. There exists a critical pulse-width, , where the bitcell is on the threshold of a read upset, that is defined as the critical read stability. This is similar to the dynamic read failure criteria defined in [5] . This metric does not require access to the internal nodes of the SRAM cell. The challenge is to reliably evaluate the contents of the bitcell after the test, without accidently disrupting the stored state.
A bitcell with positive static read margin will have infinite while a bitcell with zero or negative static read margin will have a finite value of . With the SRRV margin, it is possible to characterize a negative static read margin for a particular bitcell by measuring how much additional bitcell , above the nominal voltage, is required to maintain the stored state of the SRAM cell. Increasing the of a statically unstable bitcell by the absolute value of its negative SRRV, results in infinite . Fig. 4 plots the positive correlation observed between SRRV and extracted from Monte Carlo simulations. Although is observed to be exponentially dependent on static read margin, it is impossible to accurately estimate exact values of critical read stability from a voltage screen test at elevated due to the large dispersion (up to 10x) observed in at a particular SRRV. SRAM access with read-after-read operation presents the worst-case condition for critical read stability [5] , [6] . Fig. 5 illustrates the waveforms corresponding to an SRAM bitcell with read-after-read access. The SRAM bitcell is stable after the first word-line pulse but is subsequently corrupted by the second pulse. It is therefore important to characterize as a function of the number of read-after-read pulses as well as the access frequency.
C. Writeability 1) Static Writeability Margins:
Margins such as write noise margin (WNM) and write N-curve require sweeping internal nodes in order to obtain the voltage transfer curves [9] , [10] . We choose to characterize bit-line write trip voltage (BWTV) that can be measured by sweeping the bit-line voltages of the SRAM bitcell. Correlations established with this margin can be extended to other static margins based on previously established relationships [8] .
2) Critical Writeability : Fig. 6 illustrates write operation to a SRAM bitcell with pulse-widths and . Pulsewidth is too short to overwrite the contents of the SRAM cell while pulse-width is sufficient to complete the write operation. There exists a critical pulse-width, , where the bitcell is on the threshold of a successful write access that is defined as the critical writeability. This is similar to the dynamic write failure criteria defined in [5] . This metric does not require access to the internal nodes of the SRAM cell. The challenge, however, is to reliably evaluate the contents of the bitcell after the test, without accidently disrupting the stored state. Fig. 7 plots the expected correlation between and static write margin, based on Monte Carlo simulations. Bitcells with poor static write margin (smaller values) are expected to be correlated with poor (larger values). The dispersion between and BWTV is small, especially at lower static margins, implying the possibility of using voltage screening either by reducing or word-line bias to identify cells with poor . Table I tabulates the sensitivities between the respective write margins to variability in the 6 transistors of an SRAM bitcell under write operation as illustrated in Fig. 6(a) . The sensitivities in Table I reflect the negative correlation between BWTV and . Both margins have similar magnitude of sensitivities except for the pull-up transistors, as is correlated with variability in transistor while BWTV is independent, and is positively correlated with poor is more susceptible to cell asymmetry than static write margin. Read-before-write or read-after-write does not need to be considered because the read operation only helps to upset the cell and complete the write operation [5] . under write-after-write access, however, needs to be characterized to evaluate the impact of RTN on .
III. DYNAMIC STABILITY CHARACTERIZATION ARCHITECTURE Fig. 8 presents the SRAM array configuration for the characterization of dynamic metrics. It also shows the necessary infrastructure for collecting static metrics for the purpose of establishing correlations with dynamic metrics. The SRAM bitcells under test are organized into a conventional SRAM array. Various array bias voltages ( , , , , and ) are connected to pads to characterize the SRAM under different read/write assist modes. A programmable pulse is generated on-chip and delivered to a single word-line at a time using existing row decoders. This architecture makes extensive use of simple circuits and calibration to ensure ease of implementation while providing measurements with high fidelity even in highly-scaled process technologies. A programmable pulse is generated by simply mixing together two clocks, and , that have a slight offset in clock period (Fig. 9 ). This generates a pulse train with a difference in pulse-width of between successive pulses. A counter is then used to pass the desired pulse based on a programmed codeword. This pass signal can also be programmed to be held for multiple clock cycles to generate multiple pulses, simulating read-after-read access. The sync signal used to reset the counter is generated digitally on-chip based on statistics of the beat frequency between and , averaged over 128 samples to minimize the impact of clock jitter.
To avoid process-induced uncertainties, the exact pulse width is measured by word-line samplers located on every word-line (Fig. 8) . This contrasts to prior work in which a small subset of the word-lines is sampled [11] , [12] . The sampler consists of small transmission gates sampling the word-line pulse on a parasitic capacitance. Charge injection by the sampling clock, non-linearity of the transmission gates, and offset voltages of the comparators are calibrated out by tuning the reference voltage of the comparators. The differential clock driving the transmission gates is calibrated using a phase comparator to minimize aperture uncertainty in sampling the rising and falling edges of the word-line pulse (Fig. 10 ). An ideal differential clock should have no common mode component. This phase comparator takes advantage of this fact and detects the common mode component by summing these two signals using capacitors. The calibration scheme then proceeds to skew the edges of the clock until the glitch on the sum node is minimized. A Monte Carlo simulation of this scheme reveals that it reduces the phase offset of respective edges to less than 3 ps. The word-line pulse-width is finally measured by skewing the externally generated saen signal with respect to with 1 ps resolution. This word-line sampling scheme produces finer resolution compared to delay-line samplers [11] .
Non-destructive read-back of the SRAM bitcells is accomplished using multiple minimum-width read pulses. This allows the bitcell to gradually discharge the bitline capacitance without excessive read stress. Alternatively, is raised to the nominal voltage prior to read-back, especially when characterizing bitcells at low voltages. A built-in-self-test (BIST) circuit is used to characterize the dynamic stability of each bitcell automatically. The static margins of the SRAM bitcells are measured through the bit-lines using source meters with four-terminal Kelvin sensing to calibrate out the series resistance of the bit-line switches [8] . I-V characteristics and RTN in each individual transistor of a 6T SRAM bitcell were characterized using the direct bit transistor access (DBTA) method [24] .
IV. 45 nm CMOS TEST CHIP A 1.55 mm 1.55 mm test chip [7] , [13] , [14] is implemented (Fig. 11) in a low-power strained-Si 45 nm CMOS process [15] with poly-Si/SiO N gate stack and seven metal layers. Experimental, high density 0.252 m 6T SRAM bitcells that are smaller than ITRS requirements for the 45 nm technology node are characterized to observe a larger impact of process-induced variability on SRAM performance and also to predict variability in future scaled transistors. The test chip consists of two 64 256 arrays and two 128 256 arrays with full static and dynamic stability characterization coverage. The narrower array (64 columns) has reduced word-line parasitics and is used to characterize dynamic stability at high speeds with strict requirements of rise-and fall-transition times. The word-line samplers contribute to a 16% array area overhead. The level-shifters and bit-line switches incur a larger area penalty and are required solely for static margin characterization. Fig. 12 illustrates fail bit count measured from the test chip, indicating 10-100X discrepancy between quasi-static ( 1 s with bit-lines driven) and dynamic access. Static access fail bit counts are optimistic for writeability and pessimistic for read stability, compared to those for dynamic access. More than 10 write failures were observed at nominal when the bitcells were accessed with 1 ns pulses even though no write failures occurred when the bitcells were accessed quasi-statically. No read upset failures occurred when the bitcells were accessed with 20 ns pulses even though tens of failed bits were observed when the same bitcells were accessed quasi-statically.
V. MEASUREMENT RESULTS
A. Pulse Generator
Multiple complete waveforms of word-line pulses were subsampled and plotted in real time in Fig. 13(a) . Good rise and fall transition times of 75 ps and 30 ps were observed. Note that the rise and fall transitions account for a significant portion of narrow pulses (less than 100 ps) and effectively limit the correlation between static and dynamic margins. The pulse-width, corresponding to the delay between the 50% voltage level of the rise and fall transitions, was measured across different codewords. The transfer function and the measured linearity error are plotted in Fig. 13(b) . Up to 100 ps of non-linearity was observed in the transfer function. This error is believed to be caused by voltage droop in the power supply grid as the pulse is being distributed across the chip. These non-idealities demonstrate the importance of calibrating word-line pulse-widths at every word-line in order to calibrate out this source of uncertainty from actual variability in the bitcells. All dynamic SRAM measurements presented are based on word-lines calibrated to 10 ps resolution using low-jitter signal generators and averaging. Fig. 14(a) plots the statistical distribution of measured from 1024 bitcells at 0.8X nominal . The distribution is observed to be multi-modal, a superposition of multiple Gaussian distributions. The multi-modal nature of this distribution is due to the strong dependence of read access time on sense-amplifier offset voltage, (1). Measurements of , normalized with separately characterized sense-amplifier offset voltages and estimated bit-line capacitance, was observed to correlate with static read current ( Fig. 14(b) ). The remaining dispersion in the data is due to the inherent difference between statically measured out of the bitcell at a fixed bit-line voltage and the transient bitcell current as the bit-line is being discharged. Fig. 15(a) plots measurements of critical writeability versus the static write margin for writing the same data value to the same bitcell. Each data-point of corresponds to an average of 128 measurements. Expected correlation between poor BWTV and is observed in Fig. 15(a) , however, the uncorrelated outliers exceed the correlated data-points by more than ten times. These outliers are observed to appear exclusively in bitcells that have large static write margin on the opposite side of the cell (Fig. 15(b) ). Further analysis of individual transistor characteristics using DBTA revealed that a large number of bitcells sampled had large drain series resistance in one of the PMOS transistors. These marginal transistors were found to be on the side opposite to the half-cell being written to ( in Fig. 6(a) ), causing a significant degradation in the speed of the bitcell for pulling the storage node up to . The remaining bitcells showed good correlation between and BWTV metrics, after the marginal cells were screened out (Fig. 15(a) ). These marginal transistors did not degrade static write margin due to the negligible sensitivity of the margin to variability in (Table I) . Voltage screen tests such as described in [16] are commonly used to screen out defects and early failures in SRAM arrays. Such tests are usually carried out in-line at wafer sort using testers running at lower frequencies than actual operating frequencies. The lack of correlation between the outliers in critical writeability and static write margin invalidates results obtained from such tests because the bitcells screened by these tests are not the bitcells that fail first at normal operating frequencies. Fig. 16 plots measurements of critical read stability against the negative static read margin. These measurements were obtained by lowering by 300 mV relative to word-line and bit-line pre-charge voltage levels, to increase the probability of observing cells that are unstable under static access. The expected correlation between and negative SRRV (ref. Fig. 4 ) was observed in measurements. Bitcells with marginally negative static read margin (approximately 0.1 a.u.) were observed to have a large dispersion in ranging from 1 ns to 1 s. This dispersion reduces as the bitcell SRRV becomes more negative. The minimum observed was 630 ps, indicating that this SRAM bitcell can be accessed with pulse-widths shorter than 630 ps without read upsets even with 300 mV of droop. Outliers with extremely poor SRRV that are not correlated with smaller values of were observed. Such outliers were not observed in Monte Carlo simulations of a large 100,000 sample set (Fig. 4) . . As expected, degrades under read-after-read conditions [5] . Bitcells with small values of (less than 2 ns) were observed to shift only by a small amount, while bitcells with larger were observed to degrade by up to 1 ns. Susceptibility of a bitcell to read-after-read upset depends on the proximity of the internal node voltages to the rails when the next read pulse arrives. Bitcells with smaller values of are less susceptible to read-after-read upsets, compared to bitcells with larger accessed with the same , because these bitcells have longer recovery periods to settle at the rail voltages. Fig. 18 plots of a single bitcell as a function of the number of read-after-read pulses across decreasing . The degradation in , due to read-after-read, increases as is decreased. This degradation saturates eventually after 6 cycles in direct agreement with [5] . Evidence of slight degradation even with a relatively slow of 67 ns suggests that the recovery period of this bitcell is more than 67 ns, which is greater than 20 times the single-read of this bitcell (3.2 ns).
B. Read Access Time
C. Critical Writeability
D. Critical Read Stability
E. Impact of Assist Techniques
Fig . 19 compares the impact of different assist techniques on . Word-line voltage boosting and underdrive resulted in significant speed-up of [28] . boost was slightly more effective than under-drive because it increases the strength of the pass-gate transistors which have the strongest impact on . Fig. 19(c) plots the statistical distributions of under 300 mV of PMOS reverse body-bias (RBB) [29] . Not much improvement in was observed even with 300 mV of RBB due to the small body-effect coefficient for this 45 nm CMOS process. RBB might even have a detrimental effect on , due to the opposite sensitivities of to variability in and . Fig. 19(d) investigates write assist using negative voltage levels on the bit-lines [30] . A 100 mV negative bit-line bias results in a significant improvement in . Fig. 20 demonstrates the effectiveness of boosting and under-drive for read assist [17] . boosting was found to provide a larger improvement in critical read stability compared to under-drive. SRAM design using assist techniques involves a delicate balance of bias voltages in order to balance out the improvement in one margin with the degradation in the other. The strong sensitivity of read and write stability to and biasing suggests the possibility of using these two voltage tuning knobs to increase the overall reliability of the SRAM array. Results in this work however demonstrate that this technique needs to be used with caution as slight offsets in will affect and exponentially. Because of this, any uncertainty or noise in setting can result in large write or read stability failures.
F. Impact of Temporal Variations 1) Random Telegraph Noise (RTN)
: RTN refers to a noise phenomenon that is caused by charge trapping and de-trapping within the gate oxide of the transistor [21] . Aggressive scaling of SRAM transistor active area has resulted in an increasing contribution of RTN to transistor variability compared to random dopant fluctuations [18] . While RTN is observed in SRAM operation as low frequency fluctuation in static read and write margins, the impact of RTN on SRAM operating at high frequency has not yet been evaluated [19] , [20] . Fig. 21 plots drain currents of transistors measured from three different bitcells using the DBTA method. The pass-gate transistors were biased into strong inversion to reduce the contribution of RTN in the pass-gates to the measured drain current, as the pass-gate transistors are in series with these transistors. These bitcells were selected because the RTN amplitude fluctuation from the selected transistors was much larger than the other transistors in the same bitcell. This allowed direct correlation between characteristics observed in dynamic stability metrics and RTN in a particular transistor. Dynamic stability of the bitcells was characterized with different dynamic access patterns (Fig. 22) , designed to emphasize the impact of RTN on dynamic stability. Fig. 23 plots statistical distributions extracted from the respective access patterns on the corresponding bitcells, averaged over 128 tries at each pulse-width. Low frequency RTN in the transistors resulted in shifts in bitcell dynamic stability of up to 11%, that is dependent on single or multiple access [27] . Write-after-write access degraded corresponding to writing into both bitcell and bitcell even though large RTN was observed in different transistors ( and ) . This shift effect can be explained by considering the large-signal dependence of RTN trap occupancy [25] . The 100 ms hold condition for write-after-write access (Fig. 22) forced occupancy of traps in and emptied traps in . These traps maintain their occupancy state even though the gate biases are changed after the first write , set up by the first write operation, degraded writeability of the cell, compared to single write access ( Fig. 23(a) and (b) ). Read-after-write improved of bitcell compared to single read access (Fig. 23(c) ). is degraded under single read access because the 100 ms hold condition automatically applied a positive gate bias on , forcing trap occupancy in which degraded , , and
. These results indicate that dynamic stability should be characterized with write-after-write and single read access in order to capture the worst-case impact of RTN on dynamic stability.
2) Negative-Bias Temperature Instability (NBTI): NBTI refers to degradation in of PMOS transistors that is accelerated by negative gate bias and increased temperature. While the impact of NBTI on read stability has been studied extensively, the impact on write stability has mostly been ignored because NBTI actually improves static write margins by degrading of the PMOS transistors [22] , [23] . Analysis of the sensitivities of to transistor variability in Table I leads to the prediction that NBTI actually improves of one side and degrades of the opposite side of the bitcell, due to the opposite sensitivities of transistors and . We experimentally verified this point by subjecting the bitcells to data-dependent NBTI stress while monitoring before and after stress. The SRAM array was first initialized to a "0" state.
was then raised to 1.8 V and the test chip was baked at 125 C for 2 hours. The stored "0" state automatically applied NBTI degradation to only one PMOS transistor in the SRAM bitcell. Since positive-bias temperature instability (PBTI) is not expected in this process technology, only the transistor characteristics of this one particular PMOS transistor was expected to change from pre-stress to post-stress conditions. Fig. 24 plots measurements of before and after stress indicating data-dependent improvement and degradation in write stability. Degradation in due to NBTI translates to degradation in maximum frequency of a product or product failure at a given operating frequency.
VI. CONCLUSION
A dynamic SRAM stability characterization architecture is implemented in 45 nm CMOS. Expected correlations between dynamic stability and static margins were observed in addition to observation of large uncorrelated outliers (10 times more than expected) that are primarily caused by extra PMOS drain resistance. This finding exemplifies the inadequacy of low frequency voltage screen tests for identifying early failures and necessitates at-speed test. and bias voltages were observed to be effective tuning knobs for balancing critical read stability and writeability but need to be used with caution, due to the enhanced sensitivity of dynamic stability to these biases. Largeamplitude low-frequency RTN signaling in SRAM transistors causes shifts in dynamic stability of similar magnitude that depends on bitcell access patterns. Critical writeability magnifies the impact of process-induced and temporal variability in transistor characteristics, compared to static write margins.
