Abstract -Perpendicular spin-transfer torque (p-STT) magnetic memory is gaining increasing interest as a candidate for storage-class memory, embedded memory, and possible replacement of static/dynamic memory. All these applications require extended cycling endurance, which should be based on a solid understanding and accurate modeling of the endurance failure mechanisms in the p-STT device. This paper addresses cycling endurance of p-STT memory under pulsed electrical switching. We show that endurance is limited by the dielectric breakdown of the magnetic tunnel junction stack, and we model endurance lifetime by the physical mechanisms leading to dielectric breakdown. The model predicts STT endurance as a function of applied voltage, pulsewidth, pulse polarity, and delay time between applied pulses. The dependence of the endurance on sample area is finally discussed.
cache, which takes advantage of the nonvolatile behavior to reduce the OFF-state power consumption [6] , [7] . Also, the MRAM technology and spintronics, in general, are gaining considerable interest for non-von Neumann computing architectures, such as low-power hybrid MTJ/CMOS logic circuit [8] and beyond-CMOS brain-inspired neuromorphic circuit [9] .
The state-of-the-art conceptual implementation of MRAM relies on the magnetic tunnel junction (MTJ), namely a metal-insulator-metal stack consisting of a MgO dielectric barrier (t MgO ≈ 1 nm) between two CoFeB ferromagnetic electrodes. Of these two ferromagnets (FMs), the pinned layer (PL) has fixed magnetic polarization, whereas the free layer (FL) can change its polarization between parallel (P) and antiparallel (AP) with respect to the PL. The MTJ resistance is dependent on the relative orientation of the magnetic polarization in the two FMs due to the tunnel magnetoresistance (TMR) effect [10] , where the P state has a relatively low resistance R P , while the AP state has a relatively high resistance R AP . Switching from P to AP and vice versa takes place by spintransfer torque (STT), where the spin polarization of the electron flow across the MTJ is transferred to the FL ferromagnetic polarization by momentum conservation [11] , [12] . The perpendicular STT (p-STT) concept, where the ferromagnetic polarization lies out of the MTJ plane, allows a smaller switching current at a given retention time, thus enabling low power operation and improved scalability [13] .
To drive the switching current across the MTJ, bipolar voltage pulses are applied, which might induce degradation and time-dependent dielectric breakdown (TDDB) in the long term. Although the cycling endurance of STT-MRAM is generally referred to as virtually infinite [14] , the repeated electrical stress during cycling induces a breakdown-limited endurance lifetime, which poses a limitation on the applicability of STT-MRAM as working memory or in-memory computing element. Despite the relevant need for high endurance, the characterization methodology, the physical understanding, and the simulation models for breakdown-limited endurance are not yet well established.
In this paper, we address the endurance of p-STT-based memory. We study endurance failure for various pulse amplitude, polarity, and pulsewidth. Then, we present a model for breakdown-limited endurance based on defect generation, activation, and diffusion, capable of predicting STT-MRAM lifetime under different cycling conditions. Finally, we discuss the endurance dependence on device area.
A preliminary study of the modeling of STT-MRAM endurance was reported previously in [15] . Here, we provide a fully detailed report, with a deeper investigation of the fundamental mechanisms of defect generation/activation, a direct evidence for polarity-dependent activation, and a study of areadependent endurance.
II. SAMPLES AND METHODOLOGY
We used p-STT memory devices sketched in Fig. 1(a) , consisting of CoFeB PL [bottom electrode (BE)] and FL [top electrode (TE)] with a crystalline MgO dielectric layer. The device cross-sectional area was 47 nm × 47 nm. Fig. 1(a) also shows the experimental setup for the pulsed characterization of STT devices, including a TGA 12102 waveform generator (TTi) to apply triangular pulses for set (transition from AP to P under positive voltage) and reset (transition from P to AP under negative voltage) processes, while the applied V TE voltage and current I across the MTJ were monitored by a 600-MHz LeCroy Waverunner oscilloscope. Fig. 1(b) shows a typical sequence of set, read, reset, and read operations. Each triangular pulse had a width of t P = 100 ns and a pulse delay of t D = 20 ns, except where noted. The maximum positive voltage during set was V + , while the maximum negative voltage for reset was V − . The read current in Fig. 1(b) confirms the different states of the device, namely P state after set and AP state after reset. Fig. 1(c) shows the I -V curve obtained from the collected V and I data [16] . By monitoring the I -V curves at each cycle, we could observe possible degradation phenomena and the exact event of endurance failure. This technique is thus most accurate in reproducing the exact device conditions in real time, instead of unrealistic description by constant/ramped stress [17] , [18] . Also, the pulsed signal of Fig. 1(b) enables a comprehensive analysis with respect to various parameters, such as voltage (V + and V − ) and time (t P and t D ) parameters. All measurements were carried out at room temperature. Fig. 2(a) shows the measured resistance during a typical pulsed experiment under symmetric switching (V + = |V − |) as in Fig. 1 , as a function of the number of cycles. Data show clearly separate P and AP states with a TMR = R/R P ≈ 50%, where R = R AP − R P . Endurance failure is marked by an abrupt drop of resistance, corresponding to a hard breakdown of the MgO dielectric layer, after a number N C of cycles. The resistance values R P and R AP are constant throughout the lifetime, thus indicating no obvious cyclinginduced resistance degradation [19] . Also with a cycle-bycycle observation of the I -V characteristics, allowed by triangular stress pulses, no clear evidence for degradation could be found. Even though some preliminary studies [20] suggested a possible gradual dielectric breakdown for relatively thick MgO layer, in a nanometer thick tunnel barrier, an abrupt breakdown event is typically observed [19] . Breakdown could take place on either voltage polarities, e.g., breakdown during the positive sweep for V + > |V − | [see [21] , [22] . After breakdown, the device shows a TMR of 0% and a constant resistance R ≈ 300 , which we attribute to the metal contacts and interfaces. No other kinds of cycling-induced failure, such as a degradation of the magnetoresistance ratio due to the cycling-induced degradation of the ferromagnetic layers, were observed, although this might be possible as a result of the thermal runaway right after dielectric breakdown. The latter was always responsible for device failure, consistently with the high electric field causing stress within the MgO barrier. Fig. 3 (a) shows the measured cycling endurance N C as a function of the applied voltage with a pulsewidth t P = 100 ns and a pulse delay t D = 20 ns. Three cycling conditions are compared in Fig. 3(a) , i.e., symmetric bipolar stress with V + = |V − |, positive unipolar stress with V − = 0 V, and negative unipolar stress with V + = 0 V. N C data for positive and negative unipolar stress show similar behaviors, suggesting a high polarity symmetry of the MTJ structure with respect to degradation and breakdown processes. N C shows a steep exponential voltage dependence with a slope ≈ 50 mV/decade for the three regimes in Fig. 3(a) . A simple extrapolation to the switching voltage V set ≈ |V reset | ≈ 0.3 V [15] indicates an estimated N C ≈ 10 18 at V = 0.3 V and t P = 100 ns, which is high enough to comply with most SCM and dynamic random access memory applications. Data indicate a higher N C value hence reduced degradation, for unipolar stress, compared with the bipolar stress condition. This can be interpreted by considering the MgO-CoFeB interfaces to be the regions of maximum generation of stress-induced defects, and thus, unipolar stress predominantly creates damage at a single interface, whereas both interfaces are affected by bipolar stress-induced degradation. i.e., failure occurs during the set pulse and 2) region B for |V − | > V + with a relatively low slope ≈ 600 mV/decade and negative-voltage breakdown, i.e., failure occurs during the reset pulse. Even though breakdown polarity is dictated by the largest applied voltage, surprisingly V + influences breakdown in region B, where V + < |V − |. This remarkable evidence is further confirmed by Fig. 3(c) , showing N C for asymmetric bipolar stress with variable V − and constant V + = 1 V and indicating the same qualitative behavior as in Fig. 3(b) .
III. CYCLING ENDURANCE
The same behavior is evidenced by Fig. 4 No other input patterns were explored, e.g., a mixed unipolar/bipolar regime, although we expect that the failure mechanism would not change, and the endurance would be intermediate between the unipolar and bipolar cases.
IV. ENDURANCE MODEL
We developed a semiempirical model of endurance, which describes the dependence of N C on the voltage amplitude and pulsewidth of the applied signal. In the model, N C is inversely proportional to the defect concentration within the MgO layer, namely N C = N C0 (n D /n D0 ) −1 , where N C0 and n D0 are constant and n D was calculated as n D = n D,TE +n D,BE , where n D,TE and n D,BE are the defect concentrations originating from the TE interface and the BE interface, respectively. In this physical picture, defects are mostly generated near the interfaces where electrons have the highest kinetic energy and where the structure might display possible degradation precursors, e.g., dangling bonds or oxygen vacancies. For example, an incomplete Mg oxidation could make unoxidized atoms to move more easily toward anode due to electromigration, thus increasing Mg/O vacancy concentration close to the interface [23] . In addition, boron (B) diffusing from the electrodes toward the tunnel barrier might initiate the creation of pinholes that might short circuit the tunnel conduction [24] . A relatively high density of initial degradation precursors plays also a key role in lowering the electron transport barrier height in MTJ [25] . Defect concentrations are given by n D,TE = n D0 * R TE /R 0 and n D,BE = n D0 * R BE /R 0 , where R TE and R BE are the generation rates describing the cycling-induced degradation at the TE and BE interfaces, respectively, while R 0 is a constant. For our crystalline MgO layer, defects might be attributed, e.g., to Frenkel pairs of O vacancies V in the following. Even though process variability is of great importance for the STT-MRAM design [27] , we did not take into account such effects given the relatively low deviceto-device variation of conduction and switching among our samples. The observed variation in cycling endurance for a given voltage might result from intrinsic variability of both the position in the oxide layer and the number of generated defects. Fig. 5(a) shows the defect generation mechanism in our model. Electrons injected from one interface reach the other with a kinetic energy E given by the difference of the Fermi levels in the two electrodes, i.e., E = E F,TE − E F,BE = qV + for positive voltage applied to the TE and hence electrons injected from the BE. The release of the energy E induces lattice vibrations and defect generation at the TE interface by bond breaking. Even though the strong ionic bond between Mg and O is very energetic, bond breaking is possible due to the extremely high local field and polarization that it will experience, leading to significant bond distortion [28] . This condition can be explained considering its high dielectric susceptibility and dipole moment [29] .
A. Defect Generation
In our model, defect generation probability is assumed to increase exponentially with the energy E, and thus, the generation rate is given by R TE = R 0 e αV + , where α is a constant, in agreement with the E-model of dielectric breakdown [30] , [31] . Similarly, the generation rate at the BE interface can be written as R BE = R 0 e α|V − | .
To test the defect generation model in Fig. 5(a) , Fig. 6 (a) shows the calculated N C value for asymmetric bipolar cycling, compared with data from Fig. 4 . We assumed α = 42 V −1 in the calculations. The model correctly describes the steep decay of N C in region A; however, the model fails to predict the weak voltage dependence in region B. In fact, due to the exponential voltage dependence of R BE and R TE , the defect generation model only attributes degradation to the largest voltage, in contrast to the experimental evidences in Figs. 3 and 4 .
B. Defect Activation
To account for the impact of the smaller voltage in the MgO degradation, we considered the defect activation mechanism displayed in Fig. 5(b) . After a positive pulse of voltage V + , the application of a negative pulse with amplitude |V − | < V + can activate the defects generated by the positive semicycle, e.g., by displacing an interstitial O
2− i
away from the corresponding O vacancy in the newly created Frenkel pair, as shown in Fig. 5(c) , with a rate R a /R 0 = ke β|V − | , where k and β are constants with β < α. The activation causes an additional damage to the dielectric layer during the lowvoltage semicycle, since the separation of the two constituents of the Frenkel pair leads to: 1) a reduced probability of recombination and 2) an increased defect concentration in the bulk of the MgO, supporting the formation of a percolative path [22] . Calculation results from the generation/activation model with k = 1 and β = 4 V −1 are shown in Fig. 6(b) , indicating better agreement with data in both regions A and B.
To further confirm that the activation process consists of a displacement rather than a thermal effect, e.g., a temperatureinduced stabilization of the generated defect, we compared the asymmetric bipolar stress (fixed V + = 1 V and variable negative V − ) and the asymmetric unipolar stress, where both the fixed and variable voltages were positive. Data shown in Fig. 7 indicate a larger N C value and a rather flat behavior in region B for the asymmetric unipolar stress, thus suggesting that a positive voltage is not effective in displacing O
O . These data confirm that the activation process requires bipolar stress.
To complete our model, we included defect generation and activation at the BE side with the same parameters used for the TE side in view of the high symmetry of our MTJ stack. We also included an explicit dependence on the pulsewidths t + and t − of the positive and negative pulses, respectively. The total defect density due to generation and activation is thus written as
where t 0 = 10 −30 s is a constant. The model parameters are summarized in Table I . Fig. 8(a) shows the measured and calculated N C value for both constant V + with variable V − and constant V − with variable V + . Our model is able to predict the different slopes in regions A and B, where n D can be approximated as
respectively. Slopes in regions A and B can be directly related to α and β. The model is able to account for N C for unipolar (positive and negative) and symmetric bipolar stress (i.e., V + = |V − |), as shown in Fig. 8(b) . Fig. 8(c) shows the simulated voltage-dependent endurance for t P = 100 ns and t D = 20 ns.
V. PULSE-TIME DEPENDENCE OF ENDURANCE
A. Impact of Pulsewidth t P
To test the impact of the pulsewidth t P on endurance, Fig. 9(a) shows the measured and calculated N C value for symmetric bipolar stress (V + = |V − |) for increasing t P from 100 ns to 100 μs. Data indicate that N C decreases at increasing t P as N C ∼ t −1 P , as also summarized in Fig. 9(b) for stress at V + = |V − | = 0.8 V. Calculations accurately account for the t P dependence, as a result of the dependence on t + and t − in (1), to describe the increase of the defect density with increasing stress time.
To study the distinct impacts of t + and t − in (1), Fig. 10 From our data, N C shows a dependence only on the width of the pulse of the largest voltage, namely the one that generates defects in the MgO [see Fig. 5(a) ]. The duration of the activation pulse is instead not affecting degradation. This is in agreement with a physical picture where activation behaves like a binary event, i.e., resulting in either failure or success. The maximum number of cycles depends only on the pulsewidth of the highest voltage pulse, which is responsible for the generation step in Fig. 5(a) . B. Impact of Pulse Delay t D Fig. 11(a) shows the measured and calculated N C value as a function of the delay time t D for both unipolar stress (V + = 0.85 V and V − = 0 V) and bipolar stress (V + = |V − | = 0.85 V). In both cases, N C slightly decreases for increasing t D , which can be explained by defect diffusion from the interface region where the defects are generated toward the bulk region of the MgO layer, as shown in Fig. 11(b) . Similar to defect activation, delay enhances the degradation of generated defects by preventing recombination of oxygen interstitials and vacancies, and by enhancing the defect concentration within the bulk of the MgO layer, thus supporting the creation of percolation paths [22] .
The dependence on t D was taken into account in the model by adding a diffusive rate
where t D0 and γ are the constant parameters shown in Table I . Calculations by (3) are shown in Fig. 11(a) , in close agreement with the experimental results. The results also suggest that the gap between unipolar and bipolar endurance decreases for increasing t D , which is fully taken into account by our diffusive model. In fact, as t D increases, defects efficiently diffuse toward the opposite interface, thus making the difference between unipolar and bipolar stress increasingly negligible. Note that the weak dependence of bipolar endurance on t D is consistent with previous results in [32] . On the other hand, our data for unipolar stress show no dramatic dependence on t D , in contrast to [32] , which might be explained by a different structure or etch damage profile in our MgO layer.
VI. AREA DEPENDENCE
The reduction of device area A in p-STT-MRAM devices allows to decrease the switching current, which is required to minimize the cell area limited by the driving transistor in 1T-1MTJ structures [14] , [19] . In addition to reducing the footprint and power consumption, area scaling also allows to improve cycling endurance due to the Poisson area scaling of TDDB [14] , [18] . To study the area dependence, Fig. 12 shows the measured N C value for bipolar cycling as a function of V + = |V − | for increasing area, namely A = (47 nm) 2 , (75 nm) 2 , and (105 nm) 2 .
Data in Fig. 12(a) show an unexpected nonmonotonic behavior, which is summarized in Fig. 12(a) (inset) for the case V + = |V − | = 0.65 V. Here, N C decreases with area but shows an anomalous large value for the largest area. This result was attributed to a series resistance effect, where the actual voltage drop V across the MTJ decreases with the device area. In fact, V is given by V = V − R s I , where R s is the series resistance associated with the contacts and interfaces and I is the current. As the device area increases, I also increases, thus causing a decrease of V . We estimated V by assuming R s = 300 , corresponding to the resistance after breakdown in Fig. 2(a) . Fig. 12(b) shows N C as a function of V , evidencing a correct monotonic decrease of N C with area. Fig. 13(a) shows N C as a function of device area for V + = |V − | = 0.65 V, evidencing a decrease according to the power law N C ∼ A −2 . Based on Poisson area scaling, the exponent in the power law should be equal to the inverse of the Weibull shape factor of N C , namely the slope of the cumulative distribution in the Weibull plot [33] . The latter is shown in Fig. 13(b) for various A, indicating an areaindependent Weibull shape factor η = 1.35 in the formula log(-log(1-F) = ηlog (N C /N C0 ). Such a value of the shape parameter η can be explained by intrinsic TDDB processes, such as defect generation controlled by the electrical stress, in contrast to extrinsic breakdown processes for η < 1 [14] , [34] . From Poisson area scaling, we calculate a theoretical slope in Fig. 13(a) of −1/η ≈ −0.75, in contrast with the experimental slope ≈ −2. This disagreement might be explained by the etching process having beneficial effects on the device lateral surface, resulting in a low probability of breakdown initiation. This effect results in a reduced effective area for breakdown process appearing as a stronger area dependence for relatively small devices [18] . Also, Joule heating effects might contribute to TDDB, thus causing deviation from the field-driven Poisson area scaling for relatively small device area.
VII. CONCLUSION
We show a comprehensive study of breakdown-limited cycling endurance in p-STT-MRAM devices. Cycling endurance is experimentally monitored as a function of the pulse amplitude, polarity, and timing. We developed a semiempirical model based on generation, activation, and diffusion of defects in the MgO tunnel barrier. The model accounts for the dependence of endurance lifetime on applied voltage, pulsewidth, and pulse delay. Finally, the area scaling of endurance is experimentally analyzed and discussed.
