ABSTRACT This paper proposes a compact nonvolatile three-terminal two-transistor spintronic memory cell with a fast-read operation. It is applicable to a variety of current-driven and voltage-controlled write mechanisms, such as spin diffusion, spin Hall effect, domain wall motion, and magnetoelectric effect. Compared to the prior three-terminal spintronic memory proposal, the new cell provides 20% improvement in cell density. Compared to the conventional spin torque transfer random access memory, the proposed cell separates the read and write paths, and improves the read energy-delay product by up to 22× considering process variations for transistors and MTJs.
I. INTRODUCTION
There is an increasing need for fast nonvolatile memory to enable low-standby power and low-energy computing, especially in energy harvest applications that have limited and unreliable power supply [1] - [3] . Spin torque transfer random access memory (STTRAM) has been proposed as one of the promising candidates thanks to its low latency, nonvolatility, and high endurance [4] - [8] . Data stored in the magnet are read via a magnetic tunnel junction (MTJ), and the tunneling magnetoresistance (TMR) ratio is a key metric which defines how far apart the two resistance values are. The TMR of an MTJ generally increases with its tunneling oxide thickness up to a point (∼2 nm) beyond which it saturates [9] . Therefore, a relatively thicker tunneling oxide helps to achieve a large TMR ratio, improves sensing margin, provides a small read current, and reduces the read disturb rate. However, the resistance-area product of an MTJ increases exponentially as its tunneling oxide thickness increases. Hence, the read delay and write energy increase significantly because of the large Joule heating energy associated with the large MTJ resistance.
To improve the write energy efficiency without sacrificing the read performance and read disturb rate, three-terminal memory cells have been proposed to separate the read and write current paths [7] , [8] , [10] - [12] . Significant improvements have been projected in terms of the write energy and delay. However, the drawback is that the cell footprint area increases by 50% due to the extra transistor required to decouple read and write paths. Regarding the read operation, it is still identical to the conventional STTRAM, whose delay and energy are constrained by the large MTJ resistance. In these proposals, because the MTJ tunneling oxide thickness needs to be sufficiently large to ensure a correct read operation, the large bitline capacitance needs to be charged or discharged by a read current passing through a highly resistive MTJ, causing a large read delay. Several fast nonvolatile SRAM implementations based on magnetoelectric MTJ have been proposed [13] , [14] . However, a large number of transistors are used in each cell, leading to a significant cell footprint area overhead.
In this paper, a nonvolatile two-transistor three-terminal spintronic memory cell structure with an enhanced read operation is proposed. It has an extra reference MTJ, and the transistors are reallocated to take advantage of the large driving capability of CMOS transistors. This structure has a compact layout and achieves 20% higher cell density compared to the previous three-terminal memory cell. In addition, it is compatible with a variety of spintronic writing mechanisms, including spin diffusion, spin Hall effect, domain wall motion, and magnetoelectric effect. An array-level memory performance analysis is performed based on Monte Carlo SPICE simulations that take into account the process variations for both MTJs and CMOS transistors. Optimal tunneling oxide thicknesses of MTJs are obtained for a variety of TMR assumptions and array sizes to minimize the overall read energy-delay product (EDP).
The rest of this paper is organized as follows. Section II illustrates the structure and layout of the proposed spintronic memory cell. Section III describes the performance modeling approaches, simulation results, and analyses. Conclusions are summarized in Section IV. Fig. 1 shows the schematics of the proposed cell. The read and write paths pass through an access transistor that is connected to the wordline (WL). During the write operation, the access transistor is turned on, and the operation principles of the spintronic writing mechanisms are the same as the ones described in [11] . The voltage applied on the sourceline (SL) and write bitline (WBL) determines the magnetization of the free magnet that stores the data. The read voltage (V read ) is set at the ground state during the write operation.
II. PROPOSED MEMORY STRUCTURE
For each cell, a reference MTJ is connected in series with the sense MTJ. The drain and source of the top nMOS are connected to read bitline (RBL) and the access transistor, respectively. During the read operation, the access transistor is turned on, and the output cell resistance depends on the bottom MTJ configuration. If the bottom MTJ is at the parallel state, it has a small resistance, reducing the voltage between the two MTJs. Thus, the top nMOS operates at the cutoff region and provides a large cell resistance; otherwise, a large voltage is generated between the MTJs, turning on the top nMOS and lowering the cell resistance. Since the transistor has a large driving capability and on-off ratio, it improves the sensing margin and drives the bitline capacitance much faster than the conventional STTRAM does. For unselected memory cells in the same column, both V read and WBL are set at the ground state during the read to ensure no sneak current flowing through MTJs. The voltage at the input of the top nMOS is at the ground state, leading to a high cell impedance in nearby cells sharing the same RBL. Fig. 2 shows the proposed memory cell layout based on the spin diffusion and cross-sectional views at dashed lines A-E. Note that the reference and storage MTJs are formed close to each other at the same metal level, as shown in Fig. 2(c) . Therefore, the voltage division between the two MTJs is less sensitive to the process variation of the tunneling oxide thickness because the two thicknesses are correlated. The layout views of each layer marked in Fig. 2 are shown in Fig. 3 , where unidirectional metal design rule is assumed.
For other spintronic writing mechanisms, including spin Hall effect, domain wall motion, and magnetoelectric effect, the corresponding cross-sectional views at dashed line A are shown in Fig. 4 . Note that for the spin Hall effect-based memory cell, fabricating free layer magnet on the top of the spin Hall layer may degrade the pristine condition of the interface between the free layer and the spin Hall layer. For the domain wall-based memory cell, fabricating two fixed magnets on the opposite side of the free magnet may also lower the interface quality between the fix and free magnets. However, in this paper, potential benefits are projected given the assumption that these challenges associated with fabricating high-quality interfaces will be overcome.
To compare the proposed three-terminal memory cell with the conventional two-terminal STTRAM and previously proposed three-terminal spintronic memory cells, Fig. 5 shows the corresponding cell layout views and cross-sectional views at the dashed line A. The proposed memory cell improves the cell density by 20% compared to the previously proposed three-terminal memory cell. The main reason for the cell area advantage comes from the fact that two nMOS transistors in the proposed memory cell share the same source/drain contact. The drawback is that it requires three more metal levels to provide all connections to the read and write terminals compared to the previously proposed three-terminal memory cell [11] . The footprint area of the proposed memory cell is larger than that of the conventional two-terminal STTRAM because the separation of the read and write terminal requires extra connections to each cell, leading to a larger routing constraint. The detailed cell area and density comparison data are shown in Table 1 .
III. PERFORMANCE ANALYSIS A. MODELING AND SIMULATION APPROACHES
The resistance-area products of MTJs with the antiparallel configuration at various oxide thicknesses are taken from the previous experimental work, where the maximum TMR is ∼150% [9] . The read circuitry is adopted from the default sensing circuitry presented in [15] , and the simplified read circuit diagram, including one memory cell and one reference cell, is shown in Fig. 6 . The CMOS device-level SPICE model follows the ASU PTM FinFET HP at the 16-nm technology node [16] . The minimum sensing voltage swing at the input of the sense amplifier is set at 50 mV. The technology assumptions are as follow [11] .
1) The half metal pitch F is 30 nm.
2) The default memory array size is 1000 × 1000.
3) The read voltage V read is set as 1 V.
4) The bitline and WL resistances are 20.5 and 53 /µm, respectively, based on a compact interconnect resistivity model [17] . 5) The interconnect capacitance is 0.15 fF/µm [18] . 6) The parasitic capacitance of CMOS transistor and local interconnects assumed for each cell is 0.1 fF. For all the writing mechanisms considered here, the models to estimate the magnet switching delay, domain wall propagation speed, and exchange bias effect follow the modeling approaches used in the uniform benchmarking methodology [11] . Since the write operation is identical to the proposed memory in [11] , the rest of this paper focuses on the read operation. Fig. 7 shows the performance metrics of the read delay and energy consumption based on HSPICE simulations. Three TMR values are investigated, and the nominal TMR value is taken from the experimental measurement [9] . Optimal oxide thicknesses exist to achieve the minimum delay for various TMR values. The reason is that when the oxide is thin, an increase in its thickness improves the sensing voltage swing and reduces the delay required to reach the minimum sensing voltage of 50 mV. If the oxide thickness exceeds a certain point; however, the TMR saturates, and a large MTJ resistance dominates the RC delay at the input gate of the top nMOS and increases the overall read delay. The read energy decreases as the oxide thickness increases because for a thin oxide, the small resistance of the MTJs leads to a large leakage current flowing from V read to SL.
B. SIMULATION RESULTS WITHOUT PROCESS VARIATION
The proposed memory outperforms the STTRAM, in both delay and energy at the optimal read EDP design points, as shown in Fig. 8 . This is because the active transistors inside each cell charge and discharge large bitline capacitances much faster than the STTRAM does. In addition, the read speed is insensitive to the MTJ resistance; hence, the tunneling oxide can be made sufficiently thick to achieve a large TMR and a large sensing voltage swing at the input gate of the transistor, leading to a large cell resistance difference and a fast sensing. More details will be discussed in Section III-C. At the optimal read EDP design points, the proposed twotransistor memory provides 6× smaller EDP compared to the STTRAM at the nominal TMR. The proposed spintronic memory structure has more benefits at a low TMR because the overall read delay is dominated by the WL delay, as shown in Fig. 7(a) . The improvement in sensing delay due to the improved TMR only slightly affects the overall read delay. For the conventional STTRAM; however, the read delay is dominated by the bitline and sensing delay. A large TMR improves the voltage swing at the input of the sense amplifier and leads to a substantial read performance improvement.
During the read operation, a charge current flows from V read to SL, which potentially induces read disturb. The read disturb rate is calculated as [19] 
where
, τ 0 is the attempt period of 1 ns [19] , E is the thermal stability factor, I read is the read current flowing through the cell MTJ, t read is the read delay, and I c is the critical charge current to switch the magnet, which is estimated by following the previous three-terminal spintronic memory work [11] . Fig. 9 shows the read disturb rate versus the MTJ tunneling oxide thickness for in-plane and PMA magnets. Memory cells with PMA magnets have a lower critical switching current than those with in-plane magnets, leading to a higher read disturb rate. In addition, the read disturb rate decreases as the TMR improves thanks to the improved MTJ resistance at the antiparallel state, leading to a reduced read current flowing through cell MTJs. However, if the oxide thickness increases beyond a certain point, the read disturb rate saturates and slightly increases because the longer read delay diminishes the benefit from the lower read current. Due to the large optimal oxide thickness (∼1.8 nm) at minimum EDP design point in Fig. 8 , the read disturb rate is very small (<10 −24 ) thanks to the small current. Another reason for the small read disturb rate is that the proposed memory has a fast-read time, which lowers the read disturb rate exponentially. 
C. PERFORMANCE ANALYSIS UNDER PROCESS VARIATION
To quantify the impact of the process variation of transistors and MTJs, Monte Carlo simulations are performed based on the 3σ deviation of 10% in the footprint area and tunneling oxide thickness of MTJs and 3σ deviation of 60 mV in the threshold of CMOS transistors in both memory and reference cells [20] , where 5000 samples are taken for each configuration. Fig. 10 shows the gate voltage distribution of the top transistor when the cell MTJ is set at parallel and antiparallel states. If the cell MTJ is at the parallel state, it has a low resistance, reducing the gate voltage; otherwise, a high gate voltage is generated. The red dashed line marks the reference voltage applied on the reference memory cell. Enough margins are ensured to account for the process variation of MTJs. Fig. 11 shows the cell resistance distributions when the cell MTJ is set at parallel and antiparallel states. These distributions not only include the MTJ variations, but also take into account the threshold variation of the transistors. The high/low resistance state ratio of the proposed memory cell is above 100, whereas the conventional STTRAM has a resistance difference of only ∼2. This large resistance change improves the sensing margin significantly and reduces the read delay. The proposed memory cell has a much smaller resistance at the on state due to a larger driving capability of the CMOS transistors, leading to a major speed advantage. The reference cell resistance distributions of the proposed spintronic memory cell and the conventional STTRAM are shown in Fig. 12 . It can be observed that the proposed memory also has a smaller reference cell resistance.
The normalized probability density function of the total read access time at the optimal read EDP design points is shown in Fig. 13 . The memory with a lower TMR suffers more from the process variation and has a wider distribution IEEE Journal on Exploratory Solid-State Computational Devices and Circuits FIGURE 11 . Normalized probability density function of the cell resistance at (a) antiparallel and (b) parallel states. because the voltage swing at the input gate of the transistor is limited, leading to large read delays at worst corners.
The EDP improvement of the proposed memory cell compared to the conventional STTRAM under the process variation is shown in Fig. 14 , where the read access time is taken at the 6σ deviation point. Three different array sizes are investigated to quantify the potential benefit. It can be observed that the improvement of the proposed memory cell is more prominent at large array sizes and in most cases at low TMR ratios. This is because a larger memory array benefits more from the large driving capability of the driving transistor. For array size of one and two megabits, memory cells at the nominal TMR benefit most from the proposed memory structure because of the same reasons mentioned in Fig. 8 . However, for an array size of 4 Mb, the proposed memory benefits more with a 20% improved TMR. This is because the large parasitic resistance of the long bitline becomes comparable to the cell resistance at the on state. The read margin decreases and the memory becomes more vulnerable to the process variation of MTJs and transistors. For relatively large array sizes, the read EDP of the proposed memory cell reduces up to 22×.
IV. CONCLUSION
This paper proposes a novel two-transistor spintronic memory cell with enhanced read performance that is applicable to a variety of writing mechanisms, including spin diffusion, spin Hall effect, domain wall motion, and magnetoelectric effect. Detailed cell structures are illustrated and rigorous SPICE Monte Carlo simulations are performed to take into account the process variation of MTJs and CMOS transistors. Compared to the prior three-terminal memory cell, the proposed memory provides 22× reduction in overall read EDP at relatively large array sizes with 20% improvement in cell density.
