Abstract-A new device structure for spin-transfer torquebased magnetic random access memory (STT-MRAM) is proposed for on-chip memory applications. Our device structure exploits spin Hall effect to create a differential memory cell that exhibits fast and energy-efficient write operation. In addition, because of inherently differential device structure, fast and reliable read operation can be performed. Our simulation study shows 10× improvement in write energy over the standard 1T1R in-plane STT-MRAM memory cell, and 1.6× faster read operation compared with single-ended sensing (as in standard 1T1R STT-MRAMs). The bit-cell characteristics are promising for high performance on-chip memory applications.
I. INTRODUCTION
S PIN-transfer torque magnetic random access memory (STT-MRAM) is a promising candidate for on-chip memories because of its high density, nonvolatility, and compatibility with CMOS technology [1] , [2] . High performance memory design is, however, quite challenging with standard 1T1R STT-MRAM. Large write current (voltage) is required for high-speed write that imposes severe stress conditions on the tunneling oxide in the magnetic tunnel junction (MTJ), leading to reliability concern such as time-dependent dielectric breakdown [2] . Therefore, the upper limit for MTJ voltage, or equivalently, the maximum achievable write-speed is determined by the tunnel barrier reliability. One approach to address such reliability issues is to design the MTJ with lower critical switching current [1] , [2] . However, when switching current is lowered, the read current must also reduce to prevent disturb failures (accidental write during read) [2] . The reduced read current then leads to increased sensing time, thus slowing the single-ended read operation. Therefore, achieving high speed for both write and read is challenging with standard 1T1R STT-MRAM because of its limited design space.
The STT induced by the spin current in spin Hall effect (SHE) is a promising methodology for manipulating Manuscript the magnetization of a nanomagnet [3] , [4] . Because the effective spin injection efficiency can exceed 100%, the write mechanism using SHE has the potential to be fast (>1 GHz) and energy efficient (< 0.1 pJ/bit) [3] , [4] . In this letter, we propose a differential spin Hall magnetic random access memory (DSH-MRAM) bit-cell that exploits the transverse spin currents in SHE to store both true and complementary bits for the same energy required to write the true bit. This results in a memory structure that is self-referenced for a full-differential and fast read operation.
II. PROPOSED DSH-MRAM STRUCTURE
DSH-MRAM consists of two MTJs and a spin Hall metal (SHM), which is a nonmagnetic conductor with spinorbit interaction [ Fig. 1(a) ]. The free layers (FLs) of the MTJs are in contact with the SHM and the magnetizations of the FL (m 1 and m 2 ) represent the stored information. The pinned layer (PL) magnetization of the two MTJs is fixed in one direction.
Our proposed structure uses SHE to switch the magnetizations of m 1 and m 2 for an energy-efficient write. In the example shown in Fig. 1(b) , m 1 and m 2 are initially oriented along the +x and −x direction, respectively, and a charge current is flowing in the SHM in +y direction. The coupling between electron spin and orbital motion (spinorbit coupling) in SHM deflects the −x and +x directed spins to +z (top) and −z (bottom) surfaces of the SHM. Therefore, the accumulated spins on the top and bottom surfaces exert STT on m 1 and m 2 . Because SHE separates opposite spins to opposite surfaces, m 1 and m 2 will be subjected to equal magnitude but opposite sign of spin currents, writing both true and complementary bits at the same time. Note that the write energy used for writing both true and complementary bits is same as the energy required to write the true bit.
The differential read operation in the proposed memory structure is shown in Fig. 1(c) , where the SHM acts as the common terminal for the read current paths. In this figure, MTJ 1 and MTJ 2 are in the parallel (P) and anti-parallel (AP) states, respectively. The resistance of the MTJs is dependent on the orientation of the FL with respect to the PL; hence, the difference between the two read currents (voltages) is sensed to evaluate the stored bit. Unlike standard STT-MRAM, our proposed device is self-referencing and does not require a global reference cell.
The schematic view of a DSH-MRAM bit-cell is shown in Fig. 2 with the access transistors in the read and write current paths. The read/write operations are performed by applying appropriate voltages to the write-word-line (WWL), read-word-line (RWL), two bit-lines (BL/BLB), and sourceline (SL).
III. MODELING AND SIMULATION
The DSH-MRAM has a different current path for read and write, hence the equivalent circuits during read and write operations are different. As shown in Fig. 3 , the equivalent circuit during the read operation consists of MTJ 1 and MTJ 2 in series with their access transistors. The resistance of the MTJs in parallel (R P ) and anti-parallel (R AP ) states is obtained from the nonequilibrium Green's function (NEGF)-based simulation framework [8] . The NEGF formalism uses a spin-dependent effective mass Hamiltonian for electron transport simulations, which were calibrated with experimental data to reproduce the resistance characteristics of the MTJ in [5] . Subsequently, the resistances of the MTJs were used in SPICE-based circuit simulations along with a commercial 45-nm transistor model to evaluate the read performance of the proposed bit-cell. During write, the equivalent bit-cell is a resistor (resistance of SHM) in series with an access transistor as shown in Fig. 4(b) . The charge current (I e ) flowing through the SHM is extracted from SPICE-based circuit simulation and the corresponding spin current is calculated as [3] 
where the effective field (H eff,i ) includes shape and magnetocrystalline anisotropy fields and the interlayer dipolar field. The dipolar coupling field is given by H Dip,i = −N dip M S m j where N dip is the effective dipolar coupling factors extracted from micromagnetic simulations [10] . In addition, the magnetic energy barrier for our proposed structure is 58 k B T , which is calculated using the method described in [11] . The simulations parameters are listed in Table I .
IV. RESULTS AND DISCUSSION
DSH-MRAM exhibits four distinct advantages over the standard 1T1R STT-MRAM. First, the write current flows through SHM instead of the tunneling oxide. Therefore, high write current can be supplied to achieve fast switching without any reliability concerns associated with the tunnel barrier. Second, the SHM has a much lower resistance than an MTJ; hence, a smaller transistor width is needed for the write current. Third, the write operation using SHE is more energy efficient because the spin current injected to the FL can be larger than the charge current. Finally, differential cell allows faster read operation. For quantitative comparison, we designed both DSH-MRAM and standard 1T1R STT-MRAM with in-plane MTJ (hereafter referred to as 1T1R cell) with an identical target write-time of 1 ns at iso-bit-cell area.
In DSH-MRAM, θ SH and t SHM are expected to be the dominant factors in determining the spin injection efficiency. As shown in Fig. 4 , for each type of SHM, we find the thickness (t SHM ) at which spin injection efficiency is maximized. The t SHM at which maximum spin current flows is different for writing 1 and 0. When writing 0, the access transistors are driving a charge current from SL to BL/BLB, in which case the transistors have a gate-to-source voltage (V GS ) of V DD . On the other hand, when writing 1, the access transistors are source degenerated because of a finite voltage at node X (Fig. 4) , thus reducing its drive strength (V GS < V DD ). This difference in transistor driving strength leads to unequal spin currents. To minimize the asymmetry, t SHM is chosen for the worst write case, i.e., writing a 1. With spin injection efficiency of ∼3× in DSH-MRAM, a minimum sized access transistor (120 nm) is sufficient to achieve a 1-ns write time.
In case of 1T1R cell design, the source degeneration of access transistor is due to the MTJ that has a much higher resistance than SHM. The stronger source degeneration in 1T1R cells causes much stronger write asymmetry and results in wasted write energy [12] . It is also noted that in 1T1R cell design, large transistor width (1.035 μm) and also boosted voltage (i.e., V DD = 1.2 V) are needed to meet the write-time requirement of 1 ns. Therefore, in 1T1R cells, high writespeed design leads to severe stress condition on the tunnel barrier, degrading the reliability. The simulation results show that DSH-MRAM achieves write energy of 0.077 pJ, which is ∼10× smaller than that of 1T1R cell (0.744 pJ). In addition, upon MTJ dimensional scaling, the write performance of DSH-MRAM is projected to be better than that of 1T1R cells because of higher spin injection efficiency and lower SHM resistance.
For reading the data stored in DSH-MRAM, a voltagesensing scheme was used, where a read current is injected into BL and BLB, and the corresponding voltages are sensed. For this differential cell, the sense margin (SM) can be written as (V AP -V P ). On the other hand, in a single-ended sensing scheme (as in 1T1R bit-cell), only the true bit is stored, hence the bit-cell provides either V AP or V P , which is compared with a reference voltage (V REF ) to determine the stored data. The V REF is typically chosen as the average of V AP and V P . Therefore, the SM in single-ended read is (V AP -V P )/2. This twofold increase in the SM because of the differential nature of DSH-MRAM allows fast read operation. The read time is typically defined as the time taken to develop a sense margin of 50 mV. Fig. 3 shows the read time of single-ended and differential sensing for different thickness of MgO (t MgO ). It is shown that the optimum t MgO occurs at 1.2 nm and differential read achieves 1.6× faster read than that of singleended sensing. In 1T1R bit-cell, t MgO affects both read and write performance, and hence, thinner t MgO may have to be chosen to improve write performance.
V. CONCLUSION
We propose a differential STT-MRAM bit-cell that uses SHE, resulting in a write operation that consumes 10× less energy than standard 1T1R in-plane STT-MRAMs. In addition, storing both true and complementary bits comes without any write energy overhead because of the nature of SHE. The proposed DSH-MRAM bit-cell can perform 1.6× faster read because of an inherently differential device structure. Therefore, DSH-MRAM is suitable for high performance on-chip memories.
