Abstract-This paper presents a low-offset read sensing scheme for resistive memories. Due to increasing device variations in sub-32 nm CMOS processes, it becomes very challenging to design a high yield and low-offset read-sensing scheme. In this work we address these issues by using a pseudo-differential sensing scheme to get 2× signal margin and by full offset cancellation of the sense-amplifier, making it more suitable to tolerate variation from the memory array due to storage device resistance variation. Measurement results show the sense-amplifier can work with a 20mV input, which makes it ideal for small-signal sensing for resistive memories.
I. INTRODUCTION
Resistive memories [1] , [2] are a class of non-volatile embedded memories that have the potential to be a universal memory technology by providing the density of DRAM, the speed of SRAM and the non-volatility of Flash. Resistive memories typically consist of a 1T-1R structure [3] , [4] , with a resistive storage device (e.g. a magnetic tunnel junction or MTJ for Spin-Transfer Torque Random Access Memory (STT-RAM)) and an access transistor. There are two resistance states: high-resistance (R H '1') and low-resistance (R L '0'), where R H = (1 + T M R) × R L . For MTJ devices the tunnelmagneto-resistance ratio (TMR) is typically around 100%-200%, depending on technology, temperature etc., which makes it challenging to distinguish the two resistance states correctly. Also, when a current is passed through an MTJ device to read it, there might be accidental flipping of the MTJs state resulting in an unwanted write operation (i.e. a read disturbance). Hence, the current through the MTJ (I mtj ) should be kept as low as possible. On the other hand due to increasing variations in sub-32 nm CMOS processes along with variation in MTJ resistance, it becomes really challenging to design a read-sensing scheme that achieves low readdisturbance and high yield (> 5.5σ). Previous works e.g. current sensing schemes [3] , [4] suffer from mismatches in the mirroring transistors and hence do not have a very good yield. Furthermore, they typically operate from a high supply voltage due to voltage headroom requirements and hence consume a lot of power. In this work we try to address these issues to design a robust read sensing circuit for resistive memories which would work for yield > 5.5σ and reduce power consumption. The robustness to variations is achieved in mainly two ways. Firstly, due to the pseudo-differential nature (comparing data to two references 'ref1' and 'ref0') of the sensing scheme we get 2× signal margin as compared to a single reference scheme [3] . Secondly, the offset cancellation of the sense-amplifier (SA) makes it more suitable to tolerate variation from the array due to MTJ resistance variation. Also due to offset cancellation we can use small sized devices for the SA, which results in lower area and less power. Thus the proposed technique benefits from technology scaling. Furthermore, since the sense amplifier has a differential topology it reduces the effect of supply noise and improves common mode rejection. Fig.1 shows the proposed resistive divider based read sensing technique. The basic principle of sensing the MTJ resistance is comparing a resistive divider output voltage V O (formed by a 'ref1' MTJ and data MTJ) to 2 reference voltages V H and V L . V H and V L are also generated from 2 resistive divider networks using 'ref1' and 'ref0' cells. R ref,H and R ref,L are reference '1' and '0' devices respectively. R mtj is the data MTJ to be sensed. In phase Φ 1 , the node 'in1' is connected to V O and the node 'in2' is connected to V L . So V in1 is close to V dd /2 (if we assume the data bit is '1') and V in2 is close to V dd /3 (assuming R H = 2 × R L ). In the next phase Φ 1B , 'in1' is connected to V H and 'in2' is connected to V O . Hence, V in1 is equal to V dd /2 and V in2 is close to V dd /2. Thus the swing at the node 'in1' from phase Φ 1 to Φ 1B is smaller than the swing at 'in2' i.e. ∆V 1 < ∆V 2 . Whereas if data is '0', ∆V 1 > ∆V 2 . The sense margin is defined as (∆V 1 −∆V 2 ) An analytical model of this scheme was simulated to analyze the effect of MTJ resistance variation on the sense margin. The µ and µ/σ of the sense margin (SM) is shown in the Fig.1 . R L and R H are assumed to be 4KΩ and 10KΩ respectively in the simulations, each having 5% variation. It is also assumed that the read current is kept less than 20µA to keep the read-disturbance rate low. This corresponds to V dd ≈ 20µA × (10 + 10)KΩ = 400mV . As seen from the figure, the proposed sensing scheme improves both the nominal signal margin as well as the mean/sigma as compared to an ideal single-ended voltage sensing scheme. This is mostly due to the pseudo-differential nature of the sensing technique which provides ∼ 2× signal margin compared to a singleended scheme. In addition to that the voltages V in1 (and V in2 ) are generated using the same R ref,H resistor in both the phases Φ 1 and Φ 1B , which reduces the effect of its variation of the sense voltage. The proposed resistive-divider technique also burns less power since V dd is smaller (∼400mV) compared to the nominal V cc (∼1V, due to voltage headroom requirements) for ideal voltage sensing.
II. PROPOSED RESISTIVE DIVIDER BASED READ TECHNIQUE
The reference devices can be implemented in the memory array itself and they can be shared by a sub-array (or sector) Since the voltages at the nodes 'in1' and 'in2' are needed simultaneously, the unselected sector (shown in the bottom part of Fig.2 ) is used to generate the reference voltages (V H , V L ) by operating the switches in phases Φ 1 and Φ 1B as described in Fig.1 
III. PROPOSED SENSE-AMPLIFIER WITH FULL OFFSET-CANCELLATION
The difference of V O from the 2 reference voltages V H and V L are fed as the 2 inputs of a differential sense amplifier. The sense amplifier (SA) consists of 2 inverters which can be reconfigured both as an amplifier and a latch, similar to [5] . We propose a technique to fully cancel the trip-point (V trip ) mismatch of the 2 inverters to reduce the sense-amplifier offset voltage. ) and using extra sampling capacitors (C Z1 and C Z2 ). Shown for the case of ∆V 1 < ∆V 2 , i.e. data '1'. Fig.3 shows the proposed sensing circuit. The nodes 'in1' and 'in2' are from the resistive divider network described above. During phase Φ 1 , the two inverters 'inv1' and 'inv2' are biased at their respective trip points. This self-biasing scheme makes sure that the inverters are biased in their high gain (amplifying) region irrespective of device variation. During this phase the sampling capacitors (C z1 and C z2 ) samples the difference of the trip points of the 2 inverters. Next, in phase Φ 1B , the negative feedback for the inverters is disconnected. The swing at the nodes 'in1' and 'in2' are coupled to the input of the inverters via the coupling capacitors (C f 1 and C f 2 ). The swing at the input of the inverters is amplified by their gain (A) at the outputs ('y1' and 'y2'). The nodes 'z1' and 'z2' of the capacitors C z1 and C z2 are floating in this phase, so they do not provide any extra loading at the input of the inverters. After this, in phase Φ 2e , 'z1' is connected to 'y2' and 'z2' is connected to 'y1'. This creates a positive feedback circuit (as long as C z > C f /A). So the inputs 'x1' and 'x2' will move in opposite directions (from the respective V trip points), depending on whichever of ∆V 1 and ∆V 2 is higher. Similarly, the outputs 'y1' and 'y2' will move in opposite directions. It should be noted that even if the trip points of the two inverters are different, it does not affect the positive feedback since the difference of V trip1 and V trip2 was already sampled across C z1 and C z2 during phase Φ 1 . By doing this we can fully cancel the offset due to V t -mismatch in the 2 inverters. Finally, in phase Φ 2 the capacitors C z1 and C z2 are shorted, completing the latch. The waveforms of both the SAs are shown in Fig.4 for a case where there is significant mismatch of the V trip of the 2 inverters. As seen from the figure the proposed SA was able to resolve the data correctly due to the full offset cancellation technique. Table I shows the effect of variation on the simulated SA offset voltage (V of f ). The proposed SA provides ∼ 3× improvement in σ(V of f ). The low sigma due to offset cancellation suggests that small sized transistors can be used for the sensing inverters and hence the SA power can be reduced. On the other hand, the SA area which is mainly dominated by the size of the sampling capacitors also reduces with CMOS scaling. This is because the capacitors can be implemented by regular MOS transitors. Hence the proposed SA benefits from technology scaling in terms of both area and power metrics, which is very desirable when designing for future resistive memory technologies. Table I also shows the effect of supply noise on σ. For this analysis the V cc of the SA is switched by ∆V cc from phase Φ 1 to Φ 1B . This is because changing V cc in phase Φ 1B shifts the inverter trip points and this shift did not get sampled during phase Φ 1 . Hence, this step jump in V cc can have a negative impact on the SA operation. However, because the proposed scheme is differential it is more immune to supply noise (which is a common mode signal) than a single-ended sensing scheme [6] . 
VO

IV. SILICON IMPLEMENTATION AND MEASUREMENT RESULTS
The proposed SA was implemented in a 14nm CMOS technology. The capacitors in the design (C f , C z ) are realized using MOSFET devices to limit the technology cost while still achieving a compact footprint of 8µm 2 . To facilitate offset characterization, the resistive divider output voltages (V H , V L , and V O ) are provided externally through the pads as shown in Fig.5 . A Schmitt trigger and a timing control block is integrated on-die to relax the tester requirements. After the DC inputs are set, START is asserted, and the timer generates all the phase clocks that are necessary for SA operation as described in Section III. The SA output becomes available on the D out pin at the end of Φ 2 , and is held by the SA in the latch state until D out value is recorded and START is deasserted for the next measurement. The trip points of 71 SA instances were measured by stepping V O from V L (100 mV) to V H (150 mV) with a 2mV resolution. Each instance was tested 16 times and the V O levels that can reliably produce '0' or '1' at the output were recorded. Fig.6 shows the distributions of the trip voltage (defined as V O − 0.5 × (V L + V H )) for both low-to-high and high-to-low transitions. The low-to-high trip point varies with a (µ, σ) of (11.3 mV, 2.6 mV) whereas the distribution of the high-to-low trip voltage has a (µ, σ) of (-9.6 mV, 3.2 mV). All the 71 SAs are able to reliably detect input signal levels beyond 20 mV in all 16 runs. We suspect that the data uncertainty within the 20 mV input range is partly caused by random AC noise events in our test environment, since the pre-silicon validation had confirmed the design to have a good tolerance to supply noise and low thermal noise. The AC noise is subtracted from the data by computing the average trip voltage of each SA from the 16 measurements. The DC offset is then defined as twice the difference between the measured average and ideal trip points. Fig.7 shows that the DC offset varies with a standard deviation of 1.9 mV, matching reasonably well to the expected value of 2.1 mV. µ = 2.0mV σ = 1.9mV Fig. 7 . Measured DC offset from 71 SAs. Each SA was tested 16 times and random noise was removed from data by computing the average trip point. DC offset is then defined as twice the difference between the measured average and ideal trip points.
V. CONCLUSION
In this work we presented a four-phase differential sense amplifier (SA) with an offset cancellation technique for voltage divider based read sensing of resistive memories in a 14nm CMOS process. 71 tested sense amplifiers achieve correct operation with 20mV input and achieve a DC offset σ of 1.9mV. This shows that the SA can tolerate large variations from the memory array to achieve a high yield. On the other hand, due to the offset-cancellation technique the SA can be designed using small sized devices to achieve low area and power.
