Abstract-Spin-Torque Transfer RAM (STTRAM) is a promising candidate for last level cache due to its high density, high endurance and low leakage. Although promising, STTRAM suffers from high write latency and write current. Additionally, the latency and current depends on the polarity of the data being written. These factors introduce security vulnerabilities and expose the cache memory to side channel attacks (SCA). In this paper we propose a SCA model where the adversary can monitor the supply current of the memory array to partially identify the sensitive cache data that is being read or written. We propose a suite of low-cost solutions such as short retention STTRAM, obfuscation of side channel using 1-bit parity and multi-bit random write, and, neutralizing the side channel using constant current write driver to mitigate the attack. Our analysis reveal that the 1-bit parity reduces the number of distinct write current states by 30% for 32-bit word and the current signature is further obfuscated by multi-bit random writes. Constant current write makes it more challenging for the attacker to extract the entire word using a single supply current signature.
I. Introduction
Spin-Torque Transfer RAM (STTRAM) [1] is promising for Last Level Cache (LLC) due to numerous benefits such as high-density, non-volatility, high-speed, low-power and CMOS compatibility. Fig. 1 shows the STTRAM cell schematic with Magnetic Tunnel Junction (MTJ) as the storage element. The MTJ contains a free and a pinned magnetic layer. The resistance of the MTJ stack is high (low) if free layer magnetic orientation is anti-parallel (parallel) compared to the fixed layer. The MTJ can be toggled from parallel to anti-parallel (or vice versa) by injecting current from source-line to bitline (or vice versa). The data in MTJ is stored in the form of magnetization. The data stored is '1' if the free layer magnetization is anti-parallel to fixed layer magnetization and '0' if they are parallel. The read/write latency of MTJ depends on the size of the device, current passing through the layers as well as on process variation. STTRAM depends on ambient parameters like magnetic field and temperature that can be exploited to tamper with the stored data. The free layer of MTJ flips under the influence of external magnetic field which can be exploited by the adversary to launch magnetic attacks using a horseshoe magnet or an electromagnet [2] . The switching of MTJ depends on the ambient temperature, at high temperature the MTJ resistance reduces resulting in high read and write current [3] . The increased read current leads to read disturb failures, where the bits are accidentally flipped during read operation. The temperature can also be exploited to extend the persistence of the memory [11] . The persistent user data in non-volatile cache can also be compromised by launching unauthorized read and write operation, and probing the data buses after the authentic user has logged off. The persistent data leaving the cache can also be accessed by probing the data bus between the cache and main memory [4] . Traditional cache attacks can also be extended for STTRAM such as, (a) micro-probing, where conductors are attached to 
Time (ns)
978-1-5090-3623-3/16/$31.00 ©2016 IEEE the chip surface directly to interfere with the integrated circuit; (b) radiation imprinting, where the contents are burned in using X-Ray radiation to prevent overwriting or erasing of stored data; (c) optical probing, where a laser is shinned on the surface resulting in activating the underlying circuit. The active components glow which can then be used to interpret the stored data.
In this work we investigate the Simple Power Analysis (SPA) based SCA, to decipher the contents of the STTRAM LLC by monitoring the current drawn from the supply during read and write operations. The fact that STTRAM is associated with high write latency, high write current and asymmetry (polarity dependent) of writes, makes it vulnerable to SCAs that can compromise data privacy and integrity. The current in a circuit can be measured by inserting a small resistance in series with the Vdd or ground rail and measuring the voltage drop across it. Sophisticated devices can be used to sample the voltages at high rates (1GHz) with excellent accuracy (< 1% error) [5] . The system level illustration of the die and the regulated power supply is shown in Fig. 2 . Although on-chip regulators have been investigated, due to its limited presence in ICs makes SPA-based attacks non-trivial. Fig. 3 shows the variation of supply current when a 512b word is written into the LLC. In order to mimic the power signature of a processor core, we implement 15, 17, 19 and 21-stage ring-oscillators and instantiate those 250 times. We note the change in the DC current level upon bit-flip from all-0 to all-1 which is a direct indication of the value of data being written. The data can be extracted more easily by forcing the CPU in idle mode. 
II. STTRAM Vulnerabilities
A. Read/Write Latency: The write latency of STTRAM is a function of thermal stability factor (Δt) which in turn depends on the retention time. For 10 year retention Δt =56 is required [13] which corresponds to a write latency of 0.67ns at 1V supply. Furthermore, STTRAM is susceptible to process variation (PV) [6] which increases the thermal stability of bits randomly especially for larger arrays. Therefore, some bits suffer from excessive high read and write latencies. Fig. 4 (ab) shows the read and write latency distribution of a 40nmx40nmx4nm STTRAM under PV. A 5000 point Monte Carlo simulation is performed and the data is extrapolated to 8MB using extreme value theory in Matlab. It is observed that the worst case write (read) latency is 1.3X (3.4X) the mean value. To avoid read and write failures worst case latency is followed for the entire memory array which results in longer wordline pulse. The longer read and write latency presents more opportunity to the adversary to analyze the side channels and weaken the data privacy (Section III).
B. Read/Write Current:
Another aspect of STTRAM is the high write current which is dependent on thermal stability, retention time and the polarity of the stored data. We assume constant voltage write which is commonly employed to simplify the write driver design [6] . STTRAM resistance is high (low) during state '1' ('0'). Fig. 5 (a) shows the supply current waveform for single bit write '1' when the previous value stored is '0'. Intially the current is high (STTRAM resistance low) and it goes low after successful write. Fig. 5 (b) shows the supply current waveform for write '0' with previous value stored as '1', in this case the current is initially low and goes high after successful write. The high and low states of current are very distinct and they reveal the information about the previous and new data. The current difference between the states depends on the Tunnel Magneto Resistance (TMR) of STTRAM which is given by (RH-RL)/RL. For robust read operation it is desired to have higher TMR which adversely affects the data privacy. The read current is comparatively less than the write current ( Fig. 5(c) ), thus the read and write operation can be distinctly identified from the current waveforms. The source degeneartion based read sensing is used in this work [7] .
C. Temperature Sensitivity:
The thermal stability (Δt) of STTRAM is a function of ambient temperature and the write current and write latency linearly depends on the thermal stability. The thermal stability is given by Δ = 2 .
where Hk = uniaxial anisotropy, Ms= saturation Colder temperature increases Δt which in turn increases the write current and latency. Fig. 6 shows the write latency with respect to delta values. The write latency increases with the increase in thermal barrier. This can be exploited by the adversary to strengthen the side channel signature from STTRAM.
III. STTRAM Attack Models
A. Exploiting Read/Write Current: The LLC contains sensitive data in raw form such as login, password and credit card details entered during a web transaction and encryption keys used to encrypt data to be sent over the network. In current processor architecture all the user data processed by CPU passes through cache memory. The adversary can steal the raw data or get clues about the data so that the correct data can be predicted in linear time. For STTRAM LLC the adversary can perform SCA by monitoring the supply current waveform of the memory array. It is assumed that the adversary can monitor the current flowing into the memory array from the power supply. Even if the adversary has access to die-level power supply, it can reveal the LLC side channel signature. Fig. 7 shows the write current waveforms for 4-bit write operation in STTRAM. Out of 16 data values only 5 are unique in terms of total number of 0's and 1's (1111, 0111, 0011, 0001, 0000). In memory array all the bits in a word are written in parallel, thus the order of 0's and 1's in a word does not affect the supply current waveform rather the overall number of 0's and 1's in a word defines the current signature. For 4 bits all 5 permutations are clearly distinct in the current waveform. Knowing the number of 0's and 1's weakens the security significantly as it reduces the reverse engineering effort to identify the correct data.
B. Exploiting Read/Write Latency:
The high read and write latency provides a larger attack window to the adversary. By monitoring the current waveforms the adversary can not only predict the number of 0's and 1's in the new data that is being written but can also predict the previous data by sampling the current just after the wordline is asserted. The adversary samples the current during the attack window shown in Fig. 7 . The difference in current states of each combination depends on the TMR of STTRAM as discussed before, higher the TMR more apart are the current states. In Fig. 7 the write operation is completed in 800ps but to avoid write failures under PV the wordline is active for longer duration. This gives adversary more time to identify the transient current and get confidence about the results. Thus, data dependency of current reveals the stored and new data and higher latency facilitates the attack. The figure also shows the attack window available to identify the old and new data. Note that larger word size creates more number of states in supply current signature however the difference between two consecutive states remain the same. Furthermore, larger word size increases the total current which makes the attack easier for the adversary. The adversary can intentionally increase the write latency by lowering the ambient temperature. The MTJ resistance increases at lower temperatures which leads to less write current. The write latency is directly proportional to the write current and thus at lower temperatures the write latency increases which provides adversary more time to launch the attack.
IV. Prevention Techniques
In this section we discuss preventive techniques to obfuscate the current signature and/or make the attack difficult or nearly impossible. Since the supply current signature is prominent during write operation we focus our efforts to obfuscate the write current signature.
A. Semi Non Volatile Memory (SNVM):
SNVM is a nonvolatile memory with lower retention time. The typical retention time for STTRAM is 10 years however such high retention time is not required for cache application as the data is invalidated when the system restarts or the virtual address space is changed. Instead the retention time can be lowered to improve the write latency and write current [8] . The write latency and write current (I) linearly depends on the thermal barrier (Δt) of STTRAM. The retention time (t) is exponentially related to Δt by t = C × e kΔt , where C and k are fitting constants.
Both write latency and write current can be lowered by reducing Δt which in turn lowers the retention time. Since Δt depends on the free layer volume of STTRAM it can be scaled to lower the retention time ( Fig. 9 (a) ). The lower write latency due to SNVM reduces the attack window as shown in Fig. 6 . Lower write current brings the current states closer to each other making it difficult to identify the state individually. However, simulations (Fig. 9 (b) ) show that at low temperatures the retention time increases dramatically, thus giving away the above benefits obtained from lower retention. Thus, SNVM cannot be used in isolation to prevent SCA.
B. Adding 1-Bit Parity:
The objective of this prevention technique is to merge multiple supply current levels in the side channel current waveform which will make it difficult for the adversary to predict the states accurately. This is achieved by writing an extra parity bit along with the original data. Fig.   10(a) shows the current waveform of 4-bit write with 1-bit even parity. So, instead of writing 4 bits we write 5 bits with the last bit value decided by the parity of the 4 bits. By doing this we are able to merge 5 states (Fig. 10(a) ) into 3 states. Compared to un-coded data the reverse engineering effort increases because a data will map to more number of possibilities. The solution works on the principle that the overall write current depends on the number of 0's and 1's and not on their order. This extra 1-bit write makes some states identical to each other in terms of total 1's and 0's. For example, the un-coded 0111 will become 01111 which will merge with 1111. Fig. 10(b) shows the percent reduction in states with 1-bit parity for different word sizes. For a 32-bit word the number of states reduce by 30%. The reduction in states due to 1-bit parity goes down with the increase in word size because the effect of 1-bit parity gets absorbed by the larger word size. For a 32-bit word the effect of single bit is 1/32 whereas for a 256-bit word the effect reduces to 1/256. The reduction in states is maximum for 16-/bit word, 70% reduction. Below 16-bit the reduction rates drops because there are not many states available to merge. For a 4-bit word there are 5 states out of which 2 are merged by 1-bit parity. Therefore, the 1-bit parity mitigation technique works best for 16-32 bit word sizes. Note that the overhead associated with parity is negligible for practical word sizes. Furthermore, parity encoding is typically present in the error correction code (ECC) protected memory arrays. Therefore, this technique is easily introduced in the design by reusing existing design features.
C. Adding Random bits in Word:
The reduction is states with 1-bit parity diminishes as the word size increases. The signature of the current waveform at higher word sizes becomes difficult to interpret as the number of states increase. To further obfuscate the signature, we propose to add multiple random bits in the word during write. This technique further complicates and merges the states in the supply current signature. The results with addition of 2, 3 and 4 random bits in the word is shows in Fig. 11 . It can be observed that the larger number of extra random bits reduce the number states substantially for larger word sizes. The random bits can be generated by employing a simple pseudo random number generator. For larger word sizes the overhead from few extra bits is expected to be negligible.
D. Constant Current Write:
In the previous Section it has been noted that asymmetric polarity dependent write current is a manifestation of constant voltage write. If we write both '1' and '0' with the same amount of current, then there will be only one level in the current waveform and the write current will only depend on the word size. Constant current write can be achieved by using a current mirror with voltage controlled current source (Fig. 12(a) ). The two PMOS forms the current mirror whereas the NMOS MC controls the current to be mirrored depending on the STTRAM resistance [9] . Bias voltage (VB) is adjusted to provide the initial read current in the main branch which will pass through the STTRAM in the auxiliary branch. However constant current write will create mismatch in switching times between '0' and '1' states ( Fig.  12(b) ). This will affect the design of the word-line driver but the adversary will have no clue about the data as the current will remain constant throughout the write access.
Reducing power overhead of constant current write: To ensure functional correctness, the constant current approach utilizes the worst case write current injected to homogenize the write current. This leads to power wastage while writing logic '1'. To address this issue it is possible to leverage the trade-off that exists between write current and error rate (as shown in Fig. 13) . By lowering the write current for both '01' and '10', the write-time of certain number of bits may fall beyond the worst latency. These bits contribute to the write error rate. By maintaining the write error rate under permissible levels or increasing the permissible write latency it is possible to lower the power overhead of constant current write. E. Increasing Word Size: The supply current waveform highly depends on the number of bits that is being read and written at once i.e., the word size. With the increase in word size and under PV the attack window for the adversary will reduce. This will affect the prediction accuracy and increase the difficulty for the adversary to correctly predict the number of 0's and 1's stored. Thus, increasing word size during read/write can lower the attack window for the adversary.
V. Discussions
In this section we discuss the applicability of the proposed attack model and countermeasures for various scenarios. A. Impact of Scaling: With technology scaling the MTJ size reduces which lowers the free layer thickness. The thermal stability (Δt) is linearly dependent on the free layer thickness and the retention time is exponentially related to Δt. Therefore, the write latency and write current of STTRAM is expected to scale down making it more secure against power analysis attack. Introduction of perpendicular magnetic anisotropy (PMA) STTRAM makes it further challenging for the (a) (b) Fig. 12 (a) Constant current write circuit [9] ; and, (b) write latency difference with constant current write (current in mA). 
High Current
Reduced Current adversary to perform meaningful SCA due to inherently lower write latency and write current. B. Impact of TMR: As described earlier, the TMR ratio determines the resistance difference between the two MTJ states. It is therefore evident that, larger the TMR, greater will be the difference of resistance between the two MTJ states. For a good sense margin, a large TMR is always desired. However, this can prove detrimental from a security point of view as it will allow a clearer distinction between the bits being written/read. Thus, improving the effectiveness of SPA. C. Impact of Usage: Although STTRAM LLC is considered in this paper the proposed attack models are equally applicable to the STTRAM main memory. Availability of dedicated power supply makes it easy to probe main memory active current. However, cryptographic keys cannot be revealed since the crypto operations are performed on chip. Nevertheless, the raw unencrypted sensitive data can be extracted.
D. Impact of Magnetic Tampering:
External DC magnetic field of opposite strength could be used to increase the switching time of MTJ, which will increase the attack window for the adversary. Thus, with the help of a common horseshoe magnet the adversary can increase the write latency to facilitate attacks (especially for constant voltage write).
E. Cache Timing Attack:
In shared computer the main memory and hard disk are protected against use by another user on the same machine but the cache is not. If two users are working on the same machine the malicious user can fill the entire cache with his own data and wait for the other user to perform secret operations like encryption. The malicious user then measures the loading time to find which of his data has been replaced by the other user and learns about the cache addresses used in encryption. This timing information can be exploited for key recovery of encryption algorithms like AES [14] . Since a larger cache size can be afforded with STTRAM (due to smaller footprint bitcell) the number of cache line replacements is expected to be less alleviating the cache timing attack. However the persistence of data can be exploited to launch the attack at a later time to retrieve the sensitive information. F. Other Side Channels: STTRAM resistance in the parallel and anti-parallel state is in the range of KΩ (5K-10K) and the write current is in the order of µA (100-150 µA). Thus, the IR drop will be in the order of mV resulting in considerable droop in supply voltage. The adversary can monitor the droops in supply voltage to identify write operation and the amount of droop can give out the information about the data being written much similar to supply current. G. Considerations for Other NVMs: Long/asymmetric write latency and high/asymmetric write current is common challenge for other NVMs such as Resistive RAM, Phase Change RAM and Domain Wall Memory. Therefore the attack models presented in this paper are equally applicable to the emerging NVMs. Due to generic nature of the solutions proposed in this paper, similar techniques could also be extended to other NVMs for mitigation.
VI. Conclusions
In this paper we showed that STTRAM read/write current, latency and asymmetricity can be security vulnerabilities. We presented novel SCA models for STTRAM to compromise the sensitive data in LLC. We also provided a suite of preventive countermeasures such as constant current write, increased word size, SNVM and parity bit encoding to increase the reverse engineering effort required by the adversary to decipher the data from read and write current waveforms. The proposed techniques showed significant promise to protect against data privacy attacks to enable secure NVM design. The solutions proposed in this paper could also be extended to other NVMs for attack mitigation.
