Magnetic skyrmions (MS) are particle-like spin structures with whirling configuration, which are promising candidates for spin-based memory. MS contains alluring features including remarkably high stability, ultra low driving current density, and compact size. Due to their higher stability and lower drive current requirement for movement, skyrmions have great potential in energy efficient spintronic device applications. We propose a skyrmion-based cache memory where data can be stored in a long nanotrack as multiple bits. Write operation (formation of skyrmion) can be achieved by injecting spin polarized current in a magnetic nanotrack and subsequently shifting the MS in either direction along the nanotrack using charge current through a spin-Hall metal (SHM), underneath the magnetic layer. The presence of skyrmion can alter the resistance of a magnetic tunneling junction (MTJ) at the read port. Considering the read and write latency along the long nanotrack cache memory, a strategy of multiple read and write operations is discussed. Besides, the size of a skyrmion affects the packing density, current induced motion velocity, and readability (change of resistance while sensing the existence of a skyrmion). Design optimization to mitigate the above effects is also investigated.
Introduction
Leakage current and process parameter variations significantly impact scaled CMOS (complementary metal-oxide semiconductor) devices. The need for non-volatility (zero off-state leakage), higher density, and robustness has consequently led researchers to explore alternatives technology to replace traditional CMOS-based memories. Several emerging technologies such as phase change memory (PCM), resistive random-access memory (RRAM), spin-transfer torque Magnetic RAM (STT-MRAM), and domain wall motion (DWM) based memory have been proposed as potential substitutes for CMOS-based memories. One such promising high density memory technology, domain wall motion based racetrack memory, was proposed by IBM 1 . In such a racetrack memory, multiple data bits can be coded in a sequence of spin ups or downs, separated by domain wall, and is driven by the spin transfer torque. DWM based TapeCache 2 has shown good performance improvement (with higher packing density and better energy efficiency) over other spintronic memory devices 3 . However, the motion of domain wall might be pinned by the presence of defects 4 , and hence the feasibility of the DWM based memory might be limited by the imperfectness in the material. One attractive alternative is magnetic skyrmion. Skyrmions, as magnetic storage, have been shown to possess few benefits over the domain wall motion based racetrack memory in terms of high density, low power, and is less limited by imperfectness of the material. Topologically protected properties prevents the motion of skyrmions from pinning at the defect sites in a magnetic layer, and thus skyrmions as information carrier are robust and more resistant to pinning defects. Skyrmions can be observed in non-centrosymmetric bulk magnetic materials or ultrathin magnetic system with breaking inversion symmetry and large spin orbital coupling. The state of a magnetic skyrmion can be explained by the presence of Dzyaloshinskii-Moriya Interaction (DMI) 5, 6 -the DMI between two atomic spins S 1 and S 2 with an neighboring atom [7] [8] [9] [10] [11] [12] can be expressed as
where D 1,2 is the Dzyaloshinskii-Moriya (DM) vector. In this work, we propose a skyrmion based on-chip cache memory. The much higher density of the skyrmion based cache memory and its non-volatility are promising for last level on-chip cache application. Fig. 1 shows our proposed device structure of a multi-bit skyrmion cell. Magnetic skyrmions can be stored in a ferromagnetic nanotrack adjacent to an SHM. The write port consists of an MTJ with two access transistors. A skyrmion can be nucleated in the nanotrack by injecting a spin-polarized current through the left MTJ (write MTJ) 3 . The motion of skyrmions can be controlled by an in-plane current through the nanotrack, or by utilizing vertical injection of spin current generated from a charge current flowing through the spin-Hall Metal layer (SHM) 13, 14 . It has been shown that skyrmions driven by current flow through the SHM layer can obtain higher velocities with lower current densities 3 . Consequently, we use a charge current through the SHM layer to move the skyrmions along the current flow direction. The shift ports at both sides of the SHM consists of an access transistor. Stored data can be moved in either direction by turning ON the access transistor and by properly biasing BL and SL. The presence of a skyrmion can be detected by sensing the change of tunneling conductance of the right MTJ (read MTJ) with respect to a reference MTJ and a simple CMOS inverter. The read port includes an MTJ in series with a reference MTJ and two access transistors. The presence of a skyrmion under the read port alter the resistance of the right MTJ. This resistance change can be sensed by the voltage difference between the read MTJ and the reference MTJ. Note, however, the motion of skyrmions might bend away from the intended direction due to the Magnus force 15 . Also, the repulsive force from neighboring skyrmions might cause the distance between the data bits to be inconsistent. To correctly sense the stored bits, it is necessary to move the target skyrmion bit to a position right underneath the read head. We investigate robust operation of memories by ensuring proper motion of skyrmions along the center of the nanotrack, and alleviate the influence of repulsive force from neighboring skyrmions by reliable spacing between consecutive bits. We also investigate design optimization. The size of skyrmions influences the packing density of the proposed cache memory, current induced motion along the nanotrack, and the resistance change while detecting the presence of a skyrmion. The skyrmion size can be tuned by intrinsic parameters (such as Dzyaloshinskii-Moriya interaction strength and perpendicular magnetic anisotropy constant) and also by the width of the nanotrack. Figure 1 . Schematic of device. The write/shift/read operations can be performed by the proposed device structure. A skyrmion can be nucleated in the nanotrack (yellow layer) by injecting a spin-polarized current through the left MTJ. The motion of skyrmions can be driven by utilizing vertical injection of spin current generated from a charge current flowing through the spin-Hall Metal layer (blue layer). The presence of a skyrmion can be detected by sensing the change of tunneling conductance of the right MTJ with respect to a reference MTJ. Here, the existence of a skyrmion represents logic "1", while its absence denotes logic "0"
Results

Operation of multi-bit skyrmion cell
In an ultrathin ferromagnetic nanotrack with strong spin-orbital coupling (SOC) and broken inversion symmetry, large DMI at the interface of nanotrack and the SHM stabilizes the presence of skyrmion. Fig. 2 shows the logical representation of stored data along the nanotrack. Depending on the existence of a skyrmion, different logic values can be stored along the nanotrack as multiple bits. Existence of a skyrmion represents logic "1", while its absence denotes logic "0" at the corresponding address. A current injected into the SHM (blue layer) from the right can shift skyrmions in the nanotrack right and vice versa. The device structure of a word with single port or multiple write/read ports are shown in Fig. 2 (a) and Fig. 2 (b), 2(c), respectively. Note that the write and read MTJs can be placed at any address location. As an example, consider Fig. 2 (a) which shows a write MTJ at address "0x0", a read MTJ at address "0x7" and we write a sequence of 0's and 1's. During the first write cycle, "0" is written into the address "0x0", and subsequently shifted right to the next address "0x1". "1" is written into the address "0x0" during the next write cycle, and then the stored data in the nanotrack is shifted right to the next address. By repeatedly writing data into the address "0x0" and shifting all stored data right to the next address, a sequence of bits can be stored in the nanotrack. As shown in Fig. 2(a) , the read MTJ is located on address "0x7". To read the stored data at address "0x4" ,say, the bit is shifted right by three positions to the address under the read MTJ. However, in order to prevent stored data in the nanotrack being destructed during shift operation, we extend the nanotrack by having extra data bits. In the worst case, to read the stored data at address "0x0", the bit is required to be shifted right by seven positions before it can be sensed. Note that the write/read latency is equal to the length of the word. However, we can also modify the write/read latency by introducing multiple write/read ports, Figure 2 . Logic view of multi-bit skyrmion based cache memory allowing simultaneous read and write. In the case of Fig. 2(b) , the write and read MTJ is shared, i.e. each MTJ can be used to perform write or read operation depending on the bias voltage. The write/read MTJs are located at address "0x0" and "0x4", and thus multiple write and read can be achieved. Data can be written into these two addresses simultaneously with subsequent shift right operation. Consequently, the time required to write eight bits into a word is reduced by half compared to Fig 2(a) . Furthermore, the write/read latency can also be improved by adding more write/read MTJs ports. By using the same write operation strategy, the time required to write a word can be further reduced by half in Fig. 2 (c) compared to Fig. 2(b) . Note more read ports, as in Fig. 2(b) , 2(c), also improves read latency. The array organization of a skyrmion based cache memory is shown in Fig. 3 . The SWLs, R/WWLs are shared among all the multi-bit cells placed in a row, and BLs, SLs can be shared among all the multi-bit cells placed in a column. Considering the read/write MTJ is shared, the read and the write operations can be performed by turning on the access transistors (R/WWL) of the read/write MTJ, precharging the BL to V W RIT E and V READ , respectively. In addition, the shift operation can be performed by precharging the BL to V SHIFT and SL to GND. In this architecture, multiple words can be placed in the same row and accessed independently. Note that the decoder is used to select a multi-bit skyrmion cell in the array and the sense amplifier is used to detect the output signal to zero or one. Table 1 lists the bias voltage conditions (along with device dimensions) for write/shift/read/idle operations. A skyrmion can be nucleated (logic ONE) by injecting spin-polarized current through the write MTJ of diameter 20 nm with a current density of 2 × 10 13 A/m 2 for 0.5 ns. The write "1" (logic ONE) operation can be performed by turning ON the write access transistors, precharging BL to V W RIT E and SL to GND. The idle operation is used to stabilize the magnetization of the nanotrack, and can be achieved by turning OFF all access transistors. We represent the absence of skyrmion to be logic ZERO and writing logic ZERO can be achieved just by shift operation. To access the logic value of the stored data in distinct bits, shift operations are involved to move the target data to the position underneath the read MTJ. Shifting stored data in a multi-bit skyrmion cell can be accomplished by turning ON the access transistors of the shift ports and precharging BL and SL to appropriate operating voltage. In order to shift the stored data right, we precharge the BL to V SHIFT and SL to GND. On the other hand, to shift stored data in the opposite direction, the bias voltages of BL and SL are reversed. The read operation can be performed by turning ON the access transistors of the read port, driving BL to V READ and SL to GND. The presence/absence of skyrmions can be detected by sensing the change of tunneling conductance from the read MTJ of diameter 20 nm. The magnetoresistance ratio of the read MTJ is 200%. However, since the average magnetization of a skyrmion is not anti-parallel to the fixed layer, a smaller magnetoresistance change can be obtained. This change is directly proportional to the diameter of the skyrmion, and is inversely proportional to the cross-sectional area of the MTJ. Thus, we chose the size of the read MTJ similar to the dimension of skyrmions in the nanotrack for better sensing. For accurate detection, the skyrmion should be located near the center of the read MTJ. However, the motion of a skyrmion might deviate from the center region of the nanotrack during the shift operation due to the Magnus force (which will be explained in the next section) 15 . This deviation can be mitigated by relaxing skyrmions and allowing the skyrmions to move back to the center of the track through edge repulsion before executing the read operation (idle operation). Note that the required idle time to move skyrmions back to the center position is related to the deviation of the skyrmion from the center region, which depends on the drive current density.
Table 1. Bias voltage conditions for various operations
Motion of skyrmion
In our proposed device structure, the skyrmion is driven by a vertical spin current generated from the charge current flowing through the SHM underlayer. The motion of the skyrmion can be well explained by the Theile's equation 15 ,
where α is the Gilbert damping constant, G is gyromagnetic coupling, D is the dissipative force tensor, v d is the drift velocity of a skyrmion, and j spin is the spin current induced by charge current flow through the SHM. The first term of eqn. (2) relates to the Magnus force caused by the interaction between the conduction electrons and the local magnetization. The longitudinal and
4/9
transverse velocity can be written as Figure 4 shows the trajectory of a skyrmion moving from the initial position to the next address position under a charge current density flowing through the SHM. Since a skyrmion undergoes a transverse velocity if G = 0, the motion of the skyrmion bends away from the drive current direction. Hence, shift and subsequent idle operations are required. The skyrmion is shifted right for 1 ns, followed by one cycle of idle operation (0.8 ns). In our simulations, 0.8 ns idle time is required for the skyrmion to move back to the center along the width of the nanotrack. During the shift operation, the transverse motion of the skyrmion stops at a certain distance from the bottom edge due to the skyrmion-edge interaction. The distance from the bottom edge decreases as the SHM current density increases. Consequently, skyrmions are not annihilated from the edges unless a larger charge current density is applied along the SHM layer. In Fig. 4 , under a drive current density of 2.22 × 10 11 A/m 2 , the skyrmion stops at 23 nm from the bottom edge. Fig. 5 shows the annihilation process of a skyrmion during shift operation in the nanotrack. Our simulation results show that the skyrmion annihilates in 1 ns if the drive current density through the SHM is larger than 4.44 × 10 11 A/m 2 , which is 2 times larger than the operation current density to shift a skyrmion to the next address position (Fig. 4) . As a result, for the shift operation, the SHM current density is being kept much below the level of annihilation. Skyrmion packing density and Device Optimization To ensure reliable read and write operation, the distance between consecutive skyrmion bits has to be consistent after each cycle of shift and idle operations. Furthermore, previously stored skyrmion bits should not be influenced by a newly nucleated skyrmion during the nucleation operation. Consequently, reliable spacing between consecutive bits is required to alleviate the influence of repulsive forces from neighboring skyrmions. The material parameters used in our simulation were taken from ref. [3] and are shown in Table 1 . In Fig. 6 , a skyrmion is shifted rightward by 48 nm and 75 nm under an SHM current density of 1.38 × 10 11 A/m 2 and 2.22 × 10 11 A/m 2 for 1 ns, respectively, followed by 0.8 ns relaxation time in each case. When the spacing between skyrmions is 48 nm (Fig. 6(a)-(d) ), the stored skyrmion bit experiences repulsive force from the newly nucleated skyrmion, and thus the existing skyrmion moves 20 nm right when a new skyrmion is nucleated. However, when the spacing between skyrmions is increased to 75 nm (Fig. 6(a')-(d') ), the stored skyrmion bit does not get affected, since the spacing is large enough to prevent any repulsive interaction from the newly nucleated skyrmion. Hence, a spacing of at least 75 nm between skyrmion bits is required (for the device dimensions and material parameters given in Table 1 ) for reliable operation. It has been proposed that the size of skyrmions affects the reliable spacing between consecutive bits. The repulsive force between the two skyrmions can be described by ?
where d is the distance between two center of skyrmions, K 1 is the modified Bessel function, H K is the perpendicular anisotropy, and A is the exchange stiffness, and t is the thickness of the nanotrack. The size of a skyrmion affects the tunneling conductance of the read MTJ, the current required to move the skyrmion, and the packing density of the proposed cache design. The tunneling conductance of the read MTJ increases with increasing radius of skyrmion, and thus a larger read voltage swing can be obtained between the read and the reference MTJ. It has been discussed that the skyrmion velocity depends linearly on the size of the skyrmion 16 . Larger skyrmions couple more strongly with spin orbit torque, and thus faster motion can be induced. Therefore, a larger skyrmion can be shifted by a lower drive current density. However, the packing density reduces since the reliable spacing between two consecutive skyrmions increases for larger skyrmions 17 . The size of a skyrmion can be tuned by extrinsic parameter like external magnetic field, intrinsic parameters (such as DMI strength, PMA anisotropy constant), or by changing the width of the nanotrack. By applying an external magnetic field in the same direction as the polarization vector at the center of the skyrmion, a larger skyrmion can be obtained. In contrast, a skyrmion shrinks if the direction of the external magnetic field is opposite to the polarization vector at the skyrmion's center. Similarly, the skyrmion size can be enlarged by increasing the DMI strength, or reducing the PMA anisotropy. Also, relaxation in the width of the nanotrack can result in larger skyrmion due to lower edge confinement effects. 
6/9 Discussion
We have proposed a skyrmion based cache memory in a 0.4 nm thick nanotrack placed on top of a 700 × 60 × 3 nm 3 SHM. An MTJ of diameter 20 nm, similar to the size of the skyrmions in the nanotrack, has been used in the simulations. Fig. 7 shows the simulation results after several cycles of nucleation, shift, and idle operations for our proposed cache memory. By injecting a spin-polarized current density of 2 × 10 13 A/m 2 through the write MTJ for 0.5 ns followed by 0.5 ns idle time, a stable skyrmion can be nucleated (Fig. 7(a) ). Subsequently, an SHM current density of 2.22 × 10 11 A/m 2 is used to shift the existing skyrmion rightward to the next position which is 75 nm from the initial position (Fig. 7(b) ). Depending on the bias conditions, skyrmions in the nanotrack can be shifted in either direction. In order to write a sequence of data bits into the nanotrack, we shift skyrmions right after nucleation. Note that the drive current density should be lower than the current required to annihilate skyrmions from the edge (4.44 × 10 11 A/m 2 in our simulation). However, it should be higher than the minimum current density (2.22 × 10 11 A/m 2 in our simulation) to move each skyrmion at the same speed. In other words, the distance that the drive current density shifts the existing skyrmion from the initially nucleated position to the next position has to be large enough to mitigate the repulsive force from neighboring skyrmions. In our simulation, we find that at least 75 nm is far enough to ensure that skyrmions do not experience repulsive force from the neighboring skyrmions. Hence, each skyrmion moves at a consistent speed under the injection of a vertical spin polarized current (as shown in Fig. 7(d), 7(f), 7(h)) . Also, as shown in Fig. 7(c) , 7(e), 7(g), the already existing skyrmions do not get affected by the newly nucleated skyrmion. The read operation can be performed after shifting and relaxing the target storage bit beneath the read MTJ. From our simulations we found that a voltage swing of 0.108 volt can be sensed between the read and the reference MTJ by pulling up BL voltage to 0.8 volt and SL to GND.
Compared with DW based cache memory, the current density required to move skyrmions is much lower than a domain wall. Also, the flexibility of a skyrmion allows it to be less pinned by defects than a domain wall, making skyrmion based memory more reliable. Since the average magnetization of a skyrmion is not anti-parallel to the fixed layer, a smaller magnetoresistance change would be detected compared to the DW based one. Thus, the size of the read MTJ similar to the dimension of skyrmions is chosen to obtain a higher magnetoresistance change. 
Methods
Micromagnetic modeling. The micromagnetic simulations are performed using graphics-processing-unit-based tool Mumax3 ?, ? . The magnetization dynamics of magnetic skyrmions driven by vertical current can be expressed by
where m is the normalized magnetization vector, m p is the fixed layer polarization, γ is Gilbert gyromagnetic ratio , α is Gilbert damping parameter, H e f f is the effective field, j z is the current density along the z axis, M sat is the saturation magnetization, e is the elementary charge, d is the skyrmion layer thickness, P is the polarization of conduction electron, the Slonczewski Λ parameter characterizes the spacer layer, and ε is the secondary spin transfer term. The material parameters used in our 7/9 simulations correspond to Co/Pt multilayers ?, ? , and are shown in Table 2 . We consider a 0.4 nm thick Co nanotrack with perpendicular magnetic anisotropy on a 3 nm Pt substrate inducing DMI. The sample is discretized into an element size of 1 × 1 × 0.4 nm 3 .
The Non-equilibrium Green's Function (NEGF) based spin transport simulation has been used in order to obtain the resistance of the MTJ. The charge current (I e ) flowing through the SHM and the corresponding spin current (I s ) are calculated using 18 I s = θ sh A MT J A SHM I e (6) where A MT J and A SHM are the cross sectional areas of the MTJ and SHM, respectively, and θ sh is the spin-Hall angle. The spin current from eqn. (6) Table 2 . Material parameters used for simulation
