Instant-on nonvolatile electronics, which can be powered on/o® instantaneously without the loss of information, represents a new and emerging paradigm in electronics. Nonvolatile circuits consisting of volatile CMOS, combined with nonvolatile nanoscale magnetic memory, can make electronics nonvolatile at the gate, circuit and system levels. When high speed magnetic memory is embedded in CMOS logic circuits, it may help resolve the two major challenges faced in continuing CMOS scaling: Power dissipation and variability of devices. We will give a brief overview of the current challenges of CMOS in terms of energy dissipation and variability. Then, we describe emerging nonvolatile memory (NVM) options, particularly those spintronic solutions such as magnetoresistive random access memory (MRAM) based on spin transfer torque (STT) and voltagecontrolled magnetoelectric (ME) write mechanisms. We will then discuss the use of STT memory for embedded application, e.g., replacing volatile CMOS Static RAM (SRAM), followed by discussion of integration of CMOS recon¯gurable circuits with STT-RAM. We will then present the scaling limits of the STT memory and discuss its critical performance parameters, particularly related to switching energy. To further reduce the switching energy, we present the concept of electric¯eld control of magnetism, and discuss approaches to realize this new mechanism in realizing low switching energy, allowing for implementation of nonvolatility at the logic gate level, and eventually at the transistor level with a magnetoelectric gate (MeGate). For nonvolatile logic (NVL), we present and discuss as an example an approach using interference of spin waves, which will have NVL operations remembering the state of computation. Finally, we will discuss the potential impact and implications of this new paradigm on low energy dissipation instant-on nonvolatile systems.
Introduction
The trend of the scaling of the CMOS feature size has followed Moore's Law for more than¯ve decades. 1 This continuing CMOS scaling currently faces a major challenge due to increasing energy dissipation per unit area, among others. 2 The increase in power dissipation results from the increase of static leakage power as well as the continued increase of density as the feature size is scaled down. 3 The former is a consequence of the fact that the power needs to be continuously applied to CMOS circuits in order for them to retain their information, that is, CMOS circuits are volatile in keeping their states after computation or performing logic operations. In addition, the dynamic switching energy per unit area has also been increasing continuously due to the increase of device density. This increase is a result of the fact that dynamic power dissipation density per switching is a function of the number of electrons, since carriers in the device are independent. 4 The quest for novel low dissipation devices, circuits and systems is thus one of the most critical for the future of semiconductor technology and nanosystems. Today, at the system level, the information of a system is stored in two manners: (i) temporally in Static Random Access Memory (SRAM) using CMOS circuits as embedded memory, and Dynamic Random Access Memory (DRAM) as a principal working memory; and (ii) the system information is permanently stored in the hard disk when the system is powered down. This memory structure was made due to the lack of fast, energy-e±cient, cost e®ective, and high density nonvolatile memory (NVM). Spintronic devices, i.e., those utilizing the fundamental exchange interactions of electron spins may o®er very high speed (<1 ns) as illustrated in Table 1 , which compares them with other memories in terms of their key parameters such as speed, energy, density, and endurance. Table 2 gives the principal interaction physics and their characteristics which di®erent memory devices are derived from. Generally speaking, there are two types of spintronic devices: one uses the manipulation of single or a few spins (as in the case of quantum information) and the other uses collective spins (as in the case of nanomagnetics). For this paper, we will focus on the latter, as single spin electronics requires operation at low temperature due to the fact that the Zeeman energy of single spin is on the order of 0.058 meV at 1 Tesla, much smaller than kT at room temperature and consequently the state cannot be maintained for a su±cient time at room temperature. In addition, for CMOS, in which electrons are not correlated, the switching (dynamic) energy is de¯ned by the Maxwell, Landau, and Shannon limit and will be still larger than NkTln2, 4 where N is the number of electrons for a switch. In contrast, the collective spins of a magnet can be treated as a single element when considering the switching energy, i.e., kTln2, indicating N times saving in the fundamental energy limit. 17 Clearly, as the scaling continues, the switching power dissipation per unit area will continue to increase as the density increases. Furthermore, we will only discuss material systems using metallic materials for magnetism, but will not address the topic using semiconductors, such as dilute magnetic semiconductors. 5, 6 The reason of this choice is that using metallic systems, the fabrication process may be easily integrated with CMOS since they can be implemented in the back end of line processing (metallization). 7 The modulation of magnetic moments with electric voltage and current in magnetic nanostructures o®ers an exceptionally promising set of candidates for fast nonvolatile applications (see Table 2 ). Examples of spintronic e®ects which have been utilized for memory are tunneling magnetoresistance (TMR) and spin transfer torque (STT), 7À17 which rely on the spin-dependent tunneling and angular momentum transfer between current and electron spin in a nanomagnet. These principles have been used, for example, in magnetoresistive random access memory (MRAM) based on STT (i.e., STT-RAM).
14À30 This is in contrast with existing memory devices such as volatile high speed embedded memories (e.g., SRAM having the speed on the order of that of CMOS circuits). Others, such as Flash and Resistive RAM are nonvolatile but slow and have low endurance due to their intrinsic mechanisms (Table 2) . Magnetic memory simultaneously o®ers all the needed features of high speed, high density, high endurance and reliability, and nonvolatility as shown in Fig. 1 . While STT is currently being addressed for standalone memory applications, the convenience of integration with CMOS makes the device and the technology ideal for embedded applications as well.
From Table 1 , it is clear that STT-RAM has many orders of magnitude improvements in energy and speed compared to nonvolatile Flash. In addition, compared with SRAM for embedded applications, STT-RAM o®ers comparable speed with the added bene¯ts of high density and nonvolatility, enabling a new class of instant-on nanoelectronic systems beyond the conventional-scaled CMOS electronics (when integrated with CPU on the same chip). From another point of view, spintronic memories such as STT-RAM in principle o®er lower variability due to reduced quantum°uctuations, 31 which have plagued and limited the scaling of the number of electrons in CMOS. Beyond memory, this new concept of instant-on nonvolatile electronics is of interest for logic applications as we argue as follows. When the size is decreased, the leakage current due to tunneling (e.g., from the source to the drain Notes: a V is the operating voltage domain and C is the capacitance. At nanoscale, it is the Coulomb blockade energy E ¼ e 2 =C, and it is due to the quantization of charge in the single electron device. b For ballistic transport with cross-sectional dimensions in the range of quantum mechanical wavelength of electrons, the capacitance values and drain source spacing are typically small. c The limit can be also looked at from the energy point of view, given the Coulomb blockade energy, Ec > KT , yielding typically 5 $ 10 nm 3 in volume. d For exchange interaction, Ex, the electron wavefunctions need to couple with each other of collective spin moments, Si, and thus the interaction distance is at atomic scale. e The energy is needed to°ip the local collective spin moment at one site by exchange interaction. f The time needed for exchange interaction. g When the exchange energy is overcome by thermal energy, ferromagnetism is lost. The interaction will depend on the materials: for hard magnetic materials, the size will be smaller. The size is needed to maintain ferromagnetism and will depend on materials. h Nanomagnetic devices are preferred for room temperature operation. On the other hand, single spin manipulation uses exchange interaction for mostly low temperature. i The energy needed to°ip magnetic moment by external magnetic dipole, Ed ¼ À 0 m i m j =ð4r 3 Þ, where m i and m j are two magnetic moments.
of the transistor) will continue to increase and thus it is anticipated this leakage current may impede further scaling. However, if devices/circuits can be turned on and o® on demand and their states are nonvolatile when they are o®, this leakage current may be reduced, and thus it will allow for scaling the feature size to go down to smaller values (e.g., 2 nm). While bene¯ting from the inherent advantages of a magnetic memory (high speed and nonvolatility), the scalability of STT-RAM in terms of size and energy consumption will be ultimately limited due to the fact that its underlying mechanism is currentdriven control of magnetism. The use of the currentdriven switching dictates a high current°ow in the memory array and the chip, limiting the energye±ciency of STT-RAM and causing it to be too power-hungry for ultralow-energy high density power applications. 15, 16, 21 To further realize the full potential of instant-on nonvolatile electronics, voltage control of high speed and high density nonvolatile magnetic memory is being explored using magnetoelectric (ME) materials. This research aims at realizing voltage control of magnetism as the solution to enable the ultralow-power, ultrafast memories of the future. This paper is organized as follows. Section 2 o®ers an overview of STT-RAM technology, as it is the main current spintronic contender for instant-on nonvolatile electronics; thus we will also highlight some of its recent advances. Section 3 illustrates the potential system level advantages of integrating such a nonvolatile spintronic memory with CMOS, including memory replacement of SRAM, hybrid CMOS-MTJ recon¯gurable circuits and other circuit blocks. Section 4 will look at beyond-STT magnetic memory solutions, in particular, addressing ultralow-power memory using voltage control of magnetism. The reduction of switching energies brought about by the new advances on electric-¯eld-controlled magnetic switching will enable energy improvements on magnetic devices not just for memory, but also for logic applications. Section 5 will give an example of a nonvolatile logic (NVL) circuit design using such a principle associated with the voltage control of spin wave propagation. A brief discussion of another approach using nanoscale magnets will also be provided to contrast the spin wave approach. Finally, Sec. 6 provides a perspective on the future development in these areas as well as its impact on next generations of electronic systems.
Magnetic Memory Using STT

STT-RAM background and device architectures
MRAM and its variations, in particular spin transfer torque (STT-RAM) memory, have generated an explosion of technological and research interest in recent years. Because of its fundamental principles as discussed earlier, STT o®ers low area, high speed and high endurance for electronic systems, as well as nonvolatility due to the collective behavior of the spins in the free layer, and is thus ideal for embedded applications. The typical simpli¯ed structure of STT-RAM is illustrated in Fig. 2 ; it consists of a very thin free magnetic layer, a pinned layer and a thin tunnel oxide layer, which is normally MgO. Some of the details of typical structures can be found in Refs. 15, 16 and 21. The magnetization directions and anisotropies of the free and pinned layers give rise to di®erent designs and performance, as will be discussed later. STT-RAM is often regarded as a potentially universal memory technology due to its high speed, density, nonvolatility, and very high endurance. It also o®ers the additional advantage of being more scalable (unlike Oersted-¯eld-switched MRAM). It has recently been shown that STT-RAM can also achieve fairly low switching energies of $ 100 fJ (compared to, e.g., Flash or¯eld-switched MRAM), and thus is of particular interest in a number of embedded applications such as in mobile communications and computation systems. (For comparison, it should be noted that CMOS switching energy is on the order of 1 fJ (or 10 À15 J) for the 32 nm node). For STT, the write process is performed by passing current through a magnetic tunnel junction (MTJ) to transfer electron spins, inducing a torque to change the magnetic polarization of the layer according to the direction of current. The di®erent states of the MTJ, i.e., parallel and anti-parallel directions of the free and the pinned layers, give rise to a current memory loop and can be read via TMR as shown in Fig. 2(b) , which gives the quasi-static IÀV relation as well as the pulsed RÀV characteristics (Fig. 2(c) ). STT-RAM is a NVM candidate for replacing several existing memory technologies. It is a strong contender to replace embedded high speed, relatively low energy but volatile SRAM, as it o®ers comparable or superior speed, unlimited endurance, and improved density, and more importantly with additional features such as instant-on and nonvolatile characteristics ( Fig. 1 and Table 1) . Likewise, it also may be used in replacement of standalone as well as embedded DRAM, where STT o®ers not only superior speed, but also nonvolatility with comparable densities, as well as orders of magnitude improvements in write energy and speed compared to nonvolatile Flash memory.
In this section, we will brie°y provide an overview of STT-RAM technology. We will review the major performance metrics and their state of the art. We will highlight the important design tradeo®s and challenges for the successful commercialization of this technology, followed by a discussion on scaling to next generations.
From the device structure, the available optimization schemes for improvement of each performance metric (e.g., in the typically used CoFeBÀMgO MTJ cells used for STT-RAM) can be generally divided into three parts:
(I) Barrier engineering: MgO barrier thickness a®ects resistance-area (RA) product, TMR, and endurance of the magnetic bit. (II) Free layer engineering: This includes (often inter-related) parameters such as composition, saturation magnetization, shape, size, and anisotropy of the free layer. The combination of these parameters a®ects the thermal stability (i.e., retention time), switching current density, and write energy of the magnetic bit. (III) Polarizer (pinned layer) engineering: Polarizer composition and con¯guration (e.g., in-plane versus perpendicular) a®ects the spin transfer e±ciency, switching times, and switching energy.
In the following we will discuss each of the above components in meeting the challenges and will describe several pathways for improving STT-RAM performance.
From a device structure point of view, one can envision three di®erent ways to realize STT-RAM memory cells. These are illustrated in Fig. 3 . In the fully in-plane (I-STT-RAM) (top) (e.g., Refs. 11, 14, 15 and 24), both the magnetization directions of the free and¯xed layers are parallel and lie in the sample plane. An elliptical shape is usually used to provide shape anisotropy for the bit. Perpendicular anisotropy (i.e., the magnetization direction preferably being out of the plane) in these layers either does not exist or is not large enough to pull their magnetization direction out of the sample plane. This is the earliest and easiest to realize MTJ structure, and has been thus studied the most over the past years. In the fully perpendicular (P-STT-RAM) case as shown in the middle of Fig. 3 (e.g., Refs. 26 and 30), both the magnetization directions of the free and¯xed layers are parallel in the vertical direction, i.e., in this case both layers have a large enough perpendicular anisotropy which overcomes their in-plane shape anisotropy, setting their magnetization directions perpendicular to the sample plane. The main advantage of the P-STT-RAM structure (middle) compared to I-STT-RAM is better scalability and the potential to realize higher densities. This is because the structure can be made in circular shape, in contrast with I-STT where an elliptical shape is used for maintaining the shape anisotropy along the major axis of the ellipse for memory. Both structures, however, su®er from a small spin torque during the initial incubation stage of magnetization switching, a consequence of their parallel-magnetized equilibrium states, which limits their switching speed to $ 1 ns. This problem is addressed in the third combined (C-STT-RAM) structure (e.g., Refs. 19, 32 and 33) (bottom), which combines in-plane and out-of-plane polarizers with an in-plane free layer. The large torque due to the perpendicular polarizer in this case kicks the free layer magnetization out of the sample plane, whereupon it precesses around the out-of-plane demagnetizing¯eld. This form of resonant precessional switching allows for very fast write times on the order of $ 100 ps. 19 ,32À37 The additional in-plane polarizer provides an additional torque and serves as a reference layer for readout. Thus, this high speed feature is desirable for embedded instant-on nonvolatile applications. The high endurance STT-RAM has been recently studied and assessed to be among the best for embedded memory. 38 
STT-RAM performance metrics and optimization
The two fundamental performance metrics of STT-RAM at the MTJ cell level are its switching energy and thermal stability factor (particularly for standalone memory). The stability factor is given by
where M s is the saturation magnetization of the free layer, V is its volume, H k is the magnetic anisotropy¯eld, k is the Boltzmann constant, and T is the temperature. 15, 22 The value of Á de¯nes the magnetic bit's stability against false switching events due to thermal activation (i.e., retention time). For I-STT and C-STT devices, H k is mainly determined by the in-plane shape anisotropy (i.e., the energetic preference for the magnetization to align along a preferred direction as determined by the shape of the magnetic bit), while in P-STT it is determined by the perpendicular anisotropy of the free layer. The write energy is the energy required to bring about a current-induced switching event, i.e., to write a bit of magnetic information. The write energy is closely related toand needs to be optimized together with -the switching current (which a®ects the transistor size), write voltage (which a®ects endurance), as well as the thermal stability, as described below.
A key advantage for the introduction of the STT write mechanism in MRAM development has been to lower the switching current and energy required to write the magnetic state of the bit. Unlike traditional (toggle) MRAM designs of magnetic¯eld switching with driving current, STT-RAM can be scaled down much better with shrinking device dimensions (i.e., advancing technology nodes) and thus current. (Nevertheless, further reduction in switching current of STT-RAM bits is needed as it limits the transistor size (hence density) of the STT-RAM circuit at the present). For applications of instant-on nonvolatile electronics, speed and write energy are the two most important parameters. We discuss the switching energy next.
The write energy of an STT-RAM bit for switching from the anti-parallel (high resistance, AP) state to the parallel (low resistance, P) state is given by 15, 21 :
where V ðP sw ; Þ is the write voltage at the device corresponding to a switching probability of P sw , is the write pulse width, and R AP is the MTJ resistance in the anti-parallel state. The write energy for switching from the parallel to anti-parallel states E P !AP w can be obtained similarly using the parallel state resistance R P (instead of R AP Þ, and is in general di®erent, due to di®erences in both switching current and resistance in the two cases. 20, 21 For a relatively long pulse width , where the pulse is square shaped, Eq. (1) is a simpli¯ed form of the more general expression for write energy in the form of R 0 ðV ðP sw ; tÞ 2 =RÞdt. A useful¯gure of merit for comparing di®erent MTJ designs are the mean write energies E AP!P w;m or E P !AP w;m for P sw ¼ 0:5, although one should note that for practical applications much lower write error rates (WER ¼ 1 À P sw Þ are required. Combined with the need to ensure reliable memory operation while accounting for process and operation temperature variations, this results in larger practical write energy values, depending on array size and design speci¯cations.
It can be seen from Eq. (1) that for energy-e±cient operation, a critical condition is to maintain a low mean write voltage V m ¼V ðP sw ¼0:5; Þ¼J c;m ðÞAR, where A is the MTJ area, R is the resistance in the P or AP state, and J c;m is the (pulse-width-dependent) mean switching current density. The mean write energy can thus be rewritten as 15, 21 :
where R is the resistance in P or AP state depending on switching direction. The switching current density increases with reducing pulse width, and is well described by J c;m ¼ J c0 ð1 þ 0 =Þ (in the short-pulse ballistic limit) for elliptical magnetic bits small enough to exhibit single-domain behavior. 17 The switching current density J c0 is given by 17 :
where is the free layer damping factor, is the spin transfer e±ciency, M s and t are the free layer saturation magnetization and thickness, H k is the in-plane shape-induced anisotropy¯eld, H d % 4M s ) H k is the out-of-plane demagnetizing¯eld, and H k? is the free layer perpendicular anisotropy. It should be mentioned that, in addition to the Slonczewski STT, 8 the so-called¯eld-like torque 39, 40 can have a signi¯cant e®ect on switching dynamics in MgO-based MTJs. An additional torque which has recently been increasingly studied in CoFeBÀMgO-based MTJs is the interfacial voltageinduced anisotropy change. 41, 42 Both of these e®ects can signi¯cantly modify the switching current density of the device.
Based on the foregoing discussion, one can identify the following strategies for minimizing write energy of MTJ cells:
Minimization of the MTJ resistance R
In order to minimize the MTJ resistance, thinner MgO barriers in the MTJ cell must be used but are in practice limited by the MgO barrier quality. Figure 4 shows typical dependence of RA product and TMR on the MgO barrier thickness in CoFeBÀMgO MTJs, measured on a single wafer with varying MgO thicknesses. 20, 21 For reliable array operation, one needs to choose the MgO thickness to be high enough to ensure large TMR and small device-to-device variation. A second concern is the device breakdown voltage and endurance for thin MgO (number of write cycles to failure). Figure 4 (c) shows an endurance testing result for a device with RA ¼ 3:5 ohm-m 2 , with a mean write voltage of $ 0.6 V at 5 ns write time, when extrapolated to an endurance of >10 16 write cycles, although the latter may not be critical depending on application. Figure 5 shows the dependence of write energy and switching current density on RA product in a typical in-plane CoFeB/MgO/CoFeB tunnel junction. The write energy can be reduced by using devices with lower RA (i.e., thinner MgO barriers). However, note that the switching current density increases with reduced MgO thickness, possibly as a result of higher e®ective damping [ in Eq. (3)] in the free layer due to the reduced MgO quality for low RA devices. This indicates that the write voltage does not scale in a linear fashion with reduced RA. For AP to P operation and vice versa, these parameters will scale di®erently.
Reduction of the switching current density
While the write energy performance can be improved by reducing both A and J c0 , one may need to account for the simultaneous e®ect of these parameters on reasonable thermal stability, even though the stability issue may not be as critical for embedded applications. A useful¯gure of merit for minimization purposes is the switching current divided by the thermal stability factor, given by:
From this equation and Eq. (3), it can be seen that, while reducing the device area A leads to a reduction of switching current, I c0 =Á is una®ected by this change. Increasing the thickness t of the magnetic free layer, on the other hand, will increase H k and reduces I c0 =Á. Scaling of in-plane-magnetized MTJs to smaller areas (and higher densities) therefore needs to be accompanied by increasing the free layer thickness, in order to allow for the desired stability factor. By the inspection of the last term of Eq. (4), a promising approach to improve the switching energy is to increase signi¯cantly the perpendicular anisotropy in the free layer. This is because increasing H k? can reduce J c by partially cancelling the e®ect of the out-of-plane demagnetizing¯eld H d .
24À28
A signi¯cant interface-induced perpendicular anisotropy in Fe-rich CoFeB¯lms 24,29,30,43 allows for switching current reduction by partial cancellation of the out-of-plane demagnetizing¯eld in CoFeBÀMgO MTJs. This approach is advantageous since the CoFeBÀMgO material system has been shown to demonstrate large TMR values required for circuit applications, and unlike other approaches, it does not necessitate the use of multilayer structures for perpendicular anisotropy. Figure 6 shows the dependence of the switching current density on the free layer thickness (CoFeB) in MTJs, demonstrating e®ectively that as the thickness of the CoFeb layer is decreased, the perpendicular anisotropy, H k? increases. A typical structure consists of a Co 40 (free layer) MTJ with 60 Â 170 nm in-plane dimensions. A comparison of Fe-rich MTJs to control devices with a Co-rich free layer having a free layer thickness of 1.80 nm shows a reduction of the average quasi-static switching current density by > 40% (from $ 2.8 to $ 1.6 MA/cm 2 Þ due to the presence of the perpendicular anisotropy. Note that this improvement in the switching current (hence energy) versus thermal stability tradeo® is realized despite an increase in the saturation magnetization (from $ 1000 to $ 1300 emu/cc, not shown), which can only come from the increase of perpendicular anisotropy. The data shown in Fig. 6 are switching current densities for a 10 ns write time. A clear reduction of the switching current density can be observed when the free layer thickness is reduced, reaching $ 4 MA/cm 2 for a free layer thickness of 1.69 nm. Note that this reduction is much more signi¯cant than one might expect from the reduction of the free layer volume alone, and hence it must come from the increased e®ect of the perpendicular interfacial anisotropy for thinner lms. Finally, it should be noted that for a number of applications, in particular when STT-RAM is used as an embedded memory, the requirements on retention time may be signi¯cantly relieved, allowing for smaller acceptable values of the thermal stability factor Á. In such settings, smaller cell size and higher energy-e±ciency may be more important constraints, and one could thus envision STT-RAM designs where the thermal stability is sacri¯ced to achieve a lower switching current density. This would allow for the STT-RAM switching energies to be further reduced by at least one order of magnitude beyond what is possible for standalone or storage applications, where nonvolatility is the major design criterion.
Perpendicular magnetic layers
For devices with a large enough perpendicular anisotropy where H k? > H d , the free layer can be magnetized perpendicular to plane. For fully perpendicular MTJs with perpendicular free and¯xed layers, the switching current density is given by J c0 % 2eM s H k t=}, where H k is now the perpendicular anisotropy and the corresponding¯gure of merit for switching current divided by thermal stability is 15, 30 :
It is immediately seen that this value can be substantially smaller than that in the I-STT-RAM case. Little has been reported on the dynamic properties and short-pulse switching characteristics of such P-STT-RAM structures, however. Only recently, 30, 44 there have been reports both on high TMR > 100% and reasonable switching behavior in such devices, demonstrating their promise for highdensity-scaled P-STT-RAM beyond the limit of I-STT-RAM technology. Interestingly, to date, the most promising material for these structures is 
Fe-rich CoFeB similar to the¯lms used for reducing switching currents in I-STT-RAM devices mentioned above. 24 A C-STT-RAM structure, which only has one polarizer orthogonal to the free layer, on the other hand, can exhibit low write energy through a di®erent avenue, i.e., by realizing it with a much faster write time. Figure 7 shows switching results on an MTJ with this type of combined orthogonal structure, where the perpendicular polarizer is composed of a Co/Pd-based multilayer coupled to a thin CoFeB¯lm adjacent to the MgO barrier to enhance the current spin polarization from this polarizer (b). 19 One challenge in this structure results from the use of two MgO barriers, which can increase the RA product and hence result in an increase of the write energy and write voltage. Since the perpendicular polarizer in this case does not contribute to TMR and is only used for switching, its associated MgO layer also reduces the overall TMR ratio of the device, by presenting an additional parasitic resistance. These problems can be overcome by replacing the MgO barrier at the perpendicular polarizer with a metal (i.e., similar to a GMR structure). The TMR has been shown to increase to $ 100% in this manner. 33 Due to the high speed, low energy consumption, and its nonvolatility, STT-RAM may provide a platform for a new generation of instant-on recongurable electronics. The next section will give examples of the use of MTJs in conjunction with CMOS, and how they can be used to enhance performance on a system level. This will be followed by a discussion of pathways for the realization of even more energy-e±cient magnetic memories beyond-STT.
Nonvolatile Circuits with Hybrid CMOS and STT-RAM
As mentioned before, while memory is the most immediate application area of MTJs, nanomagnetic devices also o®er opportunities for integration of their inherent nonvolatility for application of circuit functions. Magnetic NVL o®ers the possibility of signi¯cantly reduced standby power consumption, for energy-e±cient systems and for realizing instanton nonvolatile operations, e.g., enabling processors to take up an un¯nished computational task after a power failure without the need to start over (booting). Further from the system point of view, the voltage applied to the devices may be turned o® when they are idle while a higher voltage can be applied for high speed operation. Thus together, NVM and logic based on spintronic nanodevices can create a new class of nonvolatile electronics, which may o®er superior performance and new functionalities compared to today's CMOS technology. We will discuss the potential integration of STT devices for several levels of applications or di®erent granularities from the circuit point of view: replacing SRAM, and complementing recon¯gurable circuits (e.g., Field Programmable Gate Arrays or FPGAs), logic circuits, and eventually to the gate and transistor levels. This di®erence of the levels in integration will depend on the energy and speed of STT memory cells, and hence as the progress on reducing switching energy is continued, the integration may be moved down to the gate level for realizing the instant-on nonvolatile electronics. To date, the minimum switching energy achieved for STT memory cells 20, 21 ,45À48 is on the order of $ 100 fJ, which is about 2À3 orders of magnitude larger than that of CMOS (i.e., $1À2 fJ for a 65-nm CMOS gate). Thus, as far as write energy is considered, the MTJ-based nonvolatile logic circuit which requires frequent MTJ switching is hardly power-e±cient. Signi¯cant scaling of each parameter is needed to make the MTJ-based nonvolatile logic circuits energetically competitive with CMOS. Such scaled devices would be potentially compelling for integration with CMOS for a variety of applications. Clearly, signi¯cantly scaled MTJ devices will be possible for low switching energy through further advances in materials such as perpendicular lms. The ultimate size will be limited by the anisotropy of the material, de¯ning its stability. As the size is reduced, harder magnetic materials may be used to prevent the layers from becoming superparamagnetic so that stability of nanoscale bit cell can be maintained. It is important to note, however, that even with the use of materials with high perpendicular anisotropy, the write current of P-STT-RAM does not scale with device size as it is proportional to the thermal stability factor, hence impeding its true scalability. Nonetheless, the energy according to Eq. (5) can be scaled to about 10 fJ (to be discussed next) by relaxing the nonvolatility requirements with proper refresh and archiving memory management in order to minimize the energy of circuits and to achieve instant-on nonvolatile systems. At present the CMOS-STT integration approach may still be suitable for special or niche logic applications, where the CMOS logic gates need not be operated frequently.
To be thermally stable for nonvolatility on a reasonable time scale (e.g., hours or days for embedded applications versus $10 years for storage applications), a single magnetic bit may only require a thermal stability factor Á ¼ E b =kT of, e.g., $20, corresponding to an energy barrier E b of $ 0.08 aJ. (As stated before, for archival applications, larger arrays will have more stringent requirements on thermal stability, i.e., a thermal stability of $ 80À100 for fairly large arrays to ensure long-term nonvolatility.) For today's STT, it is noted that there is a vast gap between the < 1 aJ energy barrier of thermal switching (which is a measure of the minimum attainable write energy) and the practical values of $ 100 fJ for STT memory cells. Because of today's high switching energy (about 100X of CMOS), the applications are limited to caches replacing volatile SRAM and integrating them with large circuits such as core and FPGAs. Generally, the fractional writing frequency of STT-RAM over CMOS logic operations (or number of CMOS in a circuit) should be limited to n ¼ E L /E M , where E L and E M are the switching energies of CMOS and STT, respectively. For the implementation of STT for caches (in replacing CMOS SRAM), Smullen 38 has discussed the energy improvements for three levels of caches. Their simulations of several di®erent con¯gurations at the 32 nm node show that the energy and other performance parameters (delay, area, etc.) can be improved by a factor of about¯ve by relaxing the nonvolatility [or Á in Eqs. (3) and (4)] and by using only a single MTJ cell as compared with CMOS SRAM (six transistors) as shown in Fig. 8 . 38 FPGAs and look-up-table-(LUT)-based recon¯-gurable logic are other examples of implementing the MTJ-based nonvolatile logic, since they require no switching of storage cells during the logic operations. Recent work 49, 50 shows that CMOS/MTJ hybrid LUT-based logic circuits that incorporate MTJs for nonvolatility (Fig. 9 ) are able to gain an overall performance bene¯t, without paying a penalty in speed or dynamic switching energy. A similar study of hybrid CMOS-STT FPGA 51 gives only a factor of two improvement. To achieve realistic nonvolatile electronics, the integration will need to go down to small circuit or logic level and eventually the transistor level, and the switching energy must be scaled down further to be comparable to or better than that of CMOS. Alternatively, di®erent but more energy-e±cient mechanisms need to be explored. The latter will be discussed later in this paper.
There have been several proposals on implementing such a nonvolatile logic circuits through CMOS-MTJ hybrid integration. Usually a sense ampli¯er is used to read the total resistance di®er-ence between two groups of MTJ stacks at their low or high resistance states and to restore the readout voltage to a proper value for subsequent logic operations. 52À54 Others implement CMOS/MTJ hybrid logic gates that have MTJs as both memory cells and functional inputs to latch data, which is also referred to as a logic-in-memory (LIM) architecture. 55À57 However, most of these proposals are conceptual with rare energy and performance analysis. A recent study that evaluates the energy performance of the MTJ-based logic-in-memory (LIM-MTJ) architecture (Fig. 10 ) in comparison with static and dynamic CMOS implementations, shows that LIM-MTJ has no tangible advantage in energy performance over its equivalent CMOS design. 49 This is due to the fact that write energies in STT devices are still too high compared to CMOS. We will discuss potential use of beyond-STT solutions to resolve this issue in the next section. We believe at present, on one hand it is necessary to implement MTJs only into parts of large scale circuits, while on the other hand a new generation of low switching energy devices with e±ciency comparable to that of CMOS is needed for the integration to the logic gate level. Nevertheless, even today's STT-RAM devices can provide tangible bene¯ts when considering that they reduce standby power due to their nonvolatility. 
Beyond-STT: ME Memory for High Energy-E±ciency
To dramatically improve the energy-e±ciency of magnetic memory and make it suitable for integration with CMOS for nonvolatility at the gate level we need to further reduce the switching energy of nonvolatile elements. For this purpose, one may use an alternate approach: voltage-induced switching of magnetization, as opposed to current-driven STT switching.
a Electric¯eld control of magnetism (magnetic moment or anisotropy or both), similar to that of MOS structure as illustrated in Fig. 11 can lead to a new paradigm, enabling ultralow-power nonvolatile magnetic memory solutions (see performance metrics in Table 1 ). Figure 11 illustrates several di®erent principles of electric¯eld control of ferromagnetism: (1) the control of surface ferromagnetism by changing the Curie temperature with a surface carrier e®ect 58, 59 [ Fig. 11(b) ], (2) the control of magnetic anisotropy by magnetostriction of a thin magnetic¯lm with strain 60, 61 [ Fig. 11(c) ], and (3) the change of surface anisotropy 41, 62 [ Fig. 11(c) ]. In the¯rst approach, it has been theoretically predicted that the control of surface carrier density of the ThomasÀFermi layer can lead to the surface modi¯cation of magnetic transition through carrier-mediated phase transition, and a change of T c by a few tens of degrees is anticipated. 58, 63 This theoretical prediction was experimentally demonstrated in a thin Co¯lm case, in which ÁT c of 12 K was shown with an electric¯eld of AE2 MV/cm. 64 The second principle uses magnetostriction or the use of multiferroic materials. There are several approaches to the use of multiferroics including single phase and synthetic multiferroic materials. One example is to use synthetic multiferroic heterostructures consisting of piezoelectric and ferromagnetic materials to realize voltage control of magnetization as illustrated in Fig. 11(b) . 60 In this case, a voltage applied to the material stack generates a mechanical strain in the piezoelectric material. Due to the magnetostrictive property of a The power dissipation due to the use of current does not necessarily imply high power. For current-driven switching, the power supply usually needs to be of a high voltage due to the device resistance, resulting in high I 2 R loss.
the adjacent magnetic¯lm, this strain can lead to a reorientation of magnetization as a result of the ME e®ect. The third principle is to use an electric¯eld to control the orbital occupancies of di®erent symmetries of surface atoms, this in turn results in the change of symmetry of the orbitals, and consequently, the magnetic surface anisotropy can be altered through spinÀorbit interaction. 41 ,65À69 A few magnetic switching experiments have been demonstrated based on the above principles. For example in the case of using magnetostriction, Figs. 12(a) and 12(b) show experimental evidence of ME voltage control of magnetization for a 30 nm magnetostrictive Ni¯lm deposited on a piezoelectric PMN-PT substrate. 60 Magneto-optical Kerr e®ect (MOKE) measurements are obtained to show that a 90 reorientation of the easy axis can be observed with¯elds $ 0.14 MV/m (or 1.4 kV/cm). This experiment demonstrates voltage-controlled reorientation of magnetization, as well as nonvolatility -requiring no continuously applied voltage to keep the magnetization in the reoriented state, thus demonstrating an electric-¯eld-controlled nonvolatile magnetic memory operation. A critical requirement for this scheme is that the magnetic¯lm must have a large magnetostriction coe±cient. The saturation magnetization (M s ) of the magnetic¯lm needs to be optimized based on a tradeo® between ME coupling and thermal stability. While a small M s increases the ME coupling and reduces the write voltage, a large M s is desirable for a reasonable retention time and thermal stability for nonvolatility (similar to Á ¼ M s H k V = 2kT in the case of in-plane MRAM).
This type of memory takes advantage of the built-in strain of the piezoelectric layer prior to the magnetic¯lm deposition (i.e., through partial poling of the piezoelectric material and subsequent deposition of the magnetic¯lm), thus allowing one to realize permanent and reversible reorientation of the magnetization between two perpendicular in-plane directions. 60 This experiment provides a critical proof of concept for a new Magneto-Electric Random Access Memory (MeRAM) based on these types of synthetic multiferroic heterostructures. The state of such a memory bit can be readout, for example, by fabricating the magnetostrictive layer into a MTJ structure to allow for resistive TMR readout.
Energy loss in such a MeRAM consists of the energy capacitively stored in the memory cell, which will be probably dissipated into the ground via a resistive path similar to the CV 2 energy loss in CMOS switches, where C is the device capacitance and V is the applied voltage. Likewise, additional energy loss may be associated with leakage current through the device during switching. Another source of energy loss may come from the poling process, i.e., hysteresis of the multiferroic (or ferroelectric) material; this energy loss may be reduced if devices are designed such as to minimize or eliminate the need for poling to achieve magnetic switching. Given the nonvolatility of a circuit based on such an ME switch, which eliminates all standby power, as well as the dramatic reduction in dynamic power due to electric¯eld control, ME devices may form the basis for a new paradigm of inherently nonvolatile and highly energy-e±cient magnetic logic. It will allow for further size scaling of CMOS by tolerating somewhat larger leakage as devices may be powered o® from time to time. The read speed limit of such device will be fundamentally similar to CMOS if current is used to read the state. Another example of electric-¯eld-driven magnetic switching may use the interfacial magnetic anisotropy mentioned in the previous sections, particularly the recently discovered electric-¯eld-dependent anisotropy 41, 42, 62, 70 at the interface of oxides (e.g., MgO) and magnetic¯lms. Voltage-induced switching of magnetization using this mechanism has been recently demonstrated 71À73 as illustrated in Fig. 13 , which also shows the basic structure (a). The device is similar to a conventional MTJ with a thicker dielectric layer. The magnetic anisotropy change can be utilized to bring about switching of the free layer, as illustrated by RÀV curves shown in Fig. 13(b) . A critical challenge of this approach is the use of a magnetic¯eld in order to determine the switching direction in some cases or the use of precise timing for toggle switching to the opposite magnetic direction. These requirements make the circuit designs more complicated and result in an additional overhead for read/write as well as for increasing the write energy. (However, other innovative approaches to eliminate these requirements are in progress.)
Realization and scaling of MeRAM will allow for ultralow-power nonvolatile electronics far beyond what is possible based on other NVM technologies ( Table 1 ). The energy consumption of a MeRAM cell consists of a CV 2 term, where C is the cell capacitance and V is the write voltage, as well as a V 2 t=R term due to leakage through the device (which is material-and thickness-dependent and should be minimized), in which R is the device resistance and t is the switching time. While MeRAM switching energies can be lower by $ 2À3 orders of a magnitude, estimated using, e.g., the technology node of 65 nm and the present material systems used in STT-RAM, 41, 42, 60, 62, 70 continuing scaling and material developments will further reduce MeRAM energies to the atto-Joules regime. Such low energy operation is fundamentally beyond the reach of STT-RAM. While STT-RAM has been predicted to approach write energies of a few 1 fJ (¼ 10 À15 J) at the 10 nm node, 47 we project MeRAM to reach $ 10 aJ (¼ 10 À17 J) for similar dimensions. This will enable nonvolatility to be implemented at the transistor level and may make possible high speed, low dissipation instant-on information processing systems.
It should be noted that low write voltages are a key requirement for realizing the low energy dissipation for MeRAM, and recent experiments have indicated that switching using such low voltages (<1 V) is indeed possible using this approach. 71À73 The read energy may be equally important for applications. However, given that typical readout using TMR would use only a fraction of the write voltage, the read energy would be even smaller and hence will not limit the overall energy consumption of the device.
The ultimate scaling limit (hence achievable energy-e±ciency) for these kinds of magnetic MeRAMs will be determined by the development of materials with high perpendicular magnetic anisotropy, to ensure that the cells remain nonvolatile with high enough thermal stabilities. As the volume of magnetic bits is reduced with scaling, the magnetic anisotropy of the free layer materials will thus have to increase to maintain stability. Based on current projections 47 with an additional consideration of shorter retention time for embedded application, we anticipate that MeRAM may be scaled down to a few nm; the development of harder magnetic materials and other voltage-controlled mechanisms for MTJs could potentially make the scaling go even further.
Next, we will discuss a fully magnetic (rather than hybrid with CMOS) approach for NVL using such ME e®ects.
Fully Magnetic NVL
To date a number of promising ideas have emerged for spin-based logic.
74À83 This section presents approaches to all-magnetic design for NVL and circuits. As pointed out before, there are two major approaches to spin-based logic devices: (i) Using collective spins for the development of magnetic devices and circuits, 74, 76, 77, 82, 83 and (ii) the use of spin in addition to charge for improving device performance (e.g., Spin-Modulator or Spin-FET devices such as that in Ref. 78) . (From the materials point of view, there are two major di®erent classes of materials used for spintronics: namely, dilute magnetic semiconductor and metallic systems). For collective spin devices, magnetic logic devices may be realized by exploiting physical mechanisms such as dipoleÀdipole or exchange interactions for nonvolatile logic circuitry (see Table 2 ). We will limit our discussion to mostly our spin wave bus (SWB) approach, but a brief discussion of nanomagnetic logic (NML) based on dipole-coupled magnetic cellular automata will also be included later in this section. 74, 77 Both these approaches each have advantages and limitations in terms of their scalability, power dissipation, defect tolerance, and their compatibility with CMOS-based circuits. In the SWB approach, information is transferred via exchange-coupled spins in a continuous magnetic medium (thin¯lm), in contrast with the nanomagnetic dipole-coupled logic devices, in which each element is coupled via magnetic dipole interaction to neighboring elements as will be discussed later (Table 2 ). Figure 14 illustrates the SWB logic approach and its basic operation principle. A spin wave is a collective excitation of spins due to exchange interaction, which can also be viewed as a magnonic wave as shown in Fig. 14(a) . SWB uses spin wave interference for various gate operations; the result of the operations can be readout by a nonlinear switch (a ME gate, very much similar to a MeRAM cell) in the¯nal stage as the results of computation as illustrated in Fig. 14 and detailed in Fig. 16 . 83 Using SWB, 84À87 we have previously proposed and developed the concept of magnonic logic circuits, where an applied in-plane magnetic¯eld is used to control the spin wave propagation frequency and dispersion characteristics. The basic structure uses a magnetic¯lm as a spin conduit of wave propagation as SWB shown in Fig. 14 , where the information can be coded into the phase of a propagating spin wave and the logic functions are Schematic of a logic circuit based on SWB and ME gates. The ME devices are used for input and output functions, interfacing with spin waves propagating in the SWB. 83 performed using spin wave interference in the bus. A prototype device is illustrated in Fig. 14(b) , showing two microstrip lines (S1 and S2) acting as the spin wave input with a third microstrip (S3) used to readout the signal. The amplitude and phase of propagating spin waves can be modulated by an applied or an e®ective magnetic¯eld, the latter of which can be produced by a ME gate (MeGate) as in the case of MeRAM devices. In this case, the modulation of magnetic¯eld is done via electric¯eld as discussed previously through the change of magnetic anisotropy. Spin wave generation and detection at the input and output can also be achieved by ME gates for higher energy-e±ciency. This approach combines several advantages: (i) information transmission is accomplished without electron transport enabling one to minimize power dissipation in the interconnects; (ii) ability to use wave superposition in the SWB to enhance functionality, while switching is done at readout; (iii) a number of spin waves with di®erent frequencies can be simultaneously transmitted among a number of spin-based devices (i.e., frequency multiplexing). 75, 83 There are other approaches, which are based on spin wave MachÀZehnder-type interferometer proposed in recent years, 88, 89 where spin wave amplitude was used to de¯ne the logic state of the output. In contrast, in SWB, information is coded into the phase of the spin wave signal. Two logic states 0 and 1 can be assigned to two phases of the spin waves having the same amplitude. In this approach, data processing is accomplished via the change of the phase of the propagating spin waves (as illustrated in Figs. 14(a) and 16) .
Using the latter approach, a prototype majority gate has been demonstrated. Figure 15(a) (top) shows a photo of the test chip used in the experimental study of a prototype majority (MAJ) gate. Each of the¯ve wires of Fig. 15(b) can be used as an input or an output port for which microwave coplanar structures are used, similar to the basic structure shown in Fig. 14(b) . In order to demonstrate a three-input one-output MAJ gate, three of the¯ve wires were used as input ports, and two other wires were connected in a loop to detect the inductive voltage produced by the spin wave interference, Fig. 15(b) . The plot in Fig. 15(c) shows the output inductive voltage detected from di®erent combinations of spin wave phases. An electric current passing through each wire generates an Oersted magnetic¯eld, which, in turn, excites spin waves in the ferromagnetic layer. The direction of the current°o w (the polarity of the applied voltage) de¯nes the initial spin wave phase. The relative phases between the waves excited by the wires may be 0 or , for the same or opposite direction of the current. The di®erent curves in Fig. 15(c) depict the inductive voltage as a function of time for di®erent combinations of the spin wave phases (e.g., 000, 00, 0, ) as illustrated in Fig. 15(d) . The phase of the output inductive voltage corresponds to the majority of phases of the interfering spin waves. The data are taken for 3 GHz excitation frequency and at bias magnetic¯eld of 95 Oe (perpendicular to the spin wave propagation). Note that this bias magnetic¯eld may be replaced with a MeGate. All measurements are performed at room temperature. This simple demonstration of the concept used a magnetic¯eld to control the phase of the propagating spin wave. The use of electric¯eld control of spin wave (i.e., ME gates) for the same purpose will signi¯cantly reduce the dissipated energy by many orders of magnitude as previously discussed for electric-¯eld-controlled magnetic memory.
This prototype device demonstrates the feasibility of building wave-based MAJ logic gates, which are of great value for logic circuitry. In general, majority logic is more powerful for implementing a given digital function with a smaller number of logic gates than CMOS. 90 For example, a full adder may be constructed with three majority gates and two inverters (3 ME gates and 2 phase modulators). In contrast, a Boolean-based implementation requires a larger circuit with seven or eight gate elements (about 25À30 transistors). 91 The main reason majority logic has been out of fashion for decades is that its CMOS realization is ine±cient, while using spin waves will alleviate this shortcoming.
NVL circuits can be constructed using SWB structures and ME gates 83 as illustrated in Fig. 14(c) . An illustration of the nonvolatile logic gate operation with electric¯eld control of spin waves is shown in Fig. 16 . In the illustration, we show the use of ME gates with the SWB, where spin wave excitation and storage of the¯nal computational result (output) are done via ME gates. The only nonmagnetic input required for the circuit is a clock applied to the ME gates for triggering the switching at the output end and the spin wave generation at the input bit. The direction of switching is determined by the spin wave phase at the output cell. 83 The voltage pulses are converted into spin waves (magnons), then, the data transmission and processing within the SWB is accomplished via spin waves only. Nonvolatility is accomplished through the ME gate like the memory devices discussed before: (i) output data from each computational step is stored in a ME memory cell, which can be switched by the spin wave under the MeGate, assisted by a pulsed bias voltage. The bias is used to reduce the energy barrier for facilitating the switching by the spin wave (resulting in a voltage-assisted spin-wave-induced switching process). The direction of switching is determined by the spin wave phase; (ii) a ME cell generates a spin wave for the next computational step with the next clock cycle and the spin wave phase is determined by its memory state through exchange and/or dipole coupling to the SWB. Logic functionality is achieved as a sequence of the ME switching events, where each ME gate changes its state (°ips magnetization) according to the magnetization of the preceding cells via propagating and interfering spin waves as assisted by the MeGate bias.
Energy-e±cient electric-¯eld-induced generation of spin waves is a key requirement for this NVL scheme to work. For ME gates, we have recently successfully demonstrated this e®ect using a multiferroic heterostructure, showing spin waves generation by voltage in a capacitive ME gate and propagating over distances of up to 40 micrometers. 92 Once switched, the ME gate preserves magnetization till the next computation step. The latter makes it possible to eliminate the need for static power consumption and at the same time drastically reduces the active power. The minimum energy of the spin wave switching is limited by the thermal noise only and can be very close to kT, with the majority of the energy required for switching being provided by the clock (voltage) signal. Additional examples of di®erent nonvolatile magnonic logic circuits based on this concept are given in Ref. 83 . These SWB logic devices may also be further integrated with CMOS to make nonvolatile hybrid CMOS/magnetic logic circuits.
A di®erent approach to realizing spin-based NVL is the use of magnetic quantum dot cellular automata (MQCA), also referred to as nanomagnet logic (NML). 74,77,93À95 The NML approach relies on the transfer of information via consecutive switching events in rows and columns of dipole-coupled nanomagnets. Initial demonstrations used both ferromagnetic 74 and anti-ferromagnetic 77 dipole coupling of adjacent nanomagnets to bring about the consecutive switching events. Both isotropic, circular, 74 and elliptical nanomagnets exhibiting shape anisotropy 96 have been used for this purpose, and logic gates based on the NML approach were demonstrated. 77 The NML approach is similar in concept to the earlier idea of using electrostatically coupled arrays of quantum dots in an electrostatic quantum dot cellular automata (EQCA) structure. 97 EQCA, however, is not easily realizable at room temperature, 77 while MQCA does not su®er from this problem as the nanomagnetic dots can be designed to exhibit stable magnetization at room temperature due to the use of collective spins. An interesting feature of the NML approach is its inherent nonvolatility, provided that the magnetic bits are su±ciently stable to retain their information beyond the time scales used in the computation. Given that individual dots are only coupled via dipole interaction, however, scaling of the NML approach entails the use of materials with a high anisotropy (as previously discussed for memory) in order to ensure stability. Moreover, due to the need for a full switching of each bit, the The phase of the generated spin wave is determined by the state of the input memory cell via dipole or exchange interaction. The ME gates excite and store spin wave information. At the output end, the ME gate is switched by the arriving spin wave, with the switching direction being determined by the spin wave phase. A bias voltage is applied by the clock to reduce the energy barrier to allow for the spin wave to switch the state of the ME cell.
speed of logic circuits based on this concept is determined by the speed of magnetic-¯eld-induced switching ($ nanoseconds) of the nanomagnets in an array. The SWB and NML approaches o®er di®erent variability and scalability behaviors. While in the NML approach a minimum uniformity among di®er-ent cells has to be maintained in order to allow for circuit operation, this issue is to some extent resolved in the SWB approach due to the use of a continuous medium. The SWB approach, on the other hand, may su®er from challenges due to the nonuniformities of propagation speed and delay due to issues such as edge roughness variation, di®erent propagation angles, and di®erent propagation lengths in di®erent parts of the circuit. It should be noted as the size and the separation gap of nanomagnets are further reduced, the exchange coupling strength will increase, and spin waves will propagate through. Then MQCA or NML eventually approaches the SWB.
Summary
Collective spin devices, or metallic magnetic devices have advantages of high speed, nonvolatility and being able to operate at room temperature. Since devices/circuits may be turned on and o® on demand, thus it may allow CMOS to further scale to smaller feature sizes without much static leakage. Likewise, the dynamic switching energy may be reduced as well, while retaining minimal variability. In addition, these metallic devices can be conveniently processed and integrated with today's CMOS technology. We discussed the state of the art technology for spin transfer torque memory (STT-RAM). Although the switching energy of STT is on the order of 100 fJ (or about 100Â higher than that of today's CMOS), the integration of the nonvolatile STT-RAM memory has been already shown to lead to advantages in terms of energy-e±ciency and density when integrated with CMOS at the block/ circuits level. In particular, embedded applications where STT-RAM replaces traditional SRAM on-chip memories are an example of this. The integration may further advance to the gate and eventually to the transistor level as the energy-e±ciency of the magnetic memory elements is improved. This will be possible, for example, through the development of voltage-controlled ferromagnetism and devices. We discussed several principles of electric eld control of magnetism. The topic will be much pursued in the future for realizing MeRAM and ME gates, with dramatically higher energy-e±ciency and scalability than those of STT-RAM. The work will continue to focus on reducing the write energy, improving the speed and making it more convenient to integrate with CMOS transistors. With MeRAM, it should be possible to make low levels of circuits/ systems nonvolatile and eventually to integrate it at the gate level. For the latter, it will be important to make the reading signal from the MeGate as close to that of CMOS (V dd ) as possible. Eventually, with the enhanced energy-e±ciency of nonvolatile spintronic devices, all-magnetic logic schemes such as SWB and NML, where CMOS is replaced and/or complemented with entirely spin-based logic circuits, will also become viable. Thus, these nonvolatile spintronic devices may transform electronic systems to result in a new era of instant-on nonvolatile nanoelectronics.
