

# POLITECNICO DI TORINO Repository ISTITUZIONALE

Virtual Clocking for NanoMagnet Logic

Original

Virtual Clocking for NanoMagnet Logic / VACCA, MARCO; CAIRO, FABRIZIO; TURVANI, GIOVANNA; RIENTE, FABRIZIO; ZAMBONI, Maurizio; GRAZIANO, MARIAGRAZIA. - In: IEEE TRANSACTIONS ON NANOTECHNOLOGY. - ISSN 1536-125X. - ELETTRONICO. - 15:6(2016), pp. 962-970.

Availability:

This version is available at: 11583/2655353 since: 2017-09-28T01:45:24Z

Publisher: IEEE

Published DOI:10.1109/TNANO.2016.2617866

Terms of use: openAccess

This article is made available under terms and conditions as specified in the corresponding bibliographic description in the repository

Publisher copyright ieee

copyright 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating.

(Article begins on next page)

# Virtual Clocking for NanoMagnet Logic

Marco Vacca, Fabrizio Cairo, Giovanna Turvani, Fabrizio Riente, Maurizio Zamboni and Mariagrazia Graziano

Abstract—Among emerging technologies NanoMagnet Logic (NML) has recently received particular attention. NML uses magnets as constitutive elements, and this leads to logic circuits where there is no need of an external power supply to maintain their logic state. As a consequence, a system with intrinsic memory and zero stand-by power consumption can be envisioned. Despite the interesting nature of NML, a fundamental open problem still calls for a solution that could really boost NML technology: the clock system. It constrains the layout of circuits and leads to a potentially high dynamic power consumption if not carefully conceived. The first clock system developed was based on the generation of a magnetic field through an on-chip current. After that other types of NML, based on several different types of clock systems, were proposed to improve clocking.

We present here our proposal for a new clock delivery method. We named this system "Virtual Clock". It offers several important advantages over previous solutions. First, it notably simplifies the clock generation network, reducing the complexity of the fabrication process. It improves the efficiency of circuits layout, substantially reducing interconnections overhead and boosting the reliability of the majority voter. It enables the fabrication of in-plane NML circuits with two layers, while they were confined to one single layer up to now. Finally, it allows to globally reduce dynamic power consumption by considerably shrinking circuits area. Overall the "Virtual Clock" system we propose represents an important step forward in the development of NML technology.

#### I. INTRODUCTION

NanoMagnet Logic (NML) [1] is one of the implementations of the Quantum dot Cellular Automata (QCA) principle [2]. QCA technology is based on identical cells encoding digital values using different polarization states [3]. Nanomagnets are one of the most natural devices that can be used to implement QCA cells. When the size of magnets is reduced under the 100nm barrier and a proper shape and aspect ratio are used [1], only two stable magnetizations are allowed (Figure 1.A). This is a well known condition. since nanomagnets represent the basic element of modern hard drives [4]. The other natural approach to the physical implementation of QCA principle is using molecules [5]. Molecular QCA are based on complex molecules to represent logic values [6][7]. However, NML technology is the only solution that can be fabricated with up to date technological processes [8]. The key feature of this technology relies on its magnetic nature. NML combines both memory and logic in the same device, allowing the design of "Logic-In-Memory" (LIM) architectures [9]. Moreover, magnets do not require an external power supply to maintain their state, so they are ideal for all applications where the stand-by power is a major concern.

In NML circuits the logic information is associated to the magnetization along the major nanomagnet axis (Figure 1.A). Data propagates through magnetodynamic coupling among



Fig. 1. NanoMagnet Logic fundamentals. A) Single domain nanomagnets represent logic values "0" and "1" that are associated to the magnetization direction. B) Magnets are forced in an intermediate unstable state (RESET) to allow their switching. C) Multiphase clock system. Circuits are divided in small areas called clock zones, made by a limited number of magnets. D) Three clock signals with a phase difference of 120° are required for a correct signals propagation. [10]

neighbor magnets [11]. Theoretically, if magnets are placed and correctly aligned on a plane, when one element switches, neighbor elements should switch accordingly (Figure 1.A). This result can be achieved only if magnets are previously forced in an intermediate unstable state, (RESET in Figure 1.B). Being in the RESET state, magnets start to switch to a new stable state as soon as at least one of the neighbor elements switches. To force magnets in the RESET state an external mean, typically a magnetic field, is necessary. It is normally referred to as "clock" [10] and is one of the requirements for a correct signals propagation in NML circuits. In Section II we present a detailed discussion on the clocking system and on the technological solutions proposed in previous and in the present work. Here it is enough to say that the clock system is the most distinctive part of NML (and QCA) technology, as it defines circuits logic behavior and it constrains their layout [12]. The clocking system leads to an high dynamic power consumption [13][8][14]. The key targets to study the clock technology are, i) reducing the per area power consumption and ii) minimizing the area required by the design of realistically complex circuits. This is the main point that we address in our work: studying a clock system that enables the reduction of circuits area can ultimately lead to low power circuits, even if the power necessary for the RESET state generation is higher (see Section II). Another point suggests to further investigate the technique based on magnetic field generation: the magnetic field clock enables the integration of other magnetic structures inside NML circuits, like domain walls [15], leading to further advantages in terms of area reduction, while other clocking techniques might not.

Having this purpose in mind we present in our work, an innovative clock solution, which we refer to as a "Virtual Clock". This solution notably improves the clock system generation. It is based on the idea that the strength of the magnetic field required to force a magnet in the RESET state depends on its size, particularly on its aspect ratio. A smaller aspect ratio requires a smaller value of RESET magnetic field. In other words, applying the same magnetic field to two magnets of different aspect ratios, and then slowly removing it, induces a two-time-intervals switching of the magnets from the RESET state. At the beginning only the bigger magnet switches, while the external magnetic field has a hold over only the smaller one, which is in the RESET state. Only after the magnetic field is reduced below a certain barrier, then the smaller magnet starts to switch as well. As a consequence two "virtual" clock zones are created by differentiating the aspect ratio of magnets but using only one single clock signal. The detailed explanation can be found in the following sections. A summary of the Virtual Clock advantages ([ADV]) over previous solutions is reported in the following. Each point will be clearly explained later in the paper and referred with the numbering used here.

- [ADV1] It simplifies the clock generation network (Section III).
- **[ADV2]** It improves circuits layout greatly reducing the interconnections overhead, and therefore the area and the associated power dissipation (Sec. III).
- [ADV3] The Majority Voter logic gate works also in presence of asymmetric length of input wires (Sec. IV).
- [ADV4] It enables two layer in-plane NML circuits (Sec. V).
- [ADV5] It allows the implementation of the multilayer crosswire (Sec. VI).

We have already successfully applied this clock solution in the work presented in [9]. In this work we implemented with NanoMagnet Logic an innovative Logic-In-Memory circuit. We have chosen the Virtual Clocking as a clock scheme for its efficiency, even though the Logic-In-Memory principle does not rely on the Virtual Clocking. The focus of the work presented in [9] was the principle of Logic-In-Memory. The Virtual Clocking used as a clocking scheme largely improves the performance of the Logic-In-Memory structure. In the present paper we clearly present, describe and validate the clocking scheme itself. We demonstrate the validity of the proposed solution through rigorous low level simulations, obtained with the help of NMAG [16] micromagnetic simulator and Comsol Multiphysics [17]. A 2-phase clock was previously proposed in [18]. The two phase clocking is based on shape engineering of magnets [19][20][21], where a trapezoidal magnet is used at the beginning of each clock phase. Our solution share some similarities with this work, but provides further advantages discussed later on. Overall the advancements provided by the clock solution proposed in our work represent an important step forward in the development of NML technology.

# II. THE CLOCK SYSTEM

As introduced in Section I, magnets are reset in order to erase a previous magnetization state. As soon as the external force that maintains the RESET state is released, magnets will reach a stable state. Since the RESET state is unstable, magnets might be forced in a stable state by external influences. The major influence is expected to be that of another magnet placed nearby and already in a stable state; but also other causes



Fig. 2. A) The magnetic field is generated by a current flowing through a wire placed under the magnets plane. A ferrite yoke is used to confine the magnetic flux lines. B) Snake clock structure. Wires 2 and 3 are twisted and placed on different planes. C) Signals propagation in the snake clock case.

might influence the transition, like thermal noise [22]. This is particularly true in long chains of magnets. In this case several magnets are released from the RESET state at the same time. As stated in [22], simulations clearly show that no more than 5 magnets can be cascaded at room temperature without an erroneous switching caused by the influence of thermal noise. As a consequence, the organization of complex circuits requires the use of a "multiphase clock" mechanism. It consists of dividing the circuit in small areas called clock zones (Figure 1.C). Only a small number of magnets are located in each clock zone. A different clock signal is applied to each clock zone. Three (or more) clock signals must be used, each one with a phase difference of  $120^{\circ}$  (Figure 1.D). The whole circuit layout is based on a repetition of these consecutive zones. Given a portion of a circuit including three clock zones, at each instant only the magnets in one of three clock zones are switching, as depicted in Figure 1.C. Magnets in the SWITCH phase are influenced by magnets located on one of their side that are in the HOLD state, acting like inputs. On the opposite side magnets are in the RESET state, and have no influence on the switching magnets. When time increases, the clock signals are applied similarly, but shifted of one zone. The multiphase clock system enables the correct propagation of signals through the circuit (Figure 1.C), and it avoids at the same time errors due to the presence of noise.

The first type of clock mechanism proposed was based on a magnetic field. It uses a current flowing through a wire placed under the magnets plane [8]. The structure is depicted in Figure 2.A. A ferrite yoke is used to confine the magnetic field flux lines. This clock system was experimentally demonstrated in [13]. It works correctly but the value of current required is quite high, around 500mA for a wire with a section of few  $\mu m^2$ . It is important to underline that the magnetic field strength depends on the current density, therefore reducing the wires section reduces the value of the required current as well. The circuits layout is also constrained by the limitations related to the wires fabrication. The more realistic layout that can be obtained uses parallel wires, one for each clock signal. The consequence is that each clock zone is made by parallel stripes. This organization constrains the layout of circuit. Particularly, it notably increases interconnections overhead, as demonstrated in [23], because signals cannot propagate in vertical direction efficiently. The structure shown in [8], though effectively working, allows only the design of combinational circuits, i.e. no sequential circuits that normally require feedback signals can be designed. This is caused by the propagation method of signals in NML circuits. In order to propagate in a specific direction signals must cross clock zones in a precise order (1 then 2 and finally 3).

To allow the design of circuits of complex and realistic topology, the "snake clock" solution was proposed in [24]. The layout is depicted in Figure 2.B. The clock wire corresponding to phase 1 is a simple straight wire. The clock wires corresponding to phases 2 and 3 are instead twisted. They are placed under and over the magnets plane to allow the physical "virtual" twisting. The possibility of placing clock wires on different planes was also suggested in [8]. The information propagation is highlighted in Figure 2.C. Signals can propagate in both directions thanks to the wires twisting. Obviously magnets cannot be placed in the area corresponding to the cross of the two clock wires, otherwise they will be subjected to both clock signals of phases 2 and 3. In [14] the Comsol Multiphysics [17] simulations of these structures is reported; the confinement of the magnetic field is very good, suggesting then a correct behavior of magnets under the clock influence.

In recent years many clock solutions were developed, leading to different NML technologies. Instead of a magnetic field, an STT-coupling with a current flowing through the magnets can be used [25]. In this case the magnet is a Magneto-Tunnel Junction (MTJ) the basic elements used in Magnetic RAMs. This clock solution has two main advantages. It is based on Magnetic RAMs, a technology already spread at commercial level [26]. Moreover the power consumption is lower than the magnetic field case, if the number of elements in the circuit is less than 10000 [27]. A different NML technology is based on magnets with a strong magnetocrystalline anisotropy, a magnetic property that forces the magnetization perpendicularly to the plane [28]. It is still based on a magnetic field as clock, but no clock zones are required, simplifying the fabrication process [29]. Finally an ultra low power solution was developed in [30]. Magnets are deposited on a piezoelectric substrate. When an electric field is supplied to the substrate, the corresponding deformation induces a mechanical stress on magnets, forcing them in the RESET state. This last solution allows to have a power consumption much lower than ultra scaled CMOS transistors [30].

All the proposed clock solutions have their advantages and disadvantages. The solution based on the generation of a magnetic field actually requires a high current, but it has some important advantages. First it has been already experimentally demonstrated. Moreover it is a really flexible solution, because it allows to integrate different magnetic structures in one system. MTJ can be integrated to use as input and output interfaces [31]. In [15] domain walls are embedded into the circuit developing a new logic solution, called Domain Magnet Logic (DML), that reduces the overhead caused by the interconnections. An important feature of NML technology is that power consumption depends on circuits area. At this point of the state of the art we believe that the key point for reducing power is finding a solution that allows to reduce the area as much as possible, considering circuits of realistic complexity.

The Virtual Clock represents the results of our ongoing



Fig. 3. Virtual Clock for NML circuits. A) Two clock wires are used. Virtual phases are defined by magnet sizes. B) Clock waveforms used in the simulations. C) Clock waveform suggested for experimental circuit implementation.

efforts to improve clocking in NML technology. It greatly improves the magnetic field clock system, leading therefore to the development of the first full-magnetic circuit. The following section fully explains the proposed structure.

# III. VIRTUAL CLOCKING

# A. The principle

The magnetic field required to force a magnet in the RESET state depends on its size, particularly on its aspect ratio. The aspect ratio is the difference between the longer and the shorter magnet sides. The higher the aspect ratio, the higher the magnetic field required to force a RESET. The Virtual Clock system exploits this fact by creating virtual clock phases with magnets of different sizes subjected to the same clock signal. The basic concept is explained in Figure 3.A. The circuit used as an example is a simple chain of magnets. Two physically distinct clock wires, (CLOCK WIRE 1 and CLOCK WIRE 2), are used to generate the clock signals. The effective number of virtual clock phases is four, two for each physical wire. In this example, magnets belonging to virtual phases 1 and 3 have dimensions 50x100x20nm<sup>3</sup>, while magnets belonging to phases 2 and 4 have dimensions 50x80x20nm<sup>3</sup>. The two CLOCK WIRES are subjected to the clock signals depicted in Figure 3.B. The clock period used in this example is equal to 4.8ns, corresponding to a frequency of 208MHz. Focusing on the signal applied to CLOCK WIRE 1, i.e. H WIRE1, the basic working principle can be understood (a detailed simulation follows in the next subsection). When the clock signal reaches the maximum value both types of magnet are in the RESET state. When this signal starts to fall, due to their different aspect ratio only the taller magnets starts to switch from the RESET state first. Only after the intensity of the clock signal



Fig. 4. Simulation of an NML wire with Virtual Clock system. A) All magnets are initialized to logic '1'. A helper block [32] is used to assure a correct information propagation. B) When the magnetic field is applied to both clock wires, all magnets, except for the input, are forced in the RESET state. C) With the slow removal of the magnetic field from the first clock wire, the first four magnets start to switch. D) The four magnets in the second virtual phase switch when there is no field applied to the first wire. E) When the magnetic field is slowly removed from the second wire, the first four magnets of the second clock wire switch. F) Finally also the last four magnets reach the correct state. G) Initial wire state with all magnets initialized to logic '0'. H) Final state when a logic '0' is propagated. I) Clock signals waveform used and color legend.

is sufficiently reduced, the second group of smaller magnets move from the RESET and start to switch. The signal applied to CLOCK WIRE 2, i.e. H WIRE2, is identical but delayed, so that magnets in the virtual zone 3 are in the RESET stage when magnets in virtual zone 2 are switching. Signals depicted in Figure 3.B are interesting for their simplicity and used in principle for this explanation. Clock signals depicted instead in Figure 3.C lead to the same circuit behavior from a theoretical point of view. They are however a more robust solution to use. In the second solution (Figure 3.C), when one of the two clock signals is rising or falling, the other is kept stable. The advantages are an immunity to jitter noise on the clock signals, and the ability to set the duration of rise and fall time at any desired value. For example, with this second solution it is possible to increase the duration of the rise time to obtain the adiabatic switching. The Virtual Clock leads as a result to a 4 phase clock but using only two clock signals. The typical sequence followed by the signal that is carrying the information is 1-2-3-4. Among the other advantages there is also the possibility to have feedback signals without requiring the clock wires twisting of the snake clock system [24]. Figure 3.A depicts a simple feedback signal (the set of magnets at the bottom), which travels through the correct sequence of (virtual) phases, 1-2-3-4-1-2. The information flow is defined by the order used for the aspect ratio, not by the clock wires layout.

#### B. Validation and results on a wire

Figure 4 depicts the physical simulation of a NML wire. All simulations are obtained using NMAG software [16], defining Permalloy as magnetic material and a maximum mesh size of 5nm. The maximum value of the magnetic field used is 80kA/m. The circuit is similar to the one represented in Figure 3.A but without the loop. This choice was made to keep the visualization as clear as possible. Clock signal waveforms used for this simple simulation are depicted in Figure 4.I box, along with the color (grayscale) legend. Clock waveforms used are slightly different from the one depicted in Figure 3.B. The

simplification can be used in this case since the circuit is small and composed of only 4 virtual phases. Moreover the use of the signals presented in Figure 4.I, allows to better appreciate the signal flow.

The most significant simulation steps are reported in Figure 4.A-H. The correspondent time steps are also highlighted in Figure 4.I on the clock waveforms. At the beginning all magnets are initialized to logic '1' (Figure 4.A, time step 0 in Figure 4.I). At the end of the NML wire a helper block [32] is used (horizontally) to assure a correct signal propagation (it works as a sort of "termination" of the line). Figure 4.B depicts the moment when both clock wires are subject to the magnetic field, and all magnets are forced in the RESET state (time step 1 in Figure 4.I. The input magnet on the very left of the wire remains fixed as not subjected to any external field. Then the magnetic field of the first clock wire starts to slowly decrease, while the magnetic field of the second wire is still applied. As soon as the magnetic field decreases under a certain threshold, the first four magnets (Figure 4.C, time step 2 in Figure 4.I) switch according to the input element. The switching is abrupt also if the magnetic field is slowly removed. When the magnetic field of the first clock wire is completely removed, also the four magnets in the second virtual phase switch to a stable state (Figure 4.D, time step 3 in Figure 4.I). During this process the final eight magnets are still in the RESET state, because the magnetic field is still applied in that zone. They begin to switch only when the magnetic field starts to be removed from the second clock wire. The first four bigger magnets begin to switch (Figure 4.E, time step 4 in Figure 4.I) and then the last four magnets switch (Figure 4.F, time step 5 in Figure 4.I). Figure 4.G and Figure 4.H report instead the first and last simulation steps (time steps 0 and 5 in Figure 4.I) when the opposite information (logic '0') is propagated. The circuit behaves correctly also in this case. Simulations do not consider the influence of temperature. However, we follow the guidelines presented in [22], where the impact of thermal noise is analyzed. Following the results presented in [22] we therefore limit the number of magnets for each virtual clock phase to 4-5. Simulations highlight that the smaller magnets can drive the bigger magnets without any problem. The difference in size among magnets is small, so information propagates correctly also when magnets of different sizes are chained.

From these preliminary results some of the advantages of the Virtual Clock can already be seen. [ADV1.1] The layout of clock wires is composed by simple parallel wires. No clock wires twisting is required to allow feedback signals. To better perceive this advantage we recall the fundamental importance of feedback in sequential circuits and their problems in NML technology in [33]. [ADV1.2] Moreover, only two clock signals are requested. The 2-phase clock solution proposed in [18] provides a similar advantage, however the solution proposed here has less limitations. For example, without the Virtual Clock the clock wire width must be equal to the maximum number of magnets that can be chained. According to [22] this number is 5 magnets, that leads to a wire width of 350nm considering magnets with a width of 50nm and a separation distance of 20nm. [ADV1.3] With the solution proposed here clock wires can have a width of 700nm, further relaxing the constraints on wires fabrication. While the simulations here presented do not consider the influence of temperature, its impact can be reduced decreasing the number of magnets for each virtual clock phase, ideally to one element for phase. Clock wire width will be reduced. Nonetheless, thanks to the virtual clock, the wire width is at least 2 times bigger with respect to the case without the virtual clock. [ADV2] The size, shape and position of clock phases is not limited by the shape of the clock wire. Magnets can be placed with more freedom, improving the layout and reducing the overhead caused by the interconnections.

Theoretically the number of virtual phases can be further increased choosing magnets with small differences in aspect ratio. However some important rules must be followed when deciding which aspect ratios to use.

- The difference between the aspect ratios of magnets belonging to two virtual phases should be high enough in order to guarantee a correct behavior also in presence of process variations. In our solution we have kept the short magnet side fixed at 50 nm, to have a more regular layout. For the longer magnet side we have chosen 100nm and 80nm. The difference between these two values is 20nm, equal to the separation distance between two magnets. According to our simulations we have chosen this value as the minimum difference among magnets of different virtual clock phases. The difference in magnetic field strength is around 20-30kA/m. As a consequence there is a clear separation between virtual clock phases, leading to a reliable information propagation.
- The aspect ratio cannot be reduced too much. In our simulations, adding a third virtual phase with magnets of 50x60x20nm<sup>3</sup>, has not produced good results. The smallest magnets were unable to reach a stable state, due to the formation of magnetization vortexes.

The two above mentioned rules realistically limit the number of virtual clock phases for each clock wire to two. It is possible to obtain a system with three virtual clock phases for each clock wire, adding bigger magnets (120x50x20nm<sup>3</sup>). The power consumption is however increased of nearly 50% since a stronger magnetic field is needed. It could be argued that a system composed only by all equal smaller magnets (like 50x80x20nm<sup>3</sup>) leads to a lower power consumption, since a lower value of magnetic field is necessary. However such a system would lose all the advantages of the Virtual Clock. Particularly, without virtual clock magnets placement is severely constrained (see the work in [23]). Using the virtual clock system magnets can be placed with more freedom, optimizing therefore the layout with a clear performance improvement.

### IV. MAJORITY VOTER

The majority voter is the most important logic gate in QCA and NML technology. It is a logic gate with three inputs, where the value of the central element is equal to the majority of the inputs. It is a powerful gate that can be also used to implement AND/OR gates fixing one of the inputs to '0' or '1'. However it has one main drawback, input signals must arrive at the central element at the same time. As a consequence the number of elements (magnets or QCA cells) that constitute each input wire must be the same. The NML case requires to follow strict constraints related to clock generation network fabrication. The layout of the majority voter gate that can be used in NML technology is similar to the one depicted in Figure 5.A, but with magnets all equal in size. Clock zones are made by parallel stripes, the inputs come from neighbor clock zones. To connect inputs to the central magnet, additional magnets must be used in the top and bottom NML wires, with respect to the central magnet. The direct consequence is that the three input wires have a different length. As demonstrated in [34] this asymmetry leads to a majority voter that does not work correctly with some input combinations. The logic gate library is therefore limited to the AND/OR gates developed in [18]. One of many advantages of the Virtual Clock [ADV3] is that it also enables the use of majority voters, notably enriching the NML logic gate library.

The new majority voter layout is shown in Figure 5.A. Magnets belonging to input wires have a bigger aspect ratio, therefore they are in the first virtual clock phase. The central and output magnets have a smaller aspect ratio, they are in the second virtual clock phase. Helper blocks are necessary to allow signals propagation in vertical paths. The central magnet, the most important element of the gate, is in a different virtual phase. As a consequence, even if input signals propagate with a different delay, only when the magnetic field is lowered under a certain threshold then the central magnet switches. The duration of the clock fall time is obviously chosen long enough to allow the switching of all input magnets. When the central magnet switches, all its inputs are stable and its final state is correct. This is confirmed by our simulations. Figure 5.B depict the initial state of the majority voter when inputs are set to 0-0-1. This input combination was chosen because the simulation is particularly clear and understandable (all the cases have been tested, however, but cannot be shown for the



Fig. 5. Majority voter with Virtual Clock. A) Gate layout. The three input wires are in the first virtual clock phase, while the other magnets are in the second virtual clock phase. Helper blocks are required to assure a correct propagation of signals. B) Simulation with inputs equal to 0-0-1. Magnets are initialized to logic '1'. C) Magnets are then forced in the RESET state. D) When the magnetic field start to be removed, only the input wire magnets switch. E) Finally also the central and output magnets switch correctly. F) Final state with inputs equal to 0-0-0. G) Final state with inputs equal to 0-1-0. H) Final state with inputs equal to 0-1-1. I) Final state with inputs equal to 1-0-0. J) Final state with inputs equal to 1-0-1. K) Final state with inputs equal to 1-1-0. L) Final state with inputs equal to 1-1-1.

sake of space). All the magnets are initialized to logic '1', but initial configuration does not impact the final result. All magnets are then forced in the RESET state (Figure 5.C). When the magnetic field starts to decrease, input elements switch accordingly, but the central and output magnets remain in the RESET state (Figure 5.D). When the magnetic field is completely removed, all the other magnets switch correctly (Figure 5.E). From Figure 5.F to Figure 5.L the final majority voter state with all the other seven possible input combinations is shown. As clearly demonstrated by these simulations the majority voter works perfectly in every case.

## V. ENABLING TWO LAYER CIRCUITS

To design NML (and QCA) circuits of a certain complexity it is important that wires delivering different signals can be crossed, as it happens in standard CMOS technology. The crosswire [8] in NML technology, a block that allows to cross two wires on the same plane without interferences, enables the design of circuits using only one physical layer of magnets (or cells). Nonetheless, the consequences of using just a single layer on circuits of realistic complexity cannot be tolerated: the area of circuits grows exponentially due to the overhead of the interconnections [23]. The availability of more layers could drastically reduce the area of circuits. Moreover, reduced area means also less power consumption. The necessity of routing clock wires, however, limits the adoption of multilayer circuits in in-plane NML technology. Out-of-plane NML technology already allows the design of 3D circuits, a 3D crosswire circuit was for example experimentally fabricated in [35]. Here we show that the Virtual Clock system [ADV4] easily enables the use of a multilayer circuit, basically boosting the possibility for NML to be used in realistically complex circuits.

Figure 6.A depicts the circuit structure and the Comsol Multiphysics simulation of a simple multilayer NML circuit. The purpose of this simulation is the investigation of magnetic field distribution. The structure is composed by three copper wires with a section of 280x300nm<sup>2</sup>. Each copper wire is surrounded by a ferrite yoke, used to confine the magnetic field flux lines. Saturation in the yoke is not a problem, as demonstrated in [13], because the magnetic field used is far lower then the saturation limit of the ferrite. Three layers of magnets are used. Each magnet is 50x100x20nm<sup>3</sup>, with a space separation among neighbor elements of 20nm. The distance between each magnet layer is 5nm. This value was chosen in order to have a better confinement of magnetic flux lines. A current density of  $1.5 * 10^7 \text{A/cm}^2$  is applied to the central clock wire. This current value is the same used in the experimental results presented in [8]. A current of 545mA flowing through a wire of section 2.6x1.4um<sup>2</sup> was used in that case. The magnetic field distribution generated by the current is depicted in the detail in the figure. The magnetic field is calculated only in the area among magnets because. This is due to the way Comsol simulation engine works. In the magnet volume the magnetization is present. The magnetic

field distribution is not uniform. Near magnets belonging to the bottom layer the magnetic field is around 90kA/m. Near magnets belonging to the second and third layer the magnetic field drops around 60kA/m.



Fig. 6. Comsol Multiphysics simulation of two layer NML structures. A) One single plane is used for magnetic field generation. B) Clock wires are placed both over and under the magnets plane. The magnetic field value is quite uniform on all magnets.

If another level of clock wires is added on top of the last magnet layer, the magnetic field distribution changes. Results, along with the description of the structure used in the simulation, are reported in Figure 6.B. Geometrical dimensions are the same of the previous structure. The detail of Figure 6.B shows that the magnetic field distribution is more uniform. The intensity of the magnetic field generated is around 90kA/m in the bottom and top magnets layer, and around 80kA/m in the intermediate layer. The current density used in this case is lower, only  $1.4 * 10^7$ A/cm<sup>2</sup>, and it is applied to the upper and lower central clock wires. On the two wires current flows in opposite directions, so that the total magnetic field is the sum of the fields generated by the two wires. Overall the magnetic field intensity in the intermediate layer substantially increases, while simultaneously using lower current.

The possibility to place clock wires both under and over the magnets plane was suggested in [24] along with the snakeclock and then in [8]. However, with the snake clock solution, wires already require to be placed on two different layers to allow basic circuit functionalities. An additional clock wire can be placed only in correspondence of clock phase 1 (see Figure 2.B to understand better). The 2-phase clock developed in [18] leaves enough room to place additional clock wires on top of the magnets layer. However, additional considerations must be taken into account. In the case of a two layer NML circuit, for a signal to travel from the bottom to the top layer many magnets must be chained. Considering a clock wire width equal to 4 magnets, at least 6 magnets must be chained. The NML wire is composed in the worst case by 6 magnets. The maximum number of magnets that can be chained inside a clock zone is 5. So the snake clock solution and the 2phase clock solution can be used in two layer structures only with reduced clock wires width. Reducing wire width increases interconnections overhead and increases the complexity of the fabrication process.

The Virtual Clock here proposed does not suffer from any of those limitations. Clock wires can be placed above the magnets layer. Moreover, the limit of maximum 5 magnets for each clock zone is applied to each virtual clock phase. With two or three virtual clock phases the number of magnets that can be effectively chained is therefore 10 or 15 respectively.

#### VI. MULTILAYER CROSSWIRE

To demonstrate the validity and advantages of the Virtual Clock we have designed and simulated a two layer crosswire. Magnets placement is described in Figure 7.A. Figure 7.B reports the final circuit state when inputs are both set to logic '0'. Figure 7.C and Figure 7.D report the cross-section view and the top view, respectively. Figure 7.C reports also the schematic representation of clock wires position. The simulations highlight the correct behavior of the circuit. We have validated the circuit with all input combinations. Simulation results with other input combinations are not reported here for space reasons. Two layer structures require two clock wires, with the current flowing simultaneously in both wires, doubling the power consumption. However, our preliminary results on two layers NML circuits [9] highlight a considerable reduction in circuit area, which largely compensates the increase of power consumption due to the use of two clock wires.

A reliable crosswire [ADV5] is a fundamental requirement to implement NML circuits. The possibility to build a multilayer crosswire is therefore another advantage provided by the Virtual Clock solution here presented. As a final note, we do not have a sufficient expertise on technology processes to give definitive assessments on the technological feasibility of two layer NML circuits. We believe, on the basis of our knowledge, that it is effectively possible to build multilayer NML structures. In case it would not result possible, then the Virtual Clock system still has many advantages over the snake clock [24] and the 2-phase clock [18], also considering single layer circuits. (ADV1-ADV4).

#### VII. CONCLUSIONS

We have proposed and demonstrated with detailed micromagnetic simulations a new clock system for NML logic. The Virtual Clock offers several advantages with respect to previous solutions. It simplifies the clock structure and reduces the complexity of fabrication processes. It allows to optimize circuits layout reducing interconnections overhead. It enables the use of Majority Voters without complex clock distribution structures. It favors the introduction of two layers NML circuits and specifically enables the use of multilayer crosswires. It globally reduces power consumption by remarkably reducing circuits area.



Fig. 7. A) Two layer crosswire. The horizontal wire travels on the top layer to avoid the vertical wire. B) Final state of the simulation considering an input combination of 0-0. C) Cross-section view of the simulation. The position of clock wires is depicted. D) Top view of the simulation.

To better understand the impact of Virtual Clock, two examples can be done using circuits proposed in literature. In [36] a ripple carry adder, designed respecting all theoretical limitations of traditional clocking, was presented. Considering a 4 bits Ripple Carry Adder, its size is 252x52 magnets. Each magnet is  $50 \times 100 \text{ nm}^2$  with a distance among them of 20nm. The area is therefore  $109\mu m^2$ . Considering copper clock wires with a Section of 280x300nm<sup>2</sup> and a current of 3mA, the total power consumption is  $247\mu$ W. In [9] we described a Logicin-Memory circuit implemented with NanoMagnet Logic. The focus of that work was the implementation of the Logic-In-Memory circuit, and we reported as an example the detailed description of a 4 bits ripple carry adder, designed applying the Virtual Clock instead of another clocking system because of its efficiency. Here we mention the results of that case study to prove the possible impact that the Virtual Clock has on a circuit. The circuit size is 55x19 magnets, corresponding to an area of  $8.5\mu m^2$ . The total power consumption, considering two clock wires above and under the magnets instead of one, is  $69\mu$ W. Applying therefore the Virtual Clock, the area of the circuit is reduced by 12 times, while power consumption is reduced by 4 times. We believe that these results further demonstrate the big advancement that the Virtual Clock mechanism represents for NML technology.

We are now working on more complex circuit architectures based on the Virtual Clock. Our aim is twofold: to analyze and demonstrate the effectiveness of this clock solution, and to further improve NML clocking.

#### REFERENCES

- R.P. Cowburn and M.E. Welland. Room temperature magnetic quantum cellular automata. *Science*, 287:1466–1468, 2000.
- [2] C.S. Lent, P.D. Tougaw, W. Porod, and G.H. Bernstein. Quantum cellular automata. *Nanotechnology*, 4:49–57, 1993.
- [3] R.K. Kummamuru, A.O. Orlov, R. Ramasubramaniam, C.S. Lent, G.H. Bernstein, and G.L. Snider. Operation of a Quantum-dot Cellular Automata (QCA) shift register and analysis of errors. *IEEE Transation* On Electron Devices, 50:1906, 2003.
- [4] S. Rajaram, D. K. Karunaratne, S. Sarkar, and S. Bhanja. Study of dipolar neighbor interaction on magnetization states of nano-magnetic disks. *IEEE Transactions on Magnetics*, 49(7):3129–3132, 2013.
- [5] U. Lu and C.S. Lent. Theoretical Study of Molecular Quantum-Dot Cellular Automata. *Journal of Computational Electronics - Springer*, 4:115–118, 2005.
- [6] A. Pulimeno, M. Graziano, A. Saginario, V. Cauda, D. Demarchi, and G. Piccinini. Bis-ferrocene molecular QCA wire: ab-initio simulations of fabrication driven fault tolerance. *IEEE Transaction on Nanotechnology*, 12(4):498–507, May 2013.
- [7] M. Graziano, A. Pulimeno, R. Wang, X. Wei, M.R. Roch, and G. Piccinini. Process variability and electrostatic analysis of molecular qca. *ACM Journal on Emerging Technologies in Computing Systems*, 12(2), 2015.
- [8] M. Niemier and al. Nanomagnet logic: progress toward system-level integration. *Journal of Physics: Condensated Matter*, 23:34, November 2011.
- [9] M. Cofano, G. Santoro, M. Vacca, D. Pala, G. Causapruno, F. Cairo, F. Riente, G. Turvani, M. R. Roch, M. Graziano, and M. Zamboni. Logic-in-memory: A nano magnet logic implementation. *ISVLSI*, pages 286–291, 2015.
- [10] M.T. Niemier, X.S. Hu, M. Alam, G. Bernstein, M. Putney W. Porod, and J. DeAngelis. Clocking Structures and Power Analysis for nanomagnet-Based Logic Devices. In *International Symposium on Low Power Electronics and Design*, pages 26–31, Portland-Oregon, USA, 2007. IEEE.
- [11] A. Imre, L. Ji, G. Csaba, A.O. Orlov, G.H. Bernstein, and W. Porod. Magnetic Logic Devices Based on Field-Coupled Nanomagnets. 2005 International Semiconductor Device Research Symposium, page 25, December 2005.

- [12] J. Wang, M. Vacca, M. Graziano, and M. Zamboni. Biosequences analysis on NanoMagnet Logic. *International Conference on IC Design* and Technology, pages 131–134, May 2013.
- [13] M. T. Alam, M. J. Siddiq, G. H. Bernstein, M. Niemier, W. Porod, and X. S. Hu. On-chip clocking for nanomagnet logic devices. *IEEE Transactions on Nanotechnology*, 9(3):348–351, 2010.
- [14] M. Vacca, M. Graziano, and M. Zamboni. Nanomagnetic Logic Microprocessor: Hierarchical Power Model. *IEEE Transactions on VLSI Systems*, 21(8):1410–1420, August 2012.
- [15] F. Cairo, M. Vacca, M. Graziano, and M. Zamboni. Domain magnet logic (dml): A new approach to magnetic circuits. In *14th IEEE International Conference on Nanotechnology*, pages 956–961, 2014.
- [16] T. Fischbacher, M. Franchin, G. Bordignon, and H. Fangohr. A Systematic Approach to Multiphysics Extensions of Finite-Element-Based Micromagnetic Simulations: Nmag. *IEEE Transactions on Magnetics*, 43(6):Available on–line, 2007.
- [17] Comsol Multiphysics. http://www.comsol.com/.
- [18] M.T. Niemier, E. Varga, G.H. Bernstein, W. Porod, M.T. Alam, A. Dingler, A. Orlov, and X.S. Hu. Shape Engineering for Controlled Switching With Nanomagnet Logic. *IEEE Transactions on Nanotechnology*, 11(2):220–230, March 2012.
- [19] Bin Zhang, Xiaokuo Yang, Zhichun Wang, and Mingliang Zhang. Innovative orderly programmable in-plane majority gates using trapezoid shape nanomagnet logic devices. *Micro Nano Letters, IET*, 9(5):359– 362, May 2014.
- [20] M. Alam, G.H. Bernstein, J. Bokor, D. Carlton, X.S. Hu, S. Kurtz, B. Lambson, M.T. Niemier, W. Porod, M. Siddiq, and E. Varga. Experimental progress of and prospects for nanomagnet logic (nml). In *Silicon Nanoelectronics Workshop (SNW), 2010*, pages 1–2, June 2010.
- [21] E. Varga, M. Siddiq, M.T. Niemier, M.T. Alam, G.H. Bernstein, W. Porod, X.S. Hu, and A. Orlov. Experimental demonstration of non-majority, nanomagnet logic gates. In *Device Research Conference* (*DRC*), 2010, pages 87–88, June 2010.
- [22] G. Csaba and W. Porod. Behavior of Nanomagnet Logic in the Presence of Thermal Noise. In *International Workshop on Computational Electronics*, pages 1–4, Pisa, Italy, 2010. IEEE.
- [23] M. Awais, M. Vacca, M. Graziano, and G. Masera. Quantum dot Cellular Automata Check Node Implementation for LDPC Decoders. *IEEE Transaction on Nanotechnology*, 12(3):368–377, 2013.
- [24] M. Graziano, M. Vacca, A. Chiolerio, and M. Zamboni. A NCL-HDL Snake-Clock Based Magnetic QCA Architecture. *IEEE Transaction on Nanotechnology*, 10(5):1141–1149, September 2011.
- [25] J. Das, S.M. Alam, and S. Bhanja. Ultra-low power hybrid cmosmagnetic logic architecture. *IEEE Transaction on Computer And Systems*, 2011.
- [26] J. Das, S. M. Alam, and S. Bhanja. Nano magnetic stt-logic partitioning for optimum performance. *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, 22(1):90–98, 2014.
- [27] J. Das, S.M. Alam, and S. Bhanja. Low Power Magnetic Quantum Cellular Automata Realization Using Magnetic Multi-Layer Structures. *Journal on Emerging and Selected Topics in Circuits and Systems*, 1(3):267–276, September 2011.
- [28] N. Rizos, M. Omar, P. Lugli, G. Csaba, M. Becherer, and D. Schmitt-Landsiedel. Clocking Schemes for Field Coupled Devices from Magnetic Multilayers. In *International Workshop on Computational Electronics*, pages 1–4, Beijin, China, 2009. IEEE.
- [29] X. Ju, M.T. Niemier, M. Becherer, M. Putney W. Porod, P. Lugli, and G. Csaba. Systolic Pattern Matching Hardware With Out-of-Plane Nanomagnet Logic Devices. *IEEE Transaction on Nanotechnology*, 12(3), May 2013.
- [30] M. Vacca, M. Graziano, L. Di Crescenzo, A. Chiolerio, A. Lamberti, D. Balma, G. Canavese, F. Celegato, E. Enrico, P. Tiberto, L. Boarino, and M. Zamboni. Magnetoelastic clock system for nanomagnet logic. *IEEE Transaction On Nanotechnology*, 13(5), September 2014.
- [31] M.A. Siddiq, M.T. Niemier, G. Csaba, A.O. Orlov, X.S. Hu, W. Porod, and G.H. Bernstein. A Nanomagnet Logic Field-Coupled Electrical Input. *IEEE Transaction on Nanotechnology*, 12(5), September 2013.
- [32] D.B. Carlton, N.C. Emley, E. Tuchfeld, and J. Bokor. Simulation Studies of Nanomagnet-Based Logic Architecture. *Nanoletters*, 8(12):4173– 4178, November 2008.
- [33] M. Graziano M. Zamboni M. Vacca, J. Wang. Feedbacks in qca: a quantitative approach. *IEEE Transaction on VLSI computing*, 23,10, 2015.
- [34] M. Vacca, M. Graziano, and M. Zamboni. Majority Voter Full Characterization for Nanomagnet Logic Circuits. *IEEE Transaction on Nanotechnology*, 11(5):940–947, September 2012.

- [35] Irina Eichwald, Stephan Breitkreutz, Josef Kiermaier, Gyorgy Csaba, Doris Schmitt-Landsiedel, and Markus Becherer. Signal crossing in perpendicular nanomagnetic logic. *Journal of Applied Physics*, 115, 2014.
- [36] M. Vacca, S. Frache, M. Graziano, and M. Zamboni. ToPoliNano: A synthesis and simulation tool for NML circuits. *IEEE International Conference on Nanotechnology*, pages 1–6, August 2012.