Abstract. In this paper, we show that perpendicular Nanomagnetic Logic (pNML) is particularly suitable to realize threshold logic gate (TLG)-based circuits. Exemplarily, a 1-bit full adder circuit using a novel 5-input majority gate based on TLGs is experimentally demonstrated. The theory of pNML and its extension by TLGs is introduced, illustrating the great benefit of pNML. Majority gates based on coupling field superposition enable weighting each input by its geometry and distance to the output. Only 5 magnets, combined in two logic gates with a footprint of 1.95 μm 2 and powered by a perpendicular clocking field, are required for operation. MFM and magneto-optical measurements demonstrate the functionality of the fabricated structure. Experimental results substantiate the feasibility and the benefits of the combination of threshold logic with pNML.
Introduction
Nanomagnetic Logic (NML) is an emerging information processing technology using the interaction of bistable magnets to perfom logic operations [1, 2] . Low-power switching and high density integration of interconnectfree, non-volatile magnets facilitate energy-efficient and area-saving integration of combined logic and memory devices in pure NML or hybrid CMOS/NML circuitry [3, 4] . Perpendicular NML (pNML) uses CoPt or CoNi nanomagnets with perpendicular magnetic anisotropy (PMA). It benefits from flexible geometries and shape independent anisotropy, which is tuned by focused ion beam (FIB) irradiation [5] . Therefore, so-called artificial nucleation centers (ANCs) are fabricated by partial FIB irradiation at user-defined positions and provide directed signal flow in chains [6] [7] [8] and gates [9] of field-coupled nanomagnets. Fig. 1 shows the vision of a fully-integrated pNML system. Information is processed by complex circuits providing non-volatile logic operation using majority gates and inverters connected by wires [10] . Crossing of magnetic signals is achieved by detouring through additional functional layers [11] and logic gates can be programed during runtime [12] . Current wires [13] and spin transfer torque (STT) devices [14] are envisioned as I/O elements for electrical circuitry integration. An integrated on-chip coil generates perpendicular magnetic fields which operate as both power supply and clock generator [15] .
Furthermore, perpendicular NML is highly suitable to realize threshold logic gate (TLG)-based architectures [16] . The working principle of majority gates based on fringing field superposition offers the possibility to weight a e-mail: stephan.breitkreutz@tum.de Figure 1 . Vision of a perpendicular NML system using fieldcoupled magnets for logic computation, electrical I/O elements for CMOS integration and an on-chip coil as power supply.
each input by its size and distance to the ANC. In this paper, we exemplary demonstrate a TLG-based full adder circuit using a novel 5-input majority gate by experiment.
Theory of perpendicular NML
To realize logic operations, the switching process of an output magnet needs to be controlled by the coupling fields of its surrounding input magnets. The reversal process of each nanomagnet with PMA is governed by domain wall (DW) nucleation and propagation [17] . In pNML devices, the DW nucleation at the ANC of the output is supported or constrained by the input coupling fields [9] . Fig. 2 shows the basic principle of pNML. The central magnet with magnetization M z is partially irradiated on the left side and its switching field is reduced to H c . Due to the location of the ANC, only the short ranged coupling field C of the left neighbor M 1 influences the switching process of the central magnet. The antiferromagnetic coupling field superposes with the applied, perpendicular field H ext and therefore shifts the hysteresis of the central magnet to left or right depending on the magnetization state M 1 , but independent of M 2 [6] .
An alternating clocking field with adequate amplitude H clock = H c will force the central magnet to switch to the antiparallel state of M 1 . Similarly, M 2 will be ordered antiparallel to M z . The stepwise antiparallel ordering enables directed signal flow in a chain of magnets and constitutes the basis for logic operations in pNML circuitry [7, 8] .
Due to switching field distributions (SFDs) caused by the influence of thermal noise and fabrication variations of the ANC from dot to dot, the residual clocking window for complex NML circuits is noticeably reduced [15, [18] [19] [20] . Hence, it is fundamental to precisely control the location and the strength of the ANC by size and dose of the partial FIB irradiation [5, 9] .
TLG-based pNML
In general, threshold logic gates weight each single, binary input and compare the weighted sum to a threshold to define the binary output [21] . In adaption to pNML, where magnets can only have the binary (magnetization) states up +1=logic 1 or down −1=logic 0 , the definition of the function F of a threshold gate is given by
with w i as weight and x i ∈ {−1, 1} as binary state of the input i [16] . In a majority gate, the superposed coupling fields of all input magnets on the ANC of the output decide about its switching. The coupling field C of a magnet approximately scales by
with the input size A and the distance d to the magnet. n 1 and n 2 are geometry-dependend exponents in the range of 0 < n 1 ≤ 1 and 1 < n 2 ≤ 3. Thus, pNML majority gates offer the possibility to weight each input by its geometry and distance to the output ANC. According to [16] , the magnetization m out of an output magnet after clocking is with m i ∈ {−1, 1} as normalized magnetization state M z /M s and C i as effective coupling field of the input i. Fig. 3a shows the schematic of a TLG-based full adder circuit as proposed by [21] and adapted for pNML by [16] . The inverted carry-out C out is defined by a 3-input majority gate with equal weights. The inverted sum S is given by a 4-input TLG with the input C out double-weighted compared to A, B and C in . However, in semiconductor technology, it is more common to fabricate structures with equal sizes and distances. Consequently, S can be computed by a 5-input majority gate where two inputs are set by C out and the other three by A, B and C in , respectively. Fig. 3b shows the layout of the TLG-based full adder structure using a 3-input and a 5-input gate to compute the outputs C out and S. The size and the distance (and therefore the weight) of each input structure is equal within each gate, but C out defines two inputs in the 5-input gate. During clocking, C out is set antiparallel to the majority of the input magnets A, B and C in by the first clocking pulse in the 3-input gate, according to the truth table (Fig. 3c) . Once C out is set, S is defined (switched or not) in the 5-input gate by the second clocking pulse.
The input magnets B and C in can be contacted by crossing elements [11] or MTJ/GMR structures. Remarkably, the whole circuit consists of only 5 magnets, which is the minimum possible number to realize a structure with 3 inputs and 2 outputs.
Experiment
The Ta 1nm Pt 3nm [Co 0.8nm Pt 1nm ] x4 Pt 3nm multilayer stack is magnetron sputtered on a thermally oxidized Si 100 wafer. The Pt seed layer enforces the PMA of the Co, the toplayer prevents the Co from oxidation. The multilayer film is structured by FIB lithography and ion beam etching using an evaporated 5 nm-Ti hard mask. Subsequently, the output magnets C out and S are partially irradiated on an area of 20 nm · 20 nm with a dose of 2 · 10 13 ions cm 2 using a 50 kV Ga + FIB system. Fig. 4a shows a SEM image of the fabricated full adder. Its size is 1.5 · 1.3 μm 2 , the width of the magnets is 100 nm and the gap between inputs and outputs is ≈ 20 − 30 nm. Fig. 4b shows the corresponding MFM phase image for the 011 input configuration (A, B, C in ) after clocking.
Due to the lowered anisotropy at the ANC, the partial irradiated magnets C out and S have a mean switching field of ≈ 620 Oe, which is also the amplitude B clock for the clocking field. The input magnets A, B and C in are not irradiated and therefore not affected by the clocking pulses. Hence, each input configuration can be set by external fields prior to the clocking phase of the experiment. Certainly, in an operating circuit, the inputs would be set by prior gates or input structures. In the initial state the input magnets A, B and C in are set to the desired configuration, the outputs C out and S are set to the up state by a 1 kOe pulse. Afterwards C out and S are sequentially ordered by two alternating clocking pulses with the amplitude H clock = ∓620 Oe. Fig. 5 shows the clocking scheme utilized in the experiment. The input configuration A, B, C in is set by external fields and the outputs C out and S are set upwards by an initial field pulse with 1 kOe amplitude (Init, time t 0 ). Afterwards the outputs are computed by two opposing clocking pulses with H clock = ∓620 Oe. The first pulse (Pulse 1, t 1 ), sets C out to the designated state. The second pulse (Pulse 2, t 2 ) switches S back to the correct state. Fig. 6 shows the measurement results for all possible input configurations of A, B and C in . The outputs C out and S are subsequentially ordered by the clocking pulses. Note, that depending on the input configuration, C out and/or S may be already set correctly by the init pulse. Then the structure is earlier in its final state, but does not change by any further pulses. However, correct ordering of the outputs is only guaranteed after one complete clocking cycle.
In some cases, the amplitude of the second pulse had to be reduced to 600 Oe to avoid undesired switching of S. Due to the influence of thermal noise, a magnet has a given probability to switch during an applied field pulse depending on its amplitude and length. The statistical switch- Figure 7 . Measured probability densities of C out for the 000 (inputs down) and the 111 input configuration (inputs up), separated by 6 times of C 3 (input coupling of the 3-input gate).
ing behavior is described by a probability density function (PDF) [15, 18] . As described in section 2, the coupling fields of the input magnets superpose with the field pulse and therefore shift the PDF up or down. Certainly, for reliable computation those shifted PDFs have to be separated to clearly determine the magnets switching [7] . Fig. 7 shows the measured PDFs of C out for the 000 and the 111 input configuration. The means of those two PDFs are separated by 210 Oe, which is 6 times the coupling of each input of the 3-input gate: C 3 = 35 Oe. Accordingly, we measured a coupling field of C 5 = 27 Oe for each input of the 5-input gate, which is obviously not sufficient to separate the different PDFs. Additionally, the mean switching fields of both output magnets H c,C out = 630 Oe and H c,S = 620 Oe are slightly different, which further decreases the clocking window.
Therefore, great effort should be made in increasing the coupling and decreasing the SFDs, as they can be engineered by enhancing the material system and improving the fabrication technology. Both components exhibit great potential to increase the reliability of pNML circuitry. For instance, the coupling field can be dramatically increased by enhancing the total magnetic moment or reducing the distance between the inputs and the ANC according to eq.3. Simulations using compact modeling [20] show, that the coupling of each input has to be increased to C x = 60 Oe to reduce the error rate to e < 10 −3 . Consequently, distances between inputs and ANCs have to be reduced to d < 20 nm to provide definite and reliable computation using TLG-based pNML.
Conclusion
Perpendicular NML is highly suitable to implement threshold logic based circuits. Majority gates offer the possibility to weight each input by its coupling field, which is defined by the inputs geometry and distance to the ANC. Hence, a very straight and efficient way to realize TLGbased circuits is combined with advantages of pNML, e.g. non-volatility and low-power computing.
In this paper, we experimentally demonstrate a TLGbased 1-bit full adder circuit using a novel 5-input majority gate. Here, the geometry and the distance of each input in the gate is kept constant, but one input signal is connected to two physical input arms to double its influence.
Compared to former implementations of a full adder [10] , the device footprint is reduced by 88.5 % (from 17 μm 2 to 1.95 μm 2 ) and the maximum device speed is increased by 200 % (smaller magnets and less clocking cycles).
