Multi-frequency Data Parallel Spin Wave Logic Gates by Mahmoud, Abdulqader et al.
Multi-frequency Data Parallel Spin Wave Logic Gates
Abdulqader Mahmoud,1, a) Frederic Vanderveken,2, 3 Christoph Adelmann,3 Florin
Ciubotaru,3 Said Hamdioui,1 and Sorin Cotofana1, b)
1)Delft University of Technology, Department of Quantum and Computer
Engineering, 2628 CD Delft, The Netherlands
2)KU Leuven, Department of Materials, SIEM, 3001 Leuven,
Belgium
3)Imec, 3001 Leuven, Belgium
By their very nature, Spin Waves (SWs) with different frequencies can propagate
through the same waveguide without affecting each other, while only interfering with
their own species. Therefore, more SW encoded data sets can coexist, propagate,
and interact in parallel, which opens the road towards hardware replication free
parallel data processing. In this paper, we take advantage of these features and
propose a novel data parallel spin wave based computing approach. To explain and
validate the proposed concept, byte-wide 2-input XOR and 3-input Majority gates
are implemented and validated by means of Object Oriented MicroMagnetic Frame-
work (OOMMF) simulations. Furthermore, we introduce an optimization algorithm
meant to minimize the area overhead associated with multifrequency operation and
demonstrate that it diminishes the byte-wide gate area by 30% and 41% for XOR
and Majority implementations, respectively. To get inside on the practical implica-
tions of our proposal we compare the byte-wide gates with conventional functionally
equivalent scalar SW gate based implementations in terms of area, delay, and power
consumption. Our results indicate that the area optimized 8-bit 2-input XOR and
3-input Majority gates require 4.47x and 4.16x less area, respectively, at the expense
of 5% and 7% delay increase, respectively, without inducing any power consump-
tion overhead. Finally, we discuss factors that are limiting the currently achievable
parallelism to 8 for phase based gate output detection and demonstrate by means
of OOMMF simulations that this can be increased 16 for threshold based detection
based gates.
a)Electronic mail: a.n.n.mahmoud@tudelft.nl
b)Electronic mail: S.D.Cotofana@tudelft.nl
1
ar
X
iv
:2
00
8.
12
22
0v
1 
 [p
hy
sic
s.a
pp
-p
h]
  2
7 A
ug
 20
20
I. INTRODUCTION
The amount of row data has rapidly increased in the last few decades due to the informa-
tion technology unprecedented growth. These data are usually processed on high efficiency
CMOS technology based computing platforms1–3 and as the amount of row data increased,
technology feature size has been shrunken to keep up with the computation power demands.
However, when entering into the deca-nanometer regime CMOS downscaling becomes more
difficult due to: (i) leakage wall4,5, (ii) reliability wall6, and (iii) cost wall4,6, which sug-
gests the near end of Moore’s law. As a result, different technologies, e.g., graphene7–11,
memristor12–16, spintronics17–21 have been explored in an attempt to meet the exponentially
increasing computing market demands22.
While each of these alternative technologies exhibits both strong and weak points, spin-
tronics on its Spin Wave (SW) flavour seems to have a great potential to meet market needs22
due to its: (i) Ultra-low power consumption as no charge movements are required in order
to perform calculations, (ii) acceptable delay, (iii) down to nm range scalability, and (iv)
natural support for data parallelism enabled by the fact that SWs of different frequency can
coexist and selectively interact within the same waveguide.
In view of this, different logic gates built on spin wave technology were presented, e.g.,23–42,
and in the sequel we briefly present some of them. A current controlled Macha-Zender inter-
ferometer based NOT gate has been the first experimentally demonstrated SW logic gate23
and by making use of a similar method, other logic gates including XNOR, NAND, and
NOR were realized24–26. NOT, OR, and AND gates were designed using three terminal de-
vices with transmission lines27282930 and voltage-controlled XNOR and NAND gates utilizing
re-configurable nano-channel magnonic devices were suggested31. In addition, an XOR gate
was proposed by embedding magnon transistors between the Mach-Zehnder interferome-
ter arms32. By relying on another information encoding method, i.e., on SW phase rather
than on SW amplitude as it is the case for the previously mentioned schemes, buffer, NOT,
(N)AND, (N)OR, XOR, and Majority gates were introduced in33. Moreover, alternative
Majority gate designs were suggested to decrease the SW back propagation and increase the
SW transmission efficiency34–36. OR and NOR gates were designed using cross structures37
and physically implemented Majority gates were reported in38–41.
All the previously mentioned designs operate on same frequency SWs, i.e., on 1-bit inputs,
2
therefore, if multiple-bit input functions are to be evaluated, e.g., bitwise XOR over two
n-bit inputs A = (a1, a2, . . . , an) and B = (b1, b2, . . . , bn), an XOR gate structure must be
replicated n times in order to process the n input bit-pairs (sets) in parallel at the expense of
area overhead. However, different frequency SWs can simultaneously propagate through the
same waveguide without affecting each other, while only interfering with their own species.
This suggests that if each input pair (ai, bi) is encoded with fi frequency SWs, XOR(A,B)
can be potentially evaluated with one instead of n XOR gates. This approach has been
pursued in42, which introduces a Majority gate structure able to simultaneously process 3
data set encoded at 3 different frequencies. However, the suggested structure make use of
bent regions, which have detrimental effects on SW propagation, and contains a magnonic
crystal that induces a large delay overhead.
In this paper we revisit the SW parallelism concept and propose a novel multi-frequency
data parallel in-line generic SW gate structure. Our contributions can be summarized as
follows:
• Generic multi-frequency data parallel in-line SW gate structure and an associated area
optimization algorithm.
• Design and validation of 8-bit data parallel in-line Spin Wave logic gates: 8-bit 3-input
Majority and 2-input XOR gates are instantiated and validated by means of Object
Oriented MicroMagnetic Framework (OOMMF) simulations.
• Performance assessment and comparison with SW state-of-the-art: The proposed 8-bit
3-input Majority and 2-input XOR gates require 4.47x and 4.16x less area, respectively,
when compared with functionally equivalent scalar SW gate based implementations,
at the expense of 5% and 7% delay penalty, respectively, and no power consumption
overhead.
• Parallelism limit study: Demonstrate by means of OOMMF simulation that the max-
imum currently achievable parallelism, i.e., the number of different SW frequencies, is
8 for phase based output detection and 16 when spin wave magnetization is utilized
to detect the gate output.
3
• Design and OOMMF validation of a 16-bit data parallel in-line Spin Wave 2-input
XOR gate.
The reminder of the paper is organized as follows. Section II briefly explains the SW
physics fundamentals and the associated computing paradigm. Section III describes the
proposed n-bit data parallel SW logic gate and introduces the associated area optimization
algorithm. Section IV provides inside on the utilized simulation platform and parameters,
and presents simulation expperiments related to the validation of the 8-bit 3-input Majority
and 2-input XOR gates. Section V presents evaluation results for the two byte wide parallel
gates and a comparison with functional equivalent scalar implementations. In addition, it
discusses fan-in and geometric scalability, and maximum achievable parallelism issues, and
variability and thermal noise effects. Section VI concludes the paper.
II. SW BASED COMPUTING BACKGROUND
When a ferromagnetic material is exposed to an external magnetic field electron spins
arrange themselves in the applied magnetic field direction, in order to bring the total system
energy to the lowest possible level43. Further, if the electron spins are deflected by an
excitation method, e.g., by means of Magnetoelectric (ME) cell, antenna, a Spin Wave (SW)
is created mainly due to exchange and dipole spin interactions. The precessional electron
spin movement43, can be described by the Landau-Lifshitz-Gilbert (LLG) relation44,45 as
follows:
d~m
dt
= −|γ|µ0
(
~m× ~Heff
)
+ α
(
~m× d~m
dt
)
, (1)
where γ is the gyromagnetic ratio, µ0 the vacuum permeability, α the damping factor, m
the magnetization, and Heff the effective field and it is expressed as:
Heff = Hext +Hex +Hdemag +Hani, (2)
where Hext is the external field, Hex the exchange field, Hdemag the demagnetizing field, and
Hani the magneto-crystalline anisotropy.
An excited SW is characterised by its wavelength λ (the shortest distance between similar
consecutive spins), wave number k
(
k = 2∗pi
λ
)
, frequency f (determined by the complete spin
precession time), phase φ, and amplitude A, as graphically indicated in Figure 1. As such,
an SW can carry information encoded in its amplitude, phase, frequency, or a combination of
4
λ
b) φ=π, k=3
λ
a) φ=0, k=1
FIG. 1. SW Parameters
Wave 1
Wave 2
Interference 
      result
Constructive 
Interference
Destructive 
Interference
FIG. 2. Wave Interference.
them. Once formed, the SW propagates through the ferromagnetic material (waveguide) and
may eventually meet other SWs present in the waveguide, case in which their interaction
is governed by the wave interference principles. For instance, if two SWs with the same
amplitude, wavelength, and frequency coexist in a waveguide, they interfere constructively
if they have the same phase, and destructively if they are out of phase (∆φ = pi) as depicted
in Figure 2. Furthermore, if more than two waves having the same A, f , and λ interfere in
the waveguide, the outcome captures a majority decision, i.e., if the number of spin waves
having φ = 0 is larger than the number of spin waves having φ = pi, the resulting spin
wave has φ = 0, and φ = pi otherwise. Thus, SW interference provides natural support
for direct Majority gate implementations, e.g., 3-input Majority is evaluated by means of
a 3-SW interference in a waveguide33, while its CMOS based implementation requires 18
transistors. Moreover, SWs with different frequencies can coexist and propagate in the same
waveguide without affecting each other and only interacting with other same-frequency SWs,
5
Wave 1
Wave 2
F1
F1
F1
F2
F2
F2
Wave 3
Wave 4
Wave 5
Wave 6
FIG. 3. Different Frequency, Wavelength, and Amplitude Spin Wave Interference.
which indicates that SW interaction provides intrinsic support for data parallel computing.
Note that, in the most general case, spin waves with different amplitude, frequency, and
wavelength can coexist and selectively interfere in the same waveguide, which results in
more complex interference patterns as presented in Figure 3. As depicted in the Figure, f1
waves F1 and F2 interference results in F5 and f2 waves F3 and F4 interference results in F6,
while no interaction between the f1 and f2 waves occurs. We note that in our investigation
we consider that regardless of their frequency all input SWs have the same amplitude.
Depending on the orientation relation between spin wave propagation, effective magnetic
field, and magnetization three main Magnetostatic Spin Wave (MSW) types exist: Magne-
tostatic Surface Spin Wave (MSSW), Forward Volume Magnetostatic Spin Wave (FVMSW),
and Backward Volume Magnetostatic Spin Wave (BVMSW)43. While each type has certain
interesting properties, FVMSWs are the most attractive as in-plane spin-wave propagation
is isotropic, which is beneficial from the circuit design prospective.
Figure 4 depicts the generic structure of a SW based logic gate, which consists of multiple
6
inputs (I1, I2, I3, ..., In), a Functional Region (FR), which might perform Majority, AND,
OR, XOR function or its inverted version, and an output O. All inputs are excited at the
same frequency, propagate from their sources through the waveguide and interfere construc-
tively or destructively based on their phases. The result is available at the output as a SW
with the same frequency as the inputs. This is a scalar gate as each input SW represents
one bit, thus in case the same function has to be pairwise evaluated on n-bit inputs this can
be done in parallel by instantiating n such gates or serially by using one gate only with the
associated area and delay overhead, respectively. In the following section we take advantage
of different frequency SW interaction behaviour and introduce data parallel SW gates that
can process n-bit inputs without hardware replication or serialisation.
III. n-BIT DATA PARALLEL SW LOGIC GATE
Figure 5 presents the parallel spin wave logic gate we introduced in46, which is able to
concurrently process m n-bit inputs. As indicated in the Figure, the input sets Ii = {Ii,1,
Ii,2, Ii,3, . . . , Ii,m}, i = 1, 2, . . . , n, are simultaneously encoded into SWs with frequency fi by
means of, e.g., Magnetoelectric (ME) cells or antennas. Subsequently, the SWs corresponding
the sets Ii, i = 1, 2, . . . , n propagate through the waveguide without affecting each other
until reaching the Functional Region (FR). Once the m× n spin waves arrive at FR, equal-
frequency spin waves interfere constructively and destructively depending on their phases,
producig n output SWs Oi = F(Ii), i = 1, 2, . . . , n, where F is the gate function, e.g.,
AND, OR, XOR. Those SWs can be sensed and transformed into the voltage domain by the
detection cells located at O1, O2, . . . , On or tansmitted to the next SW gate.
Although the approach in Figure 5 is generic its practical realization requires stacked
waveguides and contains bent regions, which impede smooth SW propagation. We address
OFR
I1
I2
I3
In
FIG. 4. Conventional SW Logic Gate Structure
7
O1FR
I1,1
I1,2
I1,3
I1,m
F1
1st
input
I2,1
I2,2
I2,3
I2,m
F2
2nd
input
In,1
In,2
In,3
In,m
Fn
nth
input
O2 On
F1 F2 Fn
FIG. 5. Multi-Frequency Spin Wave Logic Gate
I1,2 I2,2
F1
In,2
F2
O1 O2
F1
1st
bit
On
F2 Fn
2nd
bit
nth
bit
Fn
2nd
F1
bit
2nd
F2
bit
2nd
Fn
bit
I1,m I2,m In,m
F1 Fn
mth
F1
bit
mth
F2
bit
mth
Fn
bit
F2
d1 dn×m+1
d2
dn
dn×m+2
dn×m+n
I1,1 I2,1
F1
In,1
F2 Fn
1st
F1
bit
1st 
F2
bit
1st
Fn
bit
FIG. 6. n-bit Inputs In-line Spin Wave Logic Gate
these issues by apply the same idea on a single waveguide structure and constructing the
in-line gate in Figure 6.
Note that for proper gate operation, SWs with the same frequency must be excited with
the same amplitude and wavelength. Moreover, the distances between input sources and
interference locations are SW frequency specific and crucial for proper gate functionality,
thus they must be accurately determined. For example, if constructive interference is re-
quired for in-phase SWs and destructive for out of phase SWs, the distances between the
same frequency sources must be jq × λi, i = (1, 2, 3, . . . , n), i.e, d1 = j1λ1, d2 = j2λ2, . . . ,
dnm = jnmλn, where jq = {1, 2, 3, . . .}, q = 1, 2, 3, . . . , nm. Note that to minimize gate area
and delay jq = 1 is the preferred choice, which is feasible for scalar gates but not always
possible for parallel gates. Whereas, the distances must be (jq + 12)λi, i.e., d1 = (j1 +
1
2
)λ1,
d2 = (j2 +
1
2
)λ2, . . . , dnm = (jnm + 12)λn, if the opposite behaviour is desired.
In view of the previous discussion each output wave Oi is available for detection after
a delay determined by the distance between the most faraway input cell of the Ii set, i.e.,
Ii,1 in Figure 6, and the output cell Oi, thus full parallelism is achieved. Note that the
actual gate delay value can be optimized by choosing appropriate, e.g., waveguide material,
dimensions, thickness, as discussed in Section IV.
While delay optimization is a matter of waveguide material and geometry choice, the
8
gate area can be minimized by changing the position of the input and output transducers.
One can observe in Figure 6 that input and output cells are ordered by bit position for clar-
ity purpose. However, they can be shuffled as long as the previously discussed constraints
are still satisfied, and this results in an area (overall gate length) reduction. To this end
we introduce Algorithm 1, which identifies the transducer (source/detector) locations that
are minimizing the waveguide length, while not infringing the wavelength dependent inter
transducers distance constraints. The algorithm iteratively construct the gate structure by
instantiating one input set Ii, i = 1, 2, . . . , n at a time, while optimizing its transducer posi-
tions in relation to the already optimized structure embedding the previousely instantiated
sets Ij, j = 1, 2, . . . , i− 1.
The algorithm starts with a configuration in which all transducers are placed overlapped
at the waveguide beginning. Subsequently, inputs sets are processed one at a time by initially
placing them one after the other at D distance regardless of the wavelength of the SW they
process (line 3 to 7). If the first set was the one currently processed no further adjustments
are required and the second set can be considered for placement. If this is not the case,
the for loop (line 9 to 24) is repositioning the transducer at the correct positions, which
are multiples of their wavelength frequency. After this step, the transducer configuration
for the up to date processed sets is the same as in Figure 6. Next, the for loop (line 25 to
38) performs the area optimization by checking the spaces between transducers and if it is
possible moving one transducer if its wavelength imposed distance condition is satisfied. If
one transducer has been moved Sort reorders the transducers in the TP matrix to capture
the new configuration. These steps are repeated until all sets are placed and the gate length
optimized. At the end, the gate area is calculated by multiplying the waveguide width by
the waveguide length.
Let us assume a 3-bit 2-input gate operating on SWs with wavelength λ1=100 nm,
λ2=50 nm, and λ3=19 nm, 10 nm transducer length, and 1 nm minimum distance between
transducers. By following the structure in Figure 6, the second input set can begin at 33 nm
from the waveguide start because the first three sources I1,1, I1,2, I1,3 occupy each 10 nm and
are 1 nm distanced apart. As such the initial order is (I1,1, I1,2, I1,3, I2,1, I2,2, I2,3, O1, O2, O3)
with a corresponding waveguide length of 288 nm. The optimization algorithm changes the
order to (I1,1, I1,2, I1,3, I2,3, I2,2, I2,1, O3, O2, O1), which corresponds to a 210 nm waveguide
length thus about 27% area savings.
9
10
Furthermore, two main methods can be utilized for output detection: (i) Phase detection,
and (ii) Threshold detection. In the first case, a predefined phase is utilized as reference and
a phase difference of 0 represents a logic 0, and a phase difference of pi a logic 1. The second
detection method assesses the SW magnetization (SWM) value and reports a 0 logic if the
SWM is smaller than a predefined threshold value and a logic 1 otherwise. If phase detection
is in place, the gate can provide non-inverted or inverted output (or even both of them) by
adjusting the reading location. For instance, referring to Figure 6, the detectors must be
placed at a distance equal to (from the last fi SW source) (jq + 12)λi, i = (1, 2, 3, . . . , n), such
that dnm+1 = (jnm+1 + 12)λ1, dnm+2 = (jnm+2 +
1
2
)λ2, . . . , dnm+n = (jnm+n + 12)λn, if the non-
inverted results are desired. However, the detectors must be placed at a distance equal to
(from the last fi SW sources) jλi such that dnm+1 = jnm+1λ1, dnm+2 = jnm+2λ2, . . . , dnm+n =
jnm+nλn if the compliment is required. In the case of threshold based detection, the gate
can provide non-inverted or inverted outputs without changing the output detector position
by just switching the thresholding condition in the detector cell. Note that, regardless of the
detection method, each read location should be as close as possible to the last input in its
set to diminish the due to damping SW energy lost and process high amplitude spin waves.
IV. SIMULATION SETUP
This section provides inside on the simulation platform, parameters, and performed ex-
periments.
A. Simulation Platform
The Object Oriented MicroMagnetic Framework (OOMMF)47 is utilized to evaluate the
proposed structures. OOMMF numerically solves the LLG equation to capture the gate
behaviour. The OOMMF input is a TCL/TKL script, which describes the gate and the
input stimuli and the results can be visualized within the OOMMF framework or post-
processed by other tools like matlab, which is the case in this paper.
11
TABLE I. Parameters
Parameters Values
Magnetic saturation Ms 1.1 × 106 A/m
Perpendicular anisotropy constant kani 8.3177 × 105 J/m3
Damping constant α 0.004
Waveguide thickness t 1 nm
Exchange stiffness Aexch 18.5 pJ/m
B. Simulation Parameters
Fe60Co20B20 waveguides that have waveguide width of 50 nm with Perpendicular Mag-
netic Anisotropy (PMA) are utilized for all gate constructions. We note that for this material
the anisotropy field Hanisotropy > Ms, which means that there is no need for the application
of an external magnetic field48. Table I presents the parameter we utilize to validate the
8-bit 2-input XOR/XNOR and 3-input Majority gates. The 8 SW frequencies are 10 GHz,
20 GHz, 30 GHz, 40 GHz, 50 GHz, 60 GHz, 70 GHz, and 80 GHz. By making use of the
FVMSW dispersion relation and given that the wavenumber k = 2pi
λ
, we determine the dis-
tances between transducers exciting/detecting SWs with the same frequency are: d1=166 nm
(j=2), d2=100 nm (j=2), d3=117 nm (j=3), d4=165 nm (j=5), d5=174 nm (j=6), d6=130 nm
(j=5), d7=168 nm (j=7), and d8=176 nm (j=8). Furthermore, an 1 nm minimum separation
distance between transducers is in place.
C. Performed Simulations
We perform the following simulation experiments:
• 8-bit 2-input XOR/XNOR gate with threshold detection. The two 8-bit inputs are
simultaneously excited using the sources (I1,1, I2,1, I3,1, . . . , I8,2). The excited spin
waves propagate through the waveguide and those who have the same frequencies
interfere with each other. The resulting spin waves propagate towards the out-
put where they are captured at O1, O2, . . . , O8 based on threshold detection. We
carry on the validation of both area unoptimized (I1,1, I2,1, I3,1, I4,1, I5,1, I6,1, I7,1, I8,1,
I1,2, I2,2, I3,2, I4,2, I5,2, I6,2, I7,2, I8,2, I1,3, I2,3, I3,3, I4,3, I5,3, I6,3, I7,3, I8,3) and optimized
12
| FFT |
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.022
0
0.022
M
x/
M
s
I1=0 , I2=0
I1=0 , I2=1
I1=1 , I2=0
I1=1 , I2=1
0 10 20 30 40 50 60 70 80 90
Frequency (GHz)
0
0.5
1
1.5
M
x(
f)/
M
s
I1=0 , I2=0
I1=0 , I2=1
I1=1 , I2=0
I1=1 , I2=1
FIG. 7. Unoptimized 8-bit XOR Gate Time and Frequency Response.
(I1,1, I2,1, I3,1, I4,1, I5,1, I6,1, I7,1, I8,1, I2,2, I3,2, I1,2, I6,2, I4,2, I5,2, I7,2, I8,2, I2,3, I8,3, I3,3, I1,3,
I6,3, I4,3, I5,3, I7,3) configurations. Note that as detectors order is not important they
follow the same pattern, i.e., (O1, O2, O3, O4, O5, O6, O7, O8) in both cases.
• 8-bit 3-input Majority gate based on phase detection. We again considered area unop-
timized and optimized gate instances but in this case detector order is relevant, thus the
after optimization source and detector order is I1,1, I2,1, I3,1, I4,1, I5,1, I6,1, I7,1, I8,1, I2,2,
I3,2, I1,2, I6,2, I4,2, I5,2, I7,2, I8,2, I2,3, I8,3, I3,3, I1,3, I6,3, I4,3, I5,3, I7,3, O6, O8, O4, O2, O5, O1,
O7, O3.
V. SIMULATION RESULTS AND DISCUSSION
This section presents simulation results for the 8-bit 2-input XOR/XNOR and 3-input
Majority gate instances, performance estimations, and a comparison with SW state-of-the-
art functionally equivalent structures. Subsequently, it discusses fan-in and geometric scal-
ability, and maximum achievable parallelism (upper bound of the number of practically
achievable SW frequencies) issues, and variability and thermal noise effects.
13
(a) F1=10GHz
(b) F2=20GHz
(h) F8=80GHz
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.002
0
0.002
M
x/
M
s
I1=0,I2=0
I1=0,I2=1
I1=1,I2=0
I1=1,I2=1
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.005
0
0.005
M
x/
M
s
I1=0,I2=0
I1=0,I2=1
I1=1,I2=0
I1=1,I2=1
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.005
0
0.005
M
x/
M
s
I1=0,I2=0
I1=0,I2=1
I1=1,I2=0
I1=1,I2=1
FIG. 8. Unoptimized 8-bit XOR Gate Outputs a) f1=10GHz, b) f2=20GHz, . . . , h) f8=80GHz.
A. Simulation Results
8-bit 2-input threshold detection based XOR/XNOR gate
Figure 7 presents OOMMF simulation results for the area unotimized byte-based 2-input
XOR gate instance. The y-axis reflects the output SWs Mx over Ms ratio, i.e., magnetization
14
| FFT |
0 10 20 30 40 50 60 70 80 90
Frequency (GHz)
0
0.45
0.91
1.36
1.81
M
x(
f)/
M
s
I1=0,I2=0
I1=0,I2=1
I1=1,I2=0
I1=1,I2=1
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.05
0
0.05
M
x/
M
s
I1=0,I2=0
I1=0,I2=1
I1=1,I2=0
I1=1,I2=1
FIG. 9. Optimized 8-bit XOR Gate Time and Frequency Response.
in the x-direction over magnetic saturation. To simplify the Figure we only assume all 0s
and all 1s input sets, thus only four input combinations are possible, and as such the gate
response to any input combination is the same in all frequencies. As expected same-frequency
SW pairs interfere without affecting the other SWs and this is clear from Figure 7, which
indicates that 8 different frequencies components exist without distorting each-other in the
Fast Fourier Transform (FFT) amplitude spectrum for all the considered input combinations.
Moreover, as it can be noticed from Figure 8, the output SWs are not distorted and can be
properly detected for each frequency. Let us consider the first output detection cell, which
is tuned for the 10 GHz SW. When reading the output at time 0.5 ns for I1 = I2 = 0 and
I1 = I2 = 1, the absolute SW magnetization value is greater than 0.0035 Ms due to the
constructive interference, whereas the SW magnetization is less than 0.0035 Ms when one
input set is 0 and the other one is 1. Therefore, if the detection threshold is set to 0.0035
Ms an XOR function is obtained as a SW magnetization greater (lower) than 0.0035 Ms
is read as a logic 0 (1). An XNOR can be realized by flipping the condition such that a
SW magnetization lower (greater) than 0.0035 Ms is read as a logic 0 (1). Similarly, for the
15
(a) F1=10GHz
(b) F2=20GHz
(h) F8=80GHz
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.0075
0
0.0075
M
x/
M
s
I1=0 , I2=0
I1=0 , I2=1
I1=1 , I2=0
I1=1 , I2=1
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.005
0
0.005
M
x/
M
s
I1=0 , I2=0
I1=0 , I2=1
I1=1 , I2=0
I1=1 , I2=1
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.01
0
0.01
M
x/
M
s
I1=0 , I2=0
I1=0 , I2=1
I1=1 , I2=0
I1=1 , I2=1
FIG. 10. Optimized 8-bit XOR Gate Outputs: a) f1=10GHz, b) f2=20GHz, . . . , h) f8=80GHz.
second detection cell, which targets the 20 GHz SW a threshold value of 0.0032 Ms is in place
and by following a similar way of reasoning threshold values of 0.0028 Ms, 0.0025 Ms, 0.0022
Ms, 0.0017 Ms, 0.0015 Ms, and 0.001 Ms can be determined for the rest of frequencies.
Figure 9 and 10 present OOMMF simulation results for the optimized 8-bit 2-input
XOR gate. As depicted in Figure 10, the simulation proves the correct functionality of the
XOR/XNOR gate. One can observe in the Figure that in this case the SW magnetization
16
| FFT |
0 10 20 30 40 50 60 70 80 90
Frequency (GHz)
0
0.05
0.1
0.18
M
x(
f)/
M
s
I1=0, I2=0, I3=0
I1=0, I2=0, I3=1
I1=0, I2=1, I3=0
I1=0, I2=1, I3=1
I1=1, I2=0, I3=0
I1=1, I2=0, I3=1
I1=1, I2=1, I3=0
I1=1, I2=1, I3=1
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.005
0
0.005
M
x/
M
s
I1=0, I2=0, I3=0
I1=0, I2=0, I3=1
I1=0, I2=1, I3=0
I1=0, I2=1, I3=1
I1=1, I2=0, I3=0
I1=1, I2=0, I3=1
I1=1, I2=1, I3=0
I1=1, I2=1, I3=1
FIG. 11. Unoptimized 8-bit Majority Gate Time and Frequency Response.
at all frequencies is higher as the spin waves propagate on lower distances when compared
with the non-optimized case. In addition, the detection threshold values are higher, i.e.,
0.007 Ms, 0.005 Ms, 0.0045 Ms, 0.0038 Ms, 0.0034 Ms, 0.0027, 0.0025 Ms, and 0.002 Ms,
therefore, less sensitive detectors are requited for the XOR/XNOR gate implementation.
8-bit phase detection based 3-input Majority gate
The 8-bit 3-input unoptimized Majority gate OOMMF simulation results are presented
in Figure 11. The same notations are in place and again, to simplify the Figure we only
assume all 0s and all 1s input sets, thus only 8 input combinations are presented. The
Figure clearly demonstrate proper gate functionality as 8 different frequencies components
exist without distorting each-other in the Fast Fourier Transform (FFT) amplitude spectrum
for all the possible input combinations (I1 = I2 = I3 = 0), (I1 = I2 = 0, I3 = 1), . . . ,
(I1 = I2 = I3 = 1). Figure 12 indicates that the output SWs are not distorted and can
be properly detected for each frequency. Let us concentrate on Figure 12a, which captures
the 10 GHz 3-input Majority gate response and consider the output at time moment 0.75 ns,
17
(a) F1=10GHz
(b) F2=20GHz
(h) F8=80GHz
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.0002
0
0.0002
M
x/
M
s
I1=0,I2=0,I3=0
I1=0,I2=0,I3=1
I1=0,I2=1,I3=0
I1=0,I2=1,I3=1
I1=1,I2=0,I3=0
I1=1,I2=1,I3=0
I1=1,I2=1,I3=0
I1=1,I2=1,I3=1
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.0005
0
0.0005
M
x/
M
s
I1=0,I2=0,I3=0
I1=0,I2=0,I3=1
I1=0,I2=1,I3=0
I1=0,I2=1,I3=1
I1=1,I2=0,I3=0
I1=1,I2=0,I3=1
I1=1,I2=1,I3=0
I1=1,I2=1,I3=1
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.001
0
0.001
M
x/
M
s
I1=0,I2=0,I3=0
I1=0,I2=0,I3=1
I1=0,I2=1,I3=0
I1=0,I2=1,I3=1
I1=1,I2=0,I3=0
I1=1,I2=0,I3=1
I1=1,I2=1,I3=0
I1=1,I2=1,I3=1
FIG. 12. Unoptimized 8-bit Majority Gate Outputs a) f1=10GHz, b) f2=20GHz, . . . , h)
f8=80GHz.
When the three inputs have the same phase of 0 (I1I2I3 = 000) they constructively interfere
in the waveguide resulting in a phase of 0 SW, which corresponds to a logic 0. Also, when
at most one of the inputs is logic 1 (I1I2I3 = 001, I1I2I3 = 010, I1I2I3 = 100), i.e., has phase
of pi, the SWs interfere constructively and destructively, and the results is still a logic 0. In
18
| FFT |
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.005
0
0.005
M
x/
M
s
I1=0,I2=0,I3=0
I1=0,I2=0,I3=1
I1=0,I2=1,I3=0
I1=0,I2=1,I3=1
I1=1,I2=0,I3=0
I1=1,I2=0,I3=1
I1=1,I2=1,I3=0
I1=1,I2=1,I3=1
0 10 20 30 40 50 60 70 80 90
Frequency (GHz)
0
0.18
0.36
M
x(
f)/
M
s
I1=0,I2=0,I3=0
I1=0,I2=0,I3=1
I1=0,I2=1,I3=0
I1=0,I2=1,I3=1
I1=1,I2=0,I3=0
I1=1,I2=0,I3=1
I1=1,I2=1,I3=0
I1=1,I2=1,I3=1
FIG. 13. Optimized 8-bit Majority Gate Time and Frequency Response.
contrast, if at most one of the inputs is logic 0 (I1I2I3 = 011, I1I2I3 = 110, I1I2I3 = 101),
then the output is logic 1 as a result of the interferences. Further, when the three inputs
have the same phase of pi (I1I2I3 = 111), then spin waves interfere constructively in the
waveguide, which results in a phase of pi, which corresponds to a logic 1. The same line of
reasoning can be applied for all the other 7 cases as it is clearly indicated by Figure 12.
The optimized 8-input 3-input Majority gate OOMMF simulation results are presented
in Figure 13 and 14. As it can be observed from Figure 14, the gate functions correctly
while the SW amplitudes are higher as due to the optimization SWs propagate over shorter
distances, which enables the utilization of less sensitive detectors.
B. Performance Evaluation
To get inside on the practical potential of our proposal, we evaluate and compare the
8-bit gates with functionally equivalent state-of-the-art SW implementation obtained by the
instantiation of 8 normal (scalar) Majority/XOR gates, in terms of area, delay, and power
consumption. In our evaluations we make the following assumptions: (i) source/detector
19
(a) F1=10GHz
(b) F2=20GHz
(h) F8=80GHz
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.0011
0
0.0011
M
x/
M
s
I1=1 , I2=1 , I3=0
I1=0 , I2=0 , I3=0
I1=0 , I2=0 , I3=1
I1=0 , I2=1 , I3=0
I1=0 , I2=1 , I3=1
I1=1 , I2=0 , I3=0
I1=1 , I2=0 , I3=1
I1=1 , I2=1 , I3=0
I1=1 , I2=1 , I3=1
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.003
-0.0015
0
0.0015
0.003
M
x/
M
s
I1=0 , I2=0 , I3=0
I1=0 , I2=0 , I3=1
I1=0 , I2=1 , I3=0
I1=0 , I2=1 , I3=1
I1=1 , I2=0 , I3=0
I1=1 , I2=0 , I3=1
I1=1 , I2=1 , I3=0
I1=1 , I2=1 , I3=1
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.001
0
0.001
M
x/
M
s
I1=0,I2=0,I3=0
I1=0,I2=0,I3=1
I1=0,I2=1,I3=0
I1=0,I2=1,I3=1
I1=1,I2=0,I3=0
I1=1,I2=0,I3=1
I1=1,I2=1,I3=0
I1=1,I2=1,I3=1
FIG. 14. Optimized 8-bit Majority Gate Outputs a) f1=10GHz, b) f2=20GHz, . . . , h) f8=80GHz.
dimensions are 10 nm × 50 nm as suggested in46, (ii) SW propagation through the waveguide
doesn’t consume noticeable energy, and (iii) transducer delay is 0.42 ns49.
Under this assumptions we first evaluate the optimization algorithm impact on the 8-
bit gates area. Our calculations indicate that the unoptimized XOR and Majority gates
have an area of 0.025 25µm2 and 0.047 25µm2, respectively, which become 0.017 55µm2 and
20
0.0279µm2, respectively, after the optimization. This clearly proves the algorithm efficiency
as it diminishes the area by 30% and 41%, respectively.
As the standard functionally equivalent implementations require 8 2-input XOR and 8
3-input Majority gates it occupies 0.0784µm2 and 0.116µm2 real estate, respectively, our
proposal enables a 4.47x and 4.16x area reduction, respectively.
Generally speaking, to calculate an SW gate delay one needs to sum-up the time asso-
ciated to SW generation, propagation, and detection. The due to SW propagation through
the waveguide delay depends on the travelled distance from generation to detection and
it can be computed by dividing the distance by the SW group velocity, which is 3500 m/s
for CoFeB43. Given that the longest propagation path for the 8-bit 2-input XOR and 3-
input Majority gates is 351 nm and 558 nm, respectively, the propagation delay is 100 ps and
159 ps, respectively, which by adding the transducers delay sums up to 940 ps and 999 ps,
respectively. For the scalar 2-input XOR and 3-input Majority gates the longest path is
196 nm and 290 nm, respectively, which translates into a transmission delay of 56 ps and
83 ps, respectively, and 896 ps and 923 ps overall gate delay, respectively. Thus, the 8-bit
2-input XOR and 3-input Majority gates are slower than their scalar counterparts with 5%
and 7%, respectively.
As both parallel and scalar gate implementations make use of the same number of trans-
ducers and the through the waveguide propagation consumes insignificant power, the two
implementations are equivalent in terms of power consumption.
C. Fan-in and Geometrical Scalability
The proposed structure is generic and the number of bits per frequency, i.e., the gate
fan-in, shouldn’t affect its functionality. However, as the number of inputs increases, the
damping effect might play a more significant role in diminishing SW amplitudes. Thus, if
a large number of inputs is targeted, it might be needed to excite the same frequency SW
inputs in Figure 6 at different energy levels En < En−1 < . . .< E1, where Ei is the energy
that the ith SW is excited at. We note however that: (i) usual fan-in values are rather small
(2 and 3 in the gates we designed), (ii) energy level differentiation is only required for large
fan-in values in case the logic gate doesn’t function correctly, and (iii) within certain limits
the SW energy levels can be adjusted by properly biasing the source transducers.
21
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.009
0
0.009
M
x/
M
s
I1=0, I2=0, I3=0 (encoded at 8 different frequencies)
I1=1, I2=0, I3=0 (encoded at 8 different frequencies)
I1=0, I2=0, I3=0 (encoded at 9 different frequencies)
I1=1, I2=0, I3=0 (encoded at 9 different frequencies)
FIG. 15. MAJ Gate Outputs at f1=10GHz.
To get inside on the effect of the waveguide width on gate functionality we scaled it from
50 nm up to 500 nm It was noticed that scaling doesn’t affect the gates functionality and
it doesn’t generate any crosstalk effects. We note that, as waveguide width increases, the
ferromagnetic resonance frequency decreases and thus lower SW frequencies can be utilized.
Although this is advantageous from signal loss perspective such structures require stronger
static magnetic fields, which results in area and energy consumption overheads.
D. Practically Achievable Parallelism
To get some inside on the data parallelism practical upper-bound we examined the con-
sequences of increasing the number of bits per set, i.e., utilized frequencies. To this end
we OOMMF simulate 8-bit and 9-bit 3-input Majority gate instances and display in Figure
15 the 10 GHz frequency output component for the input combinations I1I2I3 = 000 and
I1I2I3 = 100 .
One can observe in the Figure that at time=0.5 ns the 8-bit Majority gate output has the
same phase for the considered input combination, which reflects the correct functionality
of the Majority gate as in both cases 0 is the majority. However, the 9-bit Majority gate
output at time=0.5 ns has different phase, 0 for I1I2I3 = 000, and approximately pi/4 for
I1I2I3 = 100, which indicate that the gate starts to malfunction. Based on this we can
conclude that, for the proposed topology and utilized material, 8 is the maximum number
of frequencies one can use to construct robust parallel SW gates.
22
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.045
0
0.045
M
x/
M
s
I1=0, I2=0 (encoded at 8 different frequencies)
I1=0, I2=1 (encoded at 8 different frequencies)
I1=0, I2=0 (encoded at 9 different frequencies)
I1=0, I2=1 (encoded at 9 different frequencies)
I1=0, I2=0 (encoded at 10 different frequencies)
I1=0, I2=1 (encoded at 10 different frequencies)
I1=0, I2=0 (encoded at 16 different frequencies)
I1=0, I2=1 (encoded at 16 different frequencies)
FIG. 16. XOR Gate Outputs at f2=20GHz.
However, one can go beyond this limit if threshold detection based it utilized. To ex-
amine the effect of embedding more than 8 frequencies we evaluate by means of OOMMF
simulations 2-input XOR gates with 8, 9, 10, and 16 frequencies. For illustration purpose
we display in Figure 16 the 20 GHz frequency output component for the input combinations
I1I2 = 00 and I1I2 = 01, which should give a 0 and 1 output value, respectively, for all the
considered input widths. The Figure clearly indicates that while the spin wave magnetiza-
tion difference between the two input combinations decreases as the number of frequency
increases, which makes output detection more challenging, two different levels can still be
distinguished and a threshold defined, as such if the spin wave magnetization is greater
than that threshold, the output is 0, and 1 otherwise. To clarify this let us inspect the
output value at time moment 0.4 ns for the 8, 9, 10, and 16-bit XOR gates. For the input
combination I1I2 = 00 the output SW has a higher amplitude than the one corresponding
to I1I2 = 01, which means that a threshold can be set and based on threshold detection,
X(N)OR can be detected. This suggests that for threshold detection based gates are more
robust and can operate with up to 16-bit inputs. Note that more than 16-bit inputs might
be realizable but it is part of planned future work.
Figure 17 presents OOMMF simulation results for the 16-bit based 2-input XOR gate.
As it can be observed from the FFT magnitude spectrum in Figure 17, the information is
encoded in SWs with 16 different frequencies, 10, 20, . . . , 160 GHz and the output for all
the possible input combinations (I1 = I2 = 0), . . . , (I1 = I2 = 1) can be detected at each
frequency. To further examine the results, we filter each frequency component for different
23
| FFT |
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160
Frequency (GHz)
0
4.2
9.1
13.6
18.1
22.7
27.2
31.8
M
x(
f)/
M
s
I1=0,I2=0
I1=0,I2=1
I1=1,I2=0
I1=1,I2=1
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.5
0
0.5
M
x/
M
s
I1=0,I2=0
I1=0,I2=1
I1=1,I2=0
I1=1,I2=1
FIG. 17. Optimized 16-bit Majority Gate Response in Time and Frequency.
input combinations separately in Figure 18 and one can observe that the output SWs are
not distorted and can be properly detected at each frequency, which means that the 16-
bit XOR/XNOR gate operates correctly. Let us consider the 20 GHz output time moment
0.75 ns and a detection threshold value of 0.04 Ms. For I1 = I2 = 0, or I1 = I2 = 1 the
absolute SWmagnetization value is greater than 0.04Ms due to the constructive interference,
which means 0 logic output as it should. For I1 = 0I2 = 1, or I1 = 1I2 = 0 the absolute
SW magnetization value is lower than 0.04 Ms, which means a 1 logic output as it should.
An XNOR can be realized by flipping the condition such that a SW magnetization lower
(greater) than 0.04 Ms is read as a logic 0 (1). The same line of reasoning can be utilized
to determine all threshold values as, 0.045 Ms, 0.04 Ms, 0.038 Ms, 0.033 Ms, 0.032 Ms, 0.03
Ms, 0.028 Ms, 0.025 Ms, 0.02 Ms, 0.015 Ms, 0.01 Ms, 0.007 Ms, 0.0068 Ms, 0.005 Ms, 0.0045
Ms, 0.004 Ms, 0.0035 Ms, and 0.002 Ms, for value increasingly ordered frequencies.
24
(a) F1=10GHz
(b) F2=20GHz
(p) F16=160GHz
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.09
0
0.09
M
x/
M
s
I1=0,I2=0
I1=0,I2=1
I1=1,I2=0
I1=1,I2=1
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.045
0
0.045
M
x/
M
s
I1=0,I2=0
I1=0,I2=1
I1=1,I2=0
I1=1,I2=1
0 0.5 1 1.5 2 2.5 3
Time (ns)
-0.0045
0
0.0045
M
x/
M
s
I1=0,I2=0
I1=0,I2=1
I1=1,I2=0
I1=1,I2=1
FIG. 18. Optimized XOR Gate Outputs: a) f1=10GHz, b) f2=20GHz, . . . , p) f16=160GHz.
E. Variability and Thermal Noise Effects
In this paper, our main purpose is to propose and validate an intrinsic data parallel spin
wave technology under ideal conditions as a proof of concept, while disregarding factors, e.g.,
edge roughness, waveguide dimension variations, spin wave strength variation, and thermal
noise, which might negatively affect the performance of the proposed concept. However,
25
in50,51, the effects of waveguide trapezoidal cross section and edge roughness were investigated
and demonstrated that they have a rather limitted impact in gate behavior, which preserve
functionality under their presence. Moreover, an investigation of a SW gate behaviour at
different temperatures was presented in50. At different temperatures, it was noticed that
the gate functions correctly and that the temperature variation effect is rather limited. In
addition to that, as our proposed structure is in-line waveguide width variations do not
affect gate functionality, thus we expected it to be rather robust to dimension variations.
Despite that fact that we expect that variability and thermal noise do not fundamentally
affect the proposed gate behaviour, a thorough investigation of such effects is part of the
planned future work.
VI. CONCLUSIONS
A novel n-bit data parallel spin wave logic gate was proposed in this paper. In order to
explain the proposed concept, we implemented and validated by means of OOMMF, 8-bit
2-input XOR and 3-input Majority gates. Further, we proposed an optimization algorithm
to minimize the area overhead of the proposed multi-frequency gates and demonstrate that
the algorithm diminishes the area by 30% and 41% for XOR and MAJ gates implemen-
tations, respectively. Moreover, to asses the potential of our proposal, we evaluated and
compared the proposed multifrequency gates with functionally equivalent scalar SW gate
based implementations in terms of area, delay, and power consumption. The results indi-
cated that the byte-based XOR and Majority gates require 4.47x and 4.16x area less than
the conventional (scalar) implementations, respectively, at the expense of 5% to 7% delay
overhead and without inducing any power consumption overhead. Finally, we demonstrated
that, for current gate topology and materials, the maximum number of frequencies (gate
parallelism) is 8 and 16 for phase and threshold based output detection, respectively.
ACKNOWLEDGEMENT
This work has received funding from the European Union’s Horizon 2020 research and
innovation program within the FET-OPEN project CHIRON under grant agreement No.
801055. It has also been partially supported by imec’s industrial affiliate program on
26
beyond-CMOS logic. F.V. acknowledges financial support from Flanders Research Foun-
dation (FWO) through grant No. 1S05719N.
REFERENCES
1N. D. Shah, E. W. Steyerberg, and D. M. Kent, JAMA (2018).
2R. L. Villars, C. W. Olofson, and M. Eastwood, IDC (2011).
3S. Agarwal, G. Burr, A. Chen, S. Das, E. Debenedictis, M. P. Frank, P. Franzon, S. Holmes,
M. Marinella, and T. Rakshit, “International roadmap of devices and systems 2017 edition:
Beyond cmos chapter.” Tech. Rep. (Sandia National Lab.(SNL-NM), Albuquerque, NM
(United States), 2018).
4D. Mamaluy and X. Gao, Applied Physics Letters 106, 193503 (2015).
5B. Hoefflinger, Chips 2020: a guide to the future of nanoelectronics (Springer Science &
Business Media, 2012).
6N. Z. Haron and S. Hamdioui, in Design and Test Workshop, 2008. IDT 2008. 3rd Inter-
national (IEEE, 2008) pp. 98–103.
7Y. Jiang, N. C. Laurenciu, H. Wang, and S. D. Cotofana, IEEE Transactions on Nan-
otechnology 18, 287 (2019).
8Y. Jiang, N. Cucu Laurenciu, and S. D. Cotofana, IEEE Transactions on Circuits and
Systems I: Regular Papers 66, 1948 (2019).
9S. Choudhary and S. Khandate, IEEE Transactions on Nanotechnology 18, 670 (2019).
10Y. Jiang, N. C. Laurenciu, H. Wang, and S. D. Cotofana, IEEE Transactions on Nan-
otechnology 18, 287 (2019).
11S. Bansal, A. Das, P. Jain, K. Prakash, K. Sharma, N. Kumar, N. Sardana, N. Gupta,
S. Kumar, and A. K. Singh, IEEE Transactions on Nanotechnology 18, 781 (2019).
12H. Nili, A. F. Vincent, M. Prezesio, M. R. Mahmoodi, I. Kataeva, and D. B. Strukov,
IEEE Transactions on Nanotechnology 19, 344 (2020).
13S. N. Truong, K. Van Pham, and K. Min, IEEE Transactions on Nanotechnology 17, 482
(2018).
14C. E. Graves, C. Li, X. Sheng, W. Ma, S. R. Chalamalasetti, D. Miller, J. S. Ignowski,
B. Buchanan, L. Zheng, S. Lam, X. Li, L. Kiyama, M. Foltin, M. P. Hardy, and J. P.
Strachan, IEEE Transactions on Nanotechnology 18, 963 (2019).
27
15N. Zheng and P. Mazumder, IEEE Transactions on Nanotechnology 17, 520 (2018).
16M. R. Mahmoodi, A. F. Vincent, H. Nili, and D. B. Strukov, IEEE Transactions on
Nanotechnology 19, 429 (2020).
17K. P. Gnawali, S. N. Mozaffari, and S. Tragoudas, IEEE Transactions on Nanotechnology
17, 1206 (2018).
18H. Zhang, W. Kang, B. Wu, P. Ouyang, E. Deng, Y. Zhang, and W. Zhao, IEEE Trans-
actions on Nanotechnology 18, 473 (2019).
19D. Zhang, Y. Hou, L. Zeng, and W. Zhao, IEEE Transactions on Nanotechnology 18, 518
(2019).
20S. K. Thirumala, Y. Hung, S. Jain, A. Raha, N. Thakuria, V. Raghunathan, A. Raghu-
nathan, Z. Chen, and S. K. Gupta, IEEE Transactions on Nanotechnology , 1 (2020).
21A. Roohi and R. F. DeMara, IEEE Transactions on Nanotechnology 18, 885 (2019).
22D. E. Nikonov and I. A. Young, Proceedings of the IEEE 101, 2498 (2013).
23M. P. Kostylev, A. A. Serga, T. Schneider, B. Leven, and B. Hillebrands, Applied Physics
Letters 87, 153501 (2005), https://doi.org/10.1063/1.2089147.
24T. Schneider, A. A. Serga, B. Leven, B. Hillebrands, R. L. Stamps, and M. P. Kostylev,
Applied Physics Letters 92, 022505 (2008), https://doi.org/10.1063/1.2834714.
25K.-S. Lee and S.-K. Kim, Journal of Applied Physics 104, 053909 (2008),
https://doi.org/10.1063/1.2975235.
26I. A. Ustinova, A. A. Nikitin, A. B. Ustinov, B. A. Kalinikos, and E. Lähderanta, in 2017
11th International Workshop on the Electromagnetic Compatibility of Integrated Circuits
(EMCCompo) (2017) pp. 104–107.
27A. Khitun and K. L. Wang, Superlattices and Microstructures 38, 184 (2005).
28Y. Wu, M. Bao, A. Khitun, J.-Y. Kim, A. Hong, and K. L. Wang, Journal of Nanoelec-
tronics and Optoelectronics 4, 394 (2009).
29A. Khitun, D. E. Nikonov, M. Bao, K. Galatsis, and K. L. Wang, Nanotechnology 18,
465202 (2007).
30A. Khitun, M. Bao, Y. Wu, J. Kim, A. Hong, A. Jacob, K. Galatsis, and K. L. Wang, in
Fifth International Conference on Information Technology: New Generations (itng 2008)
(2008) pp. 1107–1110.
31B. Rana and Y. Otani, Phys. Rev. Applied 9, 014033 (2018).
28
32A. V. Chumak, A. A. Serga, and B. Hillebrands, Journal of Physics D: Applied Physics
50, 244001 (2017).
33A. Khitun and K. L. Wang, Journal of Applied Physics 110, 034306 (2011),
https://doi.org/10.1063/1.3609062.
34S. Klingler, P. Pirro, T. Brächer, B. Leven, B. Hillebrands, and A. V. Chumak, Applied
Physics Letters 105, 152410 (2014), https://doi.org/10.1063/1.4898042.
35S. Klingler, P. Pirro, T. Brächer, B. Leven, B. Hillebrands, and A. V. Chumak, Applied
Physics Letters 106, 212406 (2015).
36O. Zografos, S. Dutta, M. Manfrini, A. Vaysset, B. Sorée, A. Naeemi, P. Ragha-
van, R. Lauwereins, and I. P. Radu, AIP Advances 7, 056020 (2017),
https://doi.org/10.1063/1.4975693.
37K. Nanayakkara, A. Anferov, A. P. Jacob, S. J. Allen, and A. Kozhanov, IEEE Transac-
tions on Magnetics 50, 1 (2014).
38T. Fischer, M. Kewenig, D. A. Bozhko, A. A. Serga, I. I. Syvorotka, F. Ciubotaru, C. Adel-
mann, B. Hillebrands, and A. V. Chumak, Applied Physics Letters 110, 152401 (2017),
https://doi.org/10.1063/1.4979840.
39P. Shabadi, A. Khitun, P. Narayanan, M. Bao, I. Koren, K. L. Wang, and C. A. Moritz, in
2010 IEEE/ACM International Symposium on Nanoscale Architectures (2010) pp. 11–16.
40T. Fischer, M. Kewenig, D. A. Bozhko, A. A. Serga, I. I. Syvorotka, F. Ciubotaru, C. Adel-
mann, B. Hillebrands, and A. V. Chumak, Applied Physics Letters 110, 152401 (2017),
https://doi.org/10.1063/1.4979840.
41F. Ciubotaru, G. Talmelli, T. Devolder, O. Zografos, M. Heyns, C. Adelmann, and I. P.
Radu, in 2018 IEEE International Electron Devices Meeting (IEDM) (2018) pp. 36.1.1–
36.1.4.
42A. Khitun, Journal of Applied Physics 111, 054307 (2012),
https://doi.org/10.1063/1.3689011.
43A. V. Chumak, A. A. Serga, and B. Hillebrands, Journal of Physics D: Applied Physics
50, 244001 (2017).
44L. Landau and E. Lifshitz., Phys. Z. Sowjetunion , 101 (1935).
45T. L. Gilbert, IEEE Transactions on Magnetics 40, 3443 (2004).
46A. Mahmoud, F. Vanderveken, F. Ciubotaru, C. Adelmann, S. Cotofana, and S. Hamdioui,
in 2020 Design, Automation Test in Europe Conference Exhibition (DATE) (2020) pp.
29
642–645.
47M. J. Donahue and D. G. Porter, Interagency Report NISTIR 6376 (1999).
48T. Devolder, J.-V. Kim, F. Garcia-Sanchez, J. Swerts, W. Kim, S. Couet, G. Kar, and
A. Furnemont, Phys. Rev. B 93, 024420 (2016).
49O. Zografos, B. Sorée, A. Vaysset, S. Cosemans, L. Amarù, P. Gaillardon, G. D. Micheli,
R. Lauwereins, S. Sayan, P. Raghavan, I. P. Radu, and A. Thean, in 2015 IEEE 15th
International Conference on Nanotechnology (IEEE-NANO) (2015) pp. 686–689.
50Q. Wang, P. Pirro, R. Verba, A. Slavin, B. Hillebrands, and
A. V. Chumak, Science Advances 4 (2018), 10.1126/sciadv.1701517,
https://advances.sciencemag.org/content/4/1/e1701517.full.pdf.
51Q. Wang, B. Heinz, R. Verba, M. Kewenig, P. Pirro, M. Schneider, T. Meyer, B. Lägel,
C. Dubs, T. Brächer, and A. V. Chumak, Phys. Rev. Lett. 122, 247202 (2019).
30
