Energy Efficiency of Computation in All-spin Logic: Projections and Fundamental Limits by Chen, Zongya
University of Massachusetts Amherst 
ScholarWorks@UMass Amherst 
Masters Theses Dissertations and Theses 
March 2019 
Energy Efficiency of Computation in All-spin Logic: Projections 
and Fundamental Limits 
Zongya Chen 
Follow this and additional works at: https://scholarworks.umass.edu/masters_theses_2 
 Part of the Electrical and Electronics Commons, Electronic Devices and Semiconductor Manufacturing 
Commons, and the Power and Energy Commons 
Recommended Citation 
Chen, Zongya, "Energy Efficiency of Computation in All-spin Logic: Projections and Fundamental Limits" 
(2019). Masters Theses. 754. 
https://scholarworks.umass.edu/masters_theses_2/754 
This Open Access Thesis is brought to you for free and open access by the Dissertations and Theses at 
ScholarWorks@UMass Amherst. It has been accepted for inclusion in Masters Theses by an authorized 
administrator of ScholarWorks@UMass Amherst. For more information, please contact 
scholarworks@library.umass.edu. 
ENERGY EFFICIENCY OF COMPUTATION
IN ALL-SPIN LOGIC: PROJECTIONS AND
FUNDAMENTAL LIMITS
A Thesis Presented
by
ZONGYA CHEN
Submitted to the Graduate School of the
University of Massachusetts Amherst in partial fulfillment
of the requirements for the degree of
MASTER OF SCIENCE IN ELECTRICAL AND COMPUTER ENGINEERING
February 2019
Electrical and Computer Engineering
ENERGY EFFICIENCY OF COMPUTATION
IN ALL-SPIN LOGIC: PROJECTIONS AND
FUNDAMENTAL LIMITS
A Thesis Presented
by
ZONGYA CHEN
Approved as to style and content by:
Neal G. Anderson, Chair
Zlatan Aksamija, Member
Russell Tessier, Member
Christopher V. Hollot, Department Chair
Electrical and Computer Engineering
ABSTRACT
ENERGY EFFICIENCY OF COMPUTATION
IN ALL-SPIN LOGIC: PROJECTIONS AND
FUNDAMENTAL LIMITS
FEBRUARY 2019
ZONGYA CHEN
B.Sc., HEBEI UNIVERSITY
M.S.E.C.E., UNIVERSITY OF MASSACHUSETTS AMHERST
Directed by: Professor Neal G. Anderson
Built with nanomagnets, a spintronic device called the all-spin logic (ASL) de-
vice carries information with only spin currents, resulting in a low power supply—10
mV. This voltage is 100 times smaller than the conventional CMOS devices (usu-
ally 0.8∼1V). The potential for improved energy efficiency made possible by the low
operating voltage of ASL makes it one of the most promising devices among its post-
CMOS competitors.
The basic working principles of ASL device are introduced in this thesis and two
complementary approaches to studying energy efficiency of computation are applied
to a common set of ASL circuits: (1) a circuit simulation approach that provides effi-
ciency estimates for specific ASL circuit realizations, and (2) a physical-information-
theoretic approach that reveals fundamental efficiency bounds for ASL circuits as
limited by irreversible information loss.
iii
The results of this study support the expectation that the energy efficiency of
computation in ASL can far exceed that of CMOS. However, it also reveals that ASL
efficiencies—shown to exceed fundamental limits by many orders of magnitude in
the ASL implementations studied here—are unlikely to approach fundamental limits
because of the unavoidable energetic overhead cost of maintaining spin currents.
iv
TABLE OF CONTENTS
Page
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .viii
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
CHAPTER
1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 From CMOS to All-spin Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Energy Dissipation in ASL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Thesis Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2. ALL-SPIN LOGIC DEVICE: WORKING PRINCIPLES AND
BASIC UNITS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 Working Principles of ASL Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Digital Computing with ASL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3. ENERGY EFFICIENCY OF COMPUTATION IN ASL:
THEORETICAL METHODS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.1 Energy Efficiency From Circuit Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.1.1 Landau-Lifshitz-Gilbert Equation and spin storage . . . . . . . . . . . . . 20
3.1.2 Spin Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.1.3 Spin Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1.4 Energy Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 Efficiency Limits from Physical Information Theory . . . . . . . . . . . . . . . . . . 31
3.2.1 Introduction of Physical Information Theory . . . . . . . . . . . . . . . . . . 32
3.2.2 Physical Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
v
3.2.3 Process Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.4 Operational Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.5 Cost Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2.6 Energy Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4. ASL CIRCUIT STUDY: PRELIMINARY RESULTS AND
DISCUSSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.1 Simple Circuits: Buffer, Inverter and Latch . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.1.1 Buffer Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.1.2 Inverter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.1.3 Latch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2 Simplified Circuits Simulation Results: Buffer, Inverter, Latch . . . . . . . . . 51
4.2.1 From Dynamic and Static Power to Projected Energy . . . . . . . . . . 51
4.2.2 Simplified Buffer and Inverter Simulation . . . . . . . . . . . . . . . . . . . . . 52
4.2.3 Simplified Latch Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2.4 Comparison of the full-LLG Model and Simplified Model . . . . . . . 55
4.3 Simulation of Larger Circuits: Majority Gate, Half-adder, and
ALU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3.1 Majority Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.3.2 Half Adder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3.3 Arithmetic Logic Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.4 Energy Efficiency Bounds for ASL Circuits from Physical Information
Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.4.1 From Projected Energy to Irreversibility Induced Energy . . . . . . . 67
4.4.2 Efficiency Bounds for ASL Circuits: Detailed Application
Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.4.3 Efficiency Bounds for Buffer, Inverter, Latch, Majority Gate,
and Half Adder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.4.4 Efficiency Bounds for ALU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.4.5 Simulation-Based Projections vs Fundamental Bounds . . . . . . . . . 83
5. SUMMARY AND CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
APPENDICES
A. LANDAU-LIFSHITZ-GILBERT EQUATION DERIVATION . . . . . . 91
B. SPIN INJECTION WITH LLG EQUATION . . . . . . . . . . . . . . . . . . . . . . 95
vi
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
vii
LIST OF TABLES
Table Page
1.1 The energy cost and throughput of ASL vs CMOS FFT processor. . . . . . . 3
2.1 The truth table of three-input majority gate. . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 The truth table of full adder with the illustration of xor/xnor gate
design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.1 ASL simulation parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2 The energy cost of different circuits with and without LLG model. . . . . . 55
4.3 The truth table of half adder with the illustration of majority-gate
nand gate design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.4 ALU function table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.5 Projected energy results of different ASL devices. . . . . . . . . . . . . . . . . . . . . 67
4.6 State transformation for the ASL adder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.7 Inputs and outputs of the first adder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.8 Outputs of the first adder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.9 Inputs and outputs of the second adder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.10 Outputs of the second adder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.11 Outputs of buffer, inverter and latch. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.12 Inputs and outputs of the majority gate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.13 Inputs and outputs of the half adder middle stage. . . . . . . . . . . . . . . . . . . . 80
4.14 Inputs and outputs of the half adder middle stage. . . . . . . . . . . . . . . . . . . . 81
viii
4.15 State transformation for the buffer, inverter, latch, majority gate, half
adder, and ALU. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.16 The truth table of subtractor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.17 Truth table of increment and decrement function. . . . . . . . . . . . . . . . . . . . . 85
4.18 Irreversibility induced energy results of different ASL devices. . . . . . . . . . 85
ix
LIST OF FIGURES
Figure Page
1.1 A lab envrionment fabricated ASL buffer/inverter. . . . . . . . . . . . . . . . . . . . . 3
1.2 A 3-input majority gate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1 A typical structure of ASL unit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Working principle of ASL. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Structure of a three-input majority gate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 A full adder using a three-input majority gate. . . . . . . . . . . . . . . . . . . . . . . 14
2.5 ASL full adder constructed from three majority gates. . . . . . . . . . . . . . . . . 15
2.6 The schematic of latch using ASL. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.7 The schematic of D flip-flop using ASL (D1 and D2) . . . . . . . . . . . . . . . . . . 18
2.8 The schematic of a pipelined full adder using the 2-phase clock. . . . . . . . . 19
3.1 An overview of the circuit simulation model between two magnets
and a non-ferromagnet channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Spin storage simulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3 The charge with a spin direction (spin up or down) flows right,
generating a charge flow and a spin flow. . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.4 Magnetization direction of ~m2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.5 Spin injection model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.6 Spin transport model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.7 Physical decomposition of the target circuit, including
computationally relevant domain and environmental domain. . . . . . . . 35
x
4.1 Circuit simulation results of a two-magnet buffer. . . . . . . . . . . . . . . . . . . . . 47
4.2 Circuit simulation results of a two-magnet inverter. . . . . . . . . . . . . . . . . . . 48
4.3 Circuit simulation results of three-magnet latches. . . . . . . . . . . . . . . . . . . . . 50
4.4 Results of simplified buffer and inverter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.5 Results of a simplified three-magnet latch, implementing buffer
function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.6 Results of a simplified 3-input 4-magnet majority gate, implementing
NAND gate or NOR gate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.7 ASL half adder schematic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.8 Simulation results of the half adder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.9 Power simulation results of the half adder. . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.10 ASL based ALU schematic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.11 Results of a ASL based ALU. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.12 Fundamental and supporting sources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.13 Physical decomposition of ASL magnet and its connecting channel,
power supply, ground lead, and substrate. . . . . . . . . . . . . . . . . . . . . . . . . 69
4.14 Data zones in different computational step c1∼c5. . . . . . . . . . . . . . . . . . . . 71
4.15 Half adder layout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
A.1 A small region dVr in magnetic field ~H. Elementary moments |µj >s
point to random directions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
A.2 Procession and damping. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
B.1 Spin current injection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
B.2 Self-consistancy of LLG nanomagnet dynamics and spin transport. . . . . . 96
B.3 ASL interver with (a) top view and (b) side view. . . . . . . . . . . . . . . . . . . . . 97
B.4 Circuit simulation model of an ASL inverter. . . . . . . . . . . . . . . . . . . . . . . . . 97
xi
CHAPTER 1
INTRODUCTION
1.1 From CMOS to All-spin Logic
In the past decades, integrated circuit engineers have focused on device physics,
seeking ways to improve performance, speed, the chip area, energy dissipation, and
so on. In 1965, Moore [1] pointed out that the number of transistors in integrated
circuits has been doubling every two years, leading to a situation where the size of
the Complementary Metal-Oxide Semiconductor (CMOS) devices has aggressively
scaled down for the past few decades and enabled the industry to build ultra-dense
integrated circuits with higher performance [2].
However, when transistors scaled down into the nanoscale regime in recent years,
Moore’s Law slowed down. The power dissipation or the heat dissipation became
a significant problem preventing the number of the transistors doubles about every
two years. Dynamic power dissipation used to dominate the power consumption in
CMOS devices, but as the size of the transistor has kept scaling down, the static
power (which is from excessive leakage currents) started to play a more critical role
in the power or heat dissipation area.
Energy dissipation, or heat dissipation, is always in the list of problems waiting to
be solved. It ended the rapid growth of the transistor count and its historical doubling
every two years as per Moore’s Law. Heat dissipation causes severe performance
problem among very large scale integrated (VLSI) circuits. Considering the scale and
number of transistors in a modern VLSI circuit, the heat dissipation generated from
all transistors combined is enormous. The small device scale also increases the density
1
of defective transistors and limits further scaling. Thus, energy/heat dissipation has
become one of the most urgent problems in CMOS integrated circuit technology.
The limitations of CMOS device regarding heat dissipation are mainly caused
by leakage currents. There are two kinds of leakage current in CMOS. One is the
tunneling current from gate to ground due to the thinner dielectric layer. The other
leakage is from source to drain due to the unwanted current flow under threshold
voltage. These two leakage currents result in significant energy dissipation and off-
state power consumption when the size of transistors is smaller, i.e., the density of
transistors is higher. One way to reduce the power is to decrease the supply voltage
which in turn lowers the tunneling current at the same time. However, we are unable
to reduce the threshold voltage suitable for the reduced supply power, which leads to
poor performance [3].
To solve the problem, engineers explored many devices beyond CMOS, among
which is the all-spin logic device. Behin-Aein [4] showed that one distinguishable
feature of this post-CMOS device is that it conducts both information and energy
by only spin current instead of charge current. The spin current is a collection of
electron spins. The electron spin is a form of angular momentum which has an
possible orientation of spin-up or spin-down [5]. Under this consideration, we name it
all-spin logic (ASL) device. Instead of charges in capacitors in conventional CMOS,
it uses the direction of the magnetic field of spins in ferromagnets to represent two
stable states 0 and 1.
Moreover, the supply voltage in the ASL device has nothing to do with the leakage
issue we mentioned above. Thus, it has a significant reduction in VDD (0.01V com-
pared to conventional CMOS 0.8V∼1V). Low power from low supply voltage seems
to be a promising feature for ASL.
Up to now, a few ASL physical circuits have been fabricated in the lab environ-
ment. Figure 1.1 shows an ASL unit with two magnets functioning as a buffer/inverter
2
Figure 1.1: A lab envrionment fabricated ASL buffer/inverter [6].
Table 1.1: The energy cost and throughput of ASL vs CMOS FFT processor [8].
nJ/FFT VDD FFT/s number of devices
CMOS 155 0.35 3.8 627K
ASL 23.1 0.1 6.5M 1.5M
[6]. The voltage between the magnet and the conducting channel could be viewed as
an output voltage. Figure 1.2 shows a 3-input majority gate [7]. At first, all the three
input magnets had spin-up spins. Then at time 1.0ns, the left and upper magnet
switched to spin-down state, and the third input magnet (lower magnet) remained
the same. The majority gate then chose the majority states from the three magnets.
In this case, the spin-down state was chosen then transported to the output magnet
(right one). At time 3.0ns, the output magnet switched to spin-down state.
As D. Morris mentioned in his paper [8], when compared with a published CMOS
fast Fourier transform (FFT) processor [9], their lab built ASL based FFT processor
has a 6.7 times lower energy consumption and 1,700,000 times higher throughput
(Table 1.1).
1.2 Energy Dissipation in ASL
Since the issue of the energy dissipation plays a vital role in the device performance
and the ASL technique advantages over CMOS this area, we will investigate two
3
Figure 1.2: A 3-input majority gate. (a) A majority gate fabrication layout. (b)∼(d)
Magnetization directions of the majority at time t=0.5, 1, 3ns. Two of the input
magnets (left and upper one) were switched from spin-up state to spin-down state
and one of the input magnets remained unchanged (spin-up state). Then the output
magnet (right one) chose the majority state from all the input states. In this case,
the output magnet switched to spin-down state at t=3.0ns [7].
energy dissipation calculation methods and apply them in the same set of ASL circuits
to see how this novel post-CMOS device performs under different testing methods.
The first one is based on circuit analysis simulation [4][10]. The energy dissipa-
tion obtained from this method could be called as projected energy where a numerical
estimate is shown as a final result. The projected energy can be viewed as a sim-
ulation of a real ASL circuit specified by particular material parameters, structural
dimensions, and geometric configurations. The results could be seen as a prediction
or projected energy of the real physical ASL circuit simulation results–this is how
the name projected energy comes from. The projected energy is determined by many
factors. It describes a situation when the device conditions, interconnection models,
physical dimensions, material parameters, environmental effects, etc. are given, how
much energy is dissipated by an particular circuit realization executing a particular
computation. It is similar to the kind of energy dissipation mentioned in a circuit
4
performance lab test, i.e. the projected energy would ideally be close to the real
physical circuit test result conduct in a lab environment.
The second energy calculation method is related to the fundamental energy limits
we will discuss in Section 1.2. This method provides lower bounds on the irreversibility
induced energy considering that it is from the irreversible information loss during the
logical computation [11]. It is an entirely different calculation method from the first
one, and the result is of a very different nature than the projected energy. The
irreversibility induced energy has nothing to do with the circuit parameters, such as
material properties, conductance channel length, environment temperature, etc.
The irreversibility induced energy is derived from the working principle of a cir-
cuit, reveals the inner logic behavior of the computation process. Only the logic
computation itself and how the investigated circuit realizes it (the function) deter-
mine the final numerical result of irreversibility induced energy. It shows the lowest
energy bound of a circuit, i.e. how much energy at minimum is necessarily dissipated
by a circuit doing a particular computation in a particular way, no matter what the
material, structure parameters, connection models, etc. are.
The relation between the projected and irreversibility induced energy is subtle.
The projected energy is more familiar to us when compared with the CMOS simula-
tion using software such as Cadence Virtuoso. The irreversibility induced energy is
more theoretical and abstract, depending on Landauer’s Principle (will be introduced
in Chapter 3) with formula derivation and calculation. The projected energy is re-
lated to the static power and the dynamic power. The integration of the static and
dynamic power with respect to time is the projected energy, while the irreversibility
induced energy is only related to the dynamic power. The detailed definition and
comparison will be explained in Chapter 3 and Section 4.4.5.
These two calculation methods will be both applied to the same sets of ASL
circuits. With the results obtained, we will get a basic idea of the low energy (the
5
projected energy) dissipation of ASL device. Secondly, from the comparison of the
two methods, we could get a deeper understanding of the projected and irreversibility
induced energy—the projected energy of ASL device is low compared to CMOS, but
still far above fundamental limits.
1.3 Thesis Objectives
The goal of this thesis is to apply two kinds of energy dissipation calculation
methods to ASL circuits. The detailed information of these two calculation methods
will be introduced in Chapter 3.
The detailed working principle of ASL will be introduced in this thesis. Its unique
working principle of using only spin currents to transfer information gives it a promis-
ing feature, which surpasses its competitors of post-CMOS devices. A clear layout of
it, including commonly applied material and structure of spin storage component and
conducting channel, is illustrated with words and figures. Because of its non-volatile
characteristic, the ASL device could be used for logic computation and information
storage. Several basic logic circuits based on ASL, such as the buffer, latch, flip-flop,
and adder [3] are shown in Chapter 2. The functions (truth tables), transient simula-
tion with provided pulses and the numerical results (i.e., simulation run time, delay
time and power consumption) are also described.
The spin-current-only device, which the all-spin logic device gets named from,
has a low supply voltage of 10mV, enabling it to work with high circuit density and
low heat dissipation rate. Because of this feature, ASL stands out not only among
conventional CMOS devices but also among its post-CMOS competitors. Interested in
the low power and energy consumption property of ASL, we will apply these two sets
of energy calculation methods to the same ASL circuits. With the results obtained,
we will get a basic idea of the low energy (the projected energy) dissipation of ASL
device. Secondly, from the comparison of the two methods, we could get a deeper
6
understanding of the projected and irreversibility induced energy and the difference
between them.
1.4 Thesis Overview
The second chapter will introduce the working principles of ASL devices and
several computing circuits based on ASL units.
In the third chapter, we will introduce the two calculation methods, the projected
energy calculation method, and the irreversibility induced energy calculation method.
The projected energy calculation method also can be seen as the circuit simulation
method, will build a whole set of circuit components especially suited for the ASL
devices. The physical structure and corresponding analog model in the stage of spin
storage, spin injection and spin transport will be illustrated and then built under
the Matlab environment in Section 3.1. These models will be combined in the later
chapter and applied for ASL digital circuit simulation to get the projected energy
dissipation.
The calculation method of irreversibility induced energy dissipation will also be
proposed in this chapter. The irreversibility induced energy is closely related to an-
other theory—the physical information theory. In this section, we will talk about the
Landauer’s Principle [12], which points out that irreversible information loss necessar-
ily induces irreversible energy loss, hence the phrase “irreversibility induced energy
dissipation”. Followed by this principle, we will introduce terminologies and four
steps by which readers can follow to analyze the circuit and get the irreversibility
induced energy results [11].
Chapter 4 will show the preliminary results of the two methods applied in sev-
eral ASL based circuits, including buffer, inverter, latch, and half-adder. A more
complicated circuit (an arithmetic logic unit) based on ASL will be introduced. The
projected and irreversibility induced energy dissipation will be calculated, and the
7
corresponding energy efficiencies obtained from the two methods will be compared
and discussed.
Chapter 5 is the summary and conclusions of the thesis.
8
CHAPTER 2
ALL-SPIN LOGIC DEVICE: WORKING PRINCIPLES
AND BASIC UNITS
In this chapter, we introduce the basic idea about ASL device, including the
structure of basic ASL function unit, ASL circuit elements, and how the spin currents
are excited and flow from one magnet to another. A few basic ASL units are also
introduced in this chapter, which gives us a better understanding of how the ASL
works.
2.1 Working Principles of ASL Devices
Fig. 2.1 is the structure of a typical ASL device unit [3]. Two ferromagnets are
labeled in red color. Ferromagnet is a kind of material contains individual domains.
Each domain has its magnetic field with random orientation (high probability in the
magnetic easy axis). Once an external magnetic field is applied, the orientation of
each domain tends to the direction of the external magnetic field. The gray part is the
spacer so that no charge can go from one side of the spacer to the other. The green
rectangle is the nonmagnetic channel. For better performance, copper is a better
choice for short distance communication channel while graphene is a better choice
in the long-distance communication channel [13][14][15]. The ground lead which is
shown in blue color is also made of nonmagnetic material. The ground lead is close
to one of the ferromagnets, next to the spacer.
In Fig. 2.2, the supply voltage VDD of the left ferromagnet m1 injects current to
m1 [16]. The current is then injected to the channel and goes all the way down to
9
Figure 2.1: A typical structure of ASL unit [3]. Ferromagnet (FM), nonmagnetic
(NM) channels isolated by the spacer and ground lead compose ASL basic unit.
the ground via the ground lead (red arrow in Fig. 2.2). This “red dots” current has
the same polarization direction with m1 (detailed discussion is given in Appendix B).
Note that positive supply voltage extracts electrons from ground to magnet, i.e. the
charge current goes from the magnet to ground, while negative supply voltage injects
electrons, i.e. the charge current goes from ground to the magnet. The injection of
the electrons accumulates electrons under m1. The accumulated electrons have the
same polarization with m1.
On the other hand, the extraction of the electrons accumulates electrons with the
opposite polarization direction with m1. These accumulated electrons with either the
same or the opposite polarization with m1 diffuse from one side of the channel to the
other side, i.e. from beneath m1 to beneath m2. This diffusion current is called the
spin current (green dots in Fig. 2.2). When the electrons reach beneath m2, they
change the direction of magnetization of m2 to whatever the polarization direction of
the electrons is, by torque. This changing process is discussed in detail in Appendix
B.
All in all, the negative supply voltage injects majority spins (the ones that have
the same magnetization direction with input spins) into the channel. Part of the spin
10
Figure 2.2: Working principle of ASL [13]. This is an inverter (with the positive
supply voltage). The red current generated by supply voltage goes from supply to
ground. It is polarized by magnet 1 and accumulates antiparallel polarized spin
current near the magnet 1 in the channel. The antiparallel polarized spin current
(green current) acts as information current. It torques the magnetization of magnet
2 to the opposite direction of magnet 1, thus realizing a NOT function.
goes all the way down to the ground (the red dots charge current in Fig. 2.2). The
rest is accumulated at the left side of the channel to create a diffusion spin current
(the green arrow spin current in Fig. 2.2) to the right side (ferromagnet m2 side),
thus carrying the input information from input to output. Because the spin current
has the same spin direction with that of input, output realizes a COPY from the
input.
Vice versa, when the supply voltage is positive, majority spins are extracted from
the channel, leading to the accumulation of opposite (minority) spins beneath m2.
Thus, the diffusion current (spin current) has the opposite spin direction from that of
input. In this case, the charge current and the spin current have different polarization
directions. The output operates logic function NOT from the input, i.e. the receiving
magnet has the opposite magnetization direction to the transmitting magnet.
Because of the structure and working principle of ASL, we can easily tell that the
magnet closer to the ground lead is the sender, and the remote one is the receiver [3].
11
Figure 2.3: Structure of a three-input majority gate [14]. Magnet A and C have
the right pointed magnetization and magnet B has the left pointed magnetization.
Output follows the majority direction of inputs, i.e. the right pointed magnetization.
2.2 Digital Computing with ASL
Almost all the logic circuits are built with majority gates. Fig. 2.3 is the basic
structure of a three-input majority gate based on ASL [14]. Inputs A, B, and C
transmit their spin currents from input to output via copper channels. Two inputs
could have either parallel spins or anti-parallel spins. However, for all three inputs,
there must be a majority spin that determines the magnetization direction of output.
For example, as shown in Fig. 2.3, input A and C have the magnetization direction
to the right and B to the left. In this case, the output magnetization direction will
appear as the same as the majority magnetization direction, i.e. to the left (the same
with input A and C).
A three-input majority gate itself can function as NAND, NOR, AND, and OR
gate if the supply voltage is appropriately chosen, as shown in Table 2.1 [3].
Input B and C act as valid inputs which could be customized by the user. Input A
and supply voltage VDD act as controllers. When A is 0 and VDD is positive, function
B NAND C is realized. Similarly, when A is 0 and VDD is negative, function B AND
C is realized; when A is 1 and VDD is positive, function B NOR C is realized; when
12
Table 2.1: The truth table of three-input majority gate [3].
A B C
OUT
Positive VDD Negative VDD
0 0 0 1
NAND
0
AND
0 0 1 1 0
0 1 0 1 0
0 1 1 0 1
1 0 0 1
NOR
0
OR
1 0 1 0 1
1 1 0 0 1
1 1 1 0 1
Table 2.2: The truth table of full adder with the illustration of xor/xnor gate design
[3].
A B Cin Cout Comp.Cout Comp.S
0 0 0 0 1 0
XNOR
0 0 1 0 1 1
0 1 0 0 1 1
0 1 1 1 0 1
1 0 0 0 1 1
XOR
1 0 1 1 0 1
1 1 0 1 0 0
1 1 1 1 0 1
A is 1 and VDD is negative, function B OR C is realized. In this way, computing
elements could be built with three-input majority gates.
The typical building block mentioned in this section is the full adder. Three inputs
are needed to build a full adder: two addends A and B, and a carry Cin. The truth
table of a full adder is shown in Table 2.2 [3]. Notice that in a full adder, XOR and
XNOR gates can also be realized.
The layout of a full adder is shown in Fig. 2.4 [3]. The supply voltage is positive
to create a complementary output Cout and Sum S.
A trick is applied to reduce the potential number of majority gates. We can see
from the truth table (Table 2.2) that in order to get the complementary S, three inputs
13
Figure 2.4: A full adder using a three-input majority gate [3]. A, B and Cin act as
inputs. Output Comp.Cout is achieved by three inputs A, B and Cin. Output Comp.S
is achieved by five inputs A, B, Cin, Cout and Cout (double Couts)
and two complementary Couts are needed. The trick is: the channel that transmits
complementary Cout is made shorter than the channel transmitting inputs A, B and
Cin, so that the spin current of the complementary Cout is strong enough to act as
two complementary Couts (the relative strength compared to A, B and Cin).
As shown in Fig. 2.5 (a), the channels connected to the five-input majority gate
(M3) is made to have the same length (the wire length in Fig. 2.5 does not represent
the real nor relative length); thus the five inputs of the five-input majority gate have
the same strength. In this case, two three-input majority gates (M3) are built to
generate two Couts and one five-input majority gate is built to generate one S. Fig.
2.5 (b) is the modified schematic of a full adder, where the trick is used. Only one
three-input majority gate is used to generate two Couts. Only one Cout is generated
14
(a)
(b)
Figure 2.5: (a) ASL full adder constructed from three majority gates. Mx means a
majority gate with x inputs, i.e. M3 means a majority gate with three inputs; (b)
modified ASL full adder. The five-input majority gate has five inputs: Cout, Cout, A,
B and Cin. Because we have two Cout goes into M5 at the same time, the strength of
logical Cout is twice as strong as other inputs.)
15
by one three-input majority gate, but the strength of this Cout is as strong as two
Couts compared to the Cout generated in Fig. 2.5 (a). Because the connection channel
between this Cout and the input side of M5 is made shorter than other connection
channels, Cout is twice as strong as each input A, B and Cin.
In actual design, instead of going into M5, three inputs A, B and Cin are combined
together and then joined with Cout, i.e. M5 is not used in this modified design, as
shown in Fig. 2.4. The combined input, containing the spins of two Couts and one
A, one B, and one Cin, goes into a single ASL basic unit (i.e. an inverter). Thus, by
using the trick, the full adder is made of a three-input majority gate and an inverter,
as shown in Fig. 2.4. Because of the routing issue of inputs A, B and Cin, the actual
size of this type of full adder is equal to a three three-input majority gates circuit.
A basic ASL unit can act as a latch because when no voltage is supplied, the
state of each ferromagnet is stable. When the supply voltage VDD of a basic ASL
unit changes to a clock voltage VCLK , the basic ASL unit changes to an ASL latch.
In Fig. 2.6 [3], the middle ASL unit acts as a latch. We can see in Fig. 2.6 (b)
and (c) that the clock voltage VCLK has two phases: VSS and VDD. In (b), unit A
sends current to L and changes the magnetization of L, i.e. L has either the same
or opposite magnetization of A. However, L will not send current to B, because no
voltage (VSS) is supplied. The latch is “OFF”. Thus, L separates A from B. In (c),
VDD is supplied to L. L is “ON”, and sends any information it received from A to B.
L is transparent now.
The schematic of D flip-flop (DFF) is shown in Fig. 2.7 [3]. These components
have been made physically and tested by M.Alawein’s group. Two ferromagnets D1
and D2 are supplied with VCLK and VCLK−b respectively. VCLK−b is the complement
of VCLK . VCLK has two phases: VSS and VDD. We can see from (b) that when D1 is
supplied with VDD, the information is transmitted from A to D1, and held there until
the supply voltage of D2 changes from VSS to VDD, as shown in (c). The information
16
Figure 2.6: (a) The schematic of latch using ASL; (b) Current cannot flow from A to
B via L when the supply voltage of L is low (opaque); (c) Current flows from A to B
via L when the supply voltage of L is high (transparent) [3].
17
Figure 2.7: (a) The schematic of D flip-flop using ASL (D1 and D2) [3]; (b) D1 is
transparent (supplied with VDD) and D2 is opaque (supplied with VSS); (c) D1 is
opaque and D2 is transparent.
then transmits from D2 to B in the same clock cycle. It is a positive edge-triggered
DFF. A negative-edge-triggered DFF is achieved by exchanging VCLK and VCLK of
D1 and D2.
Fig. 2.8 shows the schematic of pipelined full adder with a 2-phase clock [13].
Three full adders are connected and supplied with different clock voltage. The first
and third adder (FA1 and FA3) use VCLK and the second adder (FA2) uses VCLK−b.
VCLK−b is low when VCLK is high and vice versa. The width of the clock pulse
depends on the critical path of each adder.
When VCLK is high, the first and third adder start to compute and send informa-
tion to the next stage, i.e. the first adder sends the computation result to the second
adder, and the third adder sends the result to its next stage, which is not shown in
18
Figure 2.8: The schematic of a pipelined full adder using the 2-phase clock [13]. CLK
and CLK-b have opposite clock phases. When CLK is high, and CLK-b is low, FA1
and FA3 work and FA2 is held. When CLK is low, and CLK-b is high, FA1 and FA3
are at hold state and FA2 works.
this figure. FA2 absorbs the output from FA1 (i.e. Cout from FA1) but doesn’t do
any further computation. Thus, except one magnet keeps the result (Cout from FA1)
from the current cycle, the other magnets of FA2 keep the results from the previous
clock cycle.
Then VCLK changes to low and VCLK−b changes to high. The first adder and the
third adder turn to the steady state, keeping the result they got in the last clock
cycle (except the receiving magnet, i.e. the magnet receives the previous adder’s
Cout), while the second adder starts to its computation and sends the result to the
third adder. The third adder receives the computation result (Cout) from the second
adder but doesn’t do any computation until VCLK changes to high.
19
CHAPTER 3
ENERGY EFFICIENCY OF COMPUTATION IN ASL:
THEORETICAL METHODS
This chapter introduces two different theoretical approaches for studying energy
dissipation in ASL: circuit simulation and the physical-information-theoretic analysis.
Circuit simulation provides energy dissipation estimates for specified circuit realiza-
tions of ASL circuits, considering the material parameters, connection models, device
dimensions, and supply voltage conditions, etc. The results are numerical projections
of energy dissipation [17][10]. The physical information theoretic analyses, on the
other hand, provides fundamental lower bounds on energy dissipation that depend
on the circuit structure and control but are independent of the particulars of the
circuit and device realizations. These analyses quantify the amount of information
irreversibly lost during computation from the underlying data processing method and
then obtains the energy dissipation of this amount of information loss [11]. This chap-
ter introduces this two methods in detail, letting readers have a better understanding
of them and knowing how to apply them in the basic ASL circuits.
3.1 Energy Efficiency From Circuit Simulation
3.1.1 Landau-Lifshitz-Gilbert Equation and spin storage
Landau-Lifshitz-Gilbert Equation (LLG equation) describes the behavior of mag-
nets regarding the changes of the magnetic field and spin current polarization, i.e. the
20
Figure 3.1: An overview of the circuit simulation model between two magnets and a
non-ferromagnet channel. Input voltage applies to magnet 1, affects spins and gener-
ates spin currents. Manget 1 consists of two steps of the simulation, spin injection and
spin storage. Then, the spin currents go through a non-ferromagnetic channel, which
is another step of simulation called spin transport. When the spin current arrives at
magnet 2, it affects the spin direction of Magnet 2, and an output voltage will be
detected. The feedback spin current carries feedback information flows through non-
ferromagnet channel and reaches magnet 1. The magnetization direction of magnet 1
will be influenced by the feedback current slightly and quickly reach the equilibrium
state.
LLG equation illustrates how the information in the magnetization of the ferromagnet
and the current influences each other. The LLG equation is written as follow:
∂ ~M
∂t
= −γ ~M × ~Heff + α
Ms
~M × ∂
~M
∂t
(3.1)
where Ms =
∣∣∣ ~M ∣∣∣ is the magnitude of magnetization vector in a small region dVr, t is
time, and ~Heff is the effective magnetic field applied to dVr and α > 0 is the Gilbert
constant, depending on the material.
The detailed Landau-Lifshitz-Gilbert equation derivation is in Appendix A. The
general LLG equation shown at the end of Appendix A is:
∂ ~M
∂t
= −γ ~M × ~Heff + α
Ms
~M × ∂
~M
∂t
(3.2)
When applying the LLG equation in a spintronic device, we need to take the influence
of spin current polarization into consideration, as shown in Appendix B.
21
Figure 3.2: Spin storage simulation [17].
We make a step further to break down the modified LLG equation into three
equations, each of which stands for the relationship between current and magnetic
field or voltage in one direction (x, y, or z-direction) of the magnets [17]:
Nsq(1 + a
2)
dmx
dt
= f(~m, ~Is, ~Heff ) (3.3)
Nsq(1 + a
2)
dmy
dt
= g(~m, ~Is, ~Heff ) (3.4)
Nsq(1 + a
2)
dmz
dt
= h(~m, ~Is, ~Heff ) (3.5)
The formula of these three equations is similar to C dv
dt
= i, i.e, the I-V character-
istics of charging and discharging a capacitor. Here capacitance C = Nsq(1 +a
2) and
i = f(~m, ~Is, ~Heff ) or i = g(~m, ~Is, ~Heff ) or i = h(~m, ~Is, ~Heff ). Thus we have a spin
storage simulation circuit as Fig. 3.2 [17]. Current i is a voltage dependent current
source. Heff is a voltage dependent voltage source.
22
Function f(~m, ~Is, ~Heff ), function g(~m, ~Is, ~Heff ), and function h(~m, ~Is, ~Heff ) are
f(~m, ~Is, ~H) = Is,ymxmy − Is,xm2y + Is,zmxmz − Is,xm2z − Is,zm2x − Is, zm2xmya
−Is,zm3ya+ Is,ym2xmza+ Is,ym2ymza− Is,zmym2za− Is,zm3ya+Heff,zmyNsqγµ0
−Heff,ymzNsqγµ0 +Heff,ym2zNsaγµ0 −Heff,xm2yNsqaγµ0
+Heff,zmxmzNsqaγµ0 −Heff,xm2zNsqaγµ0
(3.6)
g(~m, ~Is, ~H) = −Is,ym2x + Is,xmxmy + Is,zmymz − Is,ym2z + Is,zm3xa
+Is,zmxm
2
ya− Is,xmx2mza− Is,xm2ymza+ Is,zmxm2za− Is,xm3za
−Heff,zmxNsqγµ0 +Heff,xmzNsqγµ0 −Heff,ym2xNsqaγµ0 +Heff,xmxmyNsqaγµ0
+Heff,zmymzNsqaγµ0 −Heff,ym2zNsqaγµ0
(3.7)
h(~m, ~Is, ~H) = −Is,zm2x − Is,zm2y + Is,xmxmz + Is,ymymz − Is,ym3xa
+Is,xm
2
xmya− Is,ymxm2ya+ Is,xm3ya− Is,ymxm2za+ Is,xmym2za
Heff,ymxNsqγµ0 −Heff,xmyNsqγµ0 −Heff,zm2xNsqaγµ0 −Heff,zm2yNsqaγµ0
+Heff,xmxmzNsqaγµ0 +Heff,ymymzNsqaγµ0 (3.8)
3.1.2 Spin Injection
Two-channel resistor model [18][19] is used to build up the spin injection model. In
two-channel resistor model, the spin current is sorted into majority spin current I↑ and
minority spin currentI↓, determined by the majority and minority spin conductance
of material G↑ and G↓, respectively [18].
23
Figure 3.3: The charge with a spin direction (spin up or down) flows right, generating
a charge flow and a spin flow. For charge flow, the charge current ~IC = ~I↑ + ~I↓. For
spin flow, the spin current ~Is = ~I↑ − ~I↓.
Usually, a flow of charges (electrons) only has charge current, because the numbers
of spin-up electrons and spin-down electrons are the same, and the spin current ~Is =
~I↑ − ~I↓ = 0. However, if we use a special method to make the number of spin-up and
spin-down electrons unequal, i.e. a majority spin current and a minority spin current
are created, we get both the charge current and the spin current, which is the case
for all-spin logic. Ideally, the charge current only goes from the voltage supply to
the ground lead and the spin current to transport from one magnet to the other via
channel).
For the charge current ~IC in all-spin logic, we can use the conventional model to
calculate the value:
IC = G↑(VC,N + ~Vs · ~m− VC,F ) +G↓(VC,N − ~Vs · ~m− VC,F ) (3.9)
where G↑ and G↓ are the majority and minority spin conductance, respectively. VC,N
and VC,F are the charge voltage for the nonmagnetic material and ferromagnetic
material, respectively. ~Vs is the spin voltage. By calculating the dot product in the
24
Figure 3.4: Magnetization direction of ~m2
equation above, we get
IC = (G↑+G↓)(VC,N −VC,F ) + (G↑−G↓)Vs,x + (G↑−G↓)Vs,y + (G↑−G↓)Vs,z (3.10)
where Vs,x, Vs,y and Vs,z are the spin voltage in x, y and z-direction.
Because the magnetization directions on the two sides of the interface are usually
non-linear, i.e.m1 6= ±m2, the spin current Is is then divided into parallel and per-
pendicular spin current. As shown in Figure 3.4, the parallel current Is,‖ is in-plane
perpendicular to the magnetization direction of the injected material (~m2), thus its
magnetization direction is parallel to ~m2. The perpendicular current Is,⊥ has two
part. One part is parallel to the magnetization direction of the injected material
(~m2). The other part is out-of-plane perpendicular to the magnetization direction.
Both parts have a magnetization direction that is perpendicular to ~m2.
Thus, the parallel spin current can be written as
~Is,‖ = ~m · [G↑(VC,N + ~Vs · ~m− VC,F )−G↓(VC,N − ~Vs · ~m− VC,F )] (3.11)
25
, which can be further written as
~Is = (G↑ −G↓)mx(VC,N − VC,F )~x+ (G↑ −G↓)my(VC,N − VC,F )~y
+(G↑ −G↓)mz(VC,N − VC,F )~z + (G↑ +G↓)mxVs,x~x
+(G↑ −G↓)myVs,y~y + (G↑ −G↓)mzVs,z~z (3.12)
The perpendicular spin current can be written as
~Is,⊥ = 2ReG↑↓ ~m× (~Vs × ~m) + 2ImG↑↓~Vs × ~m (3.13)
, which can be further written as
~Is,⊥ = 2ReG↑↓[(Vs,x − Vs,xmx2 − Vs,ymxmy − Vs,zmxmz]~x
+(Vs,y − Vs,xmxmy − Vs,ym2y − Vs,zmymz]~y
+(Vs,z − Vs,xmxmz − Vs,ymymz − Vs,zm2z]~z
+2ImG↑↓[(−Vs,zmy + Vs,ymz)]~x+ (−Vs,xmz + Vs,zmx)~y + (−Vs,ymx + Vs,xmy)~z
(3.14)
3.1.3 Spin Transport
First, it can be inferred from the name of the all-spin logic device that, in the
transport channel, all the information is transmitted only by the spin current. Special
methodologies are used to generate only spin current but no charge current. However,
the charge current does exist in the channel. It is generated only by the charge
diffusion, while both spin drift and diffusion generate the spin current.
To present the charge diffusion equation, we have the following expansion. The
charge diffusion equation is a continuity equation based on a basic equation Q
t
= I
26
Figure 3.5: Spin injection model [17].
where Q = CV . Substitute Q into the first equation, we get CV
t
= I. By making the
partial derivative of time on left part and the partial derivative of length on the right
side (to simulate the charge and discharge process in a limited transmission line ∂x),
we get the continuity equation of charge diffusion current as follows [20][21]
Ce
∂Vc
∂t
=
∂Ic
∂x
Ic = σA
∂Vc
∂x
(3.15)
where Ce is the electrostatic capacitance per unit length, Vc is the charge voltage and
Ic is the charge current, σ is the conductivity and A is the cross-sectional area.
27
For spin current, as the form of Equation 3.15, we can represent it in both the
continuity equation and current density.
For the spin current continuity equation, it is similar to that of the charge conti-
nuity equation (Equation 3.15), but with an additional spin relaxation part.
∂~s
∂t
=
∂ ~Js
∂x
− ~s
τs
(3.16)
where ~s is the spin accumulation, τs is the spin relaxation time.
∂~s
∂t
= ∂
~Js
∂x
is the
continuity version of Q
t
= I.
Driven by both drift and diffusion, the current density can be written as ~Js =
~Jdiff + ~Jdrift = D
∂~s
∂x
+ µE~s, or
~Js = D
∂~s
∂x
+ µE~s (3.17)
where µ is the mobility of electrons or holes, E is the electric field applied to the
channel, D is the diffusion coefficient for the electron, and ~s is the spin accumulation.
It can also be written as s = µs
∂n0
∂η
where µs is the spin quasi-chemical potential,
which can also be written as µs = Vsq
2, η is the chemical potential, and n0 is the
carrier concentration. Substitute µs in, we get s = Vsq
2 ∂n0
∂η
. Divide both sides by Vs,
we get s
Vs
= q2 ∂n0
∂η
. This is the form of quantum capacitance Cq, i.e.
s
Vs
= q2 ∂n0
∂η
= Cq.
And according to the general form of Einsteins relation, σ = q2 ∂n0
∂η
, we get Cq =
σ
D
.
Thus, s = σ
D
Vs. Putting s in vector, we get
~s =
σ
D
~Vs (3.18)
Substitute Equation 3.18 into Equation 3.16 and 3.17, we get
28
Cq
∂~Vs
∂t
=
∂ ~Js
∂x
− Cq
τs
~Vs
~Js = σ
∂~Vs
∂x
+
µEσ
D
~Vs (3.19)
To turn the current density into current, we times the cross-sectional area A on the
both side of Equation 3.19
CqA
∂~Vs
∂t
=
∂~Is
∂x
− CqA
τs
~Vs
~Is = σA
∂~Vs
∂x
+
µEAσ
D
~Vs (3.20)
where EAσ = ∂Vc
∂x
Aσ = Ic. Thus, Equation 3.20 can be written as
CqA
∂~Vs
∂t
=
∂~Is
∂x
− CqA
τs
~Vs
~Is = σA
∂~Vs
∂x
+
µIc
D
~Vs (3.21)
Combining Equation 3.15 and Equation 3.21 , we get a general equation to describe
spin transport.
Cs
∂V s
∂t
=
∂Is
∂x
−GsV s
Is = σA
∂V s
∂x
+
µIC
D
V s (3.22)
where Cs is the unit tensor capacitance of the material
Cs =

Ce 0 0 0
0 CqA 0 0
0 0 CqA 0
0 0 0 CqA

(3.23)
29
Figure 3.6: Spin transport model [17].
, and Gs is the spatial tensor conductance of the non-ferromagnetic channel. It is
defined as
Gs =

0 0 0 0
0 CqA
τs
0 0
0 0 CqA
τs
0
0 0 0 CqA
τs

(3.24)
The spatial derivative ∂x can be simplified as a finite numbers N of ∆x where ∆x
is a tiny amount of length of the whole channel L. Thus between the two ends of ∆x,
the voltages are ∆V s(x) and ∆V s(x+ δx), respectively. The corresponding currents
are ∆Is(x) and ∆Is(x+ δx), respectively.
30
3.1.4 Energy Efficiency
The energy efficiency, derived from input-averaged energy dissipation, is the num-
ber of computational operations performed per dissipated Joule of energy. It is a
post-processing step for the total energy dissipation obtained either from circuit sim-
ulation of physical-information-theoretic analysis.
The energy dissipation is “input averaged” to meaningfully account for the dissi-
pation incurred in processing all computational inputs, which may differ from input
to input. Thus the energy dissipation may vary from different input patterns and
lengths. For example, we have a circuit with three inputs (eight possible patterns
from 000 to 111). By applying the two calculation methods, we could get the pro-
jected energy EA and irreversibility induced energy EB. By the working principle,
the projected energy EA is the energy dissipation from the circuit simulation with
an input pattern from 000 to 111, and the results of projected energy may vary from
different chosen options. The irreversibility induced energy EB, however, depends
only on the statistics of the input, i.e. EB need not be calculated individually for the
inputs 000, 001, 010, 011...The final result is already input averaged.
For this reason, the definition of input averaged energy is used in this thesis,
allowing a way for different types of energy dissipation to compare with each other
in the same standard.
3.2 Efficiency Limits from Physical Information Theory
The projected energy is a numerical estimate of the energy dissipation per com-
putation obtained for a specific circuit realization from circuit simulations. The fun-
damental lower bound of the projected energy, which cannot be obtained from circuit
simulations, is the dissipation resulting from irreversible information loss associated
with the logic function performed by the circuit and the manner in which this function
is performed. The irreversibility induced energy is only related to the irreversible in-
31
formation loss, regardless of the circuit material, connections or working environment.
It yields the lower bound of energy dissipation.
In this thesis, we obtain the fundamental lower bound on the energy dissipation
using the methodology of Ercan and Anderson [11]. In this methodology, there are
four steps to determine the lower boundary of a given circuit: physical decompo-
sition, process abstraction, operational decomposition, and cost analysis. We will
first introduce physical information theory, and then explain in detail how to calcu-
late irreversible information loss using the four steps mentioned above and get the
corresponding energy dissipation.
3.2.1 Introduction of Physical Information Theory
Devices give out heat during the computation steps. Some heat is reversible,
and some heat is irreversible. If we define the device and its relative thermodynamic
surroundings as a universe, the reversible computation process is, thermodynamically,
in this defined universe, the final equilibrium state can be restored to the initial
equilibrium state without any energetic cost after a circuit operation [22].
For example, we have input x going through device D, realizing function F (x)
and getting the output y. The initial equilibrium state is when input x is in device
D but hasn’t started the computation steps yet. The final equilibrium state is when
input x goes through the machine, gets the output y, and all the heat is dissipated in
the surrounding environment, i.e. the heat in the universe comes to a steady state,
not flowing from one end to another. If we could reverse this finally equilibrium state
back to the initial equilibrium state without any energy cost outside the universe,
i.e. all the heat flow is within the universe; we say this computation operation is
reversible [22].
Logically, the reversible computation operation means the inputs and outputs
have a one-to-one relation, i.e. for a given output y, the input x is determined. For
32
example, the device D realizes function F (x). Now we have input series x1, x2, ..., xn
goes through the device D, and gets the output series y1, y2, ..., yn. If the number of
outputs is the same with inputs and has a one-to-one relation, i.e. one output only
has one related input, the computation process is reversible.
On the other hand, the irreversible computation operation is, regarding thermo-
dynamics, when the final equilibrium state of the universe could not be restored to the
initial equilibrium state without the additional energy cost from outside the universe.
Regarding logic circuit computation, irreversible computation operation means that
in the device D, there exists at least one output y related to at least two inputs x1
and x2, i.e. for this output y, we cannot figure out which input x generates it. In this
case, the computation operation is irreversible.
Irreversible computation process is closely related to our final goal—the irre-
versibility induced energy dissipation. Landauer’s Principle points out that if the
computation process is irreversible, the information in the device is erased and can’t
be restored, while the reversible computation process means the information can be
restored and no information is lost.
If the information is lost, energy will be consumed, and heat will be dissipated into
the environment. The relation between information loss and energy/heat dissipation
is also shown in the Landauer’s Principle [12]
4〈E〉 ≥ −kBT ln(2) ×4I (3.25)
where kB is the Boltzmann constant and − 4 I is the information loss from the
computation process.
From equation 3.25, we can get a general idea of how physical information theory
works: the circuit gives out heat if the computation requires irreversible information
loss, i.e. if there exist one or more outputs that can not be referred to a unique input,
we say the information is lost from the computing process. In this way, we can get the
33
irreversibility induced energy dissipation from calculating the irreversible information
loss.
The detailed analysis steps [11] are shown in the flowing subsections.
3.2.2 Physical Decomposition
The first step in the physical-information-theoretic analysis is the physical decom-
position [11], which divides the entire system into two separate parts — computa-
tionally relevant domain and environmental domain. These two domains do not have
overlap, but they can interact with each other during the computing process. Also,
the entire system is carefully defined so that we could treat it as a closed system, i.e.
all the heat exchanges is done within the system.
As shown in Fig. 3.7 (a) and (b), the computationally relevant domain has three
parts: information processing artifact A (which could be further divided into repre-
sentational elements C and nonrepresentational elements C), input referent R, and
supporting computational subsystems A. The detailed definition will be illustrated
below.
 Computationally Relevant Domain - The universe directly related to the com-
puting process that consists of a physical artifact (device or circuit) and sur-
rounding subsystem.
– Information Processing Artifact A - Information processing artifact A is
the circuit of interest that realizes expected function L(xi) where xi could
be any possible argument xi ∈ {xi}. The artifact A consist of the following
elements[11]:
* Representational Elements C - representational elements C are the
components designed for storing the information during the computa-
tion.
34
(a)
(b)
Figure 3.7: Physical decomposition of the target circuit, including computationally
relevant domain and environmental domain [11].
35
* Non-representational Element C - nonrepresentational elements C are
the remaining parts of the information processing artifact A. Some-
times they store the information during computation as well, but they
are not designed for information storage. For example, the connec-
tion wires between representational elements. They are designed for
transport usage, but sometimes they bear information as well.
– Input Referent R - Input referent its the original inputs without being
processed by the information processing artifact A, i.e. the “source file”
copied from I/O port.
– Supporting Computational Subsystem A - Subsystem external to A sup-
ports A to complete its computing program. Basically it provides all the
information A needs for its function L(xηi ) where η is the ηth input of input
xi. Thus A can be a simple register or buffer that stores L(x
η
i ). Or it can
be the previous or next stage “A” that produce L(xη−1i ) = xi or receive
L(xηi ) = x
η+1
i .
 Environmental Domain - Environment domain is everything in the closed system
except the computationally relevant domain. It consists of two parts, heat bath
B and remote environment B [11]:
– Heat Bath B - Heat bath B is the environment that contacts directly to
both artifact A and remote environment B. It is a heat exchanging bridge
over the information processing artifact A and environment B. It usually
first exchanges heat with artifact A due to the information loss caused by
circuit computation. Second, it exchanges heat with environment B to
bring its temperature back to the balance temperature T.
– Environment B - Remote environment B is the greater environment that
restores the heat bath B to equilibrium state (temperature T).
36
3.2.3 Process Abstraction
Process abstraction describes how the entire closed system changes and “rether-
malized” (i.e. the temperature goes back to the balance state temperature T) in the
computing process [11]. It contains control operations and a restoration process.
 Control operations — The information in representational elements changes,
thus causes the information loss and heat dissipation. The heat dissipation
usually involves the information processing artifact A, supporting computa-
tional subsystem A and the heat bath B. The process is usually done by a set
of control operations φt ∈ {φt} where {φt} is determined by the implemented
functions of information processing artifact A [11].
 Restoration processes — Remote environment B restores heat bath B and com-
putational relevant domain A. Heat bath B drives artifact A toward the thermal
equilibrium state, but the results of the analysis do not presume that the state
of A at the conclusion of a computational step is restored to equilibrium.
3.2.4 Operational Decomposition
Operational decomposition divides the entire closed system regarding time and
space so that we could analyze the information cost step by step in the next section.
Operational decomposition consists of clocking part and computation part. The de-
tailed explanation is as follows [11]:
 clocking
– Clock zones and subzones - A clock zone is a series of representational ele-
ments C that responses to the same control operation φt ∈ {φt} at the same
time t. Time t could be any desired time. Each of the representational
elements C may have different separate physical locations that have no
direct interaction with each other, i.e. they may compute independently.
37
A collection of these physically separated representational elements within
the same clock zone is called a subzone. A clock zone may have one or
more subzones. The uth clock zone is denoted as C(u). The lth subzone in
the uth subzone is denoted as Cl(u) [11].
– Clock step - Clock step describes what happens in a certain period of time
to a certain area. To make it clearer, it denotes that in a period of time,
a desired control operation φt is applied to a desired clock zone C(u). For
example, clock step ϕv = {(C(u);φt)}v indicates that at clock step ϕv,
control operation φt applies to clock zone C(u) [11].
– Clock cycle - The control operations are usually periodic, i.e. after a
sequence of control operations Φ = φ1φ2φ3...φn, the next control operation
φn+1 goes back to φ1. The period of time that a cycle of a periodic sequence
control operations takes is called clock time. The clock is called a clock
cycle.
 Computation
– Computational step - To realize the desired function L(xi), a function L
applied to the ηth input x
(η)
i in the input sequence ...x
(η−1)
i , x
(η)
i , x
(η+1)
i ....
The clock step that this function belongs to is called computational step.
The kth computational step of x
(η)
i in K clock steps is donated as ck. Clock
steps ϕ apply to the circuit structure, usually involved with artifact A and
other supporting computational subsystem A, while computational steps
apply to inputs. At the end of each computational step, the collective state
of all the computational elements is called computational state.
– Computational cycle - All the computational steps apply to a single input
xi to realize the entire function L(xi) is called a computational cycle. It’s
usually denoted as Γ(η) = c1...ck...cK where cK indicates that it takes K
38
clock steps to realize the entire function. In most of the cases, the amount
of computational steps in a computational cycle is equal to the amount
of the clock steps in the corresponding clock cycle. When comes to the
definition of the entire function L(xi), it means from the first information
of xi starts to appear (when xi starts to be loaded into artifact A) to the
last information loss when the next computational cycle Γ(η+1) erases all
the information of xi in artifact A [11].
3.2.5 Cost Analysis
Cost analysis is based on the irreversible information loss of data, i.e. it is based
on an analysis of the information changes of computational elements along the data
flow [11]. Therefore, information dynamics and dissipation bounds are needed in the
evaluation, the definitions of which are as follows:
 Information dynamics - In order to analyze the information loss caused by data
changes, data zones and subzones are defined to combine the same type of data
based on data changing trends.
– Data zones and subzones - In the ηth computational clock, a data zone is
all the representational elements C that holds the information of input xηi
in the kth computational step, which could be denoted as data zone D(ck)
where ck stands for the data zone of computational step ck. Data zone con-
sists of one or multiple data subzones depending on user definition. If only
one data subzone exists, this data subzone is equal to the corresponding
data zone. If multiple data subzones exist, these data subzones together
constitute a data zone. Each data subzone is physically disjoint. The size
of the data subzone is also defined by the user to make the cost analysis
easier. The wth data subzone in data zone D(ck) is donated as Dw(ck) [11].
39
– Information loss - The information loss is the information lost from the
information processing artifact A during computational cycles, i.e. we
compare the data zones of the D(ck−1) and the D(ck) and find out the
representational elements C that was erased in D(ck) from D(ck−1). More-
over, the erased representational elements are the information loss during
clock cycle ck. Note that there would be some new representational ele-
ments showing up in D(ck). These new elements, which are not in D(ck−1),
don’t cause information loss.
 Dissipation bounds - We get the dissipation bounds by using the equation [21]
4〈EB〉k ≥ −kBT ln(2)×4IRηAk (3.26)
where kB is Boltzmann constant, T is temperature, 4〈EB〉k represents the
energy dissipation from artifact A to the heat bath B in the kth computational
step, and 4IRηAk is the information loss about x(η)i in the kth computational
step.
We could get the information loss by following steps. Assume we have discrete
independent and identically distributed (IID) source
{x} = {x0, x1, ..., xd−1} (3.27)
with distribution
{p} = {p(x0), p(x1), ..., p(xd−1)} (3.28)
Then it has Shannon entropy
H(X) = −
d−1∑
i=0
p(xi)log2p(xi) (3.29)
40
where I(xi) = −log2p(xi) is the information of xi. Shannon entropy is a measure
of the information in the output of the source X.
If this discrete IID source X is sent to a channel N , and get a series of output
Y. Output Y can be written as a collection of symbol of {y} = {y0, y1, ..., yr−1}
with distribution {q} = {q(y0), q(y1), ..., q(yr−1)}. The channel N is not the
physical non-ferromagnetic material channel we introduced in Chapter 2. It
is the logic computation channel implementing the conditional distribution
p(Y |X)—when the set of input X is given, the probability of output set Y.
It is the information process artifact A, including representational element C
and non-representational element C. For the ASL circuit, the channel N is the
combination of the ferromagnetic material input-output magnets, and the non-
ferromagnetic connection channel in Figure 2.1. No ground lead and voltage
supply connection included.
We can then define the conditional entropy as
H(Y |X) =
d−1∑
i=0
p(xi)H(Y |xi) = −
d−1∑
i=0
r−1∑
j=0
piqj|ilog2qj|i (3.30)
where pi is short for p(xi), qj|i is the conditional probability that given the input
is xi, the output is yj, and H(Y |xi) is the entropy of output Y if one of the
selected symbol xi in X is sent to the channel.
The conditional entropy means the information in Y that is not in X. Vice
versa, if the output Y is given and we need to calculate the entropy of input X
distribution, we have
H(X|Y ) =
r−1∑
j=0
q(yj)H(X|yj) = −
r−1∑
j=0
d−1∑
i=0
qjpi|jlog2pi|j (3.31)
41
where qj is short for q(yj), pi|j =
pi
qj
is the conditional probability that given the
output is yj, the input is xi, H(X|yj) is the entropy of input X, if one of the
selected symbol yj in Y is received from the channel.
Thus, we have the information loss
−∆I = H(X|Y ) (3.32)
, which means the information loss is the uncertainty of the channel input if
the channel output is given. Substitute Equation 3.31 and Equation 3.32 to
Equation 3.34, we get
4〈EB〉k ≥ −kBT ln(2)×4IRηAk
= kBT ln(2)×Hk(X|Y )
= kBT ln(2)×
r−1∑
j=0
q(yj)H(X|yj) = −
r−1∑
j=0
d−1∑
i=0
qjpi|jlog2pi|j (3.33)
where k is the kth computational step; η is the information loss about x
(η)
i in
the kth computational step; qj is short for q(yj); and pi|j =
pi
qj
is the condi-
tional probability. By using this equation, we could calculate the irreversibility
induced energy for all the computational steps and the total irreversibility in-
duced energy is
4〈EB〉tot =
n∑
k=1
4〈EB〉k (3.34)
where n is the total number of computational steps k in a computational cycle
Γ(η) (ηth is the ηth input x
(η)
i in the input sequence ...x
(η−1)
i , x
(η)
i , x
(η+1)
i ....).
3.2.6 Energy Efficiency
As it is said in Section 3.1.4, the irreversibility induced energy is an averaged input
energy dissipation, i.e. the distribution rate of every input is taken into consideration
42
in the calculation method; the result of it does not reflect an actual input pattern but
shows a weighted averaged result of all possible patterns.
43
CHAPTER 4
ASL CIRCUIT STUDY: PRELIMINARY RESULTS AND
DISCUSSION
In this chapter, preliminary results are presented for projected and limiting en-
ergy efficiencies of the ASL circuits introduced in Chapter 2. The analog circuits
for the test of projected energy are built and tested with MATLAB [10]. The stan-
dard ASL model simulator is coded and authorized by Alawein and Fariborzi. In
their paper [10], an ASL buffer/adder simulation circuit is provided in the MATLAB
platform. Based on their codes, circuits with other logic functions such as full-adder
and multiplexor are implemented and modified. The detailed simulation results are
shown and discussed in this chapter. For the limiting energy efficiencies or the ir-
reversible induced energy, the calculation results are base on physical information
theory introduced in Chapter 3.
4.1 Simple Circuits: Buffer, Inverter and Latch
4.1.1 Buffer Simulation Results
As shown in figure 2.1, two basic ASL magnets form a buffer or an inverter.
Depending on the supply voltage, the circuit can work as a buffer or an inverter.
Negative supply voltage realizes a COPY function, which can be applied in the buffer
circuit. Positive supply voltage realizes a NOT function, which could be applied in
the inverter circuit.
Figure 4.1 is the projected simulation results of ASL buffer [10].
Figure 4.1 (a) is the magnetization simulation results. The red line indicates a
negative voltage supply, which means the circuit function is a COPY. The input (blue
44
line) changes from -1 to 1 in every 2ns, while the output (green line) copies its status,
which builds a basic buffer circuit.
Figure 4.1 (b) is the simulation results of spin currents. In our settings, the
x-direction is related to the magnetization storage, so only spin currents in the x-
direction are shown in figure (b). Currents in y and z-direction are not shown.
Figure 4.1 (c) shows the spin power changes. The instantaneous results in this
simulation model are calculated by the current-voltage product at every time E =
I × V × t. The unstable glitches are caused by the input currents and the feedback
currents from the receiving magnet. The feedback currents are small compared to
the main spin currents that torque the magnetization direction. The LLG model can
calculate them. The integration of the power curve is the total energy of this period
(8ns).
E =
∫ t
0
Ptdt(t = 8ns) (4.1)
The average value of p1 is higher than p2. It is because magnet 1 needs supply
power to drive currents and send information, while magnet 2 only needs to receive
information. Only half of magnet 2 is active. The other half it is in the idle state and
will be active when the information needs to be sent to its next magnet, i.e. when
magnet 2 works as a sending magnet.
The integration of p1 is the energy of sending magnet m1, while the integration
of p2 is the energy of receiving magnet m2. The total energy is e1 + e2 = (161.76 +
84.04) × 10−15J = 0.2458pJ . This total energy result indicates that as a promising
post-CMOS device, the static power of ASL circuits are in 10−10J magnitude with a
10mV voltage supply.
Table 4.1 shows the ASL simulation parameters for the buffer [10]. These param-
eters will also be used in the following circuit simulations.
45
Table 4.1: ASL simulation parameters [10].
Parameter Description Value Units
FM (Co)
tFM Thickness 3 nm
WFM Width 50 nm
LFM Length 100 nm
ρFM Resistivity 56 nΩ·m
α Damping factor 0.01 -
Ms Saturation magnetization 1.414× 106 A/m
K1 First uniaxial anisotropy constant 10× 104 J/m3
HK Anisotropy field 1.26× 105 A/m
Nx, Ny, Nz Demagnetization factors 0.055, 0.89, 0.055 -
FM/NM Interface
P Polarization 0.5 -
g↑ Spin-up conductance 0.42× 1015 S/m2
g↓ Spin-down conductance 0.38× 1015 S/m2
Re{g↑↓} Real-part of the mixing conductance 0.546× 1015 nΩ·S/m2
Im{g↑↓} Imaginary-part of the mixing conductance 0.015× 1015 S/m2
NM1 (Cu)
tNM1 Thickness 200 nm
WNM1 Width 50 nm
LNM1 Length 100 nm
lsf,NM1 Spin-diffusion length 450 nm
ρNM1 Resistivity 16.7 nΩ·m
NM2 (Al)
tNM2 Thickness 200 nm
WNM2 Width 50 nm
LNM2 Length 100 nm
lsf,NM2 Spin-diffusion length 600 nm
ρNM2 Resistivity 26.3 nΩ·m
46
(a)
(b)
(c)
Figure 4.1: Circuit simulation results of a two-magnet buffer [10]. (a)Magnetization;
(b)spin current; (c) spin power.
47
(a)
(b)
(c)
Figure 4.2: Circuit simulation results of a two-magnet inverter. (a)Magnetization;
(b)spin currents; (c) spin power.
48
4.1.2 Inverter
Figure 4.2 shows the simulation results of an inverter.
Similar to Figure 4.1(a), the red line in Figure 4.2(a) indicates a negative voltage
supply, which means the circuit function is a NOT. We could see in the same figure
that the input (blue line) changes from -1 to 1 in every 2ns. Moreover, the output
(green line) generates states opposite to the blue line, which builds a basic inverter
circuit.
The total energy is the integral of p1 and p2 curve. Total energy is e1 + e2 =
(130 + 66.75)× 10−15J = 0.1968pJ
4.1.3 Latch
As shown in Figure 2.6, a latch is made of three individual basic ASL magnet
unit. Depending on the supply voltage VDD, the latch can realize a COPY or a NOT
function. When the middle magnet supplies low voltage, the information transport
(current transport) stops. In this case, the conductance channel between input (A)
and output (B) magnet is opaque.
Figure 4.3(a), (c) and (e) are the simulation results when the supply voltage (not
shown in the figure) are negative. In (a) we could see outputs copy the states of
inputs.
Figure 4.3(b), (d) and (f) shows when the supply voltage (not shown in the figure)
is positive, outputs invert the states of inputs.
The average power of p1 and p2 are higher than p3. Because magnet 1 and magnet
2 acts as sending magnet and p3 acts as the receiving magnet, which only half of
the magnet is active. The total energy of NOT-function latch is e1 + e2 + e3 =
(126.76 + 129.24 + 66.26)× 10−15J = 0.3223pJ . The total energy of COPY-function
latch is e1 + e2 + e3 = (161.37 + 160.57 + 83.46)× 10−15J = 0.4054pJ .
49
(a) (b)
(c) (d)
(e) (f)
Figure 4.3: Circuit simulation results of three-magnet latches. One implements
“buffer” function, the other implements “inverter” function. (a)The magnetization
of the buffer; (b)Magnetization of the inverter; (c)spin currents of the buffer; (d)spin
currents of the inverter; (e) spin power of the buffer; (f) spin power of the buffer.
50
4.2 Simplified Circuits Simulation Results: Buffer, Inverter,
Latch
4.2.1 From Dynamic and Static Power to Projected Energy
From the circuit simulation above, we can see there are two contributions to
the total power. One is dynamic power. It is caused by the feedback currents and
gives slightly unstable fluctuation to the power curve. The other is the static power.
It is caused by the supply voltage (the external supply) and generates the majority
currents flow of an ASL device. Note here that this static power differs from the static
leakage that plagues CMOS, in that it plays a functional rather than a parasitic role.
For the final results, what we concerned is the projected energy dissipation, which
could be obtained by the integration of the combined power curve (the combined
power curve of both dynamic power and static power). The combined power curves
are the ones that we showed in Figure 4.1 and Figure 4.3, the integration of which will
be compared with their the irreversibility induced energy counterparts. For conve-
nient reason and decreasing the simulation time, we try to simplify the circuit model.
Energy dissipation is clearly dominated by the static contribution to the total power.
Calculation of the total power, including the feedback that determines the dynamic
component, becomes computationally intensive and induces instabilities in the simu-
lation for more complex circuits. This motivates simplification of the simulator.
From Section 3.1.1, we know that it is the Landau-Lifshitz-Gilbert equation (LLG
equation) that describes the relationship between input and output spin currents
and how the feedback currents form a self-consistency system. LLG equation is
a physical character of ASL circuits, helping us to have a better understanding of
spintronic devices. Base on this, we build a simplified model without the influence of
the LLG equation and apply it in the following section to the ASL buffer, inverter,
and latch. The results are shown and compared with the full-LLG model ones. From
51
the comparison, we could get an idea of how small the dynamic energy proportion
takes in the whole energy cost and the capability of simplified ASL model.
4.2.2 Simplified Buffer and Inverter Simulation
Figure 4.4 shows the simulation results of simplified buffer and inverter. (a), (c)
and (e) are results of the buffer. (a) is the spin magnetization changes in x direction,
which also represents the circuit logic implementation. We could see from the figure
that the output magnet m2 just follows the statuses of input magnet m1, which
realizes COPY function of a buffer circuit. The spin power of magnet 1 and 2 are
around 20µW and 10µW , respectively.
Figure 4.4(b), (d), and (f) are results of the inverter. (b) is the spin magnetization
changes in x direction, from which we could see the output magnet m2 inverts the
statuses of input magnet m1.
According to the spin power, the total energy of a simplified buffer is E = P1t +
P2t = (163.97 + 85.07) × 10−15J = 0.249pJ ; the total energy of a simplified inverter
is E = P1t+ P2t = (125.7 + 64.1)× 10−15J = 0.1898pJ .
The buffer and the inverter has one input m1; thus input patterns have two pos-
sibilities : 0 and 1. We run four clock cycles with input 0, 1, 0, 1, respectively.
So the total energy consists of two sets of full input pattern 0 and 1. Divide the
total energy by 2, and we get the effective energy or energy dissipation per in-
put pattern for the simplified buffer and latch. For the buffer, effective energy is
Eeffective = Etot/2 = 0.249pJ/2 = 0.1245pJ . For the inverter, effective energy is
Eeffective = Etot/2 = 0.1898pJ/2 = 0.0949pJ . The effective energy indicates how
much energy an operation (an output pattern) consumes during the computation
process. Thus, from the effective energy, we could calculate the energy efficiency,
which is how many operations per Joule (or pJ) energy could realize. For the buffer,
energy efficiency is Eefficiency = 1/Eeffective = 1/(0.1245pJ) = 8.032operations/pJ .
52
For the inverter, energy efficiency is Eefficiency = 1/Eeffective = 1/(0.0949pJ) =
10.537operations/pJ .
4.2.3 Simplified Latch Simulation
Figure 4.5 shows the simulation results of a simplified latch implementing buffer
function.
Figure 4.5 (a) and (b) are the magnetization changes. We could see that the
output copies the statuses of input, and the middle magnet m2 is transparent when
the power supply is high.
Figure 4.5 (d) is the spin power. p1, p2, and p3 are the spin power of magnet 1,
magnet 2, and magnet 3, respectively. The results are similar to the results of the
full-LLG-model buffer circuit (Figure 4.3(e)). The corresponding energy dissipation
is E = E1 +E2 +E3 = (163.97 + 163.97 + 85.07)× 10−15J = 0.413pJ . Also, we could
change the supply voltage to positive and calculate the corresponding energy cost of
the inverter function latch. The energy cost of the latch under positive power supply
is E = E1 + E2 + E3 = (125.71 + 125.71 + 64.10)× 10−15J = 0.3155pJ .
The buffer-function and the inverter-function latch has one input m1. Thus their
input patterns have two possibilities : 0 and 1. We run four clock cycles with input
0, 1, 0, 1, respectively. So the total energy consists of two sets of full input pattern
0 and 1. Divide the total energy by 2 and then by 4, we get the effective energy or
the energy dissipation per pattern for the simplified buffer and latch. 2 is for the full
input pattern set is repeated twice during the four clock cycle.
The effective energy for a simplified buffer-function latch is Eeffective = E/2 =
0.413pJ/2 = 0.2065pJ ; energy efficiency is Eefficiency = 1/Eeffective = 1/(0.2065pJ) =
4.843operations/pJ . The effective energy for a simplified inverter is Eeffective =
E/2 = 0.3155J/2 = 0.1578pJ ; energy efficiency isEefficiency = 1/Eeffective = 1/(0.1578pJ) =
6.337operations/pJ .
53
(a) (b)
(c) (d)
(e) (f)
Figure 4.4: Results of simplified buffer and inverter. (a)The magnetization of the
buffer; (b)Magnetization of the inverter; (c)spin currents of the buffer; (d)spin cur-
rents of the inverter; (e) spin power of the buffer; (f) spin power of the inverter.
54
Table 4.2: The energy cost of different circuits with and without LLG model.
with LLG model without LLG model Difference
Buffer 0.2458pJ 0.249pJ 1.32%
Inverter 0.197pJ 0.19pJ 3.53%
Latch (COPY function) 0.4054pJ 0.413pJ 1.88%
Latch (NOT function) 0.3223pJ 0.3155pJ 2.09%
4.2.4 Comparison of the full-LLG Model and Simplified Model
In the previous sections, we tested several circuits under the full-LLG model and
the simplified model. Table 4.2 shows the results of the energy cost of each circuit
under different models.
From the Table 4.2, we can draw two conclusions. First, as shown in the previous
calculation, the static power of ASL circuits are in 10−13J magnitude with a 10mV
voltage supply. It is a small energy cost compared to its CMOS counterparts.
Second, the difference between full-LLG-model circuits and simplified circuits are
small (less than 5%). Thus, in the following experiments, we use simplified ASL
circuits as samples and compare the results with the irreversibility induced energy
dissipation.
4.3 Simulation of Larger Circuits: Majority Gate, Half-adder,
and ALU
We have established a simplified model of the ASL device and tested several simple
ASL circuits. In this section, we are proceeding to more complex circuits with the
simplified model and calculate their projected energy dissipation. The circuits are
the 3-input majority gate and the half adder, where the half adder is built with the
3-input majority gate unit.
55
(a) (b)
(c) (d)
Figure 4.5: Results of a simplified three-magnet latch, implementing buffer function.
(a) and (b) Magnetization; (c) spin currents; (d) spin power.
56
4.3.1 Majority Gate
As discussed in Section 2.2, majority gate consists of four magnets: Three input
magnets and one output magnet. According to the truth table 2.1, when the supply
voltage is positive, a majority gate can be NOR gate or NAND gate.
Figure 4.6 shows the simulation results of one majority gate act as NOR gate and
NAND gate.
Magnet 1, magnet 3, and magnet 4 are input magnets, while magnet 2 is output
magnet. Among input magnet, magnet 3 is the control magnet. When the magne-
tization direction of m3 is -1, or logic 0, the majority is a NAND gate. When the
magnetization direction of m3 is 1, or logic 1, the majority is a NOR gate.
Figure 4.6(c) and (f) are the spin power consumption. p1, p2, p3 and p4 are the
spin power of charge currents of magnet m1, m2, m3 and m4, respectively.
According to the spin power, the total energy of a 3-input 4-magnet majority
gate implementing NOR function is E = P1t+ P2t+ P3t+ P4t = (213.66 + 213.66 +
213.66 + 93.92)× 10−15J = 0.735pJ ; the total energy of a 3-input 4-magnet majority
gate implementing NAND function is E = P1t+P2t+P3t+P4t = (229.10 + 229.10 +
229.10 + 92.06) = 0.779pJ .
The NOR-function and the NAND-function majority gate has two inputs m1 and
m4 (m3 acts as a switch button switching between NAND and NOR function, which is
fixed by us once the function is setup), thus their input patterns have four possibilities
: 00, 01, 10 and 11. We run four clock cycles with all the possible input patterns 00,
01, 10, 11, respectively. So the effective energy or the energy dissipation per pattern
is the total energy divided by 4 (for it has four input patterns and the effective energy
is an input averaged value).
the effective energy of a 3-input 4-magnet majority gate implementing NOR func-
tion is Eeffective = E/4 = 0.735pJ/4 = 0.1837pJ ; energy efficiency is Eefficiency =
1/Eeffective = 1/(0.1837pJ) = 5.444operations/pJ . The effective energy of a 3-
57
Table 4.3: The truth table of half adder with the illustration of majority-gate nand
gate design.
mx1 mx2 mx6 (Sum) mx7 (Carry)
0 0 0 0
1 0 1 0
0 1 1 0
1 1 0 1
input 4-magnet majority gate implementing NAND function is Eeffective = E/4 =
0.779pJ/4 = 0.195pJ ; energy efficiency is Eefficiency = 1/Eeffective = 1/(0.195pJ) =
5.128operations/pJ .
4.3.2 Half Adder
Section 4.3.1 shows the basic unit of ASL device—the majority gate. Using ma-
jority gate as NAND or NOR gate, we could build almost all the logic circuits.
In this section, we build a half adder using NAND-function majority gates. The
schematic is shown in Figure 4.7. Magnet 1 and 2 are inputs. Magnate 6 and 7 are
outputs. Magnet 6 is sum while magnet 7 is carry. Table 4.3 is the truth table of the
half adder. The simulation results are shown in Figure 4.8.
Each NAND gate has three input magnets and one output magnet. With total 5
NAND gates, there are (3 + 1) × 5 = 20 magnets. Each magnet has its own power.
Figure 4.9 shows the power consumption of each magnet.
Thus, we could calculate the total energy cost is E = Pgate1 + Pgate2 + Pgate3 +
Pgate4+Pgate5+ =
20∑
i=1
Pit = (221.0+221.0+93.17+214.04)×10−15J+(229.09+92.82+
221.15+213.25)×10−15J+(229.09+92.82+221.15+213.25)×10−15J+(211.58+98.02+
219.52 + 219.52)× 10−15J + (210.68 + 103.67 + 209.70 + 209.70)× 10−15J = 3.744pJ
To get the effective energy or the energy dissipation per pattern, we need to divide
the total energy 3.744pJ by 4. The half adder has two inputs m1 and m2. Thus their
input patterns have four possibilities: 00, 01, 10 and 11. We run four clock cycles with
58
(a) (b)
(c) (d)
(e) (f)
Figure 4.6: Results of a simplified 3-input 4-magnet majority gate, implementing
NAND gate or NOR gate. Input magnet: m1, m3 and m4. Output magnet: m2. m3
acts as a switch button switching between NAND and NOR function. m1 and m4 act
as the actual inputs for the NAND or NOR gate. (a), (b) and (c): NOR gate. (d),
(e) and (f): NAND gate.
59
(a)
(b)
Figure 4.7: (a) Half adder schematic. Number 1∼7 are the inputs and outputs of
four NAND gates. Black and red lines are the circuit connections. The two colors are
to make the schematic clearer, having no difference in function. (b) Half adder unit.
The three images are all NAND gate. However, the second and third one are drawn
in 3-input majority gate format.
60
(a) (b)
Figure 4.8: Simulation results of the half adder. (a) Inputs. (b) Outputs.
all the possible input patterns 00, 01, 10, 11, respectively. So the energy efficiency is
the total energy divided by 4 (for it has four input patterns and the energy efficiency
is an input averaged value).
The projected effective energy of a half adder is Eeff = E/4 = 3.744pJ/4 =
0.936pJ ; energy efficiency isEefficiency = 1/Eeffective = 1/(0.936pJ) = 1.068operations/pJ .
4.3.3 Arithmetic Logic Unit
Qi AN’s paper [24][23] reports an ASL implementation of a 5-input ALU based
on two 5-input majority gates and one 3-input majority gate, implementing functions
of full-adder, subtractor, multiplexor, increment, decrement, NAND and NOR.
Figure 4.10 [23] shows the 5-input majority layout and the ALU schematic. Table
4.4 shows the integrated functions of ALU and the detailed configuration of each
terminal. The ALU we discussed here is a 1-bit standard ALU, which can realize
the basic function of a 1-bit full adder (1-bit addend A, 1-bit addend B, and 1-bit
carry-in Cin with 1-bit sum and 1-bit carry-out as outputs), 1-bit subtractor, 1-bit
multiplexor, and 1-bit increment/decrement.
61
(a) (b)
(c) (d)
(e)
Figure 4.9: Power simulation results of the half adder. (a)∼(e): NAND gate 1∼5.
62
(a)
(b) (c)
Figure 4.10: ASL based ALU schematic [23]. (a) 5-input majority gate layout. (b)
ASL based ALU schematic. (c)ALU symbol.
63
(a) (b)
(c) (d)
(e) (f)
(g) (h)
Figure 4.11: Results of an ASL-based ALU. (a)∼(c): full adder. Input A, B, and
carry-in (Cin) are magnets 1, 3, and 4 respectively; Output sum and carry-out (Cout)
are magnet 8 and 2. (d)∼(f): subtractor. Input A, B, and borrow-in (Bin) are
magnets 1, 3, and 4 respectively; Output difference and borrow-out are magnet 8
and 16. (g)∼(h): multiplexor. Input A, B, and select signal are magnet 3, 4, and 1;
Output is magnet 8. When the select signal is logic 1, input A is selected; When the
select signal is logic 0, input B is selected.
64
Each magnet for the ALU has two configurable parts. One is the injected spin cur-
rent, and the other is the magnetization state of the magnet. Iinj represents injected
spin currents. The possibilities are 0 (no spin current), P (positive spin currents), or
N (negative spin current). The logic state is represented by the magnetization state of
each magnet. The possibilities are logic 0 or 1, i.e. parallel or anti-parallel direction,
or Z (don’t care).
For example, to set up a full-adder, we need to the following configuration: No
current injection for terminal A1, H, Z, and U; negative current injection for terminal
A2, A3, B1, B2, and C; positive current injection for terminal M.
Figure 4.11 shows the simulation results of the full adder, subtractor, and multi-
plexor:
 Figure 4.11(a)∼(c): The simulation results of a full adder. As mentioned in
the previous paragraph, it is a 1-bit full adder. One-bit input A (addend), B
(addend), and carry-in (Cin) are represented by magnets 1, 3, and 4 respec-
tively; Output sum and carry-out (Cout) are represented by magnet 8 and 2,
respectively. The truth table of a full adder is shown in Table 2.2.
 (d)∼(f): The simulation results of a 1-bit subtractor. One-bit input A, B, and
borrow-in (Bin) are represented by magnets 1, 3, and 4, respectively; Output
difference and borrow-out are represented by magnet 8 and 16, respectively.
 (g)∼(h): The simulation results of a multiplexor. Input A, B, and select signal
are represented by magnet 3, 4, and 1, respectively; Output is represented by
magnet 8. When the select signal is logic 1, input A is selected; When the select
signal is logic 0, input B is selected.
The ALU has 16 magnets. By integrating the power consumption of each magnet
by the time, we get the effective energy or the energy dissipation per pattern of
each function. For an full adder, the total energy contains eight clock cycles (16ns),
65
Table 4.4: ALU function table [23].
Input
A1 A2 A3 B1 B2 C H Z U M1
Fout
State Iinj State Iinj State Iinj State Iinj State Iinj State Iinj State Iinj State Iinj State Iinj Iinj
Full-adder 0/1 0 0/1 N 0/1 N 0/1 N 0/1 N 0/1 N X 0 X 0 X 0 P F1F0
Subtractor 0/1 P 0/1 N 0/1 N 0/1 N 0/1 N 0/1 N X 0 X 0 X 0 P F2F0
Multiplexor 0/1 0 0/1 P 0/1 N 0/1 N 0/1 0 0/1 N B1 N 1 N 0 N N F0
increment 0/1 0 0/1 N 0/1 N 1 N 1 N 0 N X 0 X 0 X 0 P F1F0
decrement 0/1 P 0/1 N 0/1 N 1 N 1 N 0 N X 0 X 0 X 0 P F1F0
decrement 0/1 P 0/1 N 0/1 N 1 N 1 N 0 N X 0 X 0 X 0 P F1F0
NAND3 0/1 0 0/1 P 0/1 0 0/1 P 0/1 0 0/1 P Z P 0 P X 0 0 F0
NOR3 0/1 0 0/1 P 0/1 0 0/1 P 0/1 0 0/1 P Z P 1 P X 0 0 F0
which allows the input magnets to check all the possible input patterns (from 000
to 111). The energy is Etot = 3.54pJ , corresponding effective energy is Eeffective =
Etot/8 = 0.4425pJ , and energy efficiency Eefficiency = 1/Eeffective = 1/(0.4425pJ) =
2.26operations/pJ .
Similarly, the subtractor has total energy of Etot = 3.206pJ , corresponding effec-
tive energy Eeffective = Etot/8 = 0.4pJ , and energy efficiency Eefficiency = 1/Eeffective =
1/(0.4pJ) = 2.5operations/pJ . The multiplexor has total energy of Etot = 3.222pJ ,
corresponding effective energy Eeffective = Etot/8 = 0.403pJ , and energy efficiency
Eefficiency = 1/Eeffective = 1/(0.403pJ) = 2.481operations/pJ . The increment/decrement
function has total energy of Etot = 0.795pJ , corresponding effective energy Eeffective =
Etot/2 = 0.398pJ , and energy efficiency Eefficiency = 1/Eeffective = 1/(0.398pJ) =
2.513operations/pJ . Here for the increment/decrement function, we divide total en-
ergy by 2 (instead of 8) to get effective energy or the energy dissipation per pattern
because the number of effective inputs are two (A and B).
Table 4.5 shows the projected energy results of different ASL devices. Effective
energy of half adder (non-ALU based) is higher than the full adder (ALU based) due
to the circuit structure difference — the half adder has 20 magnets, and the ALU
based full adder has 16 magnets.
66
Table 4.5: Projected energy results of different ASL devices.
Projected Energy Energy Efficiency (operations/pJ)
Buffer 0.147pJ 8.032
Inverter 0.0949pJ 10.537
Latch (Buffer) 0.2065pJ 4.843
Latch (Inverter) 0.1578pJ 6.337
Majority gate (NAND) 0.195pJ 5.444
Majority gate (NOR) 0.1837pJ 5.128
Half adder 0.936pJ 1.068
Full adder 0.4425pJ 2.26
Subtractor 0.4pJ 2.5
Multiplexor 0.403pJ 2.481
increment/decrement 0.398pJ 2.513
4.4 Energy Efficiency Bounds for ASL Circuits from Physical
Information Theory
4.4.1 From Projected Energy to Irreversibility Induced Energy
In section 4.1, section 4.2 and section 4.3, we discussed the circuit simulation re-
sults of ASL devices. In this section, physical-information-theoretic analysis will be
used to the same circuits regarding the energy dissipation calculation. As introduced
in Chapter 3, the physical information theory is different from analog circuit simula-
tion. It only concerns the energy dissipation caused by the irreversible information
loss during the logic computation. The other energy dissipation sources such as power
supply,and transport process are excluded.
In order to apply physical information theory to ASL, we need to distinguish the
irreversibility induced and projected sources of energy dissipation in computation via
ASL. The irreversibility induced sources of heat dissipation are those sources that
cannot be eliminated by improving the structure or material of the circuit, while the
67
Figure 4.12: Fundamental and supporting sources. Green arrows are charge cur-
rents while pink arrows are spin currents. Solid arrows are currents that make up
fundamental sources while dash ones are supporting sources.
associated practical sources can be minimized only to the extent that the structure,
material, and other related can be optimized (which is unknowable). The physical
information theory only applies to the heat dissipation caused by the irreversibility
induced information loss, which is knowable and unavoidable for a given circuit and
clocking scheme.
In ASL circuit, the supporting sources of heat dissipation are the currents tunnel-
ing through spacers and the leaky currents in the conductors, i.e. the currents that
flow perpendicular in lateral channels and flow in lateral when they are supposed
to go perpendicularly, as shown in Fig. 4.12. The irreversibility induced sources
consist of two parts. One is the irreversible information loss from the realization of
implemented logic function. The other one is the heat dissipation from currents that
cannot be eliminated even if the device is perfect, i.e. these currents are fundamental
for keeping the device functioning normally.
The physical decomposition of ASL is shown in Fig. 4.13. The green part is the
artifact A. Orange part is the supporting subsystem A. The blue part is the heat
bath B, and the black part is the remote environment B.
68
Figure 4.13: Physical decomposition of ASL magnet and its connecting channel, power
supply, ground lead, and substrate. The green part is the part of the ferromagnetic
material magnet, and the connection channel, corresponding to the artifact A. Orange
part is the magnet and channel of other computing units, corresponding to the sup-
porting subsystem A. The blue part is the non-ferromagnetic material substrate heat
bath B, and the black part is the power contact and substrate remote environment
B.
4.4.2 Efficiency Bounds for ASL Circuits: Detailed Application Example
We take the 2-bit ripple adder (Fig. 4.14) as an example to show how the physical
information theory applies to ASL devices. The clock is provided at the bottom
of Fig. 4.14. Blue line stands for CLK while green line stands for CLK-b. The
first five figures of Fig. 4.14 are the schematic of 2-bit ripple adder in the ηth full
computational cycle. In computational step c1, addends A1 and B1, and carry C1
receive information from the previous stage, but don’t do any further computation
because CLK is low. In computational step c2, CLK is high. The computation in
adder1 is done in this step and results are sent to C2. In this scenario, we assume A2
and B2 are loaded at the same time when C2 receives the result from the previous
stage. Thus, at the end of c2, information about this computational cycle are in A1,
B1, C1, Co1, S1, A2, B2, and C2, as shown in Fig. 4.14 (c2). In computational
step c3, A1, B1, and C1 receive inputs of (η + 1)
th computational cycle, erasing the
information of ηth computational cycle, but the (η+ 1)th inputs don’t do any further
69
computation because CLK is low, thus Co1, and S1 still stores the information of
(η)th computational cycle. Adder 2 completes its logic function in this step and sends
its result to an external register which is not shown in Fig. 4.14. In computational
step c4, information in Co1, S1, A2, B2, and C2 is erased because adder 1 rewrites
them with the results of the η + 1th computational cycle. In computational step c5,
information in Co2, S2, A3, B3, and C3 is replaced with the information from the
η + 1th computational cycle. Up to now, all the information of ηth computational
cycle is erased from artifact A.
Process Abstraction: From the analysis above, the clock cycle has two phases—
SWITCH (φ1) and HOLD (φ2). SWITCH (φ1) and HOLD (φ2 ) operate on the se-
lected clock zones. When the supply voltage is high (or negative), SWITCH (φ1) acts
on the related clock zones to complete the logic function, changing the information
of every representational element C according to the logic function L : x
(η)
i → L(x(η)i )
and sending the results to next clock zones. When the supply voltage is zero, HOLD
(φ2) acts on the related clock zones, receiving information from the previous stage
and changing the representational elements C contacted directly to the previous stage
while other representational elements stay the same.
To understand process abstraction a step further, we can analyze state transfor-
mation for 2-bit ASL ripple adder.
First, at the time point that the ηth input is about to go into the concerned
information processing artifact A, there is no information about the referent R in the
artifact A.
The referent R is correlated with the supporting computational subsystem A,
which could be the buffer register connected to the artifact A or the previous stage
registers that store the computing results. The state of the ηth input referent R
is represented as ρˆRη =
8∑
i=1
pi|xRηi 〉〈xRηi | where |xRηi 〉 are the orthogonal pure states
70
(c1)
(c2)
(c3)
(c4)
(c5)
(clock)
Figure 4.14: Data zones in different computational step c1∼c5.
71
encoding the inputs {xi} = {A1i, B1i, C1i}. Because A1i, B1i, or C1i can be 0 or 1,
there are 23 = 8 different pure states. pi is the probability of each pure state.
After the ηth referent Rη goes into the artifact A, the entire closed system evolves
according to Shrodinger equation. The evolving procedure of each computational
steps ck is shown in Table 4.6. For computational step c3 and c4, information loss,
or heat dissipation is occurred. Thus, they have a restoration process Uˆ restdiss = Uˆ
BB ⊗
IRηAkAk , which means the remote environment B restores the heat bath B to the
thermal equilibrium state.
Here we interpret the state transformation for the ASL adder in Table 4.6. At
computational step c1, in the beginning, inputs are not loaded into artifact A, which
means the referent Rη is only correlated with the supporting system A. Thus, ρˆ
Rη
i
only has tensor product with ρˆi
Ak for i = 1 to 8. During the computation process,
the data flows into artifact Ak. Thus Ak is involved into the data zone and Rη, B, B
remain identical. In computational step c2, the initial state involves in Rη, Ak and
Ak (ρˆ0,i stands for the ending state of c1). Thus, we have ρˆ
Rη
i , ρˆ
Ak
0,i , and ρˆ
Ak
0,i tensor
together. The state transformation of c2 is the same as c1. In computational step c3,
all the xis are in in Ak and Ak, so
ˆ
ρ
Rη
i , ρˆ
Ak
1,i and, ρˆ
Ak
1,i tensor together for the initial
state. In the state transformation, we send information to the next stage (Ak) and
information in A1, B1, and C1 are erased, leading to an irreversible information loss in
this stage. Thus we have heat dissipation from Ak to B and A (Uˆ
AkBA) and from B to
B (Uˆ restdiss = Uˆ
BB ⊗ IRηAkAk). Here, UˆAkAk also stands for Ak is starting to disconnect
from Ak and finish the disconnection by the end of c3. In c4, data xi or L(xi) is in
the referent, artifact A, receiving magnet A and the greater environment. So ρˆRη ,
ρˆAk2,i , ρˆ
Ak
2,i , and ρˆ
B
2,i tensor together. In the state transformation, the data in S1, Co1,
A2, B2, and C2 are erased, leading to an irreversible information loss. Thus we have
heat dissipation from Ak to B (Uˆ
AkB) and from B to B (Uˆ restdiss = Uˆ
BB ⊗ IRηAkAk). In
computational step c5, data xi or L(xi) is in the referent, artifact A, receiving magnet
72
Table 4.6: State transformation for the ASL adder.
Computational Step Initial State State Transformation Control Operation
c1 ρˆ0 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAki )⊗ ρˆAk ⊗ ρˆB ⊗ ρˆB ρˆ1 = Uˆ restUˆ1ρˆ0Uˆ1Uˆ rest Uˆ1 = UˆAkAk ⊗ IRηBB
c2 ρˆ1 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAk0,i ⊗ ρˆAk0,i )⊗ ρˆB ⊗ ρˆB ρˆ2 = Uˆ restUˆ2ρˆ1Uˆ2Uˆ rest Uˆ2 = UˆAk ⊗ IRηAkBB
c3 ρˆ2 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAk1,i ⊗ ρˆAk1,i )⊗ ρˆB ⊗ ρˆB ρˆ3 = Uˆ restdiss Uˆ3ρˆ2Uˆ3Uˆ restdiss Uˆ3 = UˆAkAkB ⊗ IRηB
c4 ρˆ3 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAk2,i ⊗ ρˆAk2,i ⊗ ρˆB2,i)⊗ ρˆB ρˆ4 = Uˆ restdiss Uˆ4ρˆ3Uˆ4Uˆ restdiss Uˆ4 = UˆAkB ⊗ IRηAkB
c5 ρˆ4 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAk3,i ⊗ ρˆAk3,i ⊗ ρˆB3,i)⊗ ρˆB ρˆ5 = Uˆ restUˆ5ρˆ4Uˆ5Uˆ rest Uˆ5 = UˆAkAk ⊗ IRηBB
A and the greater environment. So ρˆRη , ρˆAk3,i , ρˆ
Ak
3,i , and ρˆ
B
3,i tensor together. In state
transformation, data in S2 and Co2 are erased. However, the information loss of this
data loss belongs to the next computational cycle. No heat dissipation will be make
in this cycle. We only have control operation UˆAkA to disconnect of xi between Ak
and Ak.
Operational decomposition: Because the clock cycle Φ has two phases φ1 and φ2
, it can be written as Φ = ϕ1ϕ2 where ϕv is the clock step, with
ϕ1 : {(C(1);φ2), (C(2);φ1)} (4.2)
ϕ2 : {(C(1);φ1), (C(2);φ2)} (4.3)
where C(1) and C(2) are the clock zones, representing the representational elements
in adder 1 and 2. For a single input x
(η)
i , computational cycle Γ requires three full
clock cycles, which could be described as
Γ = c1c2c3c4c5 = ϕ
(1)
1 ϕ
(1)
2 ϕ
(2)
1 ϕ
(2)
2 ϕ
(3)
1 (4.4)
Now, we have a clear understanding of process abstraction. It is a qualitative analysis
of information loss. We will move on to the quantitative analysis of information loss
— the cost analysis.
Cost Analysis: First, denote receiving magnets A, B, and C as clock subzones
C1(u) for each clock zone C(u). Denote magnets S, and Co as clock subzones C2(u)
73
for each clock zone C(u). Then the kth data zone D(k) with the input xηi in the η
th
computational cycle could be written as
D(c1) = C1(1) (4.5)
D(c2) = C1(1) ∪ C2(1) ∪ C1(2) (4.6)
D(c2) = C1(1) ∪ C2(1) ∪ C1(2) (4.7)
D(c3) = C2(1) ∪ C1(2) ∪ C2(2) (4.8)
D(c4) = C2(2) (4.9)
D(c5) = ∅ (4.10)
To calculate the physical information loss, note that only computational step c3
and c4 have non-zero information loss. Assuming inputs have the same rate to be 1 or
0, i.e. 50% input is 1, and 50% input is 0. For the first ripple-adder, we have inputs
and outputs status in Table 4.7. Note that because we apply a positive voltage to
the magnet, outputs are the NOT logic to their original values.
From Table 4.7, we can see that the outputs Co1 and S1 have four types: 00, 01,
10, 11. Their rates are shown in Table 4.8. From Table 4.7 and Table 4.8 we can
see that output pattern 00 only have one corresponding input pattern 000. The same
with output pattern 11, which only has one corresponding input pattern 111. So no
information loss during the data transport if inputs are 000 or 111. For the other six
inputs pattern, we can use the method in Chapter 3 to get the result.
For the second adder, input rate of C2 is fixed by the first stage calculation (see
Table 4.7 and Table 4.8). Input rates of A2 and B2 are a uniform distribution, i.e.
input A or B has 50% to be 1 and 50% to be 0. Thus, we have inputs and outputs of
the second adder as Table 4.9.
74
Table 4.7: Inputs and outputs of the first adder.
Input Output
Rate A1 B1 C1 Co1 S2 Rate
1
8
0 0 0 1 1 1
8
1
8
0 0 1 1 0 1
8
1
8
0 1 0 1 0 1
8
1
8
0 1 1 0 1 1
8
1
8
1 0 0 1 0 1
8
1
8
1 0 1 0 1 1
8
1
8
1 1 0 0 1 1
8
1
8
1 1 1 0 0 1
8
From Table 4.9, we can get the outputs rates of the second adder as Table 4.10.
Similar to the first ripple-adder, output pattern 00 only have one corresponding input
pattern 000.
Moreover, output pattern 11 only has one corresponding input pattern 111. No
information loss in these two kinds of inputs pattern. For the rest six input patterns,
we can do the following calculation.
Table 4.8: Outputs of the first adder.
Output
Co2 S2 Rate
0 0 1
8
0 1 3
8
1 0 3
8
1 1 1
8
According to physical information theory, the heat dissipating for the kth compu-
tational step is
75
Table 4.9: Inputs and outputs of the second adder.
Input Output
Rate A2 B2 C2 Co2 S2 Rate
1
16
0 0 0 1 1 1
16
3
16
0 0 1 1 0 3
16
3
16
0 1 0 1 0 3
16
1
16
0 1 1 0 1 1
16
1
16
1 0 0 1 0 1
16
3
16
1 0 1 0 1 3
16
3
16
1 1 0 0 1 3
16
1
16
1 1 1 0 0 1
16
Table 4.10: Outputs of the second adder.
Output
Co2 S2 Rate
0 0 1
16
0 1 7
16
1 0 7
16
1 1 1
16
76
∆〈EB〉k ≥
∑
L
w(k)
−kBT ln(2)∆IRηL
(k)
w (4.11)
In step 3, pi =
1
8
, and qj|i =
pi
qj
= 1
8
× 8
3
= 1
3
. The information loss is −∆I =
H(X|Y ) =
r−1∑
j=0
q(yj)H(X|yj) = −
r−1∑
j=0
d−1∑
i=0
qjpi|jlog2pi|j = −2 × 18 × 3 × 13 × 3 ×
log2
1
3
= 1.19. Similarly, in step 4, the information loss is −∆I = H(X|Y ) =
r−1∑
j=0
q(yj)H(X|yj) = −
r−1∑
j=0
d−1∑
i=0
qjpi|jlog2pi|j = −2 × 716 × (2 × 37 log2 37 + 17 log2 17) = 1.27.
Thus, the information loss, i.e. the heat dissipation in computational step 3 and 4
are
∆〈EB〉3 ≥ 1.19kBT ln(2) (4.12)
and
∆〈EB〉4 ≥ 1.27kBT ln(2) (4.13)
The total information loss is
∆〈E〉TOT =
6∑
k=1
∆E(c− k) =
6∑
k=1
∆〈Eb〉k
= ∆〈EB〉3 + ∆〈EB〉4 = 2.46kBT ln(2) (4.14)
For Boltzmann constant kB = 1.38× 10−23m2kgs−2K−1, the total energy cost is
∆〈E〉TOT = 2.46kBT ln(2) = 2.46× 1.38× 10−23 × ln(2) = 2.35× 10−23J
4.4.3 Efficiency Bounds for Buffer, Inverter, Latch, Majority Gate, and
Half Adder
We illustrated the physical information calculation steps in in the previous section.
Now we can apply it to the ASL circuits in section 4.1, section 4.2 and section 4.3.
 Buffer, inverter and latch: Buffer, inverter and latch implement simple logic
function. They only have one input magnet and corresponding one output
magnet, which means input X only has one possible output Y related to it. No
matter the function is a NOT or a COPY, the input and output are paired
77
together. No 1 to N or N to 1 situation exists. This means all information from
the input to the output is reserved. No information is lost.
Table 4.11 shows the inputs and output of ASL buffer, inverter and latch. We
could see that for one certain output, only one input corresponds, which means
the conditional probability qi|j =
pi
qj
=
1
2
1
2
= 1, the entropy H(X|yj) = log2pi|j =
log21 = 0.
Thus the information loss −∆I = H(X|Y ) = −∑r−1j=0∑d−1i=0 qjpi|jlog2pi|j = 0.
 Majority gate: The 3-input majority has 3 inputs and one output. As shown
in Table 2.1, the 3-input majority gate can be a NAND gate or a NOR gate.
The input rates and corresponding output rates are shown in Table 4.12, m1,
m2 and m3 represent the three inputs magnets, and m4 represents the output
magnet.
Thus, the information loss for NAND function majority gate is−∆I = H(X|Y ) =
−∑r−1j=0∑d−1i=0 qjpi|jlog2pi|j = 34 × 13 × 3 × log2 13 = 1.19. The information loss
for NOR gate is −∆I = H(X|Y ) = −∑r−1j=0∑d−1i=0 qjpi|jlog2pi|j = 34 × 13 ×
3 × log2 13 = 1.19. The total energy cost is ∆〈E〉TOT = 1.19kBT ln(2) =
1.19×1.38×10−23× ln(2) = 1.14×10−23J for both NAND gate and NOR gate.
Because the total irreversible induced energy here is already averaged to the
per pattern value, energy efficiency Eefficiency = 1/Etot = 1/(1.14 × 10−23J) =
8.772× 1022operations/J .
 Half adder: The half adder schematic is shown in Figure 4.7. Exchanging each
NAND gate with the 3-input majority gate unit, we get the detailed half adder
layout, as shown in Figure 4.15. The half adder is made of five NAND gates.
Depending on the information flow, we briefly divide it into three stages.
In the first stage, inputs go into the first NAND gate (colored in blue), and
get the first interim output m3. In the second stage (colored in blue), interim
78
output goes into the next three NAND gates, getting one final output Carry
and two interim outputs m6 and m9. In the third stage (colored in yellow), the
two interim outputs go into the last NAND gate, getting the final output SUM.
According to the truth table, all of the majority gates have one magnet served
with logic zero (negative voltage) to get the NAND gate function.
The information loss occurs when the information stored in stage 1, 2 and 3 is
erased.
For the first stage, the information loss is the same with that of a NAND-
function majority gate. −∆I = H(X|Y ) = −∑r−1j=0∑d−1i=0 qjpi|jlog2pi|j = 34 ×
1
3
× 3× log2 13 = 1.19.
For the second stage, the inputs and the corresponding output rates are shown
in Table 4.13. The output magnet mx6, mx9 and mx12 have one to one output
patterns with input, i.e. given one certain output y, only one certain input x is
related to it. Thus the information loss for the second stage is 0.
For the third stage the inputs and the corresponding output rates are shown in
Table 4.14. −∆I = H(X|Y ) = −∑r−1j=0∑d−1i=0 qjpi|jlog2pi|j = 12× 12×2× log2 12 =
0.5.
Thus the total information loss of the three stages is 1.19 + 0.5 = 1.69. The
corresponding energy cost is ∆〈E〉TOT = 1.69kBT ln(2) = 1.69× 1.38× 10−23×
ln(2) = 1.61 × 10−23J . Energy efficiency Eefficiency = 1/Etot = 1/(1.61 ×
10−23J) = 6.211× 1022operations/J .
4.4.4 Efficiency Bounds for ALU
For all the circuits calculated above, the state transformation is shown in Table
4.15. To calculate the energy efficiency, we listed the inputs and outputs rates for
each function.
79
Table 4.11: Outputs of buffer, inverter and latch.
buffer latch inverter
Rate Input Output Output
1
2
0 0 1
1
2
1 1 0
Table 4.12: Inputs and outputs of the majority gate.
Input Output
Rate m1 m2 m3 m4 Rate Function
1
4
0 0 0 1 1
4
NAND
1
4
0 0 1 1 1
4
1
4
0 1 0 1 1
4
1
4
0 1 1 0 1
4
1
4
1 0 0 1 1
4
NOR
1
4
1 0 1 0 1
4
1
4
1 1 0 0 1
4
1
4
1 1 1 0 1
4
Table 4.13: Inputs and outputs of the half adder middle stage.
Input Output
Rate m4 m5 m7 m8 m10 m11 m6 m9 m12
1
4
0 1 0 1 1 1 1 1 0
1
4
0 1 1 1 1 1 1 0 0
1
4
1 1 0 1 1 1 0 1 0
1
4
1 0 1 0 0 0 1 1 1
80
Table 4.14: Inputs and outputs of the half adder middle stage.
Input Output
Rate m14 m13 m15(SUM) Rate
1
4
1 0 1 1
4
1
4
0 1 1 1
4
1
2
1 1 0 1
4
Figure 4.15: Half adder layout.
81
 Full adder: According to the ALU schematic Fig. 4.10 and function configura-
tions Table 4.4, the full adder has two computation stages. In the first stage
we get the carry-out from majority gate M1 and M3, where M1 has the interim
outcome ready for M2 and one of the final outputs Cout (F1). Here the output
of M3 (F2) doesn’t have any logic meaning. In the second stage, we get the
final sum of A, B, and Cin. The detailed truth table is shown in Table 2.2. The
inputs are evenly distributed with the rate of 1
8
.
For the first stage, −∆I = H(X|Y ) = −∑r−1j=0∑d−1i=0 qjpi|jlog2pi|j = 2× 12 × 14 ×
4×log2 14 = 2. For the second stage, −∆I = H(X|Y ) = −
∑r−1
j=0
∑d−1
i=0 qjpi|jlog2pi|j =
2 × 1
2
× 1
4
× 4 × log2 14 = 2. Thus the total information loss of the three stages
is 2 + 2 = 4. The corresponding energy cost is ∆〈E〉TOT = 4kBT ln(2) =
4×1.38×10−23× ln(2) = 3.87×10−23J . Energy efficiency Eefficiency = 1/Etot =
1/(3.87× 10−23J) = 2.584× 1022operations/J .
 Subtractor: According to the ALU schematic Fig. 4.10 and function configu-
rations Table 4.4, the subtractor has two computation stages, which is similar
to the configuration of the full adder. In the first stage we get the borrow-out
from majority gate M1 and M3, where M1 has the interim outcome ready for
M2 and one of the final outputs Bout (F1). Here the output of M1 (F1) doesn’t
have any logic meaning. In the second stage, we get the final difference of A,
B, and Bin. The detailed truth table is shown in Table 4.16. The inputs are
evenly distributed with the rate of 1
8
.
For the first stage, −∆I = H(X|Y ) = −∑r−1j=0∑d−1i=0 qjpi|jlog2pi|j = 2× 12 × 14 ×
4×log2 14 = 2. For the second stage, −∆I = H(X|Y ) = −
∑r−1
j=0
∑d−1
i=0 qjpi|jlog2pi|j =
2 × 1
2
× 1
4
× 4 × log2 14 = 2. Thus the total information loss of the three stages
is 2 + 2 = 4. The corresponding energy cost is ∆〈E〉TOT = 4kBT ln(2) =
82
4×1.38×10−23× ln(2) = 3.87×10−23J . Energy efficiency Eefficiency = 1/Etot =
1/(3.87× 10−23J) = 2.584× 1022operations/J .
 Multiplexor: According to the ALU schematic Fig. 4.10 and function config-
urations Table 4.4, the multiplexor has two computation stages. In the first
stage, we get the interim results from majority gate M1. Majority gate M3
also process information in this stage, but the output F2 is not needed for this
function. In the second stage, we get the final output. When the select signal
A is logic 1, input B is selected. When the signal is logic 0, input C is selected.
For the first stage, −∆I = H(X|Y ) = −∑r−1j=0∑d−1i=0 qjpi|jlog2pi|j = 38 × 13 ×
3 × log2 13 + 58 × 15 × 5 × log2 15 = 0.594 + 1.451 = 2.045. For the second stage,
−∆I = H(X|Y ) = −∑r−1j=0∑d−1i=0 qjpi|jlog2pi|j = 2 × 12 × 14 × 4 × log2 14 = 2.
Thus the total information loss of the three stages is 2.045 + 2 = 4.045. The
corresponding energy cost is ∆〈E〉TOT = 4kBT ln(2) = 4.045 × 1.38 × 10−23 ×
ln(2) = 3.87 × 10−23J . Energy efficiency Eefficiency = 1/Etot = 1/(3.87 ×
10−23J) = 2.584× 1022operations/J .
 Increment/Decrement: The truth table of increment and decrement is shown
in Table 4.17. Input B and C are control signals, and A is the actual input.
When the implement function is increment, the output F1F0 realizes function
F1F0 = A + 1. When the implement function is decrement, the output F2F0
realizes function F1F0 = A− 1. From the truth table, we could see the input
and output pattern has one to one relationship. Thus the information loss of
the increment and decrement function is 0.
4.4.5 Simulation-Based Projections vs Fundamental Bounds
In previous sections, we used two methods to calculate some ASL circuits and get
their energy dissipation (or heat dissipation). Results for projected energy dissipation
83
Table 4.15: State transformation for the buffer, inverter, latch, majority gate, half
adder, and ALU.
Computational Step Initial State State Transformation Control Operation
Buffer and Inverter
c1 ρˆ0 = (
∑2
i=1 ρˆ
Rη
i ⊗ ρˆAki )⊗ ρˆAk ⊗ ρˆB ⊗ ρˆB ρˆ1 = Uˆ restUˆ1ρˆ0Uˆ1Uˆ rest Uˆ1 = UˆAkAk ⊗ IRηBB
c2 ρˆ1 = (
∑2
i=1 ρˆ
Rη
i ⊗ ρˆAk0,i ⊗ ρˆAk0,i )⊗ ρˆB ⊗ ρˆB ρˆ2 = Uˆ restUˆ2ρˆ1Uˆ2Uˆ rest Uˆ2 = UˆAk ⊗ IRηAkBB
c3 ρˆ2 = (
∑2
i=1 ρˆ
Rη
i ⊗ ρˆAk1,i ⊗ ρˆAk1,i )⊗ ρˆB1,i ⊗ ρˆB ρˆ3 = Uˆ restUˆ3ρˆ2Uˆ3Uˆ rest Uˆ3 = UˆAkAk ⊗ IRηBB
c4 ρˆ3 = (
∑2
i=1 ρˆ
Rη
i ⊗ ρˆAk2,i ⊗ ρˆAk2,i )⊗ ρˆB2,i ⊗ ρˆB ρˆ4 = Uˆ restUˆ4ρˆ3Uˆ5Uˆ rest Uˆ4 = UˆAkAk ⊗ IRηBB
Latch
c1 ρˆ0 = (
∑2
i=1 ρˆ
Rη
i ⊗ ρˆAki )⊗ ρˆAk ⊗ ρˆB ⊗ ρˆB ρˆ1 = Uˆ restUˆ1ρˆ0Uˆ1Uˆ rest Uˆ1 = UˆAkAk ⊗ IRηBB
c2 ρˆ1 = (
∑2
i=1 ρˆ
Rη
i ⊗ ρˆAk0,i ⊗ ρˆAk0,i )⊗ ρˆB ⊗ ρˆB ρˆ2 = Uˆ restUˆ2ρˆ1Uˆ2Uˆ rest Uˆ2 = UˆAk ⊗ IRηAkBB
c3 ρˆ2 = (
∑2
i=1 ρˆ
Rη
i ⊗ ρˆAk1,i ⊗ ρˆAk1,i )⊗ ρˆB1,i ⊗ ρˆB ρˆ3 = Uˆ restUˆ3ρˆ2Uˆ3Uˆ rest Uˆ3 = UˆAkAk ⊗ IRηBB
c4 ρˆ3 = (
∑2
i=1 ρˆ
Rη
i ⊗ ρˆAk2,i ⊗ ρˆAk2,i )⊗ ρˆB2,i ⊗ ρˆB ρˆ4 = Uˆ restUˆ4ρˆ3Uˆ4Uˆ rest Uˆ3 = UˆAk ⊗ IRηAkBB
c5 ρˆ4 = (
∑2
i=1 ρˆ
Rη
i ⊗ ρˆAk3,i ⊗ ρˆAk3,i )⊗ ρˆB3,i ⊗ ρˆB ρˆ5 = Uˆ restUˆ5ρˆ4Uˆ5Uˆ rest Uˆ5 = UˆAkAk ⊗ IRηBB
Majority Gate
c1 ρˆ0 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAki )⊗ ρˆAk ⊗ ρˆB ⊗ ρˆB ρˆ1 = Uˆ restUˆ1ρˆ0Uˆ1Uˆ rest Uˆ1 = UˆAkAk ⊗ IRηBB
c2 ρˆ1 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAk0,i ⊗ ρˆAk0,i )⊗ ρˆB ⊗ ρˆB ρˆ2 = Uˆ restUˆ2ρˆ1Uˆ2Uˆ rest Uˆ2 = UˆAk ⊗ IRηAkBB
c3 ρˆ2 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAk1,i ⊗ ρˆAk1,i )⊗ ρˆB1,i ⊗ ρˆB ρˆ3 = Uˆ restdiss Uˆ3ρˆ2Uˆ3Uˆ restdiss Uˆ3 = UˆAkAk ⊗ IRηBB
c4 ρˆ3 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAk2,i ⊗ ρˆAk2,i )⊗ ρˆB2,i ⊗ ρˆB ρˆ4 = Uˆ restUˆ4ρˆ3Uˆ5Uˆ rest Uˆ4 = UˆAkAk ⊗ IRηBB
Half Adder
c1 ρˆ0 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAki )⊗ ρˆAk ⊗ ρˆB ⊗ ρˆB ρˆ1 = Uˆ restUˆ1ρˆ0Uˆ1Uˆ rest Uˆ1 = UˆAkAk ⊗ IRηBB
c2 ρˆ1 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAk0,i ⊗ ρˆAk0,i )⊗ ρˆB ⊗ ρˆB ρˆ2 = Uˆ restUˆ2ρˆ1Uˆ2Uˆ rest Uˆ2 = UˆAk ⊗ IRηAkBB
c3 ρˆ2 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAk1,i ⊗ ρˆAk1,i )⊗ ρˆB ⊗ ρˆB ρˆ3 = Uˆ restdiss Uˆ3ρˆ2Uˆ3Uˆ restdiss Uˆ3 = UˆAkAkB ⊗ IRηB
c4 ρˆ3 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAk2,i ⊗ ρˆAk2,i ⊗ ρˆB2,i)⊗ ρˆB ρˆ4 = Uˆ restdiss Uˆ4ρˆ3Uˆ4Uˆ restdiss Uˆ4 = UˆAkB ⊗ IRηAkB
c5 ρˆ4 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAk3,i ⊗ ρˆAk3,i ⊗ ρˆB3,i)⊗ ρˆB ρˆ5 = Uˆ restdiss Uˆ5ρˆ4Uˆ5Uˆ restdiss Uˆ4 = UˆAkB ⊗ IRηAkB
c6 ρˆ5 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAk4,i ⊗ ρˆAk4,i ⊗ ρˆB4,i)⊗ ρˆB ρˆ6 = Uˆ restUˆ6ρˆ5Uˆ6Uˆ rest Uˆ6 = UˆAkAk ⊗ IRηBB
ALU
c1 ρˆ0 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAki )⊗ ρˆAk ⊗ ρˆB ⊗ ρˆB ρˆ1 = Uˆ restUˆ1ρˆ0Uˆ1Uˆ rest Uˆ1 = UˆAkAk ⊗ IRηBB
c2 ρˆ1 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAk0,i ⊗ ρˆAk0,i )⊗ ρˆB ⊗ ρˆB ρˆ2 = Uˆ restUˆ2ρˆ1Uˆ2Uˆ rest Uˆ2 = UˆAk ⊗ IRηAkBB
c3 ρˆ2 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAk1,i ⊗ ρˆAk1,i )⊗ ρˆB ⊗ ρˆB ρˆ3 = Uˆ restdiss Uˆ3ρˆ2Uˆ3Uˆ restdiss Uˆ3 = UˆAkAkB ⊗ IRηB
c4 ρˆ3 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAk2,i ⊗ ρˆAk2,i ⊗ ρˆB2,i)⊗ ρˆB ρˆ4 = Uˆ restdiss Uˆ4ρˆ3Uˆ4Uˆ restdiss Uˆ4 = UˆAkB ⊗ IRηAkB
c5 ρˆ4 = (
∑8
i=1 ρˆ
Rη
i ⊗ ρˆAk3,i ⊗ ρˆAk3,i ⊗ ρˆB3,i)⊗ ρˆB ρˆ5 = Uˆ restUˆ5ρˆ4Uˆ5Uˆ rest Uˆ5 = UˆAkAk ⊗ IRηBB
Table 4.16: The truth table of subtractor.
A B Bin Bout Difference
0 0 0 0 0
0 0 1 1 0
0 1 0 1 1
0 1 1 0 0
1 0 0 1 1
1 0 1 0 0
1 1 0 0 1
1 1 1 1 1
84
Table 4.17: Truth table of increment and decrement function.
Input Output
Increment
CB (control signal) A F1 F0
01 0 0 1
01 1 1 0
Decrement
CB (control signal) A F2 F0
01 0 1 1
01 1 0 0
Table 4.18: Irreversibility induced energy results of different ASL devices.
Irreversibility Induced Energy Energy efficiency (operations/J)
Buffer 0 -
Inverter 0 -
Latch 0 -
Majority gate 1.14× 10−23J 8.772×1022
Half adder 2.09× 10−23J 6.211×1022
Full adder 3.87× 10−23J 2.584×1022
Subtractor 3.87× 10−23J 2.584×1022
Multiplexor 3.87× 10−23J 2.584×1022
Increment/Decrement 0 -
85
from circuit simulations were shown in Table 4.5. Lower bounds on dissipated energy
induced by irreversible information loss are shown in Table 4.18. Both refer to energy
dissipation, but the results are quite different. They are from very different approaches
that capture complementary aspects of energy dissipation.
The circuit simulation method illustrates the projected energy dissipation, which
is numerically closest to what one would expect in actual ASL realizations. It is
the combination of both static power and the dynamic power for the full LLG-model
circuit simulation (integrated in time to obtain the final projected energy dissipation).
The physical information method, on the other hand, reveals the minimum en-
ergy dissipation, or the lowest bound for realizing the logic function via the strategy
employed in the ASL circuit according to physical law. It is a reflection of the dy-
namic power regarding the irreversible information loss, i.e. it’s not about how much
(static) power spent on maintaining the circuit running, but the power consumption
retrieves when the input information is irreversibly erased from the circuit during a
computational step. Thus the physical information method is closely related to the
dynamic power. And by using the equation 3.34, we could get the corresponding
irreversibility induced energy dissipation.
Table 4.5 and 4.18 shows the comparison of the two methods. We use the simplified
ASL circuit model as the projected energy dissipation results.
The irreversibility induced energy for the buffer, inverter, and latch is 0. Because
in this two circuits (the buffer and inverter share the same circuit, distinguished by
the supply voltage), the computation processes of them are reversible. Inputs and
outputs have a one-to-one relation, which means no information is erased from the
circuits. In this case, no irreversibility induced energy is dissipated to the environ-
ment. The 3-input majority gate realizes NAND and NOR function, which means
their computation processes are irreversible, and at least one output has multiple
inputs related to it.
86
The projected energy for buffer and inverter are similar because they use the
same circuit. The different logic function between a buffer and an inverter cause the
slight differences between the final energy dissipation results. When supplied with a
negative voltage, the ASL circuit acts as a buffer, while the positive voltage acts as
an inverter. The projected energy of latch, majority gate and have adder get greater
successively, which is reasonable because the circuits get more completed successively.
The irreversibility induced energy dissipation is the minimum cost a device would
take for a logic computation. Compared to the projected energy, it is an intrinsic
characteristic of circuit configuration and control strategy no matter what materials
are used in realization of the devices and circuit interconnects. The result of the
irreversibility induced energy is obtained from the physical information theory, a the-
ory concerned about the computation steps and how the device fundamental working
principle implements the computation steps.
Thus, the results of the irreversibility induced energy eliminate everything but the
primary energy needed for realizing a logic computation. It is the reason why some of
the results of irreversibility induced energy are low to the 10−23 or even zero (buffer,
inverter, and latch in our cases). They only show the lowest bounds of a computa-
tion step. It is the intrinsic characteristic determined by the logic computation step
and the calculation method provided by the implementing device. For the detailed
calculation step, please refer to Chapter 3.
The definition of projected energy, on the other hand, is obtained from simula-
tions based on models of actual circuit realization. It depends on constituent material
properties, supply voltage, device dimensions, and computing temperature; the target
circuit is transformed to an equivalent circuit model that reflects these dependencies
and simulated in the virtual environment. The simulation is a digital circuit simu-
lation; all possible inputs are considered and are set manually by the user, and the
results would be similar to what we have on a real chip test.
87
Thus, the results of the projected energy consist of most of the aspects in real cir-
cuit simulation, including the irreversibility induced energy. The huge gap between
the projected and irreversibility induced energy is as large as 1010, which shows the
difference between the projected energy and the irreversibility induced energy dissi-
pation. Given their complementary nature and the very different aspects of energy
dissipation that they capture, one shouldn’t be surprised by the different results and
the huge gap.
88
CHAPTER 5
SUMMARY AND CONCLUSIONS
The goal of this thesis was to understand energy efficiency of computation in a new
post-CMOS device, the all-spin logic (ASL) device. By comparing projected energy
efficiencies from ASL circuit simulations and irreversibility induced energy dissipation
of different ASL circuits, we aimed to provide estimates of the computational energy
efficiencies of a variety of ASL circuits and to gauge the gap between the projected
and irreversibility induced energy dissipation in these circuits.
The thesis was introduced in Chapter 1. It first pointed out that the limits of
CMOS devices make energy dissipation one of the problems that we concerned. Two
types of energy dissipation were defined and introduced. The objectives of the thesis
were introduced. The working principle and basic computing logic circuits of ASL
device were introduced in Chapter 2. The current states, including charge current
and spin current, were identified and discussed, illustrating why the ASL device has
low supply voltage requirement and how it becomes one of the most promising post-
CMOS devices. Chapter 3 discussed in detail about the two energy calculation meth-
ods, analog circuits simulation with MATLAB and the physical information theory.
The analog circuits simulation got the results of projected energy dissipation, while
the physical information theory got the results of the theoretical information energy
dissipation. In Chapter 4, the results of these two calculation methods were compared
and discussed. Simple circuits of ASL devices such as the buffer, inverter, latch, and
half adder were tested. A larger circuit—an ASL arithmetic logic unit (ALU) —
89
was finally studied. Simulation results and fundamental limits were compared for the
various ASL circuits.
The use of complementary theoretical approaches to analyze the same set of ASL
circuits provides insight into what might be expected for particular ASL circuit re-
alizations and what is and is not physically achievable with optimization. The low
energy dissipation resulting from the low required supply voltage in ASL was ob-
served in the circuit simulation results, supporting the widely recognized advantages
in energy efficiency of ASL when compared to conventional CMOS. The irreversibility
induced energy, which fixes the lowest bound on the dissipation incurred in a com-
putation process implemented by a particular strategy, is orders of magnitude lower
than energy dissipation projected from simulations. The theoretical lower bounds are
approachable to differing degrees in various technologies. This raises the question of
whether the gap between projected efficiencies from simulations and theoretical lower
bounds can be closed in ASL.
Most of the energy cost in ASL goes toward operating circuit, e.g. transporting
particles and generating the force of spin torque, as is clear from the calculations of
this work. While the associated energy costs are far lower than in CMOS, and can be
optimized through improvements in materials and structure design, the energy effi-
ciencies of ASL circuits cannot be expected to approach fundamental efficiency limits
associated with the computations they perform. Energetic overheads from static
power alone, induced by the currents required to maintain ASL circuit operation,
exceed fundamental limits by orders of magnitude as suggested by the projections
and limits calculated in this work.
90
APPENDIX A
LANDAU-LIFSHITZ-GILBERT EQUATION DERIVATION
To get the Landau-Lifshitz-Gilbert equation, we can follow the following steps
[25][26].
In a spatially uniform magnetic field , we focus on a small region dVr, which con-
tains N elementary magnetic moments ~µj (~µj also can be put as |µj >, j = 1, 2, 3, , N),
as shown in Fig. A.1. We could get magnetization vector ~M by summing all ~µj to-
gether and dividing by dVr.
~M(~r) =
∑N
j ~µj
dVr
(A.1)
where ~M is magnetization vector and c is the elementary magnetic moment in a small
region dVr.
Note that spin moment ~µj and angular momentum ~L can be related by equation
~µj = −γ~L (A.2)
where
γ =
g|e|
2mec
(A.3)
g ≈ 2 is the Lande splitting factor. e and me are the electron charge and mass,
respectively. c is the speed of light.
91
Figure A.1: A small region dVr in magnetic field ~H. Elementary moments |µj >s
point to random directions.
Figure A.2: Procession and damping [27]. The dotted circle indicates the procession
without damping while the solid curve shows the real damping path. TD is the
damping torque (shown in equation A.9).
92
According to momentum theorem, we have
d~L
dt
= ~µj × ~H (A.4)
where ~L is the angular momentum, t is time, ~µj is the elementary magnetic moment,
and ~H is the magnetic field. Substitute equation A.2 into equation A.4, we get
d~µj
dt
= −γ~µj × ~H (A.5)
Next we sum |µj > from 1 to N and divide both sides by dVr
d
∑N
j ~µj
dtdVr
= −γ d
∑N
j ~µj
dVr
× ~H (A.6)
Substitute equation A.1 into A.6, we get continuum precession equation:
∂ ~M
∂t
= −γ ~M × ~H (A.7)
Replacing the spatially uniform magnetic field by effective magnetic field , we get
∂ ~M
∂t
= −γ ~M × ~Heff (A.8)
Equation A.8 describes the undamped gyromagnetic procession shown in dot line
in Fig. A.2 [27]. The magnetization vector ~M goes in a circle because damping is
not taken into consideration. However, in real life, gyromagnetic procession is usually
affected by a torque ~TD. Thus the magnetization vector ~M goes in a spiral line (solid
line in Fig. A.2) instead of a circle. The spiral line is known as the damped procession.
This damping torque can be described as
~TD = −λ ~M × ( ~M × ~Heff ) (A.9)
where λ is a constant depending on the material.
93
By adding the additional torque ~TD to the original equation A.8, i.e. by adding
a damping torque to the undamped procession, we get the equation describing the
damped procession, which is also know as the Landau-Lifshitz equation:
∂ ~M
∂t
= −γ ~M × ~Heff − λ ~M × ( ~M × ~Heff ) (A.10)
The first term of this equation describes an undamped procession generated by the
interaction of magnetization vector ~M and the effective magnetic ~Heff . The second
term describes the damped procession.
The damping torque has another form as shown in equation A.11.
~TD = α ~M × ∂
~M
∂t
(A.11)
where α > 0 is the Gilbert constant, depending on the material. This torque is
generated by the magnetic field
~H = −γα∂
~M
∂t
. If this new torque ~TD is added to the undamped procession equation A.8 instead
of the torque described in equation A.9, we get a new form to describe the damped
procession:
∂ ~M
∂t
= −γ ~M × ~Heff + α ~M × ∂
~M
∂t
(A.12)
Equation A.12 is also known as Landau-Lifshitz-Gilbert equation. The first term
of this equation describes an undamped procession generated by the interaction of
magnetization vector ~M and the effective magnetic ~Heff . The second term describes
the damping torque that turns the undamped procession into the damped procession.
94
APPENDIX B
SPIN INJECTION WITH LLG EQUATION
For ASL device, we need to add a magnetization torque to the original LLG
equation due to the absorption of incoming spin current [28][18].
∂ ~m
∂t
= −γµ0 ~m× ~Heff + α~m× ∂ ~m
∂t
+
~I⊥
eNs
(B.1)
where µ0 is the free space permeability. ~I⊥ is the interface spin current that is
perpendicular to ~m , as shown in Fig. B.1 [18]. ~m is the magnetic moment unit
vector of the ferromagnet. ~I⊥ is the perpendicular part of the current which has the
tend to go through the interface of magnet (FM) and channel (NF), i.e. ~I⊥ is the
actual current that goes through the interface of FM and NF, while ~I‖ is the current
reflected by the interface. Ns is the total number of Bohr magnetons per magnet.
~Heff is the effective magnetic field applied to ferromagnet and e is the charge of an
electron. α > 0 is the material dependent Gilbert constant.
Thus, the third term in the right side is the spin torque ( ~I⊥) in NF channel. ~I⊥
can be rewrite as
~I⊥ = ~Is − ~m(~m · ~Is) = ~m× (~Is × ~m) (B.2)
where ~Is is the spin current.
We can infer a coupled spin transport-magnetization dynamics model [28], as
shown in Fig. B.2 [29]. The self-consistency can be described as below. The angular
momentum of the magnet ~m determines the conductance of the magnet. The conduc-
tance of the magnet determines how many spins would enter the magnet. The spin
95
Figure B.1: Spin current injection [18]. A nonmagnetic material (red block) is in
contact with ferromagnetic magnetic meterial (green block). For a normal magna-
tization ~m, a incoming spin up current can be written with a sum of right and left
spin states | →> and | ←> with a scalar 1/√2. Assuming the conductance G only
allows majority spin (right spin) to pass. Only the perpendicular spin current I⊥
(right spin state) is allowed to go trough the interface between the nonmagnectic
and ferromagnetic material, while I‖ (left spin state) is reflected. The magnatization
torque generated by ~I⊥ changes the original magnatization ~m of the FM.
Figure B.2: Self-consistancy of LLG nanomagnet dynamics and spin transport [29].
It starts with the first pair of coupled spin current (~Is1) and magnetic moment unit
vector (~m), and keeps circulating until both of them sattle down.
96
Figure B.3: ASL interver with (a) top view and (b) side view [29].
Figure B.4: Circuit simulation model of an ASL inverter [29]. GFM1(~m1) and
GFM1(~m2) represent the conductance of ferromagnet1 (FM1) and ferromagnet2
(FM2), respectively. Gsfpi and Gsepi represent the conductance of the ground lead
and the conducting channel respectively.
97
current enters the magnet influences the angular momentum of the magnet, which
in turn changes the conductance of the magnet. Thus, the spin current entering the
magnet would change, too. These kinds of changes will keep going on until every
value becomes stable. This self-consistency happens at the receiver side as well as
the sender side.
Fig. B.4 [29] shows the circuit model of an ASL inverter which has top view and
side view in Fig. B.3. Apply equation B.1 to this special circuit; we can easily get
LLG equations for this inverter [29]:
∂ ~m1
∂t
= −γµ0 ~m1 × ~Heff + α~m1 × ∂ ~m1
∂t
+
~I13⊥
eNs
(B.3)
∂ ~m2
∂t
= −γµ0 ~m2 × ~Heff + α~m2 × ∂ ~m1
∂t
+
~I23⊥
eNs
(B.4)
where ~I13⊥ is the perpendicular current from node 1 to node 3, i.e. the current carries
the magnetic information from magnet 1. ~I23⊥ is the current from node 2 to node
3, i.e. the current generates the torque to change the magnetization of the receiving
magnet.
98
BIBLIOGRAPHY
[1] G. E. Moore, “Cramming more components onto integrated circuits. electronics
38 (8): 114–117,” 1965.
[2] L. Wilson, “International technology roadmap for semiconductors (ITRS),”
Semiconductor Industry Association, 2013.
[3] V. Calayir, D. E. Nikonov, S. Manipatruni, and I. A. Young, “Static and clocked
spintronic circuit design and simulation with performance analysis relative to
cmos,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 61,
no. 2, pp. 393–406, 2014.
[4] B. Behin-Aein, D. Datta, S. Salahuddin, and S. Datta, “Proposal for an all-spin
logic device with built-in memory,” Nature nanotechnology, vol. 5, no. 4, p. 266,
2010.
[5] S. F. Tuan and J. Sakurai, Modern quantum mechanics. Pearson Education
Asia, 2017.
[6] M. Johnson, “Optimized device characteristics of lateral spin valves,” IEEE
Transactions on Electron Devices, vol. 54, no. 5, pp. 1024–1031, 2007.
[7] D. Wan, M. Manfrini, A. Vaysset, L. Souriau, L. Wouters, A. Thiam, E. Ray-
menants, S. Sayan, J. Jussot, J. Swerts, and S. Couet, “Fabrication of magnetic
tunnel junctions connected through a continuous free layer to enable spin logic
devices,” Japanese Journal of Applied Physics, vol. 57, no. 4S, p. 04FN01, 2018.
[8] D. Morris, D. Bromberg, J.-G. J. Zhu, and L. Pileggi, “mlogic: Ultra-low volt-
age non-volatile logic circuits using stt-mtj devices,” in Proceedings of the 49th
Annual Design Automation Conference. ACM, 2012, pp. 486–491.
[9] A. Wang and A. Chandrakasan, “A 180-mv subthreshold fft processor using
a minimum energy design methodology,” IEEE Journal of solid-state circuits,
vol. 40, no. 1, pp. 310–319, 2005.
[10] M. Alawein and H. Fariborzi, “Improved circuit model for all-spin logic,” in
Nanoscale Architectures (NANOARCH), 2016 IEEE/ACM International Sym-
posium on. IEEE, 2016, pp. 135–140.
[11] I. Ercan and N. G. Anderson, “Heat dissipation in nanocomputing: lower
bounds from physical information theory,” IEEE Transactions on Nanotechnol-
ogy, vol. 12, no. 6, pp. 1047–1060, 2013.
99
[12] R. Landauer, “Irreversibility and heat generation in the computing process,”
IBM journal of research and development, vol. 5, no. 3, pp. 183–191, 1961.
[13] M. Sharad, K. Yogendra, K.-W. Kwon, and K. Roy, “Design of ultra high density
and low power computational blocks using nano-magnets,” in Quality Electronic
Design (ISQED), 2013 14th International Symposium on. IEEE, 2013, pp.
223–230.
[14] C. Augustine, G. Panagopoulos, B. Behin-Aein, S. Srinivasan, A. Sarkar, and
K. Roy, “Low-power functionality enhanced computation architecture using spin-
based devices,” in Nanoscale Architectures (NANOARCH), 2011 IEEE/ACM
International Symposium on. IEEE, 2011, pp. 129–136.
[15] B. Dlubak, M.-B. Martin, C. Deranlot, B. Servet, S. Xavier, R. Mattana,
M. Sprinkle, C. Berger, W. A. De Heer, Petroff, and A. Anane, “Highly effi-
cient spin transport in epitaxial graphene on SiC,” Nature Physics, vol. 8, no. 7,
p. 557, 2012.
[16] S. Manipatruni, D. E. Nikonov, and I. A. Young, “Material targets for scaling
all-spin logic,” Physical Review Applied, vol. 5, no. 1, p. 014002, 2016.
[17] P. Bonhomme, S. Manipatruni, R. M. Iraei, S. Rakheja, S.-C. Chang, D. E.
Nikonov, I. A. Young, and A. Naeemi, “Circuit simulation of magnetization
dynamics and spin transport,” IEEE Transactions on Electron Devices, vol. 61,
no. 5, pp. 1553–1560, 2014.
[18] A. Brataas, G. E. Bauer, and P. J. Kelly, “Non-collinear magnetoelectronics,”
Physics Reports, vol. 427, no. 4, pp. 157–255, 2006.
[19] S.-F. Lee, W. Pratt Jr, Q. Yang, P. Holody, R. Loloee, P. Schroeder, and J. Bass,
“Two-channel analysis of cpp-mr data for ag/co and agsn/co multilayers,” Jour-
nal of magnetism and magnetic materials, vol. 118, no. 1-2, pp. L1–L5, 1993.
[20] J. Fabian and S. D. Sarma, “Spin relaxation of conduction electrons in polyva-
lent metals: Theory and a realistic calculation,” Physical review letters, vol. 81,
no. 25, p. 5624, 1998.
[21] J. Fabian, A. Matos-Abiague, C. Ertler, P. Stano, and I. Zˇutic´, “Semiconductor
spintronics,” Acta Physica Slovaca. Reviews and Tutorials, vol. 57, no. 4-5, pp.
565–907, 2007.
[22] C. H. Bennett, “Demons, engines and the second law,” Scientific American, vol.
257, no. 5, pp. 108–117, 1987.
[23] Q. An, S. Le Beux, I. O’Connor, J. O. Klein, and W. Zhao, “Arithmetic logic
unit based on all-spin logic devices,” in New Circuits and Systems Conference
(NEWCAS), 2017 15th IEEE International. IEEE, 2017, pp. 317–320.
100
[24] Q. An, L. Su, J.-O. Klein, S. Le Beux, I. O’Connor, and W. Zhao, “Full-
adder circuit design based on all-spin logic device,” in Nanoscale Architectures
(NANOARCH), 2015 IEEE/ACM International Symposium on. IEEE, 2015,
pp. 163–168.
[25] L. Landau and E. Lifshitz, “On the theory of the dispersion of magnetic per-
meability in ferromagnetic bodies,” Phys. Z. Sowjetunion, vol. 8, no. 153, pp.
101–114, 1935.
[26] W. F. Brown, Micromagnetics. Interscience Publishers, 1963, no. 18.
[27] D. Wei, Micromagnetics and recording materials. Springer Science & Business
Media, 2012.
[28] S. Salahuddin and S. Datta, “Self-consistent simulation of quantum transport
and magnetization dynamics in spin-torque based devices,” Applied physics let-
ters, vol. 89, no. 15, p. 153504, 2006.
[29] S. Manipatruni, D. E. Nikonov, and I. A. Young, “Modeling and design of spin-
tronic integrated circuits,” IEEE Transactions on Circuits and Systems I: Reg-
ular Papers, vol. 59, no. 12, pp. 2801–2814, 2012.
101
