Stochastic p-bits for Invertible Logic by Camsari, Kerem Yunus et al.
Stochastic p-bits for Invertible Logic
Kerem Yunus Camsari,1, ∗ Rafatul Faria,1 Brian M. Sutton,1 and Supriyo Datta1, †
1School of Electrical and Computer Engineering, Purdue University, IN, 47907
(Dated: July 24, 2017)
Conventional semiconductor-based logic and nanomagnet-based memory devices are built out of
stable, deterministic units such as standard MOS (metal oxide semiconductor) transistors, or nano-
magnets with energy barriers in excess of ≈ 40-60 kT. In this paper we show that unstable, stochastic
units which we call “p-bits” can be interconnected to create robust correlations that implement pre-
cise Boolean functions with impressive accuracy, comparable to standard digital circuits. At the
same time they are invertible, a unique property that is absent in standard digital circuits. When
operated in the direct mode, the input is clamped, and the network provides the correct output.
In the inverted mode, the output is clamped, and the network fluctuates among all possible inputs
that are consistent with that output. First, we present a detailed implementation of an invertible
gate to bring out the key role of a single three-terminal transistor-like building block to enable the
construction of correlated p-bit networks. The results for this specific, CMOS-assisted nanomagnet-
based hardware implementation agree well with those from a universal model for p-bits, showing
that p-bits need not be magnet-based: any three-terminal tunable random bit generator should be
suitable. We present a general algorithm for designing a Boltzmann machine (BM) with a sym-
metric connection matrix [J] (Jij = Jji), that implements a given truth table with p-bits. The [J]
matrices are relatively sparse with a few unique weights for convenient hardware implementation.
We then show how BM Full Adders can be interconnected in a partially directed manner (Jij 6= Jji)
to implement large logic operations such as 32-bit binary addition. Hundreds of stochastic p-bits
get precisely correlated such that the correct answer out of 233 (≈ 8 billion) possibilities can be
extracted by looking at the statistical mode or majority vote of a number of time samples. With
perfect directivity (Jji=0) a small number of samples is enough, while for less directed connections
more samples are needed, but even in the former case logical invertibility is largely preserved. This
combination of digital accuracy and logical invertibility is enabled by the hybrid design that uses
bidirectional BM units to construct circuits with partially directed inter-unit connections. We es-
tablish this key result with extensive examples including a 4-bit multiplier which in inverted mode
functions as a factorizer.
I. INTRODUCTION
Conventional semiconductor-based logic and
nanomagnet-based memory devices are built out of
stable, deterministic units such as standard MOS (metal
oxide semiconductor) transistors, or nanomagnets with
energy barriers in excess of ≈ 40-60 kT. The objective
of this paper is to introduce the concept of what we call
“p-bits” representing unstable, stochastic units which
can be interconnected to create robust correlations that
implement precise Boolean functions with impressive
accuracy comparable to standard digital circuits. At
the same time this “probabilistic spin logic” (PSL) is
invertible, a unique property that is absent in standard
digital circuits. When operated in the direct mode, the
input is clamped, and the network provides the correct
output. In the inverted mode, the output is clamped,
and the network fluctuates among all possible inputs
that are consistent with that output.
Any random signal generator whose randomness can be
tuned with a third terminal should be a suitable building
block for PSL. The icon in Fig. 1b represents our generic
∗ kcamsari@purdue.edu
† datta@purdue.edu
building block whose input Ii controls the output mi ac-
cording to the equation (Fig. 1a),
mi(t) = sgn{rand(−1, 1) + tanh(Ii(t))} (1)
where rand(−1,+1) represents a random number uni-
formly distributed between −1 and +1. It is assumed
to change every τ seconds which represents the retention
time of individual p-bits. We normalize the time axis to
τ so that t is dimensionless and progresses in steps (0,
1, 2, . . .). At each time step, if the input is zero, the
output takes on a value of −1 or +1 with equal probabil-
ity, as shown in the middle panel of Fig. 1d. A negative
input Ii makes negative values more likely (left panel)
while a positive input makes positive values more likely
(right panel). Fig. 1c shows mi(t) as the input is ramped
from negative to positive values. Also shown is the time-
averaged value of mi which equals tanh(Ii).
A possible physical implementation of p-bits could
use stochastic nanomagnets with low energy barriers ∆
whose retention time [1]:
τ = τ0 exp (∆/kT )
is very small, on the order of τ0 which is a material de-
pendent quantity called the attempt time and is experi-
mentally found to be ≈ 10 ps− 1 ns [1] among different
magnetic materials. Such stochastic nanomagnets can be
ar
X
iv
:1
61
0.
00
37
7v
4 
 [c
on
d-
ma
t.m
es
-h
all
]  
21
 Ju
l 2
01
7
2(d)
(c)(a)
(b) W
R
0 20 40 60 80 100
-1
-0.5
0
0.5
1
0 20 40 60 80 100
-1
-0.5
0
0.5
1
0 20 40 60 80 100
-1
-0.5
0
0.5
1
-1 0 1
0
0.5
1
-1 0 1
0
0.2
0.4
0.6
-1 0 1
0
0.5
1
FIG. 1. Generic building block for PSL: (a) A generic model for PSL described by Eq. (1) with distinct READ and
WRITE units represented by the R/W icon shown in (b). Useful functionalities are obtained by interconnecting R/W units
according to Eq. (2), Ii = I0 × (hi +∑ Jijmj), with appropriately designed {h} and [J ]. (c) The blue trace shows the
“magnetization” (mi) obtained from Eq. (1) as the current (Ii) is ramped. The red trace shows the sigmoid response obtained
from an RC circuit which provides a moving average of the time-dependent “magnetization” which agrees very well with the
black curve showing tanh(Ii). The bias terminal could involve a voltage (V) instead of a current (I), just as the output could
involve quantities other than magnetization. (d) The idealized telegraphic behavior of the model is shown at various bias points
along with corresponding distributions.
pinned to a given direction with spin currents that are
at least an order of magnitude less than those needed
to switch 40 kT magnets. The sigmoidal tuning curve in
Fig. 1c describing the time average of a fluctuating signal
represents the essence of a p-bit. Purely CMOS imple-
mentations of a p-bit are possible [2, 3], but the sigmoid
seems like a natural feature of nanomagnets driven by
spin currents. Indeed, the use of stochastic nanomagnets
in the context of random number generators, stochas-
tic oscillators and autonomous learning [4–6] has been
discussed in the literature. But performing “invertible”
Boolean logic utilizing large scale correlations has not
been discussed before to our knowledge.
Note that we are using the term invertibility in the
broader sense of relation inverses and not in the narrower
sense of function inverses. For example, AND, when
interpreted as a relation, consists of the set {{1, 1 →
1}, {0, 0→ 0}, {1, 0→ 0}, {0, 1→ 0}} where each term is
of the form {A,B → AND(A,B)}. The relation inverse
of 0 is the set {{0, 0}, {0, 1}, {1, 0}} even though the cor-
responding functional inverse is not defined. What our
scheme provides, probabilistically, is the relation inverse
[7, 8].
Ensemble-average versus time-average: A sigmoidal
response was presented in [9] for the ensemble-averaged
magnetization of large barrier magnets biased along a
neutral state. This was proposed as a building block for
both Ising computers as well as directed belief networks
and a recent paper [10] describes a similar approach ap-
plied to a graph coloring problem. By contrast low bar-
rier nanomagnets provide a sigmoidal response for the
time-averaged magnetization and a suitably engineered
network of such nanomagnets could cycle through the
2N collective states at GHz rates, with an emphasis on
the “low energy states” which can encode the solution to
the combinatorial optimization problems, like the trav-
3Directed Networks 
of Boltzmann Machines
4-bit Multiplier/
Factorizer
32-Bit Adder
/Subtractor
(b)Boltzmann 
Machines
Full AdderAND/OR
(a)
W
R
W
R
W
R
W
R W
R
Reciprocal Network
A
W
R
B
C
Reciprocal 
Network
Directed
Connections
Reciprocal 
Network
FIG. 2. PSL designs discussed in this paper: (a) Basic Boolean elements (AND/OR, Full Adder) are implemented
as Boltzmann Machines based on symmetrically coupled networks with Jij = Jji. (b) Complex Boolean functions like a 32-
bit Ripple Carry Adder/Subtractor and 4-bit Multiplier/Factorizer are implemented by combining the reciprocal Boltzmann
machines in a directed fashion.
eling salesman problem (TSP) as shown in [11]. Once
the time-varying magnetization has been converted into
a time-varying voltage through a READ circuit, a simple
RC circuit can be used to extract the answer through a
moving time average. For example, in Fig. 1c the red
trace was obtained from the rapidly varying blue trace
using an RC circuit in a SPICE simulation.
The central feature underlying both implementations
is the p-bit that acts like a tunable random number gen-
erator, providing an intrinsic sigmoidal response for the
ensemble-averaged or the time-averaged magnetization
as a function of the spin current. It is this response that
allows us to correlate the fluctuations of different p-bits
in a useful manner by interconnecting them according to
Ii(t) = I0 × (hi(t) +
∑
j
Jijmj(t)) (2)
where hi provides a local bias to magnet i and Jij defines
the effect of bit j to bit i, and I0 sets a global scale for
the strength of the interactions like an inverse “pseudo-
temperature” giving a dimensionless current Ii to each p-
bit. The computation of Ii(t) in terms of mj(t) in Eq. (2)
is assumed instantaneous, in hardware implementations
there can be interconnect delays that relate mj(t) to cur-
rents at a later time, Ii(t
′).
Equation (1) arises naturally from the physics of low
barrier nanomagnets as we have discussed above. Equa-
tion (2) represents the “weight logic” for which there are
many candidates such as memristors [12], floating-gate
based devices [13], domain-wall based devices [14], stan-
dard CMOS [15]. The suitability of these options will
depend on the range of J values and the sparsity of the
J-matrix.
Equations (1-2) are essentially the same as the defining
equations for Boltzmann machines introduced by Hinton
and his collaborators [16] which have had enormous im-
pact in the field of machine learning, but they are usually
implemented in software that is run on standard CMOS
hardware. The primary contributions of this paper are
threefold:
• Hardware implementation: It may seem “obvious”
that an unstable magnet could provide a natural
hardware for representing a p-bit, but we would
like to stress a less obvious point. To the best of
our knowledge, simple two-terminal devices are not
suitable for constructing large scale correlated net-
works of the type envisioned here. Instead, we need
three-terminal building blocks with transistor-like
gain and input-output isolation as shown in Fig. 1b
[9]. To stress this point, we describe a concrete im-
plementation of a Boolean function using detailed
nanomagnet and transport simulations that are in
good agreement with those obtained by the generic
model based on Eq. (1). All other results in this
paper are based on Eq. (1) in order to emphasize
the generality of the concept of p-bits which need
not necessarily be nanomagnet-based [17, 18].
• Boltzmann machines (BM) for invertible Boolean
logic (Fig.2a): Much of the current emphasis on
BMs is on “learning” giving rise to the concept
of restricted Boltzmann machines [19]. By con-
trast this paper is about Boolean logic, extend-
ing an established method for Hopfield networks
[20] to provide a mathematical prescription to turn
any Boolean truth table into a symmetric J-matrix
(Eq. (2), with Jij = Jji), in one shot with no
4“learning” being involved. This design principle
seems quite robust, functioning satisfactorily even
when the J-matrix elements are rounded off, so that
the required interconnections are relatively sparse
and quantized which simplifies the hardware imple-
mentation. The numerical probabilities agree well
with those predicted from the energy functional.
E({m}) = −I0 ×
(∑
i,j
1
2
(Jijmimj) +
∑
i
himi
)
(3)
using the Boltzmann law:
P ({m}) = exp(−E)∑
i,j exp(−E)
(4)
Most importantly we show that the resulting
Boolean gates are invertible: not only do they pro-
vide the correct output for a given input, for a
given output they provide the correct input(s). If
the given output is consistent with multiple in-
puts, the system fluctuates among all possible an-
swers. This remarkable property of invertibility is
absent in standard digital circuits and could help
provide solutions to the Boolean satisfiability prob-
lem (Fig. 8) [21].
• Directed networks of BM (Fig.2b): Finally we show
that individual BM’s can be connected to perform
precise arithmetic operations which are the norm
in standard digital logic, but quite surprising for
BM which are more like a collection of interacting
particles than like a digital circuit. We show that a
32-bit adder converges to the one correct sum out
of 233 ≈ 8 billion possibilities when the interaction
parameter is suddenly turned up from say I0 = 0.25
to I0 = 5. This can be likened to quenching a
molten liquid and getting a perfect crystal. What
we expect is plenty of defects, distributed differ-
ently everytime we do the experiment. That is ex-
actly what we get if the individual BM Full adders
comprising the 32-bit adder are connected bidirec-
tionally (Jij = Jji). But by making the connection
between Adders directed (Jij 6= Jji), we obtain the
striking accuracy of digital circuits while largely re-
taining the invertibility of BM. This is a key result
that we establish with extensive examples includ-
ing a 4-multiplier which in inverted mode functions
as a factorizer.
Each of these three contributions is described in detail in
the three sections that follow.
II. AN EXAMPLE HARDWARE
IMPLEMENTATION OF PSL
To ensure that individual p-bits can be interconnected
to produce robust correlations, it is important to have
GSHE
W
R
FIG. 3. CMOS-assisted implementation of p-bits: (a)
A possible CMOS-assisted implementation of p-bits that have
a separate READ/WRITE paths. A GSHE layer provides a
spin current that pins the magnetization of circular magnets
(∆ ≈ 0 kT ). The change in magnetization is sensed by an
MTJ and amplified by two CMOS inverters that act as a
buffer, providing the necessary isolation and gain. (b) Self-
consistent, modular modeling of transport and magnetization
dynamics. See “Assumptions of the model” in the text. (c)
Equivalent READ circuit. (d) SPICE-based average output
voltage normalized to the VDD = 0.8 V of 14 nm FinFET
HP-inverters [22]. (e) sLLG-based average magnetization of
the circular magnet as a function of the spin current (av-
eraged over 500 ns for each bias point with a time step of
∆t = 0.05 ps, 10 million points per marker), normalized to
the GSHE gain and the thermal noise strength, Iths . (f) The
time-dependent output voltage at various bias points.
separate terminals for writing (more correctly biasing)
and reading, marked W and R respectively in Fig. 3a.
With IMA nanomagnets (e.g circular nanomagnets) this
could be accomplished following existing experiments
[24, 25] using the giant spin Hall effect (GSHE). Recent
experiments using a built-in exchange bias [26–29] could
make this approach applicable to PMA as well. Note
however, that these experiments have all been performed
with stable free layers, and would have to be carried out
with low barrier magnets in order to establish their suit-
ability for the implementation of p-bits. As the field
progresses, one can expect the bias terminal to involve
5W
R
W
R
W
R
FIG. 4. An invertible AND gate: (a) Passive resistor
network that is used to obtain the connection terms Jij to
correlate p-bits. The output impedance Rij = 1/Gij is much
smaller than the input impedance RGSHE , allowing separate
voltages to add at the input of the ith p-bit. (b) Explicit
implementation of an AND gate based on Eq. (10). (c) When
C is clamped to 1, A and B spend most of their time in the
(11) state, the only combination consistent with C=1. (d)
The invertible operation of the AND gate when the C gate
is clamped to a zero, while A and B are left floating. A and
B bits fluctuate between 3 possible combinations consistent
with C=0, (A,B)=(00),(01),(10). The time response of A,B,C
voltages are normalized by VDD. Histogram is obtained by
averaging over 200 ns of thresholded voltages, only the first
20 ns of A,B,C voltages are shown for clarity.
voltage control [30, 31] instead of current control, just as
the output could involve quantities other than magneti-
zation. We will now show a concrete implementation of a
Boolean function using minimal CMOS circuitry in con-
junction with stochastic nanomagnets through detailed
nanomagnet and transport simulations that are in good
agreement with those obtained from the generic model
based on Eq. (1).
Fig. 3a shows a possible, CMOS-assisted p-bit that has
a separate READ and WRITE path. The device con-
sists of a heavy metal exhibiting Giant Spin Hall Effect
(GSHE) that drives a circular magnet which replaces the
usual elliptical magnets in order to provide the stochas-
ticity needed for the magnetization. A small read cur-
rent, which is assumed to not disturb the magnetization
of the free layer in our design, that flows through the
fixed layer is used to sense the instantaneous magneti-
zation, which is amplified and isolated by two inverters
that act as a buffer. This structure is very similar to
the experimentally demonstrated GSHE switching of el-
liptical magnets that were similarly read-out by an MTJ
[24], with the only exception that the elliptical magnets
are replaced by circular magnets with an aspect ratio of
one. This device could be viewed as replacing the free
layers of the GSHE-driven MTJs demonstrated in [24]
with those in the telegraphic regime [25, 32–34] .
In the presence of thermal noise the magnetization of
such a circular magnet rotates in the plane of the circle
without a preferred easy-axis that that would have arisen
due to the shape anisotropy, effectively making its ther-
mal stability ∆ ≈ 0 kT [35]. This magnetization can be
pinned by a spin current that is generated by flowing a
charge current through the GSHE layer. The magnetic
field driven sigmoidal responses of magnetization for such
circular magnets have experimentally been demonstrated
[36, 37], while the spin current driven pinning has not
been demonstrated to our knowledge. Using validated
modules for transport and magnetization dynamics [38]
(Fig. 3b), we solve the stochastic Landau-Lifshitz-Gilbert
(sLLG) equation in the presence of thermal noise and a
GSHE current. The following subsection shows detailed
simulation parameters.
Sigmoidal response: A long-time average (t = 500 ns)
of the magnetization 〈mz〉 as a function of a GSHE-
generated spin current is plotted in Fig. 3e that displays
the desired sigmoidal characteristic for p-bits dictated by
Eq. (1). The x-axis of Fig. 3e is normalized to the geo-
metric gain factor that relates the charge current to the
spin current exerted [39, 40]:
β ≡ Is
Ic
= θSH
LFM
t
(
1− sech
(
t
λ
))
(5)
where θSH is the Hall angle, t is the thickness and λ is the
spin-relaxation length of the heavy metal. The quantity
β can be made to be much greater than 1 providing an
intrinsic gain [41], however for the parameters used in the
present examples, β is ≈ 1.5.
Another quantity that is used to normalize the x-axis
6of Fig. 3e is the “thermal spin current” that corresponds
to the strength of the thermal noise that needs to be
overcome for a circular magnet to be pinned in a given
direction:
Iths =
(
4q
~
)
α
(
kT
)
(6)
where q is electron charge, α is the damping coefficient
of the magnet. Iths , Is and Ic all have units of charge
current, therefore we can define the dimensionless inter-
action parameter, I0 of Eq. 2 as I0 ≡ βIc/Iths = Is/Iths .
It can be seen from Fig. 3e that when the applied spin
current βIc/I
th
s = Is/I
th
s ≈ 10, the magnetization of the
circular magnet is pinned in the ±z directions for these
particular parameters. For PMA magnets with low barri-
ers (∆ kT ), the pinning current is independent of the
volume as long as increasing the volume does not invali-
date the ∆  kT assumption. This can be analytically
shown from a 1D Fokker-Planck equation [42], and we
have reproduced this behavior directly from sLLG sim-
ulations. For the in-plane (circular) magnets considered
here, the pinning current in general has a Ms and Vol.
dependence and the dimensionless pinning current can
be larger.
Nevertheless, it is possible to estimate the thermal spin
current for typical damping coefficients of α = 0.01−0.1,
Iths is ≈ 0.25 µA − 2.5 µA. Pinning currents for super-
paramagnets are at least an order of magnitude smaller
than the critical switching currents of stable magnets
[43]. Iths , defined by Eq. (6) also sets the scale for I0
defined in Eq. (2) suggesting that a stochastic nanomag-
net based implementation of PSL could be more energy
efficient than the standard spin-torque switching of stable
magnets that suffer from high current densities.
Need for three-terminal devices with READ-WRITE
separation: Note that a crucial function of the READ
circuit and the CMOS transistors in this design is the
ability to turn the magnetization into an output voltage
that is proportional to mz, providing gain for fan-out and
isolation to avoid any read disturb. Indeed, a critical re-
quirement for any other alternative implementations of
p-bits is the need for three terminal devices with sepa-
rate READ and WRITE paths to provide gain and isola-
tion. In this particular design these features come in by
directly integrating CMOS transistors, but CMOS-free,
all-magnetic designs with these characteristics have been
proposed [41, 44]. Our purpose is to simply show how
a p-bit can be realized by using experimentally demon-
strated technology. Alternative designs are beyond the
scope of this paper.
READ Circuit: For the output to provide symmetric
voltage swings on the GSHE layer, the minus supply V −
needs to be set to VDD/2 since VOUT ranges between 0
and VDD. V
+ is set to VDD/2 + VR where VR is a small
READ voltage that is amplified by the inverters. We
assume a simple, bias-independent MTJ model [45]:
GMTJ = G0(1 + P
2mz), (7)
where P is the interface polarization and G0 is the aver-
age MTJ conductance. Setting the reference resistance
(Fig. (3c)) R0 equal to G
−1
0 , the input voltage to the
inverters, VM in FIG. (2d) becomes:
VM =
VDD
2
+
VR
2 +mzP 2
(8)
In the absence of a bias 〈mz〉 becomes 0 and the middle
voltage fluctuates around the mean 〈VM 〉 = VDD/2 +
VR/2. This requires the inverter characteristic to be
shifted to this value to produce a telegraphic output that
fluctuates between 0 and VDD with equal probability
(Fig. 3f). This shift is easily engineered by sizing the
pFET and nFET transistors differently, a wider pFET
shifts the inverter characteristic towards VDD, as we will
show in the next subsection.
Interconnection matrix: A passive resistor network can
be used as a possible interconnection scheme to correlate
the p-bits as shown in Fig. 4. A proper design of the
interconnection matrix J that has only a few discrete val-
ues ensures a minimal number of different conductances
(Gij). In this demonstrated example the AND gate re-
quires only 2 unique, discrete conductance values.
The spin currents that need to be delivered to each
p-bit are on the order of a few µA and can be generated
with charge currents that are even smaller, due to the
GSHE gain. This means the interconnection resistances
Rij could be on the order of 100 kΩ’s since the voltage
drops across these resistances are around VOUT − V − ≈
±0.5 V. Since the GSHE ground V − = VDD/2 simply
shifts all the voltages to get symmetric ± swings, we
define the voltages (V ′OUT)i = (VOUT)i−V −. Then input
currents to each p-bit can be expressed (Fig. 4a):(
IIN
)
i
=
∑
j
Gij(V
′
OUT) +Gi(V
′
BIAS) (9)
assuming
∑
j Gij  GGSHE since the heavy metal resis-
tances are typically much less than hundreds of kΩ. We
have verified the validity of Eq. (9) by SPICE simula-
tions, for the parameters chosen for these examples.
As a result, we observe that Eq. (9) constitutes a hard-
ware mapping for the interconnections of Eq. (2). In this
scheme Gij conductances are initially adjusted to obtain
a global interaction strength I0 for a given problem. Al-
ternatively, the interaction strength can be adjusted elec-
trically by varying the supply voltages.
Invertible AND Gate: Fig. 4b shows an explicit im-
plementation of an invertible AND gate (A ∩ B = C)
corresponding to [J] and {h} matrices [46] that have 3
unique, integer entries:
J =

A B C
A 0 −1 +2
B −1 0 +2
C +2 +2 0
 hT = [+1 +1 −2] (10)
In Fig. 4d, we show the inverse operation of the AND
gate where we clamp the output bit C to a 0 or 1 by the
7bias voltage attached to its input terminal. The inter-
connection resistance is chosen to be R0 = 125 kΩ that
roughly provides≈ ±6 µA of charge current to each p-bit,
corresponding to an I0 ≈ 3.5 for the chosen parameters.
Generating the histogram: At the end of the simulation
(t=200 ns), we threshold the voltage output of A,B and C
by legislating all voltages above VDD/2 = 0.4 V to be 1,
and below VDD/2 to be 0. Then a histogram output for
the thresholded word [ABC] is obtained and normalized
to unit probability. Clamping the output to 0 and letting
A and B float, make A and B fluctuate in a correlated
manner and they visit the three possible states (00, 01,
10) with approximately equal probability. Resolving the
output 0 to the three possible input combinations is, in a
way “factorizing” the output. Conversely, clamping the
output to 1 produces a strong (11) peak in the histogram
of [ABC], which is the only consistent input combination
for C=1 (Fig. 4c-d).
Assumptions of the model: We have made several sim-
plifying assumptions while modeling the hardware im-
plementation of a p-bit. (1) The READ voltage that is
amplified by the inverters produces a small current that
passes through the circular magnet and might potentially
disturb its current state. We assumed that this current
(labeled as IS2 in Fig. 3b) is negligible and do not af-
fect the magnetization of the stochastic magnet. (2) We
assumed that the spin current generated by the heavy
metal is deposited to the free layer with perfect efficiency
(I ′S1 = IS1 in Fig. 3b), however, depending on the inter-
face properties this conversion factor can be less than
100%. (3) We have also assumed that the fixed layer
does not produce a notable stray field on the circular
magnet. Note that the presence of such a constant field
would simply shift the sigmoidal behavior presented in
Fig. 3d-e to the right (or left) and could have been offset
by a constant bias current. (4) Finally, we have neglected
the resistance of the GSHE portion in the READ circuit
(Fig. 3c), assuming the MTJ resistance would be domi-
nant in this path.
Detailed Simulation Parameters
This section shows the details of simulation parameters
for the hardware implementation of p-bits that are used
for Fig. 3−4.
sLLG for stochastic circular magnets: The magne-
tization of a circular nanomagnet described as mˆi is
obtained from the stochastic Landau-Lifshitz-Gilbert
(sLLG) equation:
(1 + α2)
dmˆi
dt
= −|γ|mˆi × ~Hi − α|γ|(mˆi × mˆi × ~Hi)
+
1
qNi
(mˆi × ~ISi × mˆi) +
(
α
qNi
(mˆi × ~ISi)
)
(11a)
where α is the damping coefficient, q is the electron
charge, γ is the electron gyromagnetic ratio, Is is the
spin current that is assumed to be uniformly distributed
over the total number of spins in the macrospin, Ni =
MsVol./µB , µB being the Bohr magneton. It is assumed
that the spin current generated from the GSHE layer is
polarized in the z-direction, such that ~ISi = IS zˆ. ~Hi is
the effective field of the circular magnet, where the uni-
axial anisotropy is assumed to be negligible, but there
is still a strong demagnetizing field. The thermal fluc-
tuations also enter through the effective magnetic field:
~Hi = −4piMsmxxˆ+ ~Hth, x-axis being the out-of-plane di-
rection of the magnet, and 〈| ~Hth|2〉 = 2αkT/(|γ|MsVol.)
in units [Oe2/Hz] with zero mean, and equal in all
three directions. Table I shows the parameters used in
Figs. 3−4. We note that this parameter selection is sim-
ply one possibility, many other parameters could have
been used with no change in the basic conclusions.
Obtaining the sigmoidal response of CMOS+sLLG:
Each data point in the sigmoids shown in Figs. 3−4 is
obtained by averaging the z-component of the magneti-
zation after 500 ns, with a time-step of ∆t = 0.05 ps.
The CMOS inverter characterestics in conjunction with
a spherical representation-based sLLG are obtained using
the modular framework developed in [38] using HSPICE.
14 nm FinFET Inverter Characteristics: Fig. 5 shows
the input/output characteristics of the single and double
inverters that are used to amplify the stochastic signal
that is generated by the MTJ (Fig. 3). At zero-bias from
the GSHE, the amplified signal VM (Eq. 8) is in the mid-
dle of V + and V − which is VDD/2 + VR/2. The buffer
response can be shifted to this value by increasing the
size of pFETs, as shown in Fig. 5.
Parameters Value
Saturation magnetization (Ms) 300 emu/cc
Magnet diameter (Φ), thickness (t) 15 nm, 0.5 nm
MTJ Polarization (P) (Eq. (7)) 0.5
MTJ Conductance (G0) (Eq. (7)) 176 µS
Damping coefficient (α) 0.1
Spin Hall Length, Width (Eq. (5)) L = W = 15 nm
Hall Angle, Spin relax. length θ=0.5 [47], λsf =2.1 nm[48]
Spin Hall res. (ρ), thickness (t) 200 µΩ-cm [49], 3.15 nm
Temperature (T ) 300 K
CMOS Models 14nm HP-FinFET [22]
Supply and READ Voltage VDD = 0.8 V, VR = 0.5 V
Timestep for transient sim. (SPICE) ∆t = 0.05 ps
TABLE I. Parameters used for simulations in Figs. 3−4.
8FIG. 5. 14 nm PTM, Inverter/Buffer: DC response of
14 nm high performance (HP) FinFETs based on [22] for an
inverter and buffer. Sizing the transistors differently allows
the switching point to be shifted.
III. INVERTIBLE BOOLEAN LOGIC WITH
BOLTZMANN MACHINES
We now present a mathematical prescription that
shows how any given truth table can be implemented
in terms of Boltzmann Machines, in “one shot” with no
learning being involved, unlike much of the past work
in this area (See for example, [50, 51]). In Section II,
we chose a simple [J] and {h} matrix to implement an
AND gate based on [46]. In this section, we outline a
general approach to show how any truth table can be
implemented in terms of such matrices. Our approach,
pictorially described in Fig. 6, begins by transforming a
given truth table from binary (0, 1) to bipolar (−1,+1)
variables. The lines of the truth table are then required to
be eigenvectors each with eigenvalue +1, all other eigen-
vectors are assumed to have eigenvalues equal to 0. This
leads to the following prescription for J as shown in Fig. 6:
[J ] =
∑
i,j
[S−1]ijuiu
†
j (12a)
Sij = u
†
i uj (12b)
where ui are the eigenvectors corresponding to lines in
the truth table of a Boolean operation and S is a pro-
jection matrix that accounts for the non-orthogonality
of the vectors defined by different lines of the truth ta-
ble. Note that the resultant J-matrix is always symmet-
ric (Jij = Jji) with diagonal terms that are subtracted
in our models such that Jii = 0. The number of p-bits in
the system is made greater than the number of lines in a
truth table through the addition of hidden units (Fig. 6)
to ensure that the number of conditions we impose is less
than the dimension of the space defined by the number
of p-bits.
Another important aspect in the construction of [J] is
that an eigenvector ui implies that its complement −ui is
FIG. 6. Truth Table to J-Matrix: A given truth table
is first transformed from binary to bipolar variables by using
the transformation m = 2t − 1, where m and t represent the
magnetization and binary values of the truth table. Addi-
tional bits are introduced to each line of the truth table to
ensure that the resultant S-matrix is invertible. The indices
i, j correspond to the number of lines in the truth table. ui, uj
are column vectors. As an example, we have shown auxiliary
bits that result in an S-matrix equal to the identity matrix,
since the eigenvectors are orthogonal. The J-matrix is then
obtained by Eq. (12a) which ensures that the truth table cor-
responds to the low energy states of the Boltzmann machines
according to Eq. (4). A handle bit of +1 is introduced to
each line of the truth table which can be biased to ensure
that the complementary truth table does not appear along
with the desired one. This bit also allows a truth table to be
electrically reconfigured into its complement.
also a valid eigenvector. However only one of these might
belong to a truth table. We introduce a “handle” bit to
each ui that is biased (hi) to distinguish complementary
eigenvectors. These handle bits provide the added benefit
of reconfigurability. For example, AND and OR gates
have complementary truth tables, and a given gate can
be electrically reconfigured as an AND or an OR gate
using the handle bit.
J-Matrices for AND/FA: We now provide the details
of the J-matrix for the AND gate, obtained using the
prescription shown in Fig. 6 based on Eq. (12a). The
eigenvectors of the truth table for the AND in Fig. 6 are
placed into a matrix U, such that U = [u1 u2 u3 u4],
where u1 is the first row of the matrix shown in Fig. 6,
u1 = [−1+1+1+1+1−1−1−1]T and so on. In matrix
notation, the S-matrix can be written as:
S = UTU = 8 I4×4 (13)
Then the J-matrix becomes:
J =
∑
ij
[S−1]ij︸ ︷︷ ︸
1/8 δij
uiu
†
j = 1/8
∑
i
uiu
†
i (14)
Removing the diagonal entries by making Jii = 0 and
multiplying the matrix entries by 2, to obtain simple in-
9FIG. 7. Correlated p-bits, AND Gate: When the interaction strength (I0) is zero, p-bits produce uncorrelated noise,
visiting all possible states with equal probability. In this example, the interaction strength (pseudo inverse-temperature) is
suddenly increased from 0 to 2 as a step function at t = t0, to effectively “quench” the network. This correlates the p-bits
to produce the truth table of an AND gate (AND: A ∩ B = C). Note that after this quenching, the p-bits only visit the low
energy states corresponding to the truth table of the AND gate and once the system is in one of the low energy states, it tends
to stay there for a while, until being kicked out by the thermal noise. The time averages of the uncorrelated and the correlated
system are well-explained by the Boltzmann law stated in Eq. (4). The total simulation used a T = 4e6 steps to compare the
results with the Boltzmann distribution, though only a fraction is shown in the upper panel for clarity.
tegers, JAND evaluates to:
JAND =

0 −1 0 0 1 1 1 0
−1 0 1 1 0 0 0 1
0 1 0 0 1 1 −1 0
0 1 0 0 1 −1 1 0
1 0 1 1 0 0 0 −1
1 0 1 −1 0 0 0 1
1 0 −1 1 0 0 0 1
0 1 0 0 −1 1 1 0

(15)
with the notation, [1-5: auxiliary bit and handle bit,
6:“A”, 7:“B”, 8:“C”]. Following a similar procedure, we
use the following 14× 14 Full Adder matrix, JFA:
JFA =

0 0 0 0 0 0 0 4 −1 −1 −1 −1 −2 −1
0 0 0 0 0 0 4 0 −1 −1 2 −1 1 −1
0 0 0 0 0 4 0 0 −1 −1 −1 2 1 −1
0 0 0 0 4 0 0 0 −1 −2 1 1 −1 1
0 0 0 4 0 0 0 0 −1 2 −1 −1 1 −1
0 0 4 0 0 0 0 0 −1 1 1 −2 −1 1
0 4 0 0 0 0 0 0 −1 1 −2 1 −1 1
4 0 0 0 0 0 0 0 −1 1 1 1 2 1
−1 −1 −1 −1 −1 −1 −1 −1 0 0 0 0 0 0
−1 −1 −1 −2 2 1 1 1 0 0 −1 −1 1 2
−1 2 −1 1 −1 1 −2 1 0 −1 0 −1 1 2
−1 −1 2 1 −1 −2 1 1 0 −1 −1 0 1 2
−2 1 1 −1 1 −1 −1 2 0 1 1 1 0 −2
−1 −1 −1 1 −1 1 1 1 0 2 2 2 −2 0

(16)
with the notation, [1−9: auxiliary bits and handle bit,
10: “Cin”, 11: “B”, 12: “A”, 13: “S” 14: “Cout”].
These are the J-matrices (AND and FA) that are used
for all examples in the paper, except for the AND gate
described in Section II. Fig. 10 shows the “truth table”
operation of the Full Adder where all input/output termi-
nals are “floating” using the J-matrix of Eq. (16), show-
ing excellent quantitative agreement with the Boltzmann
distribution of Eq. (4) at steady state even for the unde-
sired peaks of the truth table.
Note that this prescription for [J] is similar to the prin-
ciples developed originally for Hopfield networks ([52],
and Eq. (4.20) in [20]). However, other approaches are
possible along the lines described in the context of Ising
Hamiltonians for quantum computers [46]. We have tried
some of these other designs for [J] and many of them lead
to results similar to those presented here. For practical
implementations, it will be important to evaluate differ-
ent approaches in terms of their demands on the dynamic
range and accuracy of the weight logic.
Description of universal model: Once a J-matrix and
the h-vector are obtained for a given problem, the system
is initialized by randomizing all mi at time, t = t0. First,
the current (voltage) that a given p-bit (mi) feels due to
the other coupled mj is obtained from Eq. (2), and the
mi value is updated according to Eq. (1). Next the pro-
cedure is repeated for the remaining p-bits by finding the
current they receive due to all other mi using the updated
values of mi. For this reason, the order of updating was
chosen randomly in our models and we found that the
order of updating has no effect in our results. However,
updating the p-bits in parallel leads to incorrect results.
These two observations are well-known in the context
of Hopfield networks and Boltzmann Machines [53–55].
This type of serial updating corresponds to the “asyn-
10
FIG. 8. Implementing a Boolean function and its inverse: The input or output terminals of an appropriately inter-
connected network of p-bits can be “clamped” to perform a specific logic operation or its inverse. In this example, the input
bits (A,B) of an OR Gate are clamped to be +1, forcing the output bit C to be 1, during the first phase of operation (t < t0).
In the second phase of operation (t > t0), the output of the OR gate C is clamped to the value +1, which is consistent with
three different combinations of (A,B). As shown in the time response and the long-time histogram plots, all three possibilities
emerge with equal probability, demonstrating the “inverse” OR operation. In each case, the expected probabilities from the
Boltzmann Law (Eq. (4)) closely match those produced by the generic model, Eq. (1-2) after running the system for one million
steps, only a fraction is shown in the upper panel for clarity.
FIG. 9. Noise Tolerance of AND: The probability of
a wrong output for an (AND) gate (Eq. 15) operated with
clamped inputs is investigated in the presence of a random
noise field which enters Eq. (2) as indicated in the figure. The
noise is assumed to be uniformly distributed over all p-bits in
a given network, and centered around zero with magnitude
±h˜n, where (I0 = 2, hi = ±1). Each gate is simulated 50000
times for T=100 time steps to produce an error probability
for a given noise value, and the maximum peak produced
by the system is assumed to be an output that can be read
with certainty. The system shows robust behavior even in the
presence of large levels of noise.
chronous dynamics” [20, 56]. We note that the hardware
implementation discussed in this paper naturally leads to
an asynchronous updating of p-bits in the absence of a
global clock signal. We have set up an online simulator
based on this model in Ref. [23] so that interested read-
ers can simulate some of the examples discussed in this
paper.
Fig. 7 shows the time evolution of an AND based on
Eq. (15). Initially for t < t0 the interaction strength is
zero (I0 = 0), making the pseudo-temperature of the sys-
tem infinite and the network produces uncorrelated noise
visiting each state with equal probability. In the second
phase (t > t0), the interaction strength is suddenly in-
creased to I0 = 2, effectively “quenching” the network
by reducing the temperature. This correlates the system
such that only the states corresponding to the truth table
of the AND gate are visited, each with equal probability
when a long time average is taken. The average probabil-
ities in each phase quantitatively match the Boltzmann
Law defined by Eq. (4).
In Fig. 8, we show how a correlated network producing
a given truth table can be used to do directed computa-
tion analogous to standard CMOS logic. An OR gate
is constructed by using the same [J] matrix for an AND
gate, but with a negated handle bit. By “clamping” the
input bits of an OR gate (t < t0) through their bias ter-
minals, hi, to (A,B)=(+1,+1), the system is forced to
only one of the peaks of the truth table, effectively mak-
ing C=1.
The PSL gates however exhibit a remarkable difference
with standard logic gates, in that inputs and outputs are
on an equal footing. Not only do clamped inputs give the
corresponding output, a clamped output gives the cor-
responding input(s). In the second phase (t > t0) the
output of the OR gate is clamped to +1, that produces
three possible peaks for the input terminals, correspond-
ing to various possible input combinations that are con-
sistent with the clamped output (A,B)=(0,1),(1,0) and
(1,1). The probabilistic nature of PSL allows it to obtain
multiple solutions (Fig. 8c). It also seems to make the re-
sults more resilient to unwanted noise due to stray fields
that are inevitable in physical implementations as shown
11
FIG. 10. Full Adder: Full Adder in the truth table mode,
where all inputs and outputs are floating, calculated using
JFA from Eq. (16), with I0 = 0.5. The statistics are collected
for T = 106 steps, and each terminal output is then placed
in the histogram. The states are numbered using the decimal
number corresponding to the binary number [Ci A B S Co].
The decimal numbers corresponding to the truth table are
shown in the inset, and these match the location of the taller
peaks in the histogram. Note that the Boltzmann distribu-
tion (Eq. (4)) quantitatively matches the model even for the
suppressed peaks.
in Fig. 9. Here, we simulate an AND gate in the presence
of a normally distributed random noise that enters the
bias fields of each p-bit and define the computation to be
faulty, if the mode (most frequent value) of the output
bit is not consistent with the programmed input combi-
nations after T = 100 time steps. We observe that even
large levels of uncontrolled noise produces correct results
with high probabilities.
Fig. 10 shows the design of a Full Adder (FA) with the
8-line truth table shown. There are three inputs in all,
two from the numbers to be added, and one carry bit
from previous FA. It produces two outputs, one the sum
bit and the other a carry bit to be passed on to the next
FA. The probabilities of different states are calculated
using JFA from Eq. (16), with I0 = 0.5 in the truth
table mode, where all inputs and outputs are floating
and the states are numbered using the decimal number
corresponding to the binary word [Ci A B S Co]. The
decimal numbers corresponding to the truth table are
shown in the inset, and these match the location of the
taller peaks in the histogram. Note that the Boltzmann
distribution (Eq. (4)) quantitatively matches the model
even for the suppressed peaks. A higher I0 reduces these
suppressed peaks further. The statistics are collected for
T = 106 steps, and each terminal output is then placed
in the histogram.
IV. DIRECTED NETWORKS OF BOLTZMANN
MACHINES
When constructing larger circuits composed of indi-
vidual Boltzmann machines, the reciprocal nature of the
Boltzmann machine often interferes with the directed na-
ture of computation that is desired. It seems advisable
to use a hybrid approach. For example in constructing
a 32-bit adder we use Full-Adders (FA) that are individ-
ually BMs with symmetric connections, Jij = Jji. But
when connecting the carry bit from one FA to the next,
the coupling element Jij is non-zero in only one direc-
tion from the least significant to the most significant bit.
This directed coupling between the components distin-
guishes PSL from purely reciprocal Boltzmann machines.
Indeed, even the Full Adder could be implemented not
as a Boltzmann machine but as a directed network of
more basic gates. But then it would lose its invertibility.
On the other hand, the directed connection of BM Full
Adders largely preserves the invertibility of the overall
system as we will show.
32-bit Adder/Subtractor
Fig. 11 shows the operation of a 32-bit adder that sums
two 32-bit numbers A and B to calculate the 33-bit sum
S. In the initial phase (t < t0) we have I0 = 0 corre-
sponding to infinite temperature so that the sum bits (S)
fluctuate among 233 ≈ 8 billion possibilities. With I0 =
1, Fig. 11 shows that the correct answer has a proba-
bility of ≈ 12% which is much lower than the ≈ 100%
that can be achieved with larger I0 values (as in Fig.13
a-c with I0=5). Nevertheless the peak is unmistakable
as evident from the expanded scale histogram and the
correct answer is extracted from the majority vote of
T=100 samples as shown in Fig. 13. This ability to ex-
tract the correct answer despite large fluctuations is a
general property of probabilistic algorithms.
Interestingly, although the overall system includes sev-
eral unidirectional connections, it seems to be able to per-
form the inverse function as well. With A and B clamped
it calculates S=A+B as noted above. Conversely with S
clamped, the input bits A and B fluctuate in a corre-
lated manner so as to make their sum sharply peaked
around S. Fig. 11 shows the time evolution of the input
bits that have broad distributions spanning a wide range.
Initially, when I0 is small, the sum of A and B also shows
a broad distribution, but once I0 is turned up to 1, the
distributions of A and B get strongly correlated making
the distribution of A+B sharply peaked around the fixed
value of S. It must be noted that the 32-bit adder shown
in Fig. 11 is not like standard digital circuits which are
not invertible. The demonstration of such an invertible
32-bit adder could be practically significant, since binary
addition is noted to be the most fundamental and fre-
quently used operation in digital computing [57].
Delay of Ripple Carry Adder : Just as in CMOS-based
Ripple Carry Adders, the delay of the p-bit based RCA
is a function of the inputs A and B. In Fig. 12 we have
systematically studied the worst-case delay of the p-bit
based Ripple Carry Adder (RCA) as a function of in-
creasing bit size. We selected a “worst-case” combination
12
FAFAFA
FIG. 11. 32-bit Ripple Carry Adder (RCA): (a) A 32-bit Ripple Carry Adder (RCA) is designed using individual Full
Adder (FA) units with the carry bit designed as a directed connection from the least significant bit to the most significant bit.
The overall J-matrix for a 32-bit adder J-matrix is shown, and it is quite sparse and quantized. (b) For t < t0, I0 = 0 and the
sum fluctuates randomly. At t = t0, I0 is suddenly increased, and the adder converges on the correct result for two random
inputs A and B. The distribution of 1000 data points (t > t0) show a single peak with 24% probability of time spent in the
correct state (not including the uncorrelated time points for t < t0). (c) Even though the connections between the Full Adder
units are directed, the system performs the inverse function as well. When the output (S) is clamped to a fixed number, the
inputs (A) and (B) fluctuate in a correlated manner to make A+B=S when I0 = 1. Note the broad distributions of A and B
(collected for t > t0) as compared to the extremely sharp distribution of A+B.
13
FIG. 12. Ripple Carry Adder delay: The delay of the
RCA as a function of number of bits in the Ripple Carry
Adder (RCA) is shown. The worst case input combination
generates a carry that propagates all the way through bit-1
to bit-N, and has a linear dependence on the number of bits,
exhibiting O(n) complexity. When the inputs are random,
the delay increases logarithmically. The delay is defined to
be the time it takes for the network to reach the mode of the
array for T=200 after getting quenched at t=0. Each point
is an average of 500 trials with random initial conditions for
an I0 = 1.5, and the mode of the array was exactly equal to
the arithmetic sum of the inputs in each case. The worst-case
inputs are A=0 . . . 000 and B=1 . . . 111 with an input carry
(Cin) of 1. Results show a weak I0 dependence.
that results in a carry that needs to be propagated from
bit 1 to bit N which results in a linear increase in the
delay, exhibiting O(n) complexity with input size simi-
lar to CMOS implementations [58]. When the inputs are
random, the delay seems to increase sub-linearly. The
system is quenched at t=0 for different interaction pa-
rameters I0 and the delay is defined to be the time it
takes for the system to settle to the mode of the array
for T=200. An error check has been carried out sep-
arately to ensure the calculated sum (mode) is always
exactly equal to the expected sum. For random inputs
the 32-bit adder is close to 20 time steps, in accordance
with the example shown in Fig. 11.
Digital accuracy AND logical invertibility: The strik-
ing combination of accuracy and invertibility is made
possible by our hybrid design, whereby the individual
Full Adders are Boltzmann Machines, even though their
connection is directed. Our 32-bit adder is more like a
collection of interacting particles than like a digital cir-
cuit as evident from Fig. 13a which shows a colormap
of the binary state of each of the 448 p-bits as a func-
tion of time with the interaction parameter I0 suddenly
increased from 0.25 to 5 at t0 = 50, thereby quenching
a “molten liquid” into a “solid”. Nevertheless it shows
the striking accuracy of a digital circuit, with S−A−B
exactly equal to zero in each of the 1000 trials as shown
in Fig. 13b. We do not expect a “molten liquid” to be
quenched into a “perfect crystal” every time. Instead,
we would expect a “solid full of defects” with different
non-zero values for S−A−B in each trial. That is exactly
what we get if the carry bits are bidirectional as in a fully
BM implementation (Fig. 13d).
Note however, that this digital accuracy is achieved
while maintaining the property of invertibility that is ab-
sent in digital circuits. Fig. 13 is not for direct mode
operation, but for the adder operating in reverse mode
as a subtractor. It might be expected that the directed
connection of carry bits from the less significant to the
more significant bit could lead to a loss of invertibility.
To investigate this point, we show the error S−A−B as a
function of trial number (Fig. 14) for four different modes
of operation with (i) A and B clamped (Addition), (ii) S
and A clamped (Subtraction), (iii) A, B and S for the 16
most significant bits (msb) clamped, and (iv) A, B and S
for the 16 least significant bits (lsb) clamped. The fully
bidirectional implementation shows very large errors for
all modes of operation. The directed implementation, on
the other hand, works perfectly for both the adder and
the subtractor modes. It also works if we clamp the least
significant bits, but not if we clamp the most significant
bits. This seems reasonable since we expect to be able
to control a flow by making changes upstream (lsb), but
not downstream (msb).
Partial directivity: So far in our examples we have
only considered fully directed (Jij = 2 J0, Jji = 0) or
fully bidirectional (Jij = J0, Jji = J0) carry bits when
connecting the individual Full Adders. In Fig. 15 we sys-
tematically analyze the effects of partial directivity in the
operation of a 32-bit adder. We observe that the 32-bit
adder operates correctly even when there is large degree
of bidirectionality (Jji = Jij × 0.75) provided that the
system is allowed to run for a long time, T = 50000, in
stark contrast with the fully directed case that could re-
solve the right answer within T = 100, shown in Fig. 14b.
Decreasing the time steps systematically increases the er-
ror. Increasing the correlation parameter while keeping T
constant also seems to adversely affect the bidirectional
designs, that might be getting the system stuck in local
minima.
Directionality and computation time, 2 p-bit model :
The qualitative relation between I0, T and bidirection-
ality J12/J21 described above is derived from extensive
numerical simulations based on Eq. 1-2. However, the
broad features can be understood from a model involv-
ing just two p-bits, 1 and 2, with
h =
[
0
0
]
and J =
[
0 J12
J21 0
]
It is straightforward to write a master equation describ-
ing the time evolution of the probabilities of different
configurations:
d
dt
P11P10P01
P00
 = [W ]
P11P10P01
P00

W being the transition matrix [20], P00 representing the
probability of both p-bits being −1, P11 both being +1,
14
FIG. 13. Accuracy of 32-bit adder, directed versus bidirectional: The results are shown for the adder operating in a
subtractor mode, clamping one (random) 32-bit input (A) and a (random) 33-bit output (Cout+ S), and observing the other
32-bit input B which should provide the difference S−A. (a): Colormap of the binary state of each of the 448 p-bits comprising
the directed adder as a function of time with the interaction parameter I0 suddenly increased from 0.25 to 5 at t0=50. For low
values of I0 at t<50, the collection of p-bits is like a molten liquid which is quenched at t0 = 50 into a solid. (b) Surprisingly
this solid corresponds to a “perfect crystal” in each of the 1000 trial experiments, with S−A−B exactly equal to zero (Dark
blue). (c) Same as (a) but for a bidirectional adder. Here too the “liquid” quenches to a solid at t0 = 50, but in this case the
resulting “solid” is full of defects (with hardly any zeros), with S−A−B 6= 0, yielding a different wrong result for each trial as
evident from (d). For (c) and (d) The colorbar is modified to have a dark blue color corresponding to exactly zero. S,A,B are
taken to be the statistical mode of the 100×1 array obtained at the end of each trial.
and so on. We can write two matrices W1 and W2 de-
scribing the updating of p-bits 1 and 2 respectively:
W1 =

(1, 2) (11) (10) (01) (00)
(11) p 0 p 0
(10) 0 p 0 p
(01) p 0 p 0
(00) 0 p 0 p

W2 =

(1, 2) (11) (10) (01) (00)
(11) q q 0 0
(10) q q 0 0
(01) 0 0 q q
(00) 0 0 q q

where W (i, j) represents the probability that state (j)
makes a transition to state (i), and p¯ = 1− p, q¯ = 1− q.
p and q are obtained from Eq. 1-2:
p =
1
2
(1 + tanh(I0(J12 + h1))) =
1
2
(1 + tanh(I0J12))
q =
1
2
(1 + tanh(I0(J21 + h2))) =
1
2
(1 + tanh(I0J21))
The overall transition matrix W is given by W2 ×W1
or W1×W2 depending on which bit is updated first. Ei-
ther way the matrix W has four eigenvalues λ1 = 1, λ2 =
0, λ3 = 0 and λ4 = (2p − 1)(2q − 1) = tanh(I0J12) ×
tanh(I0J21) and the corresponding eigenvectors evolve
with time ∼ λT .
The components corresponding to λ=0 decay instan-
taneously while the eigenvector corresponding to λ=1 is
the stationary result representing the correct solution.
But for the system to reach this state, we have to wait
for the fourth eigenvector corresponding to λ4 to decay
sufficiently. A fully directed network has J21 =0, so that
λ4 = 0 and the system quickly reaches the correct solu-
tion. But in a bidirectional network with J12 = J21, the
fourth eigenvalue can be quite close to one, especially for
large I0 and take an exponentially long time to decay, as
λT = exp(T ln λ) ≈ exp(−T (1 − λ)) when λ is close to
1.
This 2 p-bit model provides some insight into our gen-
eral observation that directivity can be used to obtain
accurate answers quickly. However, depending on the
problem at hand it may be desirable to retain some de-
gree of bidirectionality, since full directivity does lead to
some loss of invertibility as seen for one set of inputs in
Fig. 14. An example of a partially directed p-bit network
is discussed in the next section.
15
FIG. 14. Invertibility of 32-bit adder, directed vs bidirectional: An adder that provides the sum S of two 32-bit
numbers A and B: S = A+B. The left panel shows the adder implemented with bidirectional carry bits, while the right panel
shows one with carry bits directed from the least significant to the most significant bit. Four different modes are shown with
(i) A and B clamped (Addition), (ii) S and A clamped (Subtraction), (iii) A, B and S for the 16 most significant bits (msb)
clamped, and (iv) A, B and S for the 16 least significant bits (lsb) clamped. Note that that bidirectional implementation shows
very large errors for all modes of operation. The directed implementation works perfectly for both the adder and the subtractor
modes. It also works if we clamp the least significant bits, but not if we clamp the most significant bits. Correlation parameter
I0 = 1, T = 100 steps for all trials. S,A,B are taken to be the mode (most frequent value) of the 100×1 array obtained at the
end of each trial. Clamped inputs are random 32-bit words for each trial, for a total of 1000 trials.
FIG. 15. Error versus bidirectionality: The degree of
bidirectionality Jji/Jij of the carry-out (j) to carry-in (i) link
between the Full Adders is systematically varied while keeping
the sum Jij + Jji constant. In each case the sum is obtained
from the statistical mode (or majority vote) of T time samples
over 50 trials. The y-axis shows the fraction of trials that yield
the wrong result. Note that for large I0 and small T , error-
free operation is obtained only if bidirectionality is close to
zero similar to standard digital circuits. But with I0 = 1.5
and T=50,000, error-free operation (at least for 50 trials) is
obtained even with ≈ 75% bidirectionality.
4-Bit Multiplier / Factorizer
Fig. 16 shows how the invertibility of PSL logic blocks
can be used to perform integer factorization using a mul-
tiplier in reverse. Normally, the factorization problem re-
quires specific algorithms [59] to be performed in CMOS-
like hardware, here we simply use a digital 4-bit multi-
plier working in reverse to achieve this operation.
Specifically with the output of the multiplier clamped
to a given integer from 0 to 15, the input bits float to
the correct factors. The interconnection strength I0 is
increased suddenly from 0 to 2 at t = t0 (Fig. 16) and the
input bits get locked to one of the possible solutions. For
example, when the output is set to 9, both inputs float
to 3. With the output set to 6, both inputs fluctuate
between two values, 2 and 3. Note that factors like 9 =
9×1 do not show up, since encoding 9 in binary requires
4-bits (1001) and the input terminals only have 2-bits.
We have checked other cases where factorizing 3 shows
both 3×1 and 1×3, and factorizing zero shows all possible
peaks since there are many solutions such that 0 = 0 ×
1, 2, 3 and so on.
We also kept the same directed connections between
the Full Adders for the carry bits, making them a di-
rected network of Boltzmann Machines, similar to the 32-
bit Adder. Moreover, we kept a directed connection from
the Full Adders to the AND gates as shown in Fig. 16a
since the information needs to flow from the output to
16
FA FAFA
FIG. 16. Factorization through inverse multiplication: The reversibility of PSL allows the operation of integer factor-
ization using a binary multiplication circuit implemented using the principles of digital logic using AND gates and Full Adders
(FA) as shown in (a). The output nodes of a 4-bit multiplier are clamped to a given integer, and the system produces the only
consistent factors of the product at the input terminals, probabilistically. The interaction parameter I0 is suddenly increased
to a saturation value of 2, and held constant as shown. (b) The output terminal is clamped to 9 and is factored into 3 × 3,
note that 9× 1 is not an achievable solution in this setup since encoding 9 requires 4-bit inputs in binary, whereas inputs are
limited to 2-bits. (c) The output terminal is clamped to 6 and after being correlated, the factors cross-oscillate between 2 and
3. In both cases the histogram is obtained by counting outputs after t > ttotal/2 = 1.25 × 104 time steps to collect statistics
after the system is thermalized.
the input in the case of factorization. The input bits that
go to multiple AND gates are “tied” to each other with
a positive exchange (J > 0) value much like 2-spins in-
teracting ferromagnetically, however in PSL we envision
these interactions to be controlled purely electrically. In
this example, we have observed that the system is sensi-
tive to the relative strengths of couplings within the AND
gates and between the AND gates and the Full Adders
which can also depend on a chosen annealing profile.
The design of factorizers of practical relevance is be-
yond the scope of this paper. Our main purpose has been
to establish how the key feature of invertibility of p-bits
can be creatively used for different circuits with unique
functionalities. The demonstration of 4-bit factorization
through reverse multiplication is similar to memcomput-
ing [60] based on deterministic memristors. Note, how-
ever, that the building blocks and operating principles
of stochastic p-bits and memcomputing [61] are very dif-
ferent and the only similarity noted here is the fact that
both approaches treat the input and output terminals on
an equal footing.
V. SUMMARY
It is generally believed that (1) probabilistic algorithms
can tackle specific problems much more efficiently than
classical algorithms [62], and that (2) probabilistic al-
gorithms can run far more efficiently on a probabilis-
tic computer than on a deterministic computer [62, 63].
As such, it seems reasonable to expect that probabilis-
tic computers based on robust room temperature p-bits
could provide a practically useful solution to many chal-
lenging problems by rapidly sampling the phase space in
hardware.
In this paper we have presented a framework for us-
ing probabilistic units or “p-bits” as a building block for
a probabilistic spin logic (PSL) which is used to imple-
ment precise Boolean logic with an accuracy comparable
to standard digital circuits, while exhibiting the unique
property of invertibility that is unknown in deterministic
circuits. Specifically we have:
• presented an implementation based on stochastic
nanomagnets to illustrate the importance of three-
terminal building blocks in the construction of large
17
scale correlated networks of p-bits. We emphasize
that this is just one possible implementation that
is by no means the only one (Section II).
• presented an algorithm for implementing Boolean
gates as BM with relatively sparse and quantized
J-matrix elements, benchmarked their operation
against the Boltzmann law, and established their
capability to perform not just direct functions but
also their inverse (Section III), and
• presented a 32-bit adder implemented as a hybrid
BM that achieves digital accuracy over a broad
combination of the interaction parameter I0, di-
rectionality and the number of samples T . This
striking accuracy is reminiscent of digital circuits,
but it is achieved while preserving a certain degree
of invertibility which is absent in digital circuits.
The accuracy is particularly surprising with high
degrees of bidirectionality (J12 = 0.75×J21) where
the system is picking out the one correct answer out
of nearly 233 ≈ 8 billion possibilities. This may re-
quire a larger number of time samples, but these
could be collected rapidly at GHz rates. (Sec-
tion IV).
We hope these findings will help emphasize a new direc-
tion for the field of spintronic and nanomagnetic logic
by shifting the focus from stable high barrier magnets to
stochastic, low barrier magnets, while inspiring a search
for other possible physical implementations of p-bits.
ACKNOWLEDGMENTS
It is a pleasure to acknowledge many helpful discus-
sions with Behtash Behin-Aein (Globalfoundries) and
Ernesto E. Marinero (Purdue University). We thank
Jaijeet Roychowdhury (UC Berkeley) for suggesting the
phrase “invertible”. This work was supported in part by
C-SPIN, one of six centers of STARnet, a Semiconductor
Research Corporation program, sponsored by MARCO
and DARPA, in part by the Nanoelectronics Research
Initiative through the Institute for Nanoelectronics Dis-
covery and Exploration (INDEX) Center, and in part
by the National Science Foundation through the NCN-
NEEDS program, contract 1227020-EEC.
[1] L Lopez-Diaz, L Torres, and E Moro, “Transition from
ferromagnetism to superparamagnetism on the nanosec-
ond time scale,” Physical Review B 65, 224406 (2002).
[2] Krishna Palem and Avinash Lingamneni, “Ten years of
building broken chips: the physics and engineering of
inexact computing,” ACM Transactions on Embedded
Computing Systems (TECS) 12, 87 (2013).
[3] Suresh Cheemalavagu, Pinar Korkmaz, Krishna V
Palem, Bilge ES Akgul, and Lakshmi N Chakrapani, “A
probabilistic cmos switch and its realization by exploiting
noise,” in IFIP International Conference on VLSI (2005)
pp. 535–541.
[4] Akio Fukushima, Takayuki Seki, Kay Yakushiji, Hitoshi
Kubota, Hiroshi Imamura, Shinji Yuasa, and Koji Ando,
“Spin dice: A scalable truly random number generator
based on spintronics,” Applied Physics Express 7, 083001
(2014).
[5] Won Ho Choi, Yang Lv, Jongyeon Kim, Abhishek Desh-
pande, Gyuseong Kang, Jian-Ping Wang, and Chris H
Kim, “A magnetic tunnel junction based true random
number generator with conditional perturb and real-time
output probability tracking,” in Electron Devices Meet-
ing (IEDM), 2014 IEEE International (IEEE, 2014) pp.
12–5.
[6] Julie Grollier, Damien Querlioz, and Mark D Stiles,
“Spintronic nanodevices for bioinspired computing,” Pro-
ceedings of the IEEE 104, 2024–2039 (2016).
[7] J. Roychowdhury, private communication.
[8] For an example of the use of “invertible relations”, see
Ran Canetti and Mayank Varia, “Non-malleable obfusca-
tion,” in Theory of Cryptography Conference (Springer,
2009) pp. 73–90.
[9] Behtash Behin-Aein, Vinh Diep, and Supriyo Datta, “A
building block for hardware belief networks,” Scientific
Reports 6, 29893 (2016).
[10] Y. Shim, A. Jaiswal, and K. Roy, Journal of Applied
Physics 121, 193902 (2017).
[11] B. Sutton, K. Y. Camsari, B. Behin-Aein, and S. Datta,
Scientific Reports 7 (2017).
[12] J Joshua Yang, Dmitri B Strukov, and Duncan R Stew-
art, “Memristive devices for computing,” Nature nan-
otechnology 8, 13–24 (2013).
[13] Vinh Quang Diep, Brian Sutton, Behtash Behin-Aein,
and Supriyo Datta, “Spin switches for compact imple-
mentation of neuron and synapse,” Applied Physics Let-
ters 104, 222405 (2014).
[14] Abhronil Sengupta, Yong Shim, and Kaushik Roy, “Pro-
posal for an all-spin artificial neural network: Emulating
neural and synaptic functionalities through domain wall
motion in ferromagnets,” IEEE Transactions on Biomed-
ical Circuits and Systems (2016).
[15] Masanao Yamaoka, Chihiro Yoshimura, Masato Hayashi,
Takuya Okuyama, Hidetaka Aoki, and Hiroyuki Mizuno,
“Ising computer,” Hitachi Review 65, 157 (2016).
[16] David H Ackley, Geoffrey E Hinton, and Terrence J Se-
jnowski, “A learning algorithm for boltzmann machines,”
Cognitive science 9, 147–169 (1985).
[17] Masanao Yamaoka, Chihiro Yoshimura, Masato Hayashi,
Takuya Okuyama, Hidetaka Aoki, and Hiroyuki Mizuno,
“24.3 20k-spin ising chip for combinational optimization
problem with cmos annealing,” in 2015 IEEE Interna-
tional Solid-State Circuits Conference-(ISSCC) Digest of
Technical Papers (IEEE, 2015) pp. 1–3.
[18] Takahiro Inagaki, Kensuke Inaba, Ryan Hamerly, Kyo
Inoue, Yoshihisa Yamamoto, and Hiroki Takesue,
“Large-scale ising spin network based on degenerate op-
18
tical parametric oscillators,” Nature Photonics (2016).
[19] Ruslan Salakhutdinov, Andriy Mnih, and Geoffrey Hin-
ton, “Restricted boltzmann machines for collaborative
filtering,” in Proceedings of the 24th international con-
ference on Machine learning (ACM, 2007) pp. 791–798.
[20] Daniel J Amit, Modeling brain function: The world of
attractor neural networks (Cambridge University Press,
1992).
[21] Dingzhu Du, Jun Gu, Panos M Pardalos, et al., Satisfia-
bility problem: theory and applications: DIMACS Work-
shop, March 11-13, 1996, Vol. 35 (American Mathemat-
ical Soc., 1997).
[22] “Predictive Technology Model (PTM)
(http://ptm.asu.edu/)”.
[23] B. Sutton, K. Y. Camsari, R. Faria, and S. Datta,
http://dx.doi.org/doi:10.4231/D3C24QP4B “Probabilis-
tic spin logic simulator,” (2017).
[24] Luqiao Liu, Chi-Feng Pai, Y Li, HW Tseng, DC Ralph,
and RA Buhrman, “Spin-torque switching with the gi-
ant spin hall effect of tantalum,” Science 336, 555–558
(2012).
[25] Nicolas Locatelli, Alice Mizrahi, A Accioly, Rie Mat-
sumoto, Akio Fukushima, Hitoshi Kubota, Shinji Yuasa,
Vincent Cros, Luis Gustavo Pereira, Damien Querlioz,
et al., “Noise-enhanced synchronization of stochastic
magnetic oscillators,” Physical Review Applied 2, 034009
(2014).
[26] Arno van den Brink, Guus Vermijs, Aure´lie Solignac,
Jungwoo Koo, Jurgen T Kohlhepp, Henk JM Swagten,
and Bert Koopmans, “Field-free magnetization reversal
by spin-hall effect and exchange bias,” Nature communi-
cations 7 (2016).
[27] Yong-Chang Lau, Davide Betto, Karsten Rode, JMD
Coey, and Plamen Stamenov, “Spin–orbit torque switch-
ing without an external field using interlayer exchange
coupling,” Nature nanotechnology (2016).
[28] Angeline Klemm Smith, Mahdi Jamali, Zhengyang Zhao,
and Jian-Ping Wang, “External field free spin hall effect
device for perpendicular magnetization reversal using a
composite structure with biasing layer,” arXiv preprint
arXiv:1603.09624 (2016).
[29] Shunsuke Fukami, Chaoliang Zhang, Samik DuttaGupta,
Aleksandr Kurenkov, and Hideo Ohno, “Magnetization
switching by spin-orbit torque in an antiferromagnet-
ferromagnet bilayer system,” Nature materials (2016).
[30] JT Heron, JL Bosse, Q He, Y Gao, M Trassin, L Ye,
JD Clarkson, C Wang, Jian Liu, S Salahuddin, et al.,
“Deterministic switching of ferromagnetism at room tem-
perature using an electric field,” Nature 516, 370–373
(2014).
[31] Sasikanth Manipatruni, Dmitri E Nikonov, and Ian A
Young, “Spin-orbit logic with magnetoelectric nodes: A
scalable charge mediated nonvolatile spintronic logic,”
arXiv preprint arXiv:1512.05428 (2015).
[32] Roger H Koch, G Grinstein, GA Keefe, Yu Lu, PL Trouil-
loud, WJ Gallagher, and SSP Parkin, “Thermally as-
sisted magnetization reversal in submicron-sized mag-
netic thin films,” Physical review letters 84, 5419 (2000).
[33] Sergei Urazhdin, Norman O Birge, WP Pratt Jr,
and J Bass, “Current-driven magnetic excitations in
permalloy-based multilayer nanopillars,” Physical review
letters 91, 146803 (2003).
[34] IN Krivorotov, NC Emley, AGF Garcia, JC Sankey,
SI Kiselev, DC Ralph, and RA Buhrman, “Temperature
dependence of spin-transfer-induced switching of nano-
magnets,” Physical review letters 93, 166603 (2004).
[35] AV Khvalkovskiy, D Apalkov, S Watts, R Chepulskii,
RS Beach, A Ong, X Tang, A Driskill-Smith, WH But-
ler, PB Visscher, et al., “Basic principles of stt-mram
cell operation in memory arrays,” Journal of Physics D:
Applied Physics 46, 074001 (2013).
[36] RP Cowburn, “Property variation with shape in mag-
netic nanoelements,” Journal of Physics D: Applied
Physics 33, R1 (2000).
[37] Punyashloka Debashis, Rafatul Faria, Kerem Y Camsari,
Joerg Appenzeller, Supriyo Datta, and Zhihong Chen,
“Experimental demonstration of nanomagnet networks
as hardware for ising computing,” in Electron Devices
Meeting (IEDM), 2016 IEEE International (IEEE, 2016)
pp. 34–3.
[38] Kerem Yunus Camsari, Samiran Ganguly, and Supriyo
Datta, “Modular approach to spintronics,” Scientific Re-
ports 5 (2015).
[39] Luqiao Liu, Takahiro Moriyama, D. C. Ralph, and R. A.
Buhrman, “Spin-torque ferromagnetic resonance induced
by the spin hall effect,” Phys. Rev. Lett. 106, 036601
(2011).
[40] Seokmin Hong, Shehrin Sayed, and Supriyo Datta, “Spin
circuit representation for the spin hall effect,” IEEE
Transactions on Nanotechnology 15, 225–236 (2016).
[41] Supriyo Datta, Sayeef Salahuddin, and Behtash Behin-
Aein, “Non-volatile spin switch for boolean and non-
boolean logic,” Applied Physics Letters 101, 252411
(2012).
[42] William H Butler, Tim Mewes, Claudia KA Mewes,
PB Visscher, William H Rippard, Stephen E Russek, and
Ranko Heindl, “Switching distributions for perpendicu-
lar spin-torque devices within the macrospin approxima-
tion,” IEEE Transactions on Magnetics 48, 4684–4700
(2012).
[43] Andrew D Kent and Daniel C Worledge, “A new spin
on magnetic memories,” Nature nanotechnology 10, 187–
191 (2015).
[44] Daniel Morris, David Bromberg, Jian-Gang Jimmy Zhu,
and Larry Pileggi, “mlogic: Ultra-low voltage non-
volatile logic circuits using stt-mtj devices,” in Proceed-
ings of the 49th Annual Design Automation Conference
(ACM, 2012) pp. 486–491.
[45] Deepanjan Datta, Behtash Behin-Aein, Supriyo Datta,
and Sayeef Salahuddin, “Voltage asymmetry of spin-
transfer torques,” IEEE Transactions on Nanotechnology
11, 261–272 (2012).
[46] JD Biamonte, “Nonperturbative k-body to two-body
commuting conversion hamiltonians and embedding
problem instances into ising spins,” Physical Review A
77, 052331 (2008).
[47] Kai-Uwe Demasius, Timothy Phung, Weifeng Zhang,
Brian P Hughes, See-Hun Yang, Andrew Kellock, Wei
Han, Aakash Pushp, and Stuart SP Parkin, “Enhanced
spin-orbit torques by oxygen incorporation in tungsten
films,” Nature communications 7 (2016).
[48] Chi-Feng Pai, Luqiao Liu, Y Li, HW Tseng, DC Ralph,
and RA Buhrman, “Spin transfer torque devices utilizing
the giant spin hall effect of tungsten,” Applied Physics
Letters 101, 122404 (2012).
[49] Qiang Hao and Gang Xiao, “Giant spin hall effect and
switching induced by spin-transfer torque in a w/co 40
fe 40 b 20/mgo structure with perpendicular magnetic
19
anisotropy,” Physical Review Applied 3, 034009 (2015).
[50] Terrence J Sejnowski, Paul K Kienker, and Geoffrey E
Hinton, “Learning symmetry groups with hidden units:
Beyond the perceptron,” Physica D: Nonlinear Phenom-
ena 22, 260 – 275 (1986).
[51] S Patarnello and P Carnevali, “Learning networks of neu-
rons with boolean logic,” EPL (Europhysics Letters) 4,
503 (1987).
[52] L Personnaz, I Guyon, and G Dreyfus, “Collective com-
putational properties of neural networks: New learning
mechanisms,” Physical Review A 34, 4217 (1986).
[53] Sreeram VB Aiyer, Mahesan Niranjan, and Frank Fall-
side, “A theoretical investigation into the performance of
the hopfield model,” IEEE Transactions on Neural Net-
works 1, 204–215 (1990).
[54] Hideyuki Suzuki, Jun-ichi Imura, Yoshihiko Horio, and
Kazuyuki Aihara, “Chaotic boltzmann machines,” Scien-
tific reports 3, 1610 (2013).
[55] G. E. Hinton, “Boltzmann machine,” Scholarpedia 2,
1668 (2007), revision #91075.
[56] John J Hopfield, “Neural networks and physical systems
with emergent collective computational abilities,” Pro-
ceedings of the national academy of sciences 79, 2554–
2558 (1982).
[57] Jianhua Liu, Shuo Zhou, Haikun Zhu, and Chung-Kuan
Cheng, “An algorithmic approach for generic parallel
adders,” in Proceedings of the 2003 IEEE/ACM interna-
tional conference on Computer-aided design (IEEE Com-
puter Society, 2003) p. 734.
[58] R Uma, Vidya Vijayan, M Mohanapriya, and Sharon
Paul, “Area, delay and power comparison of adder
topologies,” International Journal of VLSI Design &
Communication Systems 3, 153 (2012).
[59] Donald E Knuth and Luis Trabb Pardo, “Analysis of
a simple factorization algorithm,” Theoretical Computer
Science 3, 321–348 (1976).
[60] Fabio L. Traversa and Massimiliano Di Ventra,
“Polynomial-time solution of prime factorization and
np-complete problems with digital memcomputing ma-
chines,” Chaos: An Interdisciplinary Journal of Nonlin-
ear Science 27, 023107 (2017).
[61] Massimiliano Di Ventra, Fabio L Traversa, and Igor V
Ovchinnikov, “Topological field theory and comput-
ing with instantons,” arXiv preprint arXiv:1609.03230
(2016).
[62] Artur Ekert and Richard Jozsa, “Quantum computa-
tion and shor’s factoring algorithm,” Reviews of Modern
Physics 68, 733 (1996).
[63] Richard P Feynman, “Simulating physics with comput-
ers,” International journal of theoretical physics 21, 467–
488 (1982).
