Spin-Hall MTJ Cells for Intra-Column Competition in Hierarchical
  Temporal Memory by Stephan, Andrew W. & Koester, Steven J.
GENERIC COLORIZED JOURNAL, VOL. XX, NO. XX, XXXX 2019 1
Spin-Hall MTJ Cells for Intra-Column
Competition in Hierarchical Temporal Memory
Andrew W. Stephan and Steven J. Koester, Fellow, IEEE
Abstract— We propose a dedicated winner-take-all cir-
cuit to efficiently implement the intra-column competition
between cells in Hierarchical Temporal Memory which is
a crucial part of the algorithm. All inputs and outputs are
charge-based for compatibility with standard CMOS. The
circuit incorporates memristors for competitive advantage
to emulate a column with a cell in a predictive state. The
circuit can also detect columns ’bursting’ by passive av-
eraging and comparison of the cell outputs. The proposed
spintronic devices and circuit are thoroughly described and
a series of simulations are used to predict the performance.
The simulations indicate that the circuit can complete a
nine-cell, nine-input competition operation in under 15 ns
at a cost of about 25 pJ.
Index Terms— Hierarchical Temporal Memory, Neuromor-
phic Computing, Spintronics, Spin Hall, Magnetic Tunnel
Junction.
I. INTRODUCTION
Hierarchical Temporal Memory (HTM) is an emerging
neuromorphic algorithm inspired by the structural properties
of the neocortex [2]. HTM boasts powerful recognition and
prediction abilities [3], [4]. The conceptual architecture is
fairly complex, with many different functions required to
implement it. A comprehensive processor architecture has
been proposed for this purpose [5], [6]. HTM consists of
two primary components, the spatial pooler and the temporal
memory. The spatial pooler consists of a set of columns with
proximal connections to the input space. The input space is a
sparse distributed representation (SDR) of data in the form
of a binary matrix. Each column activates upon receiving
input which exceeds its threshold value. The temporal memory
portion of HTM divides each column into multiple cells that
share the same proximal connections and compete with one
another to represent the column. Axial connections between
cells in different columns can give some cells a competitive
advantage, but the proximal connections must still be solely
responsible for surpassing the threshold.
Implementing the full HTM structure in hardware is energy-
expensive due to its complexity. The inclusion of dedicated
circuitry capable of efficiently performing specific HTM-
related tasks can reduce this load. This provides motivation
to design a variable-threshold analog winner-take-all (WTA)
circuit with competitive advantage. In this work we propose an
efficient spintronic implementation of a WTA circuit meant to
emulate the cells within an HTM column. The circuit includes
Manuscript submitted July 20, 2020. This work was supported by
Seagate Technology PLC.
an option for competitive advantage so that certain cells can
be biased to win even when all of the competitors receive
the same input, which emulates the ’predictive state’ of HTM.
We also consider how to detect a column ’bursting’ which
indicates that the threshold was exceeded but no cells were in a
predictive state. In the following sections we will describe the
device and circuit design, analyze its performance and explain
our simulation methodology.
In part the purpose of this work is to study the effect of the
competitive advantage term on the operation of the spintronic
WTA circuit, and determine which values should be used.
This knowledge will guide future efforts, especially the choice
of memristive devices needed to induce the advantage. This
work does not deal with the overall HTM architecture, but
focuses specifically on the proposal for an efficient implemen-
tation of individual columns. Althrough spintronic elements
are involved, the input and output is voltage-based, which
allows the columns to be paired with any other charge-based
implementation of the other HTM functions to emulate a full
HTM architecture. We use beyond-CMOS methods to develop
a WTA circuit in this work to explore the potential advantages
conferred by the inherently analog and non-linear nature of
certain spintronic devices. As will be discussed below, we
determine that the transition from CMOS to spintronic WTA
implementations results in a tradeoff between delay and energy
cost.
II. DESIGN
A. Spintronic Cell Design
The cell is based on a well-known device, the spin-Hall-
effect (SHE) driven magnetic tunnel junction (MTJ) voltage
divider [7]–[11]. The particular version of SHE-MTJ we use
is derived from [1], as the analog WTA circuit requires non-
digital behavior from the MTJs. The MTJ free layer (FL) is in
contact with a heavy metal (HM) which, when charge flows
through it, produces a spin-polarized current which can be
used to drive the FL. The conductance GMTJ of the MTJ
varies with the relative angle φ of the FL magnetization to the
pinned layer (PL) as
GMTJ =
1
2
(GP +GAP ) +
1
2
(GP −GAP )cosφ, (1)
where GP and GAP are the conductance when the FL is
parallel or antiparallel to the pinned layer, respectively. The
output potential of the voltage divider and ultimately that of
the attached inverter [12] thus depend on φ. The equations
governing the cell dynamics will be covered in more detail
ar
X
iv
:2
00
7.
08
65
9v
1 
 [c
s.E
T]
  1
6 J
ul 
20
20
2 GENERIC COLORIZED JOURNAL, VOL. XX, NO. XX, XXXX 2019
Fig. 1. (a) Basic cell design including MTJ, reference resistor and
output inverter. (b) Output potential vs. input current in steady-state. Two
different FL geometries are considered, with a step-like transition and a
smooth transition.
in Section IV. The attached inverter gives the cell a stable
voltage-based read path that avoids perturbing the voltage
divider (see Fig. 1). As in [1], we choose the anisotropic
characteristics of the MTJ to generate a smooth linear response
loop, in this case by utilizing shape anisotropy. The MTJ FL
width dimension is shorter than the length dimension, creating
a smooth transition due to the demagnetization field. We assign
the PL orientation such that a negative current produces a spin
torque in the parallel direction while positive current drives the
FL in the antiparallel direction.
B. WTA Circuit Design
An HTM column consists of multiple cells that receive
the same proximal input and compete in a WTA fashion to
represent the column if the input exceeds some threshold. The
threshold may be tuned by an input bias. In this work we
draw much of the WTA cell design from [1]. The workings
of individual cells is studied in detail in that work. For
this application, we simplify the pooler circuit by removing
the second stage of each activation pair because the neural
activation function is not needed. We assume the input space
takes the form of a set of voltage sources. Current is provided
to the cells by a memristor crossbar array joining the cells with
the input space. An example is shown in Fig. 2. The precise
nature of the memristors is not treated here but many examples
exist in the literature including filamentary, MTJ-based and
ferroelectric memristors [14]–[17], any of which would be
suitable for this purpose. Some architectures also incorporate
an intermediary device which reads the input and transmits a
corresponding signal to the cells. Besides the current from the
proximal connections, each cell has an additional connection
from each of the output inverters of its neighbors. The result
is an inhibitory connection between each cell that induces
more negative torque in the receiver cell as the magnetization
of the source cell becomes more positive. The strength of
the inhibitory connections depends on the conductance of the
memristor joining each output inverter to the HM input. A
higher conductance gives the source cell a competitive advan-
tage, which is measured as the ratio of the conductance to that
of the other cells. The complete WTA circuit design is shown
Fig. 2. An set of inputs connected to the column via an array
of memristors. If the conductances in each row of the crossbar are
identical, each cell receives the same net input as is standard in HTM.
in Fig. 3, where the crossbars are represented as simple current
sources and the inverters are represented by the standard circuit
symbol for brevity. The operating parameters are carefully
chosen such that the low sensing potential VS1 on the voltage
divider and the inverter low rail potential −VDD are matched.
The result is that the inhibitory output connections cease to
provide current when the source cell magnetization reaches
the −1 state. This allows the cells to achieve an equilibrium
by balancing the excitatory proximal connections with the
intra-column inhibitory connections. Example results for four
different cases are given in Fig. 4. If all cells compete equally
when receiving a negative input current sufficiently large to
excite them, they all achieve a similar steady-state output
below the value expected according to the proximal input but
above the minimum output. Alternatively if the competition is
uneven due to a certain cell being in a predictive state, that cell
drives the others to the minimum state while itself achieving
a higher state. This result is achieved by giving the predictive
cell a higher output conductance on its inhibitory connections.
A detailed breakdown of the WTA circuit performance is given
in Section. III.
III. RESULTS
There are two crucial questions that determine the success
of this WTA circuit in its intended function for the HTM
architecture. The first is the question of whether it can emulate
a predictive state via competitive advantage for one cell. The
second is the question of whether it can emulate a column
bursting. This situation occurs in an HTM when the proximal
input is large but none of the cells is in a predictive state.
The discussion below resulted from a series of simulations
of the full 9-cell circuit using a custom simulator written
in Matlab, which includes empirical approximations of the
inverter behavior based on previous HSPICE simulations [1].
STEPHAN et al.: SPIN-HALL MTJ CELLS FOR INTRA-COLUMN COMPETITION IN HIERARCHICAL TEMPORAL MEMORY 3
Fig. 3. Winner-take-all circuit design. A column consists of multiple
cells, each of which receives the same proximal input current. Each cell
also receives an inhibitory current from each of the other cells.
Fig. 4. Outcomes for four different basic cases. Each case assumes a 9-
cell column competing with identical inputs. When the input is insufficient
to excite the cells, all cells quickly reach a -0.5 V output and remain
there. When the input is sufficient and the cells compete on equal terms,
all cells achieve an equilibrium with one another at an above-minimum
output. When the input is sufficient and the cells compete on unequal
terms, the cell with advantage quickly drives the others to the minimum
output.
A. Predictive State
To determine whether the WTA circuit can succeed in
emulating a predictive state, we performed a series of Monte-
Carlo simulations and averaged the results. Each 100-round
ensemble assumed a specific proximal input value and a com-
petitive advantage, encoded as the ratio between the inhibitory
output conductance of the predictive cell and that of the
other cells. In Fig. 5(a) we show the designated predictive
cell output vs. the input current. The larger the competitive
advantage is, the greater the winner output grows as a function
of input magnitude. Meanwhile in Fig. 5(b) we show that as
the competitive advantage grows, the other cell outputs shrink.
Fig. 6 shows directly the difference between the winner cell
and the mean of the others, as the other cells deviate very
Fig. 5. (a) Predictive cell output vs. input current. (b) Average of other
cell outputs vs. input current.
Fig. 6. Difference between the predictive cell output and the average
of the other cell outputs for various inputs as a function of competitive
advantage. The dashed line indicates the minimum required separation
of 70 mV.
little from one another. The dashed line indicates a difference
of 70 mV, which is the smallest signal difference which is
still large enough for a digital inverter to differentiate. With
a reference potential of VS1 + 35 mV, the inverter in Fig.
7 can produce outputs that differ by 500 mV using inputs
of VS1 and VS1 + 70 mV. Here we note that the choice of
competitive advantage can be used to determine the effective
input threshold for detection of a predictive state event. In
order to noticeably differentiate even at weak excitation inputs
such as +1µA (see Fig. 1), a competitive advantage of at
least 1.6 is required. This yields sufficient output separation
to differentiate the winner cell from the others. Alternatively,
a low advantage of 1.1 can be chosen in order to enforce a
0 µA threshold because at this advantage level only negative
currents are shown to produce results which exceed VS1 + 70
mV. In general the current threshold which is enforced depends
on the magnitude of the competitive advantage used.
B. Bursting
To simulate a column going bust, we assume no cells are
in the predictive state, which corresponds to a competitive
advantage of 1. As shown in Fig. 4, the cells will all behave
identically in the case of going bust, reaching a steady-state
slightly above the minimum output. This can be detected by
measuring at least two outputs. We assume the mean output of
4 GENERIC COLORIZED JOURNAL, VOL. XX, NO. XX, XXXX 2019
Fig. 7. Inverter behavior based on HSPICE simulation with linear
approximation included.
Fig. 8. Difference between the column average output the minimum
cell potential vs. input current. The dashed line indicates the minimum
required separation of 70 mV.
all cells is measured, and the column is determined to be bust
if the average exceeds some threshold but no single cell has a
large output. To estimate the detection capability, we assume
that this average output must exceed VS1 by at least 70 mV,
sufficient for a digital inverter to differentiate the signals as
mentioned above. Fig. 8 shows the average potential difference
vs. input. We note that if the input is below 0 µA, which would
suffice to excite the cells if not for the inhibitory connections,
a passive averaging circuit can detect that the column has gone
bust. We note that while VAvg−VS1 can be expected to exceed
70 mV in cases with a single predictive cell, a simple digital
comparison circuit can differentiate those cases from the ones
in which all cells are partially excited.
C. Energy Usage
The average time to complete a WTA function depends on
the competitive advantage of the predictive cell. If there is no
predictive cell–or no competitive advantage–then the process
is quite fast, requiring about 3 ns. At most the WTA process
takes less than 60 ns to finish. If there is a predictive cell,
then the process finishes more quickly if the advantage is
larger, as shown in Fig. 9. This is because the additional
Fig. 9. Time and energy required to complete the WTA function vs
competitive advantage. Greater advantage leads to a faster solution.
advantage causes the predictive cell to drive its competitors
with more current, suppressing them more swiftly. When there
is no competitive advantage, the process finishes most quickly
since the cells quickly reach an equilibrium. In this case there
is no need to wait for one cell to differentiate itself from the
rest by suppressing their outputs, as all cells behave nearly as
one, barring noise. The energy consumption is also given in
that figure. The cost is at most 120 pJ for a nine-cell column,
or about 13 pJ per cell. The relevant power calculations are
given in Section IV C.
D. Process Variation
Here we consider the effects of process variation on the
outcome of the WTA cell in the predictive state case using
Monte-Carlo simulations. Each device in the simulated circuit
is randomly assigned a set of parameters drawn from normal
distributions at the beginning of every round of simulation. As
before, each data point is the mean result of an ensemble of
at least 100 rounds. We consider four different independent
variables: transistor threshold, MTJ parallel resistance, MTJ
antiparallel resistance and MTJ base switching current Ic0,
which is the value of I at which HSHE equals HI (3-
5). As in [1], Typical standard deviations of 5% for each
MTJ parameter were chosen after consulting [20], [21] and
a transistor threshold deviation of 20 mV was selected based
on the Pelgrom plots in [22], assuming 200 nm gate width.
Fig. 10 shows the output difference between the winner cell
and the others with competitive advantage as a parameter as
in Fig. 6. Incorporating process variation makes it somewhat
more difficult for the predictive cell to differentiate itself from
the others. A competitive advantage of 2.0 is required when a
weak proximal excitation of 1µA is applied, compared to 1.6
when ideal devices are used. Similarly an advantage of 1.2
is required to enforce the 0 µA threshold as opposed to 1.1
with ideal devices. While these differences are notable, they
indicate that the circuit is not prevented from proper func-
tion by process variation so long as the modified advantage
requirements are applied. We also note that simulation of a
column going bust with process variation incorporated showed
no noticeable difference in results.
STEPHAN et al.: SPIN-HALL MTJ CELLS FOR INTRA-COLUMN COMPETITION IN HIERARCHICAL TEMPORAL MEMORY 5
Fig. 10. Difference between the predictive cell output and the average
of the other cell outputs for various inputs as a function of competitive
advantage. The dashed line indicates the minimum required separation
of 70 mV. This simulation accounts for process variation in the transistors
and MTJs of the WTA circuit.
TABLE I
SIMULATION PARAMETERS
Symbol Quantity Value
K crystalline anisotropy 10 kJ/m3
V ferromagnet volume 1800 nm3
MS saturation magnetization 1 MA/m
α Gilbert damping 0.01
tHM heavy metal thickness 5 nm
RHM heavy metal resistance ≈ 50 Ω
θ spin-Hall angle 0.3 [18], [19]
VS1 low sensing voltage -0.5 V
VS2 high sensing voltage 0.4 V
RM MTJ RA product 8 Ω µm2 [12]
TMR tunnel magnetoresistance ratio 1.5 [12]
RR reference resistor 3.25 - 140 kΩ
∆t simulation time step 0.5 ps
VDD inverter rail voltage ± 0.5 V
τ inverter intrinsic delay 4.5 ps
Cg inverter gate capacitance 6.6 fF
RT inverter on-resistance 10.8 kΩ
IV. SIMULATION METHODS
The physical parameters used in the simulation are available
in Table I. In choosing the magnetic saturation, weak in-plane
crystalline anisotropy, MTJ resistance and TMR, we selected
values typical for spintronic devices after consulting [11], [12].
A. MTJs
The cell is simulated using the fourth-order Runge-Kutta
method to predict the behavior of the circuit and magnetic
FL. The FL is treated using the macrospin approximation and
the Landau-Lifshitz-Gilbert (LLG) equation
dmˆ
dt
= −γµ0
(
(mˆ×HEff )− α
(
mˆ× (mˆ×HEff )
))
,
(2)
where mˆ indicates the unit magnetization of the FL and HEff
is the effective field on the FL. The symbols γ, µ0 and α
are the gyromagnetic ratio, vaccuum permeability and Gilbert
damping respectively. Bold font indicates vector quantities.
The effective field consists of two terms,
HEff = HI +HSHE , (3)
where HSHE is the effective SHE-field and HI represents
all the intrinsic field terms. HSHE is proportional to the spin
current IS [23], [24]:
HSHE =
1
µ0
IS
2q
~
αVMS
yˆ, (4)
where ~ is the reduced Planck constant. The spin current IS
is in turn proportional to the charge current I flowing through
the HM layer:
IS = θ
tHM
LFM
I, (5)
where θ is the spin-Hall angle. The intrinsic field HI includes
the demagnetization and anisotropy fields as well as the
thermal noise field which consists of a multivariate Gaussian
random variable with zero mean and variance
σT =
√
2kBTα
γMSV∆t
, (6)
where kB , T , MS , V and ∆t are the Boltzmann constant,
temperature, magnetic saturation, FL volume and simulation
time step respectively. We account for this term as it can cause
noise in the cell voltage readouts.
B. Inverters
The voltage dividers provide the gate potential for the
inverters and are treated with standard circuit equations while
taking into account the changing RC gate delay due to the
varying MTJ resistance. The output of the inverter is treated
with a first-order approximation based on HSPICE simulations
using the 16-nm node Predictive Technology Model [13] (see
Fig. 7). The inverter gate width for the HSPICE simulations
was 200 nm.
C. Energy calculations
The power consumed by the WTA circuit comes from three
sources: inverter rail-to-rail leakage PI , crossbar input power
PCB and voltage divider leakage PV D. Assuming nine cells
in a column with nine powered connections to the input space
each, there are nine voltage divider stacks and 162 inverters,
so NV D = 9 and NI = 162. Using HSPICE simulations to
estimate the rail-to-rail leakage we find that PI = 7.3µW .
Assuming an average inhibition output resistance of R = 280
kΩ, each crossbar connection drains an average of PCB =
(E[VIn−VS1)2]
R = 1.2µW . Finally, the voltage divider leakage
is estimated as PV D =
(VS2−VS1)2
RR+E[RMTJ ]
= 30.4µW . The overall
WTA delay τ is estimated as the time at which the average
output comes to within 5 mV of its steady-state value. The
total energy cost is E = τ · (NI · (PI +PCB) +NV D ·PV D).
The distribution of τ and E is shown in Fig. 9. We note of
course that the values in Fig. 9 vary depending on the number
of cells per column and the average number of connections
each column has to the input space.
6 GENERIC COLORIZED JOURNAL, VOL. XX, NO. XX, XXXX 2019
This work is comparable to [25]–[27] which describe purely
CMOS-based WTA implementations. Although the circuit in
this work consumes more energy per operation, it requires
significantly less time per input set. We ascribe this difference
to the several additional layers of computation required by
the CMOS circuits which introduce more delay. However,
spintronic WTA circuit involves more leakage current, which
accounts for the increased energy cost despite its reduced
runtime.
V. CONCLUSION
The HTM algorithm is a powerful recognition and pre-
diction tool with the potential to revolutionize neuromorphic
systems. Each portion of the HTM algorithm that can be
implemented using efficient dedicated circuits significantly
reduces the overall computational burden. The intra-column
dynamics are an important part of HTM, and we have demon-
strated a novel spintronic circuit based upon spin-Hall MTJs
with a simple design that can emulate these dynamics quickly
and efficiently. To the best of the authors’ knowledge, there
are no proposals for column circuits which more efficiently
implement the intra-column competition aspect of HTM.
REFERENCES
[1] MAAP placeholder.
[2] J. Hawkins and S. Blakeslee, On Intelligence. New York, NY, USA:
Macmillan, 2007.
[3] J. Xing, T. Wang, Y. Leng and J. Fu, “A Bio-Inspired Olfactory Model
Using Hierarchical Temporal Memory,” Proc. 5th Int. Conf. Biomed.
Eng. Informat., pp. 923–927, 2012.
[4] D. E. padilla, R. Brinkworth and M. D. McDonnell, “Perforamnce of a
Hierarchical Temporal Memory Network in Noisy Sequence Learning,”
Proc. IEEE Int. Conf. Comput. Int. Cybern., pp. 45–51, 2013.
[5] A. M. Zyarah and D. Kudithipudi, “Neuromorphic Architecture for the
Hierarchical Temporal Memory,” IEEE Trans. Emerg. Top. in Comp.
Int., vol. 3, no. 1, Feb. 2019, DOI:10.1109/TETCI.2018.2850314
[6] A. M. Zyarah and D. Kudithipudi, “Neuromemristive Archi-
tecture of HTM with On-Device Learning and Neurogenesis”,
arXiv:1812.10730v1, Dec. 2018, DOI:10.1145/3300971.
[7] A. Sengupta and K. Roy, “Encoding Neural and Synaptic Function-
alities in Electron Spin: A Pathway to Efficient Neuromorphic Com-
puting,” Appl. Phys. Rev., vol. 4, no. 4, pp. 041105–1–25, Dec. 2017,
DOI:10.1063/1.5012763.
[8] D. Morris, D. Bromberg, J.-G. Zhu and L. Pileggi, “mLogic: Ultra-
Low Voltage Non-Volatile Logic Circuits Using STT-MTJ Devices,”
Proceedings 49th DAC, pp. 486–491, Jun. 2012.
[9] S. Datta, S. Salahuddin and B. Behin-Aein, “Non-Volatile Spin Switch
for Boolean and Non-Boolean Logic,” Appl. Phys. Lett., 101, pp.
252411–1–5 (2012).
[10] C. Pan and A. Naeemi, “Non-Boolean Computing Benchmarking for
Beyond-CMOS Devices Based on Cellular Neural Network,” IEEE J.
Expl. Sol.-Stat. Computat. Dev. and Circ., vol. 2 pp. 36-43, Nov. 2016,
DOI:10.1109/JXCDC.2016.2633251.
[11] W. Kang, Z. Wang, Y. Zhang, J.-O. Klein, W. Lv and W. Zhao,
“Spintronic Logic Design Methodology Based on Spin Hall Effect-
Driven Magnetic Tunnel Junctions”, J. Phys. D: Appl. Phys., vol. 49,
pp. 065008–1–11, Jan. 2016, DOI:10.1088/0022-3727/49/6/065008.
[12] J.-G. Zhu and C. Park, “Magnetic Tunnel Junctions,” Mater. Today, vol.
9, no. 11, pp. 36–45, Nov. 2006, DOI:10.1016/S1369-7021(06)71693-5.
[13] Predictive Technology Model. Accessed:Jul. 25, 2017. [Online]. Avail-
able: http://ptm.asu.edu
[14] D. Fan, Y. Shim, A. Raghunathan and K. Roy, “STT-SNN: A Spin-
Transfer-Torque Based Soft-Limiting Non-Linear Neuron fro Low-
Power Artificial Neural Networks,” IEEE Trans. Nano., vol. 14, no. 6,
pp. 1013–1012, Nov. 2015, DOI:10.1109/TNANO.2015.2437902.
[15] S. H. Jo, T. Chang, I. Ebong, B. B. Bhadviya, P. Mazumder and W. Lu,
“Nanoscale Memristor Device as Synapse in Neuromorphic Systems,”
NanoLett., vol. 10, no. 4, pp. 1297–1301, DOI:10.1021/nl904092h.
[16] T. Li, S. Duan, J. Liu and L. Wang, “An Improved Design of RBF Neural
Network Control Algorithm Based on Spintronic Memristor Crossbar
Array,” Neural Comput. & Appl., vol. 30, no. 6, pp. 1939–1946, Sep.
2018, DOI:10.1007/s00521-016-2715-8.
[17] M. Jerry, P.-Y. Chen, J. Zhang, P. Sharma, K. Ni, S. Yu and S. Datta,
“Ferroelectric FET Analog Synapse for Acceleration of Deep Neural
Network Training,” 2017 IEEE Int. Elec. Dev. Meeting (IEDM), pp.
6.2.1–6.2.4, Dec. 2017, DOI:10.1109/IEDM.2017.8268338.
[18] C.-F. Pai, L. Liu, Y. Li, H. W. Tseng, D. C. Ralph and R. A. Buhrman,
“Spin Transfer Torque Devices Utilizing the Giant Spin Hall Effect of
Tungsten,” Appl. Phys. Lett., 101, 122404–1–4 (2012).
[19] L. Liu, C.-F. Pai, Y. Li, H. W. Tseng, D. C. Ralph and R. A. Buhrman,
“Spin-Torque Switching with the Giant Spin Hall Effect of Tantalum,”
Science, vol. 336, pp. 555–558 (2012).
[20] W. Kang, L. Zhang, J.-O. Klein, Y. Zhang, D. Ravelosona, W. Zhao,
“Reconfigurable Codesign of STT-MRAM Under Process Variations in
Deeply Scaled Technology,” IEEE Trans. Elec. Dev., vol. 62, no. 6, pp.
1769–1777, Jun. 2015, DOI:10.1109/TED.2015.2412960
[21] P. Wang, E. Eken, Z, W. Zhang, R. Joshi, R. kanj and Y. Chen, “A
Thermal and Process Variation Aware MTJ Switching Model and its
Applications in Soft Error Analysis,” More than Moore Technologies
for Next Generation Computer Design, Chapter 5, pp. 101–125, Springer
New York, 2015, DOI:10.1007/978-1-4939-2163-8.
[22] M. D. Giles, N. Arkali Radhakrishna, D. Becher, A. Kornfeld, K.
Maurice, S. Mudanai, S Natarajan, P. Newman, P. Packan and T. Rakshit,
“High Sigma Measurement of Random THreshold Voltage Variation in
14nm Logic FinFET technology,” Proc. of VLSI Technology, pp. T150–
T151, Aug. 2015, DOI:10.1109/VLSIT.2015.7223657.
[23] D. C. Ralph and M. D. Stiles, “Spin Transfer Torques,” J.
Magn. Magn. Mater., vol. 320, pp. 1190-1216, Apr. 2008,
DOI:10.1016/j.jmmm.2007.12.019.
[24] W.H. Butler, T. Mewes, C. K. A. Mewes, P. B. Visscher, W. H.
Rippard, S. E. Russek and R. Heindl, “Switching Distributions for
Perpendicular Spin-Torque Devices Within the Macrospin Approxima-
tion,” IEEE Trans. Mag., vol. 48, no. 12, pp. 4684–4700, Dec. 2012,
DOI:10.1109/TMAG.2012.2209122.
[25] S. Ramakrishnan and J. Hasler, “Vector-Matrix Multiply and Winner-
Take-All as an Analog Classifier,” IEEE Trans. VLSI Syst., vol. 22, no.
2, pp. 353–361, Feb. 2014.
[26] T. Kulej and F. Khateb, “Sub 0.5-V Bulk-Driven Winner Take All Circuit
based on a New Voltage Follower,” Analog Int. Circ. and SIg. Proc., vol.
90, no. 3, pp. 687–691, 2017.
[27] Y.-C. Hung, B.-D. Liu and C.-Y. Tsai, “1-V Bulk-Driven CMOS Analog
Programmable Winner-Takes-All Circuit,” Analog Integ. Circ. Sig. Proc.,
vol. 49, no.1, pp. 53–61, Oct. 2006.
