Scalability of spin FPGA: A Reconfigurable Architecture based on spin
  MOSFET by Tanamoto, Tetsufumi et al.
ar
X
iv
:1
10
4.
14
93
v1
  [
co
nd
-m
at.
me
s-h
all
]  
8 A
pr
 20
11
Scalability of spin FPGA: A Reconfigurable Architecture based on spin MOSFET
Tetsufumi Tanamoto, Hideyuki Sugiyama, Tomoaki Inokuchi,
Takao Marukame, Mizue Ishikawa, Kazutaka Ikegami, and Yoshiaki Saito1
1Advanced LSI Technology Laboratory Corporate Research and Development Center,
Toshiba Corporation 1, Komukai Toshiba-cho, Saiwai-ku, Kawasaki 212-8582, Japan.
(Dated: May 28, 2018)
Scalability of Field Programmable Gate Array (FPGA) using spin MOSFET (spin FPGA) with
magnetocurrent (MC) ratio in the range of 100% to 1000% is discussed for the first time. Area and
speed of million-gate spin FPGA are numerically benchmarked with CMOS FPGA for 22nm, 32nm
and 45nm technologies including 20% transistor size variation. We show that area is reduced and
speed is increased in spin FPGA owing to the nonvolatile memory function of spin MOSFET.
PACS numbers:
INTRODUCTION
Spin metal-oxide-semiconductor field-effect transistor
(spin MOSFET) is a novel MOSFET whose source and
drain are contacted with ferromagnetic materials [1].
Ferromagnetic materials provide stable and robust non-
volatile memory [2]. Fig.1(a) shows a spin MOSFET in
which the write process is carried out by using magnetic
tunneling junction (MTJ) [3, 4]. Spin MOSFET directly
couples logic element with nonvolatile memory element,
opening up a path to a new style of logic-in-memory ar-
chitecture [5].
Field Programmable Gate Array (FPGA) has a great
advantage in that a chip is completely programmable and
reconfigurable. However, conventional FPGA includes
a lot of static random access memory (SRAM), which
is a volatile memory composed of six transistors and
faces the fabrication limitation of Si MOSFET. Thus,
new FPGA based on novel devices has been expected.
Here, for the first time, we report on numerical bench-
mark for an island-style FPGA using 22nm, 32nm and
45nm spin MOSFETs (spin FPGA) [4] by improving
standard benchmark tools [6]. Compared with other
proposals[7, 8], spin FPGA has an advantage in that it is
based on Si transistor equipping stable nonvolatile mag-
netic memory. Moreover, SRAM (six transistors) can be
replaced by one spin MOSFET. Many SRAMs are used
in FPGA such as in Lookup tables (LUTs) and inter-
connect area of pass transistors. Therefore, this replace-
ment reduces transistors and FPGA area. Because the
speed of FPGA is governed by the length of wire part,
smaller area of spin FPGA leads to faster performance.
Monte Carlo simulation based on the Predictive Tech-
nology Model [9] is carried out to consider variation of
device size assuming fabrication difficulties. Although
experiments on MTJ [2] at present show the maximum
magnetocurrent (MC) ratio is 260% (RA ≈ 10Ωµm2), in
this paper we treat 100% ≤ MC ratio ≤ 1000% assuming
future realization of larger MC.
Gate
Tunnel Barrier
Ferromagnetic metal
Silicon
MTJ
(a) (b)
22nm
 Parallel
 MC=100%
 MC=200%
 MC=400%
 MC=600%
 MC=1000%
Gate Voltage [V]
D
ra
in
 
cu
rr
en
t [
A/
cm
/c
m
]
0 0.2 0.4 0.6 0.8
0.0005
0.001
0.0015
FIG. 1: (a) Spin-based MOSFET in the type of “Spin-
transfer-Torque-Switching MOSFET” in which magnetic tun-
nel junction (MTJ) are attached to one of the electrodes.
(b) Id-Vg characteristics for parallel and antiparallel states
(100% ≤ MC ≤ 1000%) based on PTM SPICE model (see
text).
SPIN FPGA
Spin MOSFET.—We model the spin MOSFET by
changing SPICE parameter (mobility) such that MC de-
fined by MC = (IP− IAP)/IAP coincide with a given MC
ratio (IP and IAP are parallel and antiparallel currents,
respectively.) For IP, we use the same SPICE param-
eters as those of the conventional MOSFET (Fig.1(b)).
Although there is extra resistance owing to the existence
of MTJ in spin MOSFET, as Ref.[10] reported, the re-
sistance of 50 nm square MTJ can be controlled to less
than 400Ω and this resistance is negligible compared to
the resistance of conventional MOSFET of the order of
10 kΩ.
Spin Cluster Logic Block.—Fig. 2 shows our spin LUT
structure [11] for 4-inputs and 1-output, which is a typ-
ical set of LUT parameters [6]. Transistor sizes of am-
plifiers are adjusted such that the input pulse signal is
appropriately transferred to the output of LUT.
Pass transistor.—We propose a spin control pass tran-
sistor depicted in Fig. 3 (a). SPICE simulations show
that the speed of pass transistor in Fig. 3(a) is of the
same order as that in Fig. 3(b) by adjusting the width of
control transistors (total transistor area of Fig.3(a) is four
in unit of minimum transistor size). Although this pass
transistor structure has a disadvantage, namely, a leak-
22x
1x
2x
2x
1x
2x
2x
1x
2x
2x
1x
2x
Operate
Power supply
1x 2x
-
+
Spin MOSFET Power supply
Reference
Out
In1 In2 In3 In4
Sense amplifier
FIG. 2: Schematic of a 4-input look up table based on spin
MOSFET (spin LUT). Spin MOSFETs replaces SRAMs at
the leftmost part of this figure.
Pass Transistor
SRAM 
cell
(b)
Pass Transistor
Spin MOSFET
Enable Enable
(a)
FIG. 3: (a) New routing pass transistor using spin MOSFET
and (b) that using conventional SRAM. In (a), one extra tran-
sistor is required to change P/AP state of spin MOSFET,
and the width of spin MOSFET and PMOS is enlarged to
control ON/OFF state of attached pass transistor. The esti-
mated number of required control transistors in (a) is four in
minimum-width transistor area model [6].
age pass from p-type transistor (PMOS) to n-type tran-
sistors (NMOS), this power dissipation can be reduced
by limiting the on-state only when it is required [12].
FPGA AREA REDUCTION BY SPIN MOSFET
First, let us compare the number of transistors in spin
LUT and CMOS LUT. In ref. [11], we only counted the
number of transistor of a spin LUT. Here, we estimate
the number of transistors by a general clustered logic
block (CLB) in which four CLBs are clustered with 10
inputs and 4 outputs. For K-input LUT, 2K SRAM and
2K+1−2 pass transistors (multiplexer trees) are required
with three input buffers. Then the total number of tran-
sistors in a complementary MOS (CMOS) LUT N
(cmos)
lut
is given by 2K+3 − 2 + 6K. In a spin LUT (Fig.2),
the leftmost SRAMs are replaced by spin MOSFETs
with an additional write/erase transistor. In addition,
a sense amplifier (five transistors), a reference transistor
and two power supply transistors are required. Thus,
the number of transistor required in the spin LUT is
given by N
(spin)
lut = 3 × 2
K + 6(K + 1). Thus, we have
N
(cmos)
lut −N
(spin)
lut = 5×2
K−8. For example, 4-input LUT
conventionally has 150 transistors whereas spin LUT in-
cludes 78 transistors (48% reduction).
Circuit area is calculated by the minimum-width tran-
sistor area model [6], in which each transistor area is esti-
mated by a unit of minimum-width NMOS. When Wmin
and Smin are width and area of minimum NMOS, respec-
tively, a width ZWmin transistor is estimated as having
an area of (1+Z)Smin/2. Width of PMOS is determined
such that an inverter changes at half of a drain voltage.
For PMOSs of 22nm, 32nm and 45nm nodes,
Z
(pmos)
22nm = 1.53, Z
(pmos)
32nm = 2.22, Z
(pmos)
45nm = 2.57 (1)
(PMOS is scaled down more than NMOS because of ad-
vanced technologies such as strain effects.) Area of re-
cent FPGA is mostly occupied by an interconnect or
wiring part. Wire resistance and capacitance are cal-
culated from Ref. [13].
BENCHMARK RESULTS AND DISCUSSION
Area and speed of spin FPGA over 20 typical million-
gate circuits are benchmarked with modified VPR
ver.5 [6] for 22nm, 32nm and 45nm transistors. We take
standard parameters such as Fs = 3 (Wilton switch box),
Fc in = 1.0 and Fc out = 0.25 with length 1 wire seg-
ment [6]. Fig.4-6 show the average results over 200 Monte
Carlo simulations for up to 20% (3 sigma) variations of
length and width in 22 nm transistors, where the vertical
axes show advantage of area, critical path delay and area-
delay product defined by (Θcmos−Θspin)/Θspin for Θ={A
(area), tdelay (critical path delay), A× tdelay (area-delay
product)}. Area-delay product is treated as a metric of
FPGA performance. Fig.4 and Table I show that area
of spin FPGA is greatly reduced compared with CMOS
FPGA. For 22 nm transistor, an average of 16% area
reduction is realized. This area reduction leads to small
critical path delay of circuits resulting in faster operation
in spin FPGA. In Fig. 5 speed is improved by an aver-
age of 24%. As MC ratio increases, P/AP signals that
go into an amplifier in spin LUT (Fig. 2) become clearer.
This leads to more robust operation against the variation
of transistors, resulting in shorter delay in Fig. 5. Thus,
area-delay product is improved on average by 43%. Fig 7
shows summarized results of benchmark from 22 nm to
45 nm transistors. As mentioned above, as transistor
scale decreases, ratio of PMOS area to NMOS area de-
creases. This means that the effect of area reduction by
spin MOSFET (NMOS) becomes larger resulting in bet-
ter performance of small transistor nodes.
One of the advantages of spin MOSFET compared with
CMOS with interlayer MRAM system is that, for spin
MOSFET, MC ratio change directly affects subthreshold
region of MOSFET which leads to more efficient device
operations. The effect of direct injection of spin into
channel on device performance will be clarified in more
detail in the near future.
30
5
10
15
20
25
30
a
lu
4
a
pe
x
2
a
pe
x
4
bi
gk
ey de
s
di
ffe
q
ds
ip
el
lip
tic
ex
10
10
ex
5p
fr
isc
m
ise
x
3
pd
c
s2
98
s3
84
17
s3
85
84
.
1
se
q
sp
la
ts
en
g
cl
m
a
a
v
er
a
ge
Design
 spin MC=100%
 spin MC=200%
 spin MC=400%
 spin MC=600%
 spin MC=1000%
A
dv
a
n
ta
ge
 
o
f F
PG
A
 
a
re
a
 
(%
) 22nm transistors 
a
v
er
a
ge
FIG. 4: Benchmark calculation of the advantage of spin
FPGA to CMOS FPGA over 20 circuits (area). Rightmost
data shows average over the 20 circuits.
-25
-15
-5
5
15
25
35
45
55
65
75
a
lu
4
a
pe
x
2
a
pe
x
4
bi
gk
ey de
s
di
ffe
q
ds
ip
el
lip
tic
ex
10
10
ex
5p
fr
isc
m
ise
x
3
pd
c
s2
98
s3
84
17
s3
85
84
.
1
se
q
sp
la
ts
en
g
cl
m
a
a
v
er
a
ge
Design
 spin MC=100%
 spin MC=200%
 spin MC=400%
 spin MC=600%
 spin MC=1000%
A
dv
a
n
ta
ge
 
o
f c
ri
tic
a
l p
a
th
 
de
la
y 
(%
)
22nm transistors 
a
v
er
a
ge
FIG. 5: Benchmark calculation of the advantage of spin
FPGA to CMOS FPGA over 20 circuits (delay). Mean crit-
ical path delay of CMOS FPGA is 30.8 ns and those of spin
FPGA are 25.4 ns (MC=100%), 25.5ns (MC=200%), 25.2 ns
(MC=400%), 24.5ns (MC=600%) and 24.5ns (MC=1000%).
CONCLUSION
Spin FPGA was numerically benchmarked for 22nm,
32nm and 45nm transistors. We showed that the perfor-
mance of spin FPGA becomes superior to that of con-
ventional CMOS FPGA as transistor size decreases and
CLB area (µm2) Interconnect area (×103)(µm2)
CMOS SpinMOS CMOS SpinMOS
100% 200% 600% 1000%
22nm 118.6 97.2 237.8 207.0 208.2 206.9 208.1
32nm 124.1 102.7 242.8 210.3 211.6 207.8 212.1
45nm 250.7 208.3 483.0 416.7 418.4 421.8 413.9
TABLE I: Area of a single CLB and interconnect. Result of
interconnect is taken from Fig. 4.
-20
0
20
40
60
80
100
a
lu
4
a
pe
x
2
a
pe
x
4
bi
gk
ey de
s
di
ffe
q
ds
ip
el
lip
tic
ex
10
10
ex
5p
fr
isc
m
ise
x
3
pd
c
s2
98
s3
84
17
s3
85
84
.
1
se
q
sp
la
ts
en
g
cl
m
a
a
v
er
a
ge
Design
 spin MC=100%
 spin MC=200%
 spin MC=400%
 spin MC=600%
 spin MC=1000%
A
dv
a
n
ta
ge
 
o
f a
re
a
-
de
la
y 
pr
o
du
ct
 
(%
) 22nm transistors 
a
v
er
a
ge
FIG. 6: Benchmark calculation of the advantage of spin
FPGA to CMOS FPGA over 20 circuits (area-delay product).
20
25
30
35
40
45
50
0 200 400 600 800 1000
MC (%)
A
dv
a
n
ta
ge
 
o
f a
re
a
-
de
la
y
pr
o
du
ct
 
(%
)
22nm
32nm
45nm
FIG. 7: Comparison of transistor generation. An average
result of the benchmark calculation as a function of MC ratio.
Relations between generations are considered to be related
with relative PMOS areas (see Eq.(1) and text).
MC ratio increases.
[1] S. Sugahara and M Tanaka. Appl. Phys. Lett. 84 2307
(2004).
[2] J. Hayakawa, S. Ikeda, F. Matsukura, H. Takahashi, and
H. Ohno Jpn. J. Appl. Phys. 44, L587 (2005).
[3] T. Marukame, T. Inokuchi, M. Ishikawa, H. Sugiyama
and Y. Saito, IEDM 2009-215.
[4] T. Inokuchi, T. Marukame, T. Tanamoto, H. Sugiyama,
M. Ishikawa, Y. Saito, VLSI Symp 2010, p119.
[5] W. H. Kautz: IEEE Trans. Comput. 18 (1969) 719.
[6] V. Betz, J. Rose and A Marguardt Architecture and CAD
for Deep-Submicron FPGAs, Kluwer Academic Publish-
ers, February 1999. ISBN 0-7923-8460-1
[7] A. DeHon, ACM Journal on Emerging Technologies in
Computing Systems 1, 109 (2005).
[8] C. Dong, D. Chen, S. Haruehanroengra, and W. Wang,
IEEE Trans. Circuits and Systems I, .54, 2489, (2007).
[9] W. Zhao and Y. Cao. http://www.eas.asu.edu/∼ptm/.
[10] Y. Nagamine, H. Maehara, K. Tsunekawa, D.D.
Djayaprawira, N. Watanabe, S. Yuasa and K. Ando,
International Magnetics Conference (Intermag) , 2006,
p281
[11] H. Sugiyama, T Tanamoto, T Marukame, M Ishikawa,
T Inokuchi and Y Saito, Int. Conf. Solid State Devices
and Materials, 2008, pp670-671.
[12] Y. Gao, C. Augustine, D.E. Nikonov, K. Roy, M.S. Lund-
strom, VLSI Symp 2010, p117.
[13] International Technology Roadmap for Semiconductors,
http://www.itrs.net/.
