Scalability of Field Programmable Gate Array (FPGA) using spin MOSFET (spin FPGA) with magnetocurrent (MC) ratio in the range of 100% to 1000% is discussed for the first time. Area and speed of million-gate spin FPGA are numerically benchmarked with CMOS FPGA for 22nm, 32nm and 45nm technologies including 20% transistor size variation. We show that area is reduced and speed is increased in spin FPGA owing to the nonvolatile memory function of spin MOSFET.
INTRODUCTION
Spin metal-oxide-semiconductor field-effect transistor (spin MOSFET) is a novel MOSFET whose source and drain are contacted with ferromagnetic materials [1] . Ferromagnetic materials provide stable and robust nonvolatile memory [2] . Fig.1 (a) shows a spin MOSFET in which the write process is carried out by using magnetic tunneling junction (MTJ) [3, 4] . Spin MOSFET directly couples logic element with nonvolatile memory element, opening up a path to a new style of logic-in-memory architecture [5] .
Field Programmable Gate Array (FPGA) has a great advantage in that a chip is completely programmable and reconfigurable. However, conventional FPGA includes a lot of static random access memory (SRAM), which is a volatile memory composed of six transistors and faces the fabrication limitation of Si MOSFET. Thus, new FPGA based on novel devices has been expected. Here, for the first time, we report on numerical benchmark for an island-style FPGA using 22nm, 32nm and 45nm spin MOSFETs (spin FPGA) [4] by improving standard benchmark tools [6] . Compared with other proposals [7, 8] , spin FPGA has an advantage in that it is based on Si transistor equipping stable nonvolatile magnetic memory. Moreover, SRAM (six transistors) can be replaced by one spin MOSFET. Many SRAMs are used in FPGA such as in Lookup tables (LUTs) and interconnect area of pass transistors. Therefore, this replacement reduces transistors and FPGA area. Because the speed of FPGA is governed by the length of wire part, smaller area of spin FPGA leads to faster performance. Monte Carlo simulation based on the Predictive Technology Model [9] is carried out to consider variation of device size assuming fabrication difficulties. Although experiments on MTJ [2] at present show the maximum magnetocurrent (MC) ratio is 260% (RA ≈ 10Ωµm
2 ), in this paper we treat 100% ≤ MC ratio ≤ 1000% assuming future realization of larger MC. 
SPIN FPGA
Spin MOSFET.-We model the spin MOSFET by changing SPICE parameter (mobility) such that MC defined by MC = (I P − I AP )/I AP coincide with a given MC ratio (I P and I AP are parallel and antiparallel currents, respectively.) For I P , we use the same SPICE parameters as those of the conventional MOSFET ( Fig.1(b) ). Although there is extra resistance owing to the existence of MTJ in spin MOSFET, as Ref. [10] reported, the resistance of 50 nm square MTJ can be controlled to less than 400Ω and this resistance is negligible compared to the resistance of conventional MOSFET of the order of 10 kΩ.
Spin Cluster Logic Block.- Fig. 2 shows our spin LUT structure [11] for 4-inputs and 1-output, which is a typical set of LUT parameters [6] . Transistor sizes of amplifiers are adjusted such that the input pulse signal is appropriately transferred to the output of LUT.
Pass transistor.-We propose a spin control pass transistor depicted in Fig. 3 (a) . SPICE simulations show that the speed of pass transistor in Fig. 3(a) is of the same order as that in Fig. 3(b) by adjusting the width of control transistors (total transistor area of Fig.3(a) is four in unit of minimum transistor size). Although this pass transistor structure has a disadvantage, namely, a leak- age pass from p-type transistor (PMOS) to n-type transistors (NMOS), this power dissipation can be reduced by limiting the on-state only when it is required [12] . is given by 2 K+3 − 2 + 6K. In a spin LUT (Fig.2) , the leftmost SRAMs are replaced by spin MOSFETs with an additional write/erase transistor. In addition, a sense amplifier (five transistors), a reference transistor and two power supply transistors are required. Thus, the number of transistor required in the spin LUT is given by N Circuit area is calculated by the minimum-width transistor area model [6] , in which each transistor area is estimated by a unit of minimum-width NMOS. When W min and S min are width and area of minimum NMOS, respectively, a width ZW min transistor is estimated as having an area of (1 + Z)S min /2. Width of PMOS is determined such that an inverter changes at half of a drain voltage. For PMOSs of 22nm, 32nm and 45nm nodes, (1) (PMOS is scaled down more than NMOS because of advanced technologies such as strain effects.) Area of recent FPGA is mostly occupied by an interconnect or wiring part. Wire resistance and capacitance are calculated from Ref. [13] .
BENCHMARK RESULTS AND DISCUSSION
Area and speed of spin FPGA over 20 typical milliongate circuits are benchmarked with modified VPR ver.5 [6] for 22nm, 32nm and 45nm transistors. We take standard parameters such as F s = 3 (Wilton switch box), F c in = 1.0 and F c out = 0.25 with length 1 wire segment [6] . Fig.4-6 show the average results over 200 Monte Carlo simulations for up to 20% (3 sigma) variations of length and width in 22 nm transistors, where the vertical axes show advantage of area, critical path delay and areadelay product defined by (Θ cmos −Θ spin )/Θ spin for Θ={A (area), t delay (critical path delay), A × t delay (area-delay product)}. Area-delay product is treated as a metric of FPGA performance. Fig.4 and Table I show that area of spin FPGA is greatly reduced compared with CMOS FPGA. For 22 nm transistor, an average of 16% area reduction is realized. This area reduction leads to small critical path delay of circuits resulting in faster operation in spin FPGA. In Fig. 5 speed is improved by an average of 24%. As MC ratio increases, P/AP signals that go into an amplifier in spin LUT (Fig. 2) become clearer. This leads to more robust operation against the variation of transistors, resulting in shorter delay in Fig. 5 . Thus, area-delay product is improved on average by 43%. Fig 7  shows summarized results of benchmark from 22 nm to 45 nm transistors. As mentioned above, as transistor scale decreases, ratio of PMOS area to NMOS area decreases. This means that the effect of area reduction by spin MOSFET (NMOS) becomes larger resulting in better performance of small transistor nodes.
One of the advantages of spin MOSFET compared with CMOS with interlayer MRAM system is that, for spin MOSFET, MC ratio change directly affects subthreshold region of MOSFET which leads to more efficient device operations. The effect of direct injection of spin into channel on device performance will be clarified in more detail in the near future. 
