In this paper we propose a novel low leakage FPGAs Look-up Table (LUT) that can operate in three different modes: high-speed, low-power or sleep. In high-speed mode, the LUT provide similar power and performance to a conventional LUT. In low-power mode, as the expense of speed, leakage power is reduced by 68%~73% vs. high-speed mode. Leakage power in sleep mode is over 95% lower than in high-speed mode.
Introduction
Technology scaling trends have made power consumption, specifically leakage power, a major concern of the semiconductor industry. With each process generation, supply voltages are reduced and transistor threshold voltages (V TH ) must also be reduced to mitigate performance degradations. Reducing V TH leads to an exponential increasing in subthreshold leakage. In modern IC processes, gate oxides are thinned to improve transistor drive capability, which has led to a considerable increase in gate leakage. Field Programmable Gate Arrays (FPGAs) have become one of the most popular implementation media for digital circuits as FPGAs have some advantages: high speed, fast time-to-market and steadily decreasing cost. However, FPGAs are power inefficient compared to logically equivalent Application-Specific Integrated Circuits (ASICs) [1] . Most of the early work on low-power FPGAs has focus on dynamic power consumption, however, leakage power can now compose over 50% of total FPGAs power [2] . Recent work has concentrated on reducing the leakage within the routing switch which account for 60%~70% of total FPGAs leakage [3] [4] . The leakage of LUTs which may constitute over 20%~30% of total chip power, however has not been targeted. In this paper we propose a novel FPGAs LUT architecture that can significantly minimizes the leakage in low-power mode and sleep mode. The remainder of the paper is organized as follows: Section 2 presents related work and necessary background material. The proposed novel LUT architecture is described in Section 3. Section 4 analyzes the experimental results. Conclusions are offered in Section 5.
Related Work for Leakage Optimization
A variety of techniques for leakage optimization in ASICs have been proposed in the literature. One of the most popular methods is to introduce sleep transistors into the N-network (and/or P-network) of CMOS gates, as shown in Figure 1 (a). Sleep transistors (MPSLEEP and MNSLEEP) are ON when the circuit is active and are OFF when the circuit is in sleep mode. As the stacking effect, the leakage power is significantly minimized. However, a limitation of this approach is that in sleep mode, internal voltages in sleeping gates are not well-defined and therefore, the technique cannot be directly applied to data storage elements.
Figure 1. Sleep leakage reduction techniques
Another way of dealing with the data retention issue was proposed and is shown in Figure 1 (b). Two diodes, DP and DN, are introduced in parallel with the sleep transistors. In active mode, the virtual V DD voltage (V VD ) and the virtual ground voltage (V VGND ) are equal to rail V DD and GND, respectively. In sleep mode, the sleep transistor are turned OFF and V VD V DD -V DP , where V DP is the built-in potential of diode DP. Likewise, V VGND GND+V DN in sleep mode. The potential difference across the latch in sleep mode is well-defined and equal to V DD -V DP -V DN , making data retention possible. In sleep mode both subthreshold and gate oxide leakage is reduced as follows: 1) The reduced potential difference across the drain/source (V DS ) of an OFF transistor results in an exponential decrease in subthreshold leakage. 2) Gate oxide leakage decreased superlinearly with reduction in gate/source potential difference (V GS ). The novel LUT can operate in three modes as follows: In high-speed mode, the gate terminals of MNX and MPX are tied to V DD and GND, respectively. MPX is turned ON and therefore, V VD is equal to V DD and output swings are full rail-to-rail. The LUTs can be set to this mode when they are in timing-critical path. In low-power mode, the gate terminals of MNX and MPX both are tied to V DD . MPX is turned OFF and MNX is turned ON. The input inverters and output buffers are powered by the reduced voltage, V VD V DD -V THN , where V THN is the threshold voltage of NMOSFET. As the input inverters' outputs are tied to the gate terminals of the pass transistors to control the transistors ON or OFF, and the logic 1 equal to V DD -V THN which is always greater than V THN , the circuit can implement correctly. Since V VD <V DD , speed is reduced vs. high-speed mode. However, due to the output swings are reduced, the dynamic power and leakage power are reduced simultaneously. When the path is not timing-critical, the LUTs can be set to this mode. Lastly, in sleep mode, the gate terminals of MNX and MNX are tied to GND and V DD , respectively. Both MPX and MNX are turned OFF. The leakage power is significantly minimized vs. high-speed mode and low-power mode. In this mode, the circuit can not sustain data any longer. When the LUTs are not used, they should be set to this mode. For our proposed LUT architecture, it is infeasible to insert the sleep transistors between GND rail and n-network. In this fact, for low-power mode, the logic 0 will be equal to GND+V TH and because the input inverters' outputs are tied to the gate terminals of the pass transistors, so may lead to error.
Low Leakage

Experimental Study and Results
All HSPICE simulation results reported in this paper were produced at 110 using the Berkeley Predictive Technology Models (BPTM) for a 65nm technology [6] . The technology models were enhanced to account for gate tunneling leakage. The supply voltage is 1V and logic 0 or logic 1 was pre-stored in SRAM randomly. 
Sleep transistors size
In high-speed mode, because PMOS transistor is ON, the PMOSFET size is very important for circuit performance. Figure 4 shows the increased delay and area vs. PMOSFET size. As the PMOSFET size increases from the smallest size to size 10, the increased area from 2.35% to 4.96% and the increased delay from 17.2% to 2.3%. It is can be observed that when the transistor size is larger than size 5, the delay curve becomes very smoothly which means the transistor size has little influences on the increased delay. Figure 5 shows the output voltage vs. NMOSFET size. It is observed that as the NMOSFET size increases from the smallest size to size 10, the output voltage from 649.3mV to 740.4mV. The output voltage increases very slowly when the transistor size is larger than size 5. As described above, we propose the parallel PMOSFET and NMOSFET all about 5X smallest size in the paper. 4.2 The effect to performance, area and power a) Performance This paper introduced sleep transistors to LUT which would decrease the circuit performance. Compared to conventional LUT, the operating speed is decreased about 4.8% in high-speed mode and 41.9% in low-power mode. b) Area The total area is increased because the insertion of sleep transistors. The PMOSFET and NMOSFET are both 5X smallest size, so the increased area is only about 3.86% of total area. c) Power Compared to conventional LUT, our novel LUT architecture can save much power, especially the leakage. In high-speed mode, leakage power is reduced by 3.6% when the output logic is 1 and 4.1% for logic 0. In low-power mode, with the effect of the sleep transistors, leakage power is reduced by 70% for output logic 1 and about 71.8% for logic 0. Lastly, when in sleep mode, the leakage power becomes only 3.59% vs. high-speed mode.
Conclusion
In this paper, we proposed a novel FPGAs look-up table architecture, the LUT can operate in three different modes: high-speed, low-power or sleep. In low-power mode and sleep mode, the power can be significantly minimized. Compared to conventional LUT, Our novel LUT can obtain about 68%~73% and over 95% leakage power reduction in low-power mode and sleep mode.
