A Field Programmable Gate Array (FPGA) with finegrained body biasing shows satisfactory static power reduction. Contrarily, the FPGA incurs high overhead because additional body bias selectors and electrical isolation regions are needed to program the threshold voltage (V t ) of elemental circuits such as MUX, buffer and LUT in the FPGA. In this paper, low overhead design of FPGA with fine-grained body biasing is described. The FPGA is designed and fabricated on 65-nm SOTB CMOS technology. By not only adopting a customized design rule specifying that reliability is verified by TEGs but downsizing a body bias selector, the FPGA tile area becomes small by 39% compared with the conventional design, resulting in 900 FPGA tiles with 4,4000 programmable V t regions. In addition, the chip performance is evaluated by implementing 32-bit binary counter in the supply voltage range of 0.5V from 1.2V. The counter circuit operates at a frequency of 72MHz and 14MHz with the supply voltage of 1.2V and 0.5V respectively. The static power saving of 80% in elemental circuits of the FPGA at 0.5-V supply voltage and 0.5-V reverse body bias voltage is achieved in the best case. In the whole chip including configuration memory and body bias selector in addition to elemental circuits, effective static power reduction around 30% is maintained by applying 0.3-V reverse body bias voltage at each supply voltage. key words: FPGA, programmable V t , body biasing, static power
Introduction
Field Programmable Gate Array (FPGA) can be applied various fields such as consumer product, industrial equipment, telecommunication, automobile, aerospace and military purpose because function of a chip can be reprogram on-site. Recently, demands to employ FPGAs for battery-powered equipment such as mobile and IoT (Internet of Things) devices are increasing. In the future, in order to expand the implementation of the FPGA to the battery-powered equipment, it is necessary to further reduce power consumption in the FPGA.
In general, power consumption P total can be divided into dynamic power P dynamic and static power P static in Eq. (1) . Dynamic power becomes small in proportion to the square of supply voltage V or to the operating frequency f . On the other hand, static power originates with static current I static which is primarily composed of the gate leakage current I gate and subthreshold leakage current I sub in Eq. (2) . Gate leakage current increases exponentially by thinning the gate insulator thickness. Subthreshold leakage current tends to increase since the threshold voltage (Vt) becomes smaller along with progress of the semiconductor fabrication process. Especially in FPGAs, static power increase is comparably serious in FPGAs because many redundant resources to ensure the implementation of various functions become the source of huge leakage current. Also applying the leadingedge semiconductor fabrication process in modern FPGAs to improve the performance degradation due to a large number of redundant circuits is one of the factors in the leakage current increase.
Some power saving techniques in FPGAs have been researched. Li et al. showed that leakage current in LUT could be reduced by applying higher Vt to SRAMs [1] . Rahman et al. evaluated leakage reduction techniques such as dualVt design, body-biasing and gate biasing [2] . Gayasen et al. implemented power-gating technique to eliminate useless power consumption in unused resources [3] . Anderson et al. proposed programmable routing switches with low-power mode and sleep mode [4] . Lin et al. integrated the powergating function into the programmable routing switch [5] . Rahman et al. presented the heterogeneous routing architecture by using fast and slow routing resources to optimize the standby power, area penalty and performance [6] . Tuan el al. evaluated the performance of 90-nm FPGA with lowleakage SRAMs and power-gating technique [7] .
Our research group has developed power reconfigurable FPGA with fine-grained body biasing, called Flex Power FPGA [8] . Figure 1 shows the overview of power reconfigurable FPGA. In the FPGA, V t of elemental circuits, e.g. multiplexers (MUXes) and Basic Logic Elements (BLEs) in the Logic Block (LB) and Switch Matrix (SM) can be programed by body bias selectors. Not only circuit but power is mapped on the FPGA as shown in Fig. 2 . Lower V t is applied to the elemental circuits on critical paths to maintain the operating speed, while higher V t is assigned to the elemental circuits on the non-critical paths to reduce static current. After performing the diversified researches, e.g. effective V t steps exploration [9] , the trade-off evaluation between area and leakage current reduction on Vt programming granularity [10] and investigation of higher speed operation ability [11] , the fully-functional power reconfigCopyright c 2016 The Institute of Electronics, Information and Communication Engineers urable FPGA chip on 90-nm bulk CMOS process is successfully realized [12] . Furthermore, 0.4-V power supply operation and effective static power reduction is demonstrated by implementing 32-bit binary counter to the FPGA on 65-nm SOTB (Silicon On Thin BOX) CMOS technology [13] , [14] .
However, one of the problems to be solved is larger area-overhead composed of body bias circuits and electrical isolation area between elemental circuits to realize the programmable V t function. In earlier works [12] , [13] , mask layout of whole chip is manually designed. Many design margins are needed to simplify the construction of mask layout. The number of FPGA tile is 121. In the recent work [14] , 400 FPGA tiles are integrated in the chip. One of the reasons is that EDA tools such as placer and router are introduced to automatically design the FPGA layout. However, the design margin for programmable V t still occupies most of the chip area.
In this paper, the area-overhead reduction of power reconfigurable FPGA with fine-grained body biasing not only by adopting customized design rule [15] but by redesigning body bias selector is presented. The FPGA is designed on 65-nm SOTB CMOS technology. Moreover, power consumption of fabricated FPGA chip is evaluated by implementing 32-bit binary counter. This paper is organized as follows. In Sect. 2, the area-overhead factors and reduction methods are described. In Sect. 3, the power reconfigurable FPGA architecture is explained. In Sect. 4, evaluation results of area and power saving ability is shown. Finally, conclusions are given. 
Area-Overhead Factors and Reduction Method
In this section, area-overhead factors and its reduction methods are explained. V t of elemental circuits such as MUX, LUT and DFF becomes lower or higher by applying body bias voltage. As a result, elemental circuits operate faster (subthreshold leakage current increases), or subthreshold leakage current of elemental circuits becomes smaller (operating speed becomes slower). Body region of transistor in each elemental circuit needs to be isolated electrically by triple-well structure to eliminate interaction of body bias voltages. In case that the triple-well structure is used to circuit design, circuit area increases in comparison with typical circuits because n-well extensions are added as well as the space between n-wells is enlarged, whereas transistor's body region is electrically isolated. Also, it is necessary to prepare configuration bits to store data to reconfigure power information. Moreover, voltage level shifter circuit and voltage selector circuit for body bias selector are needed in order to handle body bias voltage which is larger than VDD or is negative value to shift V t of elemental circuits. Abovementioned enlarged separation space, well extension, additional configuration bit and body bias selector are main areaoverhead factors in power reconfigurable FPGA by body biasing as shown in Fig. 3 . It is clarified that the trade-off exists between static power reduction and area-overhead [10] . Since the programmable V t region is the finest granularity in this work, the area-overhead becomes extremely large, while high-efficiency of static power reduction is given. In order to reduce these area-overhead completely, adoption of customized design rule and redesign of body bias selector are attempted.
Adoption of Customization Design Rule
In this work, the power reconfigurable FPGA is designed on the 65-nm SOTB CMOS technology [16] which is the advanced SOI technology with ultra-thin buried-oxide (BOX). The effective energy reduction in the SOTB CMOS technology is recently reported for such new applications as en- ergy harvesting sensor network systems, and long lasting wearable computers [17] . However, the triple-well structure which consumes useless silicon area is needed to form the back-gate under the BOX layer.
Design rules of triple-well structure for adaptive body bias technology such as n-well, deep n-well and n-diffusion layer are customized to shrink the circuit area on SOTB CMOS technology as shown in Table 1 [15] . The design rule of SOTB CMOS including triple-well structure is the same as that of bulk CMOS except thin-box definition. Some kinds of TEG based on the modified design rules are designed and leakage current which flows between n-wells is evaluated. In TEG4, though n-well space is shrunk by 44%, the huge current flows by applying voltage difference larger than 2V between n-wells because current path is formed between the lower corners in n-well. TEG7 reduces the deep n-well space by 56%. However, the huge current flows between side-walls of deep n-well in the same voltage condition as TEG4. Moreover, n-well space and deep n-well space are shrunk in TEG8. In this condition, current paths are created between not only lower corners of n-well but between side-walls of deep n-well. It is difficult to adopt the design rules in TEG8 because breakdown voltage between n-wells is excessively low though the design rules have possibility to realize maximum area-overhead reduction.
Eventually, the design rules in TEG7 are adopted to design the power reconfigurable FPGA by body biasing in this work. The design rules satisfy two factors which are more effective area-overhead reduction and smaller current flow between n-wells in the maximum n-well voltage difference of 1V which is applied in the operation experiments of the fabricated chip. It is expected that the customized design rule can reduce the circuit separation space by 40% in maximum case.
Redesign of Body Bias Selector
In the power reconfigurable FPGA with body biasing, additional body bias selectors are implemented in order to finely control V t of transistors in elemental circuits such as MUXs and LUTs. Figure 4 (a) depicts the conventional body bias selector. The body bias selector is composed of the signal level shifting stage and body bias switching stage. In the signal level shifting stage, the input signal level of VDD/VSS is shifted to that of higher VDD/lower than VSS to completely switch the body bias voltage at the body bias switching stage. The body bias switching stage selects the body bias voltages for pmos body and nmos body based on the information of the configuration memory. Half of transistor in the conventional body bias selector is the normal voltage transistor. The rest is the 3.3-V high voltage transis- tors which are used in the body bias switching stage and a part of the signal level shifting stage. The 3.3-V high voltage transistors enlarge the area of body bias selector because the design value is quite bigger, resulting in 20 times larger channel area than a normal transistor. Figure 4 (b) shows the redesigned body bias selector in this work. The 3.3-V high voltage transistors in the conventional body bias selector are replaced with the normal transistors. Transistor count reduces to 14 compared with the 20 in the conventional body bias selector. The body bias selector designed by SOTB CMOS technology allows the forward-biased condition in which transistor's source terminals are connected to the voltage higher than VDD or lower than VSS whereas body terminals tie VDD or VSS since body terminals are electrically isolated from diffusions. Not only the junction leakage between the body and diffusion can be eliminated but transistors are designed relatively small. Though it is difficult for the body bias selector to apply extreme body bias voltage to the transistor's body, it is expected that satisfactory subthreshold leakage current reduction effect is maintained by applying SOTB technology which has high sensitivity of V t variation by body biasing. Static power of the body bias selector is evaluated in Sect. 4.
Tile Architecture of Power Reconfigurable FPGA
Tile architecture features of our FPGA are summarized in Table 2 . The power reconfigurable FPGA with finegrained body biasing is a typical island style FPGA. An FPGA tile includes a Logic Block (LB) and a Switch Matrix (SM). Switch matrix is the disjoint topology [18] . Programmable routing structure is the unidirectional architec- 
Evaluation Results
In this section, the area of power reconfigurable FPGA applying the area-overhead reduction methods is evaluated. Moreover operational frequency and power consumption of the chip implemented the simple sequential circuit are measured.
Area
The tool flow for the digital circuit design by the well-known commercial EDA (Electronic Design Automation) tools is carried out to obtain the mask layout of FPGA chip. Cell libraries for MUXes, body bias selectors and DFFs in BLE are originally customized. Inverters and buffers are selected from the standard cell library on 65-nm SOTB CMOS technology. Placer and router are executed to accomplish the mask layout of elemental circuits, FPGA tile and whole chip.
In Fig. 5 , the FPGA tile area of this work is compared with that of the earlier work [13] and the recent work [14] . FPGA tiles of the recent work and this work are the horizontal line symmetry structure. The arrangement of the elemental circuits is the same manner in both. The tile area of the recent work becomes small by 51% compared with that of earlier work by eliminating the many design margins for the simplification of mask layout since EDA tools are introduced for the whole chip design in the recent work. Furthermore, in this work, FPGA tile area can be shrunk by 40% compared with the recent work by adopting customized design rule and redesigning the body bias selector. Figure 6 compares two area-overhead evaluation results on the SOTB technology. The area-overhead can be reduced by 49% in total. Especially, the shrink of electrical isolation space between elemental circuits contributes the area-overhead reduction. Detailed area ratio is summarized in Table 3 . Layout of elemental circuits and configuration memory is optimized, resulting in area reduction of 7% and 21% respectively. Area of body bias selector becomes about half because of the decrease in transistor count and removal of high voltage transistor which occupy the larger area. Also, the area to be related to space between n-wells and n-well extension becomes small by 48%. 900 FPGA tiles can be integrated whereas 400 tiles in the previous work.
Operating Speed and Power Consumption
Operating speed and power consumption of the fabricated FPGA is evaluated by using 32-bit binary counter circuit in this subsection. Dedicated CAD tools for the FPGA [8] are executed to implement the binary counter on the FPGA. The function of the binary counter is mapped on LUTs by the Fig. 7 Operating frequency of 32-bit binary counter. lut mapper and packing tool. After that, placer and router are carried out to implement the function of counter on the FPGA tiles. Finally, the bit-stream data including circuit description and speed/power information is generated. Low V t and high V t is assigned to transistors on the critical path and on the non-critical path in the binary counter circuit respectively. Figure 7 shows the operating frequency of 32-bit binary counter. The counter successfully operates the voltage range from 1.2V to 0.5V while the frequency range from 72MHz to 14MHz. Operating frequency in each VDD is independent of the reverse body bias voltage because the dedicated CAD tool appropriately configures the FPGA so that the reverse body biasing is applied to only transistors on the non-critical paths.
Static power of the elemental circuits such as MUX, buffer, LUT and DFF is measured as shown in Fig. 8 . Supply voltage VDD is varied from 1.2V to 0.5V while absolute value of reverse body bias voltage -VRBB-is applied to high V t transistor's body in the range of 0V to 0.5V. Supply voltage of body bias selector is set as followed.
When VDD is set to 1.2V, static power can be reduced by 59% by applying reverse body biasing of 0.5V. In the VDD of 0.5V, static power becomes small by 80%. It is expected that smaller gate leakage of transistor with lowering VDD improves the static power saving efficiency. Figure 9 shows the total static power of the chip. Total static power tends to become small with lowering the supply voltage. On the other hand, when the reverse body bias is gradually applied, 0.3V of -VRBB-minimizes the total static power. In case that -VRBB-is 0.4V or 0.5V, total static power becomes large rather than that of zero body bias condition. Effective static power reduction around 30% is maintained by applying 0.3-V reverse body bias voltage at each supply voltage.
Total static power in 0.5-V VDD is broken down as shown in Fig. 10 . The static power of configuration memory is constant even if the reverse body bias voltage is varied because the transistor's body region of configuration memory is tied to VDD and VSS. In contrast to drastic reduction in the elemental circuits, the static power of body bias selector increases with applying reverse body bias voltage. As a result, the minimum point exists in the total static power of the chip. Body bias selector is the forward bias condition of -VRBB-because the supply voltage of body bias selector is VDDH and VSSL while transistor's body region of body bias selector is tied to VDD and VSS. Therefore, V t of body bias selector lowers and the subthreshold leakage current increases as much as the reverse body bias voltage becomes large. Figure 11 shows the breakdown of static current and static power per an FPGA tile in this work and the recent work [14] in ZBB condition. As shown in Fig. 11 (a) , static Fig. 11 The breakdown of (a) static current and (b) static power per an FPGA tile in this work and the recent work [14] . current of the body bias selectors in this work and the recent work occupies the approximately equal proportion of 5% and 8% of an FPGA tile, respectively. On the other hand, static power of the previous body bias selector to an FPGA tile increases to 20%, while static power of body bias selector in this work maintains 5% as shown in Fig. 11 (b) . The static power increase in the previous body bias selector is caused that the supply voltage of body bias selector can't be scaled down below 3.3V. The body bias selector doesn't normally operate below 3.3-V supply voltage because a part of the previous body bias selector is composed of the relatively higher threshold voltage transistors. On the other hand, the supply voltage of the body bias selector in this work which is designed by using only normal transistors can be scaled down below 3.3V. The body bias selector successfully operates at the 0.5-V supply voltage.
Here, static power reduction is discussed in case that the body bias selector in the recent work [14] in which dynamic range of body bias voltage is wider than that in this work. First, the static power of body bias selector is calculated as the quarter of the sum of static power in the elemental circuits such as BUF, MUX LUT, DFF etc. and configuration memory in ZBB condition because the static power of previous body bias selector occupies 20% of an FPGA tile as shown in Fig. 11 (b) . Furthermore, it is assumed that the static power of body bias selector keeps constant in each body bias voltage. Secondly, by fitting to measured static power as shown in Fig. 12 (a) , the behavior of static power in elemental circuits is given by P e = 0.576e −3.391VRBB (5) where P e is the static power of elemental circuits and VRBB is the reverse body bias voltage. The static power of configuration memory is the same value as the measured data in this work.
Above all, the calculation result of total static power in an FPGA tile by adopting the previous body bias selector is shown in Fig. 12 (b) . The static power of an FPGA tile in the ZBB condition increases by 14% by adopting the previous body bias selector. However, the static power effectively reduces by deeply applying the reverse body bias voltage because the previous body bias selector can aggressively apply the body bias voltage in wider range. Applying 1-V RBB can achieve further 22% lower static power of an FPGA tile than the minimum static power in the measurement result.
Conclusions
In order to reduce area-overhead in the power reconfigurable FPGA with fine-grained programmable Vt control domains, the customized design rule for the triple-well structure to electrically isolate body regions of circuits is adopted and the body bias selector is redesigned. As a fabrication result on the 65-nm SOTB CMOS technology, FPGA tile area is reduced by 40% compared with the recent work. In the customized rule, the design length related to the n-well, deep n-well and n-diffusion layer are shrunk by 48%. The body bias selector becomes small by 57% by only use of normal transistors without high voltage transistors which need extra design area. 900 FPGA tiles can be packed in the same silicon area as previous work in which 400 FPGA tiles were integrated. Moreover, the operating speed and static power consumption are evaluated by implementing 32-bit binary counter. The counter circuit operates at a frequency of 72MHz and 14MHz with the supply voltage of 1.2V and 0.5V respectively. By applying the reverse body bias voltage of 0.5V, static power of the elemental circuits becomes small by 80% in supply voltage of 0.5V. In the whole chip, the leakage current of body bias selector increases with applying the reverse body bias voltage whereas the leakage current of elemental circuits is successfully decreasing. Effective static power reduction around 30% is maintained by applying 0.3-V reverse body bias voltage at each supply voltage. In case that the previous body bias selector with wider body bias voltage range is adopted to the FPGA, applying 1-V RBB can achieve further 22% lower static power of an FPGA tile than the minimum static power in the measurement result.
