I. INTRODUCTION
CMOS device scaling has been one of the major driving forces behind high performance and very high density integrated circuits. However, to maintain the same pace in nanometer technologies, low power and process variation control have emerged as formidable challenges. Device scaling also results in 11th scaling which leads to higher sub-threshold leakage due to its exponential dependence on 11th.
In nanometer technologies, SCEs further aggravate the problem of leakage power. SCEs are secondary effects which come into picture when physical dimensions of transistor reach nanometer regime. In short channel devices, 11th decreases with reduction in channel length. This phenomenon is also known as 11th roll-oft·. Drain induced barrier lowering (DIBL) is another phenomenon which reduces the 11th of the transistors causing higher leakage [1] . DIBL reduces the gate control of the channel and the device is not able to completely shut-off. Thus, in nanometer technologies, the contribution of leakage power to the total power of an IC is significant which thereby reduces the battery life of portable devices.
Device scaling also increases the process variations leading to variations in threshold voltage, channel length and other circuit parameters. The variations in process parameters are induced due to the imperfections associated with the fabrication process. Factors such as random dopant fluctuations, line edge roughness and imperfections due to sub-wavelength lithography cause large variations in device circuit parameters [2] . Process variations can be broadly classified into die-to-die (D2D) variations, within-die variations, wafer-to-wafer and lot-to-lot variations. For high performance ICs, D2D and within-die variations significantly impact the performance and power consumption [3] . Process variations lead to large variation in the operating frequency and leakage power consumption of an IC. After fabrication, each die must meet frequency and the maximum power consumption requirement. Dies operating at lower frequency or exceeding power budget contribute to yield loss.
The impact of process variations can be significantly reduced by adaptive circuit design for variation control. Body biasing can be used to dynamically modify 11th of a transistor. Reverse body biasing (RBB) increases the 11th of a device which reduces the leakage power. Similarly, forward body biasing (FBB) reduces 11th of a transistor which improves the transistor performance and thus can be used to increase the operating frequency of a design but at the cost of higher leakage power. Bidirectional adaptive body biasing (ABB) [4] makes use of both RBB and FBB on the same chip. Post-silicon tuning using body biasing can also be used to improve yield by increasing the number of dies which satisfy the frequency and power constraints. Adaptive supply voltage (AS V) [3] can be used as well for variation control by trading off power and performance. Adaptive MTCMOS [5] is another technique for active mode leakage and frequency control. Multiple footer devices are inserted between the ground and the virtual ground terminal of the circuit. The total width of the footer devices can be dynamically controlled using a feedback mechanism.
At circuit level, leakage power can be reduced by using multi 11th devices [6] in which the gates lying on non-critical paths are replaced by high 11th devices and vice versa. RBB and power gating [7] are also effective techiques for leakage power reduction in standby mode. At device level, leakage power can be controlled by reducing SCEs through process level techniques. DIBL can be reduced by using shallow junctions and pocket implants at source and drain junctions to reduce the depletion layer widths. DIBL can also be mitigated by use of thin gate oxide (to increase the influence of the gate on the channel) [8] but it can cause higher gate leakage. SCEs can also be controlled by using heavy channel doping but it degrades device performance due to mobility degradation, larger depletion capacitance and subthreshold slope. lt is clear that using Bulk technology, scaling will be very difficult to sustain and it is time to look for alternative devices which perform better in nanometer regime and are scalable as well.
Recently, planar double gate devices [9] [I] [8] have been proposed which minimize SCEs due to very thin body and lightly doped channel. It also reduces the random dopant fluctuations in channel and mobility degradation due to columbic scattering. These devices also show better sub-threshold slope than Bulk CMOS due to effective control of SCEs. In this paper, we have shown that these devices are very robust under process variations as compared to Bulk technology and are well suited for adaptive circuit design.
In Double gate FET (DGFET) , the passive substrate is replaced by an actively biased gate known as back gate. The primary gate terminal is known as front gate and the channel can be modulated using front and/or back gate [9] . Figure I shows a planar double gate device. It consists of a very thin body and a lightly doped channel which eliminates the leakage paths that are not well controlled by the gate, the ones that are physically far from the gate [I] . As a result, DGFET minimizes the SCEs effectively and provides better performance than conventional Bulk devices in sub micron designs. An additional advantage is the reduction in capacitive load by the elimination of both depletion and junction capacitances.
DGFETs can be broadly classified into two categories:
• Symmetric: In symmetrical DGFET, the front and the back gate are identical, having same oxide thickness and work function of the gate material. Symmetric DGFETs can be used as 3 terminal (3T) or 4 terminal (4T) devices [10] . In 3T mode, front gate and the back gate are connected together and provide better control of the channel by the gate. In 4T mode, both the gates can be connected to different input signals and the device acts as two transistors in parallel. Due to less coupling between the front and back gates, the front gate 11th cannot be modulated using back gate voltage .
• Asymmetric: Asymmetric DGFET (IGFET) consists of non identical front and back gates. The back gate material has higher work function than front gate and thicker gate oxide [10] . IGFETs can be used like conventional Bulk CMOS devices. The back gate is used to modulate the vth of the front channel by applying voltage across it which can be used to control circuit delay and leakage power.
In this paper, we will focus on asymmetric DGFET only. 
II. PREVIOUS WORK
In [11] , the authors propose the use of asymmetric, planar double gate FDSOI devices to control the front gate threshold voltage and to reduce the sensitivity of the threshold voltage to film thickness in FDSOI devices. They compare the threshold voltage, leakage current and drive currents for FDSOI and asymmetric, double gate FDSOI device.
The authors of [12] propose new double gate logic circuit schemes using only symmetrical gates to reduce the area and leakage/active power. The authors propose new circuit style for NAND and NOR logic gates. The parallel devices are implemented using split front gate and back gate devices while connected gate devices are used for stacked transistors.
In [13] , authors evaluate the performance of connected gate and indepen dent gate symmetric DGFET using benchmark circuits like NAND gate and ring oscillator. In connected gate, the front gate and back gate are shorted while in independent gate, the back gate is grounded for NMOS and connected to supply for PMOS device. The authors also design a VCO in which back gate voltage is used to control the VCO frequency.
In [10] , the authors present various double gate devices including FinFET and discuss the various leakage mechanisms in symmetric and asymmetric DGFETs. Various circuit design techniques using 3T and 4T DGFETs are presented including schmitt trigger and sense amplifier.
Similarly, in [14] , the authors report various circuit design techniques specific to dynamic logic like keeper circuitry and precharge logic using independent gate DGFET. They also discuss a case study where they compare the performance of a tunable VCO designed using IGFET with Bulk CMOS. All the previous work reported above focuses on circuit design techniques using 3T and 4T symmetric double gate devices whereas in this work we extensively analyze the suitability of IGFET for variation control and yield improvement.
III. Vth CONTROL IN BULK AND IGFET
In Bulk CMOS devices, the body effect parameter I plays a very impor tant role in vth control using FBB or RBB. However, with device scaling, the body effect parameter decreases which reduces the effectiveness of body biasing for dynamic vth control. In IGFET, the back gate is strongly coupled to the front gate and can be used for threshold modulation. The back gate has very high threshold voltage as compared to the front gate. Unlike in Bulk CMOS, the Vth of IGFET does not depend on I which provides effective Vth modulation even in nanometer technologies.
In Bulk CMOS, there is a fundamental limit to which the FBB can be applied due to the forward biasing of PN junction formed between drain, source junctions and substrate. The forward biasing of this PN junction leads to large current between drain, source junctions and substrate which significantly impacts the power. With scaling, the supply voltage also decreases and therefore further reduces the FBB range in Bulk CMOS devices. On the other hand, in IGFET, the back gate is isolated from the body through an oxide layer and the back gate can be used for FBB until the back surface becomes strongly inverted. Due to very high vth of the back channel (about I V higher than front channel) , a large range of FBB can be supported in IGFET. RBB is used for sub-threshold leakage reduction in standby mode by increasing the vth of the transistors. With scaling, RBB becomes less effective for Bulk due to higher SCEs in nanometer regime [15] . IGFETs minimize the SCEs due to lightly doped channel and very thin body. Thus, RBB using back gate can be effectively applied in nanometer IGFETs.
In this paper, all the experiments were conducted using 32nm process technology having Vdd = 1 V. PTM [16] models were used for Bulk based circuit simulations while VerilogA based BSIM-IMG [17] models were used for IGFET. The NMOS and PMOS devices in both the technologies had comparable Vth and drive current around VDS = VGS = IV. Figure 2 and 3 show the plot of power-delay product versus NMOS body/back-gate bias for a F04 inverter and a 51 stage ring oscillator designed using Bulk CMOS and IGFET for same area. The bias was used for both NMOS and PMOS. The curves are shown for extended range of FBB and RBB. From the plots it can be observed that for Bulk, after O.5V, there is a sharp increase in the power-delay product due to forward biasing of the junction diodes which increases power significantly whereas for IGFET, the power delay product increases sharply only after I V (Vdd) due to inversion of the back channel but the slope of the curve is very less as compared to Bulk. With reverse bias, for F04 inverter, the power-delay product is delay dominated and the slope of the curve for IGFET is comparable to that of Bulk and due to higher range of RBB, IGFET offers better delay control over Bulk. In ring oscillator, RBB increases the delay due to higher vth and reduces power due to active mode leakage reduction. For Bulk, the percentage increase in delay is comparable to the percentage decrease in power due to which the curve remains relatively flat. However, for IGFET, the power savings are higher as compared to increase in delay due to which the power delay product decreases with RBB. Moreover, in both the cases above, the power-delay product curve for IGFET is always lower than Bulk suggesting low power and high performance using IGFET.
Apart from this, there are various other reasons which make IGFET superior than Bulk CMOS. In stacked devices in Bulk CMOS, reverse body effect is observed which lowers the operating speed of wide input logic gates but in IGFET, reverse body effect does not occur because of the floating body [18] . In Bulk, all the devices share the same substrate due to which body biasing for individual transistors is highly impractical due to large area penalty. Whereas in IGFET, individual transistors can be controlled without any extra penalty. The frequency at which the substrate bias can be applied is limited by the RC delay of the substrate contacts whereas in IGFET, the maximum limit is imposed by the back gate capacitance and wire delay which is same as the main logic delay. Thus, leading to faster Vih modulation as compared to Bulk.
IV. VARIATION TOLERANCE
To compare the performance in the presence of process variations, we performed Monte Carlo (MC) simulations on a inverter chain circuit designed using Bulk and IGFET. In IGFET, the back gate of NMOS was tied to Vss while for PMOS the back gate was tied to Vdd. The nominal value of supply voltage used was 1 V. Based on ITRS predictions, parameter variations were induced in all the transistors of the circuit. The 3a value of the parameter variations [19] is listed in Table I . For a given MC run, same supply voltage was used for the combinational circuit. Thickness variation in the back gate oxide of IGFET was neglected due to its high thickness as compared to front gate. 1000 MC simulations were performed for Bulk and IGFET respectively.
For delay comparison, the circuits were designed to have delay of Ins and identical inputs were applied. The plot in Figure 4 shows the histogram of delay distribution. The mean of the delay distribution is almost same for Bulk and IGEFT but for Bulk, a is 67.9ps whereas IGFET shows a of 44.6ps. The a for Bulk is 52% higher than that of IGFET thus showing better variation tolerance in IGFET. Another experiment was performed to compare the variation in circuit leakage in idle mode under the effect of process variations. For a fair comparison, both the circuits were designed for same area and in idle mode, the inputs were tied to Vss. Figure 5 shows the distribution of circuit leakage for Bulk and IGFET. For IGFET, mean leakage is 2.81{lw and a is 0.27 {lw whereas for Bulk mean leakage is 6.64{lw and a observed is 1.05{lw which is 3.8x that of IGFET. The reason for better variation tolerance of IGFET can be attributed to better control of SCEs, nearly ideal sub-threshold slope, lower DIBL and better Vth roll-off as compared to Bulk. ir:. :
.10-' n� l 6;, :1. In chip design, the goal is to achieve highest frequency of operation while meeting the power constraints. But, process variations lead to distribution of die frequencies and leakages. ABB is an effective technique to meet the desired frequency and power constraints by dynamically modifying the Vih of the transistors. ABB trades off performance for power and vice-versa. Chips which dont meet the desired frequency require FBB for higher performance at the cost of higher leakage power. Whereas, chips which fail to meet the leakage constraint, need to be operated at lower frequency and RBB is used for leakage reduction. In order to compare the performance of Bulk and IGFET in variation control using bidirectional ABB, we implemented a critical path replica based delay locked loop (DLL) circuit [4] . The aim of the circuit is to minimize the leakage power while ensuring that the critical path meets timing. The circuit is shown in Figure 6 . It consists of a critical path replica which is used to replicate the delay of the critical path in the circuit. Instead of replicating the critical path, the inputs and outputs of the critical path can also be used directly but it is intrusive to the design. We created a critical path replica using a chain of buffers whose total delay is matched to the critical path delay of the circuit. Here, we assume that buffer chain can be effectively used for modeling the critical path. A phase detector is used to detect when the critical path replica is faster than the target frequency and vice-versa. The output of the phase detector is connected to a 5 bit up-down counter which digitally records the phase difference between input reference clock signal and output of the critical path replica. The counter increments or decrements dynamically. The output of the up down counter is connected to digital to analog (D2A) converter which creates the body bias for NMOS and PMOS devices in the circuit and the critical path replica. For simplicity we have used only one up-down counter for NMOS and PMOS devices. One D2A converter is used for NMOS and PMOS respectively.
The DLL circuit was implemented in 32nm technology for target frequency of IGHz. The critical path replica was designed to match delay of I ns. In order to simulate the comer case, we induced 3a variation (obtained from MC simulations) in the circuit corresponding to the worst case delay for Bulk and IGFET. For a fair comparison, same body bias range was used for Bulk and IGFET. The body bias range of +320mV for FBB to -320mV for RBB in 20mV steps was used. SPICE simulations were run to determine the locking behavior of the DLL for Bulk and IGFET. Figure 7 shows the plot of frequency vs time units and PMOS bodylback gate bias voltage vs time units, for Bulk and IGFET (worst case). Due to better variation tolerance, under 3a variations, IGFET operates at a higher frequency than Bulk when no bias is applied. The bias changes with time and the frequency rises towards the target frequency. The bodylback gate bias curves for Bulk and IGFET almost overlap. The DLL locks within range of lOps of the target clock period. From the curves it can be seen that Bulk and IGFET lock when the body bias reaches the maximum value. The locking time is also almost same for both the technologies suggesting effectiveness of modulating vth in IGFET based circuits. The locking time for IGFET can be further reduced by increasing the maximum bias range. Similar experiments were done for best case delay variation as well and similar results were obtained but they have been omitted here for brevity.
VI. ENHANCING YIELD USING vth MODULATION
After fabrication, each chip should meet minimum frequency and maximum power constraints. But, process variations can lead to significant yield loss due to deviation in circuit delay and power from the intended target. The chips which operate at low frequency or consume excessive power contribute to yield loss. Post-silicon tuning using body biasing is effective for tightening the delay and power distribution for improving yield. In this section we compare the effectiveness of post-silicon tuning using body biasing for yield enhancement in Bulk and IGFET.
We implemented a combinational circuit having same area for Bulk and IGFET. For a fair comparison, the maximum and minimum body bias range of +320mV and -320mV was used for both. MC simulations were done using the parameter values listed in Table I . The resulting device parameters were used to simulate the effect of process variations. Different samples (from MC simulations) have different delay and power. Samples which violate the maximum delay or maximum power constraint are compensated for delay and power using body biasing. For a violating sample, multiple SPICE simulations are done to find the optimum bias voltage. Due to high runtime requirements we performed this experiment on a set of 700 samples only. Figure 8 shows the power-delay scatter plot for the combinational circuit implemented using Bulk technology. The delay and power numbers shown are normalized with maximum delay and power constraint. A large number of samples violate the delay or power constraints. After compensation, the sample which violates delay constraint is now accepted at higher power but within the maximum limit. Similarly, the samples which violate power constraint are accepted at higher delay but within maximum delay constraint. A small number of samples are still not able to meet the delay and power constraint and contribute to yield loss. Figure 9 shows the scatter plot for IGFET based circuit and results after compensation for delay and power. A comparison of results after compensation for Bulk and IGFET reveals that IGFETs result in a tighter distribution than Bulk. In fact, better results can be obtained for IGFET by increasing the range of back gate bias which is not possible in Bulk due to reasons explained earlier.
VII. CONCLUSIONS
We have shown that planar asymmetric double gate transistors enable low power and high performance circuits in nanometer regime. IGFET provides better variation tolerance than Bulk and is highly suitable for variation control and yield enhancement using vth modulation. Normalised Delay
Normalised Power
After compensation
