Abstract-Double-gate (DG) transistor has emerged as one of the most promising devices for nano-scale circuit design. In this paper, we propose a high-performance and robust sense-amplifier design using independent gate control in symmetric and asymmetric DG devices for sub-50-nm technologies. The proposed sense amplifier has better performance (30%-35% less sensing delay) and robustness (60%-80% less minimum input bit-differential for correct operation considering 10% worst case silicon thickness mismatch) compared to the connected gate design. Hence, the proposed design successfully demonstrates the benefit of using independent gate control in DG devices for efficient circuit design in sub-50-nm regime.
I. INTRODUCTION

D
UE to better short-channel effect control, lower leakage current, and higher "on" current, double-gate (DG) devices ( Fig. 1 ) have emerged as a very promising candidate for circuit design in the sub-50-nm regime [1] - [5] . DG devices can have front and back gates connected (ConnGateDG) or independent control of the front and the back gates (IndGateDG) [6] , [7] . The connected gate DG devices can directly translate the circuits designed in single-gate technology (e.g., bulk-CMOS) to DG technologies. However, the Directly Translated circuit (DirTrans) style does not utilize the possibility of independent control of front and back gates [6] , [7] . The independent gate control is a unique property of the DG circuits, which is very attractive for circuit design. Independent gate control can be used for dynamic operations [6] - [8] . Application of independent input signals to the two gates can also improve the power and performance of logic/memory circuits [9] - [12] .
Designing high-performance and robust sense amplifiers are extremely important for designing SRAM [14] . The voltage mode sense amplifiers are widely used in SRAM design [13] - [15] . In this paper we propose an independent gate sense amplifier (IGSA), where separate control of the front and the Manuscript received December 16, 2004 back gate is used to improve the performance and robustness of the voltage mode sense-amplifiers. DG devices of 50-nm channel length (designed following ITRS guideline [16] ) are used to implement and simulate (in device simulator MEDICI [17] ) the proposed circuit. It was observed that, IGSA results in 30%-35% reduction in the sensing delay compared to connected gate design. Moreover, the proposed design also showed better robustness against device mismatch. Hence, the proposed IGSA successfully demonstrates the advantage of using independent gate control in DG devices for circuit design in sub-50-nm regime.
II. DEVICE CHARACTERISTICS
The DG devices can be designed in different structures, namely: 1) symmetric device with same gate material (e.g., near-midgap metals) and oxide thickness for the front and back gate (SymDG) [3] , [4] ; 2) asymmetric device with different front and back oxide thickness (AsymOxDG) [5] ; and 3) asymmetric device with materials of different workfunction (e.g., poly and poly) in the front and the back gate (AsymWfDG) [4] . In this work, we designed symmetric and asymmetric devices (both AsymOxDG and AsymWfDG) with 50-nm gate length ( nm, nm, nm, nm) in the device simulator MEDICI [17] following the ITRS guidelines [16] . MEDICI is used to perform device and circuit simulations. The quantum correction models were included in the simulation. Fig. 2 and Table I show the characteristics of the designed devices (deigned for equal "off" current). Let us now qualitatively discuss (using simple long-channel transistor theory) different aspects of the independent gate operation of DG devices. The long-channel threshold voltage at the front gate ( ) of a DGMOS is given by [18] (
where is the Fermi potential and is the back gate bias; and are the work-function difference at the front and back gate, respectively; is sensitivity of to the back gate bias. From (1), it can be observed that, increasing the back gate bias reduces , thereby increasing its "ON" current . It has been discussed in [18] that, for devices with thick , back gate bias does not impact after inversion. However, with thin , due to volume inversion, the back gate bias can impact the even after inversion [20] - [26] . In this work, since we are using a very thin , we have assumed depends on at all regions of operations. We also assumed that in (1) is same as the voltage applied at the back gate. These assumptions, although not truly physical, simplify the qualitative understanding of the independent gate operation.
It should also be noted that in case of AsymOxDG the capacitance in the back gate is less compared to the front gate . On the other hand, for AsymWfDG the of the back gate is higher than that of the front gate (since, , e.g., with poly front and poly back gate , ). The total inversion charge in a DG device is contributed by the front and the back gates and is given by [19] (2)
The "ON" current of the transistor is proportional to the inversion charge [19] . Using the above discussion, let us now analyze the effect of on the drain current under the conditions and .
Case-1: Current Difference Between Two Identical Transistors With a Difference in the Back Gate Bias:
A reduction of from reduces the inversion charge and hence the drain current [6] , [7] as (3) MEDICI simulations of SymDG and AsymDG devices, shows that increases with an increase in which verifies the trend predicted by (3) (Fig. 3) . The above analysis shows that, a change in the back gate bias produces a current difference between two identical transistors both with their front gate at . Moreover, at a given reduces with a thicker back oxide (higher ) as (4) Hence, the value of at a particular is maximum for symmetric devices. MEDICI simulation results for AsymOxDG devices with different verify the trend predicted by (4) . Moreover, (4) also predicts that (at a given ) is a weak function for AsymWfDG devices, which is verified by MEDICI simulations as shown in Fig. 3(b) . From (1)-(3) we can further observe that, increasing (lower capacitance and higher ) in AsymOxDG reduces the "ON" current (i.e., drain current at ) [ Fig. 4(a) ]. Similarly, increasing in AsymWfDG increases of the back gate, thereby reducing the "ON" current (lower ) as shown in Fig. 4 
It can be observed that and the drain current at ( and ) reduce with an increase in and for AsymOxDG and AsymWfDG devices, respectively. The trends predicted by (5) are verified by the MEDICI simulation results as shown in Fig. 5 . Hence, using the AsymDG devices reduces the current at ( and ). 
III. VOLTAGE MODE SENSE AMPLIFIERS
The voltage mode sense amplifiers used in the SRAM design can be classified into two categories, namely: 1) current latch sense amplifiers (CLSA) [13] - [15] , [27] and 2) voltage latch sense amplifier (VLSA) [15] , [28] , [29] . The quality of a sense amplifier is determined primarily by performance and robustness. The performance is determined by the sensing delay. The robustness is determined by the minimum voltage difference between two bit-lines that can be correctly sensed (input offset voltage). A lower sensing delay and smaller input offset voltage represent a better sense amplifier [14] , [30] .
A. Operation of Directly Translated Current Latch Based Sense Amplifier (CLSA)
Let us first consider the Directly Translated (DirTrans) implementation of the CLSA [13] , [14] with the ConnGateDG device [ Fig. 6(a) ]. In the precharge mode (SE is low) O1 and O2 are precharged to through PC1 and PC2. After the word-line of an SRAM cell attached to the bit-lines BL and BLB is raised high, one of the bit-lines (say BL) is discharged and the other one stays high (say BLB). After the difference between BL and BLB (bit-differential) reaches a prespecified value (usually 10% of ), the sense amplifier is enabled by raising . However, as , the strength of ND2 is lower than that of ND1 (i.e.,
). Hence, O1 discharges at a faster rate than O2. After a small difference is built up between the voltages of O1 and O2 (say ), due to the cross coupled inverter action O1 reduces to "0" and O2 switches back to "1" [ Fig. 6(b) ]. Hence, in case of CLSA, the input bit-differential is converted into a current difference through the transistors ND1 and ND , and this current difference is converted to rail-to-rail voltage difference by the cross-coupled inverters. The sensing delay is defined as the difference between the time SE is turned on (i.e., ) to the time O1 (i.e., the node that is finally discharged) is reduced to 0.5 [14] . The sensing delay can be reduced by: 1) increasing the currents through the pull-down path resulting in faster discharge of O1 and O2, and 2) increasing produced by the application of .
B. Operation of Directly Translated Voltage Latch Based Voltage Mode Sense Amplifier (VLSA)
Fig . 7 shows the circuit schematics of the Direct Translated implementation of VLSA. The basic operation of the VLSA is similar to CLSA. In case of VLSA the initial voltage difference between the output nodes O1 and O2 is latched and amplified by the sense amplifier [15] , [28] , [29] . VLSA is faster than CLSA as output discharges through 2-Transistor stack instead of the 3-Transistor one as in CLSA. Moreover, only the offset due to mismatch into latch transistors (i.e., PI1, PI2, N1, and N2) contributes to the input offset in VLSA. Hence, VLSA is more robust to process variation compared to CLSA as the offset due to mismatch in transistors ND1 and ND2 is eliminated [14] .
The major issue with VLSA is that, the nodes O1 and O2 are both the input and output terminals. Hence, the circuit attempts to discharge the bit-line capacitances during sensing. To prevent this discharge O1 and O2 are decoupled from bit-lines using pMOS transistors which are turned "OFF" (by raising YSEL) at the time sensing operation starts (Fig. 7) [14] , [15] , [28] , [29] . The voltage drop across the pMOS devices reduces the available input bit-differential at the time of sense amplifier firing [14] . Moreover, decoupling increases the complexity of the timing. First, if decoupling signal arrives later than SE, the sensing delay increases as the sense amplifier also discharges the bit-lines (Fig. 7) . If YSEL arrives earlier than SE, the input bit-differential is reduced which degrades both performance and robustness. Hence, an exact synchronization of YSEL and SE is required. Second, VLSA requires separate precharge signal from SE. The precharge signal (PRE) needs to be turned "OFF" as soon as word-line is raised high. If PRE arrives late, the input bit-differential is reduced. The delay variation in the circuits used to synchronize YSEL, PRE, and SE can significantly reduce the robustness and performance of VLSA. In CLSA inputs are isolated through the high impedance input-differential stage formed by ND1 and ND2. Hence, additional decoupling circuit is not required, which reduces the complexity of the timing circuit.
The above discussion shows that, the VLSA is faster and more robust compared to CLSA (provided no variation in the synchronization circuit). However, CLSA does not require any additional decoupling transistor and has significantly simple timing requirement compared to VLSA. In this work, we propose an independent gate sense amplifier (IGSA) which combines the benefits of both CLSA and VLSA to design high-speed and robust sense amplifiers.
IV. INDEPENDENT GATE SENSE AMPLIFIERS (IGSA)
In this section, we present the proposed IGSA designed using the 50-nm DG devices presented in Section II. Fig. 8 shows the proposed IGSA circuit using SymDG devices. Using the independent gate operation of DG devices, the current difference in the two pull-down paths is achieved by using a single DGMOS in each path (N1 instead of NI1 and ND1 and N2 instead of NI2 and ND2) (Fig. 6) . The front gates of N1 and N2 are connected in the cross-coupled inverter configuration whereas BLB and BL are connected to the back gates (Fig. 8) . When SE is turned "ON" front gates of N1 and N2 are at but the back gates are at different voltages ( and ). The currents through N1 and N2 are given by (6) where represents the current difference between the two paths in Fig. 8 . Hence, from (6) it can be observed that, the voltage difference results in a current difference between the two paths ( i.e., correct sensing), thereby ensuring the sensing operation.
A. Operation of IGSA
B. Advantages of the IGSA Over DirTrans CLSA
In the IGSA, O1 and O2 are discharged through 2-Transistor stack (instead of 3-Transistor stack in DirTrans CLSA design). Reducing the number of transistors in the stack (i.e., stack height) reduces the sensing delay. Also, in the proposed IGSA, nodes O1 and O2 drive only the front gates of N1 and N2 instead of the front and back gates of NI1 and NI2 as in DirTrans CLSA. This reduces the capacitive load on O1 and O2, thereby increasing the speed and reducing the switching power. It is also evident that the proposed IGSA has less number of transistors (NI1 and NI2 are eliminated). Moreover, removal of ND1 and ND2 eliminate the input offset due to mismatch in these transistors, thereby reducing the input offset voltage. It also eliminates the offset due to differential noise at nodes INT1 and INT2. Hence, the proposed IGSA has better robustness compared to DirTrans CLSA.
C. Merging of Precharge and Pull-Up Transistors
The precharge transistors and the pMOS pull-up transistors (i.e., PC1 and PI1, PC2 and PI2) in the IGSA can also be merged together. This results in the independent gate operation of the merged transistor. Merging reduces the load on the sense-amp enable driver and on the nodes O1 and O2, thereby improving the speed and the switching power. However, merging reduces the precharging speed as only back gate of the merged transistors is used for precharging (Fig. 9) . Moreover, merging also reduces the strength of the pMOS pull-up transistors PI1 and PI2. A weaker pull-up pMOS enhances the initial voltage swing at node O2 (Fig. 9 ). This has two impacts, namely: 1) it increases the power dissipation of the sense amplifier and 2) it results in a voltage swing at the output of the inverter INV2, thereby increasing power dissipation of INV2. Due to these reasons we have not used the merging of the precharge and pMOS pull-up transistors in the proposed design. This emphasizes that selective use of the independent gate control is necessary to obtain maximum benefit from the double gate technology.
D. Disadvantage of the IGSA
In the proposed IGSA, after the sensing operation, N2 is not completely "OFF" ( and ) which results in a short circuit current through PI2, N2, and NC. The short circuit current increases the power dissipation and reduces the voltage at node O2 from (the voltage difference between O2 and is called the Noise Voltage at ). The power overhead associated with the short-circuit current depends on the time interval for which the sense amplifier is turned "ON" i.e., the width of the SE pulse. For a smaller pulsewidth, the reduction in the switching capacitance results in a lower power in the IGSA compared to the DirTrans CLSA deign. However, for larger pulsewidth, a higher short-circuit power increases the total power of the IGSA compared to its DirTrans CLSA counter-part (Fig. 10) . We have observed that, the IGSA designed with SymDG device functions correctly with a 35% improvement in speed compared to the DirTrans design. The noise voltage at O2 is less than 10% of and the maximum power overhead (which occurs for a pulsewidth same as the width of the word-line signal) is 10% (at 6 GHz of operating frequency). However, short-circuit current needs to be reduced to improve the design.
Use of Asymmetric Devices (AsymDG) to Reduce Short-Circuit Current: The short-circuit current can be reduced by using Asymmetric devices for N1 and N2, and connecting the back gates (with thicker oxide) to BLB and BL. In case of AsymOxDG as increases, the current through N1 with and reduces [ Fig. 5(a) ]. A reduction in the short-circuit current reduces the short-circuit power and the Noise Voltage at [ Fig. 11(a) ]. However, increasing reduces both the current difference between N1 and N2 and the discharging current for nodes O1 and O2 (lower ON current of N1 and N2) [Figs. 3(a) and 4(a) ]. Hence, the sensing delay increases with an increase in asymmetry (i.e.,
). Similarly, the use of AsymWfDG devices also reduces the short-circuit power and noise voltage at O2 [ Fig. 11(a) ] at the cost of higher sensing delay.
Circuit Technique to Reduce Short-Circuit Current: In order to eliminate the short circuit current in the IGSA, the voltage at the gate of N2 needs to be reduced to "0" after the sensing occurs. This can be achieved by adding nMOS Ndis1 and Ndis2 to the back gates of N1 and N2 [ Fig. 12(a) ]. The front gates of Ndis1 and Ndis2 are controlled by output of the inverters INV1 and INV2 (OUT1 and OUT2) and the back gates are connected to ground (to reduce the load on INV1 and INV2). This added circuit is called short-circuit prevention circuit (SCPC). When OUT1 and OUT2 are "0" (before the sensing) Ndis1 and Ndis2 are "OFF". Hence, the gate voltages of N1 and N2 (i.e., BLBN and BLN) follow BLB and BL. After the sensing, OUT1 switches to "1," which turns on transistor Ndis2, thereby discharging node BLN. To prevent the discharging of the bit-lines (i.e., BL in this case), we modified the column decoder-multiplexer circuit to isolate the bit-lines after the sensing starts (Fig. 12) . In this technique, the outputs of the column decoders are controlled by a delayed sense-amp signal (SE_DEL). When SE switches to high, the SE_DEL switches to low (after some delay), which turns "OFF" the pMOS pass transistors. This isolates the bit-lines BL and BLB from the nodes BLN and BLBN. It should be noted that, after the pass transistors are turned "OFF" BLBN becomes floated and in the worst case can be discharged to "0" by noise. However, even if BLBN gets discharged, node O1 is strongly held at "0" as the front gate of N1 is at "1" (i.e., N1 is "half ON"). The proposed technique reduces the short-circuit power. However, it introduces a power overhead due to the control circuit and increases the layout area. However, the control circuit to isolate the bit-lines can be shared by a row of sense-amplifiers which reduces the area and power penalty.
E. Advantage of IGSA Over VLSA
In VLSA the discharging of the output nodes O1 and O2 occurs through a 2-Transistor stack. Hence, the sensing delay of the DirTrans VLSA is comparable to that of the IGSA. Moreover, the difference in the current between N1 (say I1) and N2 (say I2) is higher in VLSA than IGSA. This is because, at the start of the sensing operation, in DirTrans VLSA both gates of N2 are at whereas in IGSA only the front gate is as at . However, in DirTrans VLSA, due to the presence of the decoupling transistors, the input bit-differential at the sense amplifier firing time is less than that of the IGSA. Considering this effect, it has been observed that the proposed IGSA has sensing delay comparable (marginally higher) to the DirTrans VLSA. However, in the proposed IGSA inputs are isolated through the high impedance input-differential stage formed by the back gates of N1 and N2. Hence, IGSA does not require decoupling circuits and additional decoupling and precharge signal (simpler timing requirement comparable to CLSA).
V. RESULTS AND DISCUSSIONS
Let us now compare the performance, power and robustness of the proposed IGSA with DirTrans CLSA and VLSA.
A. Comparison With Directly Translated CLSA
The IGSA circuit with the SymDG device and the SCPC results in a 33% reduction in the sensing delay and 10% (at 6 GHz) increase in the dynamic power compared to the DirTrans circuit (sizes of the different transistors in the two designs were kept same). Application of the AsymOxDG and AsymWfDG devices reduces the short-circuit power but increases the sensing delay (Figs. 13 and 14) . With AsymOxDG at the delay improvement is reduced to 24% (negligible power overhead) (Fig. 13) . With AsymWfDG at eV , a negligible power overhead with a 20% delay reduction is observed (Fig. 14) . The proposed IGSA has a lower sensitivity to the input bit-differential and mismatch in the capacitances at nodes O1 and O2 compared to DirTrans CLSA (i.e., better robustness, Fig. 15 ). The improvement in the robustness is due to the elimination of the intermediate nodes INT1 and INT2. Moreover, the sensing delay in the IGSA is less sensitive to the local drop in the supply voltage of the sense amplifier (Fig. 15) ( i.e., supply of the sense amplifier is reduced whereas that of the bit-lines remains same). In the DirTrans CLSA, drop in of the sense amplifier reduces discharging current by lowering the strength of NI1 and NI2 (in series with ND1 and ND2). However, in case of the IGSA, only the strengths of the front gates of N1 and N2 are reduced. But the strength of the back gates remains the same as they are connected to the higher of the bit-lines. Thus, the reduction in the discharging current is less resulting in lower delay sensitivity.
Variation in the of a DG device modifies its ON current [19] [ Fig. 16(a) ]. Increasing Tsi increases the "ON" current while reducing Tsi decreases the "ON" current [19] . The sensitivity of the ON current to variation is minimum in a symmetric device [19] . For the device structure used in this work, we observed that the current in AsymOxDG device has a lower sensitivity to than in the AsymWfDG [ Fig. 16(a) ]. The current through a 2-T stack shows higher sensitivity to compared to a single device, due to the voltage variation at the intermediate nodes [ Fig. 16(a) ]. Let us now consider the the sensing operation described in Fig. 6 . A reduction in of N1 (i.e., NI1 and ND1 in DirTrans design) and/or an increase in that of N2 (NI2 and ND2 in DirTrans design) result in slower discharge of O1 (I1 reduces) and faster discharge of O2 (I2 increases). This increases the sensing delay and may result in incorrect operation. Fig. 16(b) shows that, the the sensing delay in DirTrans CLSA has a stronger sensitivity to the worst case mismatch than the IGSA with SCPC or AsymOxDG. This is principally due to the elimination of the intermediate nodes INT1 and INT2 as explained in Section III. However, the strong sensitivity of the ON current in AsymWfDG to the makes the IGSA with AsymWfDG more susceptible to the mismatch in silicon thickness. The susceptibility can be reduced by lowering the amount of asymmetry at the cost of higher short-circuit current.
B. Comparison With Directly Translated VLSA
Due to the additional voltage across the decoupling transistors, the input bit-differential present in at the time of sense amplifier firing is lower in the DirTrans VLSA compared to the IGSA. This results in a stronger sensitivity to input bit-differential and load capacitance in the DirTrans VLSA compared to the IGSA as shown in Fig. 17 . Although, both the IGSA and DirTrans VLSA have two transistors in the discharging path, due to the presence of lower input bit-differential at the time of sensing, the sensing delay has a stronger sensitivity to the worst case mismatch in of nMOS transistors (N1 and N2 in Figs. 7 and  8 ). This indicates that the IGSA is more robust against device mismatch compared to the DirTrans VLSA. Variation in the arrival time of the decoupling signal (YSEL) and SE (due to delay variation of the synchronizing circuit) results in large variation in sensing delay in DirTrans VLSA [ Fig. 18(a) ]. However, no such variation is present in IGSA (no separate YSEL is required as input is isolated). Moreover, for the IGSA with SCPC the variation the arrival time between SE and SE_DEL (in Fig. 12 ) from its designed value has negligible impact on the sensing delay [ Fig. 18(a) ]. Similarly, late arrival of the precharge signal (PRE in Fig. 7) in DirTrans VLSA significantly increases the sensing delay which does not occur in IGSA as no separate precharge signal is present [ Fig. 18(b) ]. Fig. 19 shows that the minimum bit-differential required for a correct sensing operation considering worst case variation in device mismatch and delay of timing circuits. It can be observed that, the minimum bit-differential required for IGSA is approximately 80% less than the DirTrans CLSA and 60% less than DirTrans VLSA for 10% worst case mismatch. Hence, IGSA shows better robustness compared to both CLSA and VLSA.
VI. CONCLUSION
In this paper we have proposed a novel design technique for voltage mode sense amplifiers using symmetric and asymmetric DG devices in sub-50-nm technology. The independent back gate control of the DG device in the pull-down path (other transistors are kept in the connected gate mode) is used to improve the performance and robustness in sense-amplifier circuits. The proposed IGSA has better performance compared to the DirTrans CLSA, a simpler timing requirement compared to the DirTrans VLSA and better robustness compared to both VLSA and CLSA. Hence, in the proposed design the independent gate control in DG devices is effectively used to improve the circuit quality of the sense amplifiers. The proposed design illustrates the fact that selective use of independent control of the front and the back gates in the DG devices is very effective in designing efficient circuits in nanometer regimes.
