In this paper we report the fully depleted CMOS/SOI device design guidelines for low power applications. Optimal technology, device and circuit parameters are discussed and compared with bulk CMOS based design.
INTRODUCTION
In CMOS ULSI design, the interplay between device technology and low-power applications using FDSOI technology needs careful study. A detailed device design and its impact on low power in bulk technology were considered in [l] . Device design guidelines drawn from bulk technology may not necessarily be optimized for lowpower design on FDSOI technology. In this paper, device design guidelines for FDSOI technology are reported and compared with bulk technology guidelines for low power applications.
DYNAMIC POWER IN FDSOI CMOS
Consider a simple CMOS inverter circuit driving another load inverter as shown in Figure 1 . Ignoring the short circuit power, dynamic power consumed in the inverter can be written as:
where CL is the load capacitance at the output of the inverter, v d d is the supply voltage, a is the activity factor and f is the clock frequency. CL can also be written as Permission to make digitaward copy of a11 or part of this work for personal or classroom uae is granted without fee provided that copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication and its date appear, and notice is given that copying is by permission of ACM, Inc. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. @I997 ACM 0-89791-903-3/97/08..$3.50
where Cgate, Cjunc and Cwirc are respectively, the gate capacitance of the load inverter, sourcejdrain junction capacitance and wire capacitance at the driver inverter output node. In FDSOI technology, gate capacitance (Cgate ) is less than in bulk technology as shown in Figure 2 . This reduction in CgaW is due to negligible gate-to-body capacitance, which is Cbox in so1 technology compared with CO, in bulk technology. An average gate capacitance can be obtained by integrating Figure 2 as shown below
where C+oxn/C,xn, cfox#&xp are frontback gate capacitance of load inverter n and p-channel devices. CfOx and Cbox are . 
where IdSam is the n-channel saturation drain current, W, and W, are the widths of the n and p-channel devices [I] . In ( 5 ) T is interpreted as the average of the time for the n-channel device to discharge CL from Vdd to vdd/2 and the time for the p-channel device to charge CL from zero to Vdd/2. It is also assumed that IdSam = 2.2 Idsatp if W,=W, to account for the difference in electron and hole mobilities.
Idsat dependence on channel length (L), mobility peff, velocity saturation vsat and effective front gate bias (Vgsf -V,) can be simplified by neglecting &, effect as Equation ( 6 ) relates drain saturation current to device parameters such as surface mobility and applied front gate bias (V&. Surface mobility is a function of front gate oxide thickness (Tf,,), effective gate bias (Vgsf -V,) and silicon film thickness (Tsi). However in FDSOI technology, Idsat is a weak function of Tbx and silicon film thickness (T,J [2]. Idsat dependence on TSi comes from threshold voltage (V,) and surface mobility. For a fixed V,, its dependence on T,i comes from surface mobility only. Furthermore, in the deep submicron regime, velocity saturation is dominant and the surface mobility dependence on T,i can be ignored. Hence surface mobility dependence on (Vgsf -V,) and Tfo, can be written as
Substituting (7) in (6) Equation (8) The weak dependence of Idsat on Tsi is re-confirmed in (9) and (Vgsf -V,) than Idsat in bulk. Substituting (8) in (5) with Vgsf = O.gVdd, one obtains Switching energy in the CMOS inverter is a function of load capacitance and the supply voltage, which is given by E = CLVdd . Hence the ET product is given by 2 Compared to bulk technology, our data suggests that inverter delay dependence on Tfox and v d d is increased. Equation (1 1) was also verified this by device simulation.
An FDSOI CMOS inverter driving another load inverter as shown in Figure 1 was simulated in a MEDIC1 2-D device simulator with a circuit-analysis advanced-application module for different stage ratio (a) and wire capacitance (C,,). PMOS and NMOS devices in the simulated driver and load inverters are L=O.lmm, Tfox=5nm, Tsi=50nm and Th,=400nm. Figure 5 shows the comparison between Equation (1 1) and device simulation results. Good agreement between Equation (1 1) and simulation data were obtained for different threshold voltages, stage ratios and wire capacitances.
OPTIMAL V D~V T H RATIO
To minimize Pdynmlc without suffering excessive speed degradation, the normalized delay and energy-delay products obtained from Equations (1 1) and (12) are plotted for varying Vdd" ratios in Figure 6 . It appears that a Vdd between 3-4 times the Vm is the optimal without suffering large degradation in speed. The delay is independent of V d P , for Vdd >3vh. This is less than bulk CMOS technology [ 11. Similarly, the energy-delay product is minimal at Vdd=l SVm instead of 2V,h as in the case of bulk CMOS technology. Due to sharp turn-on characteristics of FDSOI devices, Vm scaling versus device leakage current trade-off is improved in FDSOI technology.
TRANSISTOR SIZING
Sizing the transistor for optimal delay and energy-delay product is essentially the same as for bulk CMOS technology. This is due to the same functional dependence of the delay and the energy-delay product on W, and W,. The optimal ratio is independent of load capacitance and it is given by WP -= m = 1 . 5 .
w n
Also the optimal W,+W, is obtained by plotting delay and energy-delay product versus the driver device gate capacitance as a fraction of total load capacitance CL. Figure 7 shows that the delay reduces with increasing W,+W, and there is no minimum. However, the energydelay product initially improves with reducing the delay and degrades thereafter due to an increase in the load capacitance. There is a broad minimum for the energy-delay product that occurs when the driving device gate capacitance (C, , ) is around one third of the load capacitance (CL) for different Cwire with a unity fanout. The minimum for energy-delay product shifts to a lower value of CgJCL for a higher fanout.
OPTIMAL BURIED OXIDE THICKNESS
In SO1 technology, reduced junction capacitance is due to the presence of thick buried oxide. However, thick buried oxide formation is difficult. Moreover the self-heating and floating-body effects will increase with increasing buried oxide thickness. An optimal buried oxide thickness that minimizes delay and energy-delay product is important to know. Figure 8 plots normalized buried oxide thickness versus the delay and the energy-delay produc'i. The buried oxide thickness chosen is normalized by 1 " .
A buried oxide thicknesses between 300-400nm is a good choice.
The delay and energy-delay products are independent when Tb, is > 400nm. This means a Tbx thicker than 400nm would not improve speed or power consumption significantly.
OPTIMAL GATE OXIDE THICKNESS
Dependence of the delay and the energy-delay products on the gate oxide thickness in FDSOI technology is increased due to increased mobility dependence on the gate oxide thickness. Figure 9 plots the delay and energy-delay products versus front gate oxide thickness (Tfox) for different fanouts. No minimum exists for delay with reducing the front gate oxide thickness. However there exists an optimal gate oxide thickness at which energydelay product is minimized. The optimal gate oxide thickness increases with increasing fanouts. For the unity fanout condition, the optimal front gate oxide thickness is between 7 to 8 nm for low power applications.
Interestingly, the general trend in front gate oxide scaling [3] falls in the vicinity of the energy-delay product minimum. Hence, high-speed and low-power conditions can be met without significantly sacrificing either speed, power or short channel effects in deep submicron FDSOI MOSFET technology.
OPTIMAL STAGE RATIO
The optimal way to drive a large capacitance is to use a minimally sized inverter to drive a larger inverter. The next step is to use larger inverter to drive a still larger inverter until at some point the larger inverter is able to drive the load capacitance directly. If N such stages are used, each larger than the previous by stage ratio 'a', then the total delay of the inverter chain is given by [4] : where z is the single inverter delay and 'a' is the stage ratio. A similar approach is used to compute the energy dissipated in the inverter chain, which is given by a(aN -1) E Etotal= a-1 where N = log (&) / log(a) . E is the single inverter energy dissipation. Figure 10 plots the total delay, the total energy and the total energy-delay products versus the stage ratio used in driving large capacitive loads. The energy is minimized at the stage ratios that are at an optimum and are suitable for performance. Hence the optimal stage ratio is between 2 to 4 for low-power designs. The device design guidelines obtained from the above discussions are summarized in Table 1 . By comparing this system with bulk CMOS 111, it appears that bulk CMOS designs can all be transferred to FDSOI with minimal perturbation. This saves design effort and time in meeting the SO1 low-power application requirements. However, the optimal front gate oxide thickness that minimizes the delay and energy-delay products in FDSOI is different in bulk technology. An optimal buried oxide thickness is about 400nm for low-power applications without significant speed degradation.
Parameter

WpNVn Wp+Wn TbOX Stage Ratio (a)
Device design guidelines using devices with L=O. lmm for FDSOI low-power applications are presented using a simple drain saturation current model fitted to experimental results and 2-D numerical simulations. The optimum occurs at Vdd=3Vh for performance and Vdd=1.5Vh for low power. The optimal buried oxide thickness is between 300nm to 400nm. The optimal gate oxide thickness for low power and good performance is between 3nm to 8nm. The optimal transistor sizing is when the driver device gate capacitance is 0.3 time of total load capacitance. The optimal stage ratio 1-3 (1-3) 1-3 (1-3) max (max) Cd=C/3 (C/2) 300-400nm 300-400nm 2-4 (2.8)
2-4
for low power is the same optimal stage ratio for high performance. Figure 10 The delay and energy-delay products versus stage ratio are plotted. Low power and optimal speed can be obtained at a stage ratio equal to 2.8. Figure 8 The delay and energy-delay products versus normalized buried oxide thickness are plotted. The buried oxide thickness chosen is normalized with lmm.
ACKNOWLEDGMENT
