Abstract -Partially depleted Aoating-body (PDFB) SO1 technology offers the potential of increased speed and lower power dissipation over traditional bulk CMOS. A key problem, however, related to the use of traditional design Aows for new SO1 designs is that the delay of logic gates built of PDFB SO1 transistors varies according to the signal history. This presents a complication for doing timing analysis or simulation of logic circuits. In this paper, we have formulated a simulation model that allows one to track the changes in delay during a dynamic gate-level simulation of the PDFB SO1 circuit. This is essential in order to properly account for the shift of transistor body voltage in SOI devices. The model captures the "state" of a logic gate via two delay "state variables" which represent the rise and fall delay of the logic gate. As the logic circuit is simulated, state variables become updated.
Introduction
Partially depleted floating-body (PDFB) SO1 transistors [l] offer improved performance and lower power compared to CMOS [a, 31, but lead to complications in the design flow due to the fact that the transistor body is floating, and therefore has variable voltage [4, 51. As a logic gate experiences a series of logic transitions, the body voltages of its transistors will change [6] . There are two types of changes that determine the dynamics of the body voltage, fast changes and slow changes [7] . The fast changes occur when a logic gate undergoes a transition. Due to capacitive coupling between the gate and drain nodes, and due to the transistor impact ionization current, the body voltage of an n-channel MOSFET in an inverter, for example, will experience a fast step up when the inverter input rises. It will experience a fast step down when the input falls. It can be shown that these fast body voltage changes due to logic transitions depend on the input rise/fall time, the output load capacitance and the body voltage values before the transition of the logic gate. When the gate is in steady state, the body voltage will drift slowly due to pn-junction currents. For the nchannel in an inverter, forward biased source junction currents wtill cause a slow change of body voltage [6, 71. The time scales of these changes are very different. The fast changes happen over pico-seconds or nano-seconds, while the slow changes take milliseconds [7] . A good way to visualize the process is to think of the body node as a charge tank. For t This research is supported by the Semiconductor Research Corporation, under contract SRC 2000-TJ-755. a given signal waveform, charge is pumped into the tank and taken out of the tank, and the final voltage is a function of the signal history [6, 71. Since transistor threshold voltage depends on the body potential, and since the delay of a logic gate depends on threshold voltage, then the delay of a logic gate becomes a function of the signal history [4, 6, 71. In a logic gate, the body voltage values of all the transistors will determine the delay value (because they affect VT, which affects the delay [SI 
Inverter Model
We have developed a model for the basic static SO1 inverter that maintains information on the rising delay (delay due to a rising input), and the falling delay (delay due to a falling input) of the inverter, as state variables. These values will be updated as the simulation proceeds, without maintaining any body voltage information. In the following, we describe the motivation and structure of the model. In section 2.3, we will describe how the model would be used inside a simulator. We will also verify the model against SPICE in section 2.4.
Motivation
The motivation behind our model is the following empirical observations. Different body voltage combinations inside the inverter can correspond to the same delay value. This can be seen from the contour plots in Fig. 1 . Furthermore, if we start the inverter in two different body voltage combinations that correspond to the same delay, and we then apply in each case the same signal stream, the inverter delay and the way it changes over time will be approximately the same. This is shown in Fig. 2 , which contains two groups of three curves each. The curves are so close that it is hard to tell them apart. In each 0-7803-7057-0/01/$10.00 02001 IEEE.group, one curve starts from time 0, and corresponds to the inverter starting from a DC steady state and being clocked towards its AC steady state. The other two curves in each group correspond to the inverter starting in a different body voltage combination, with some corresponding delay, and then being simulated for the same 200MHz pulse sequence. Each of these two curves is then drawn on the plot starting from a carefully chosen initial time instant, as follows: given the initial starting delay value for each curve, we find the time when the first (DC) curve has that delay value, and we start each of the two curves from that time instant. This achieves a condition whereby the three curves correspond to three versions of the inverter that have the same initial delay but have different initial body voltage combinations. It is clear that, having the same delay value and in spite of the different body voltage combinations, the inverters have approximately the same delays under subsequent inputs. This behavior was found to apply in all cases, and is the basis for our model. Fig. 3 shows the percentage spread between final delays for simulation runs of a typical inverter that is started in different body voltage states for various initial delay values. The results shown are for a total of 216 SPICE runs. This test has been performed for all ranges of load and input slope for an inverter, NAND and NOR gates, with up to three inputs.
Model structure
The inverter model maintains two state warzables, D, & Df, defined as the propagation delays due to a rising input and a falling input. For each, the model contains two main components and one auxilary component.
A-Delay tables:
The first main component is the "A-Delay" tables, or simply the A D tables. This component captures the delay change due to a transition, and is shown in Fig. 4 (a) for a rising input (a similar table gives A D for a falling input). The tables give the change in delay due to a rising or falling transition, given the delay value immediately before the transition. Several versions of each table are required, each corresponding to a certain output load (capacitance) and input rise/fall time combination. These tables are generated by starting from a DC steady state (one high and one low) and applying a sequence of fast pulses to the gate and measuring the delay values using SPICE. 2. Delay decay tables: The second main component is the "delay decay" tables set, and is shown in Fig. 4(b) , for a rising input (again, another similar table exists for a falling input). This component captures the delay change due to staying at logic high or low, so that we have two decay curves for each state variable. Several tables are required, to account for all the delay and input slope combinations.
Mapping tables:
The auxiliary component is a set of "mapping" tables, as shown in Fig. 4(c) . This component captures the relation between delay values for different input slopes with a fixed load, and allows one to use delay values characterized for a certain input slope to compute delays due to another. Several tables are required, to account for the different load values.
Model usage
We now give a step-by-step description of how the model is used, in the general case when the input slopes in a waveform may be different, as shown in Fig. 5 . We start from input state low, and right before the first transition (with rise time t,l), we assume that the state variables D,o and Dfo are known. The following three functions will denote the components of the model.
&,f(D,t) is the A-Delay function, based on the
A-Delay tables. The first argument is the delay before the transition and the second argument is the input rise/fall time. The result is A D due to that transition. The r/f subscript stands for rise or fall.
Xr/f,high/low(D, t , T ) is the decay function, based
on the decay tables. The first argument is initial value of delay. The second argument is the input rise/fall time, and the third argument is the decay time. The result is the final value of delay. The r/f subscript stands for rise and fall. The low/high subscript specifies if it is a decay to low or decay to high.
pr,f (D, t,, t d ) is the mapping function, based on the mapping tables. The first argument is the delay value for a given slope, and the second argument is the rise/fall time associated with the first argument. The third argument is the rise/fall time value for which we want to compute the delay. The r/f stands for for rise or fall. At point A, right before the first transition we
3.
have:
At point B, after the first transition, we compute the new delay values using the A D tables, as follows:
At point C, right before the second transition, we update the state variables using the decay tables, as follows: The reason for this peculiar use of the mapping function is that it is much cheaper to create the other tables using the same input slope throughout a waveform, than to vary the slope combinations within a waveform. The mapping function is a cheap mechanism of using such tables in the general case.
At point D, we use this AD tables again to compute the state variables after the transition, as follows:
At point E, right before the third transition, we update the state variables using the decay tables, then we map this value to the slope t r 3 , as follows:
At point F, applying the same update as point B, we have:
Experimental results

used for
In order to show the accuracy of this model, we used an inverter and constructed a table of (thigh, tlow) combinations representing different duty cycle values and, for each combination, we simulated the inverter in SPICE and compared our delay prediction at the end of the cycle to that obtained from SPICE. A typical percentage error histogram is shown in Fig. 6 , where t h i g h and tlow are from the set Ins, lOns, loons, lus, lous, loous, lms, therefore we have 49 combination of input pulses. The errors are all under 5%, which is very good agreement, even by typical accuracy of bulk CMOS gate timing models.
It is important to investigate whether the error accumulates over time. To do so, we have constructed and simulated different tables of (thigh, tiow) combinations, as follows, with the results shown in Fig. 7(a) . A 3 x 3 table denotes a matrix of all combinations of values of thigh and tiow from the set {Ins, lOns, 100ns). Thus a 3 x 3 table simulation in Fig. 7(a) consists of a simulation of the following sequence of cycles: (thigh,tlowj = ((1, I), (100, lo) , (100, loo)}, all values in nsec. The 4x4 table corresponds to combinations of values of t h i g h and tlow from the set {Ins, 10ns, loons, 1000ns}, and similarly for the other tables. Notice that this set of pulses covers fast pulses as well as pulses that take the inverter near to its DC steady state in each (1,10), (1, loo), (10, I), (10, lo), (10, loo) , (100,1), 797 half cycle. The error values are all small, again, justifying the use of this model. The errors also do not seem to accumulate. In the two cases of the 3 x 3 and the 4 x 4 tables, where Fig. 7(a) seems to show increasing error, we repeated the simulation a number of times, and the results are shown in Fig. 7(b) , indicating that the error does not accumulate in these cases as well.
Extension to Other Gates
This modeling approach can, in principle, be used for other gates. So far, the model has been tested for two input NAND gates and a typical model accuracy is shown on Fig. 8 . However, it may not be practical to simply increase the number of decay tables according to the set of input states (an). Instead, we are working on an approximate technique by which a fixed number of decay tables can be used.
Conclusion and Future Work
In this paper we have presented a model that can be used for timing simulation/analysis of FBPD SO1 logic gates, we showed empirical results for an inverter and a 2-input NAND gate. The concept can be extended for other types of static gates as well as gates with larger number of inputs. Future work will include testing the model for gates with more number of inputs as well as AOI gates. 
