Progress in semiconductor process technology has made SO1 transistors ons of the most promising candidates for high pertormance and low power designs. With smaller diffusion capacitances, SO1 transistors switch significantly faster than their traditional hulk MOS counterparts and consume less power per switching. However, design and simulation of SO1 MOS circuits is more challenging due to more complex behavior of an SO1 transistor involving floating body effects, delay dependence on history of transistor switching, bipolar effect and others. This paper is devoted to developing a fast table model of SO1 transistors, suitable for use in fast transistor level simulators. We propose using body charge instead of body potential as an independent variable of the model to improve convergence of circuit simulation integration algorithm. SO1 transistor has one additional terminal compared with the bulk MOSFET and hence requires larger tables to model. We propose a novel transformation to reduce number of table dimensions and as a result to make the size of the tables reasonable. The paper also presents efficient implementation of our SOZ transistor table model using piece-wise polynomial approximation, nonuniform grid discretization, and splining the transistor model into the model of its equilibrium and non equilibrium states. The effectiveness of the proposed model is demonstrated by employing it in a fast transistor level simulator to simulate high performance industrial SO1 microprocessor circuits.
Introduction
SO1 technology is one of the most promising ways to increase switching speed of MOS transistors without changing their size [I] , [Z] . SO1 transistors have significantly less source and drain diffusion capacitances and lower body effect resulting in reduction of gate delays and power dissipation. Another very attractive feature of SO1 technology is the possibility to use the same schematic solutions that were used by bulk digital chips. Unfomnately, designing SO1 VLSI circuits is more difficult than designing traditional bulk MOS circuits due to significantly more complex behavior of SO1 transistors [I] . Unlike their bulk MOS counterparts, SO1 transistors are fabricated in electrically isolated islands of silicon. So their bodies are completely isolated from each other. This allows to use them in different configurations: connect body to transistor source, connect body to any node of the circuit or, leave body floating. The last case is the most common as it requires the least transistor size and provides the fastest switching speed. However, the behavior of floating body SO1 transistor is the most complex. The most important phenomena are: history, bipolar, and I-V curve kink effects [I] , [Z] . Due to history effect, delays of logic Permission tu make digital or hard copies of all or p m of this work for personal or classroom use i s granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise. to republish, Io post on servers or to redistribute to IisIs, requires prior specific permission and/or a fee. 
120
cells depend on their switching history. Bipolar effect is an additional source of noise in circuits that may lead to malfunctions.
Transistor level simulation is traditionally used for designing and characterizing library cells and critical blocks of custom VLSI circuits. Due to complex behavior of the SO1 transistor, designing SO1 circuits requires even more simulations at transistor level than do traditional MOS circuits. Large amount of simulations and relatively low performance of SPICE simulators are critical issues in VLSI design flow. To reduce simulation time, it was proposed to use fast simulators for simulating large digital blocks at transistor level (31, [9] . Fast simulators generally are simplified versions of SPICE simulators with significantly higher simulation speed obtained at the expense of slightly low accuracy. Fast transient simulators for SO1 circuits are even more beneficial than for hulk CMOS circuits as the speed of SPICE simulators for SO1 circuits is usually much less than for bulk ones. This is due to significantly higher complexity of the SO1 transistor model, smaller time step, and worse convergence properties of the SO1 transistor model.
One of the key components of a fast transient simulator is a transistor model. Obviously the accuracy of the simulator cannot he better than the accuracy of the model. Efficiency of any kind of transistor level simulator greatly depends on the efficiency of its transistor model. In case of a fast simulator, this is especially true because it uses simplified simulation algorithms and the relative amount of transistor model computation is much higher than that for a traditional simulator. Accurate transistor models such as BSIM3SOI [SI used in SPICE, are not suitable for fast simulators because of their complexity and low speed.
Models of MOS transistors for fast simulation amact lots of interest. In this well studied field of computer aided design, it was widely recognized that only The traditional bulk MOSFET requires 3 dimensional tables for describing its currents and charges [5] . However it was noticed that MOSFET behavior can he approximated by two dimensional tables using the so called "gate-offset-voltage concept" [SI, [6] [6] . In our models we use nonuniform discretization grid combined with binary access trees. To reduce approximation error and to use coarser discrctization grid, we also found it useful to use piece-wise polynomial approximation instead of the commonly used piece-wise linear approximation.
Above mentioned research on An SO1 transistor has an extra external terminat compared to a hulk transistor. Therefore, an SO1 transistor model requires an additional independent variable. The resulting size of multidimensional tables of such a transistor model is too large to provide reasonablc efficiency. Thus the problem of reducing size of SO1 model tables is even more important than in the case of bulk transistors. In the proposed model we use a novel transformation to reduce the number of independent tablc variables, exploiting an assumption about linearity of capacitance hehveen transistor hackgate and body. Additionally wc reduce the size of tables by using nonuniform grid and piece-wisc polynomial approximation. For increased simulator accuracy we construct a 
SO1 transistor and its effects
The sbucture of an SO1 transistor is shown in Figure 1 (a). In general it is similar to traditional bulk MOSFET. However each SO1 transistor is fabricated in its own silicon island that is isolated both from the silicon substrate by buried oxide and from all the other transistors by shallow trench isolation. Due to oxide isolation an SO1 transistor has very small diffusion capacitance [I] , 121.
Therefore performance of SO1 circuits is significantly higher than that of traditional ones.
The substrate of an SO1 chip is the 5 t h terminal of an SO1 transistor, affecting its behavior through capacitive coupling. Thus, unlike traditional MOSFET, the electrical model of an SO1 transistor has 5 terminals ( Figure 1 (b) ) thus increasing the number of independent variables and correspondingly the number of dimensions of a table model. In our model we apply special transformation of variables to reduce the number of table dimensions.
The body of an SO1 transistor plays the same role as the substrate of a traditional MOSFET but it is electrically isolated from bodies of all other transistors. Electrical isolation of transistors bodies provides several ways of using an SO1 transistor in circuits that differ with body connection [Z] . The simplest and the most common way to use an SO1 transistor is to leave its body floating. This requires the smallest area for the transistor. Another benefit of this usage is that in most cases transistors with floating body are faster because of reduced body effect on the transistor threshold voltage. However, transistors.with floating body are the most difficult case for simulation. Unlike traditional bulk MOSFET, the state of an SO1 transistor with floating body is not defined uniquely by the voltages applied to the extemal transistor torminals: source, drain, gate and backgate. The state of the transistor depends additionally on the body potential that can not be directly controlled externally. The transistor body can be either in an equilibrium or non-equilibrium state with the other transistor terminals. In the equilibrium state, the transistor body voltage has reached a stable The potential and charge of a floating body is affected by multiple influences and can vary in a very wide range. Body potential depends on both own body charge and potentials of the other transistor terminals affecting the body through capacitive coupling. In the transistor off state its body is completely isolated from other circuit nodes and can abruptly change its potential without changing its charge because ofcapacitive coupling to other transistor terminals. That kind of behavior is very difficult for simulation in the traditional way when circuit node potentials are independent variables defining circuit state. In order to improve accuracy of integrating circuit differential equations, we use body charge as an independent variable. Thus, in our fast transient simulator circuit state is defined as a combination of nodes potentials and transistors body charges. For implementation, we developed necessary transformation of transistor and circuit equations.
Another difficult problem o f modeling an SO1 transistor with floating body is related to so called "history effect". Transistor threshold voltage depends on body potential and through it on body charge. On the other hand, body charge depends on rather small currents due to impact ionization and junction leakage that occur during transistor switching and in its off state too. The characteristic time of this mechanism is of the order of magnitude of milliseconds. As a result, body charge acts like a memory remembering switching history ofthe transistor [I] , [2] . That kind of transistor behavior creates significant difficulties for circuit simulation because even very small errors in a model or integration can accumulate over the simulation time and result in large error of body charge and potential and, consequently in wrong value of transistor threshold voltage. Therefore simulation of SO1 circuits requires very high accuracy of the transistor model both for its currents and charges. The necessity of computing accurate transistor charges requires non-linear models of transistor capacitances. To accommodate these requirements, our model includes tables for both currents and charges, imposing stringent constraints on the table sizes.
The second type of body connection is connecting it to the transistor source through Pi diffusion region and salicide covering Nt and P+ regions as shown in Figure 2 . Using this contact, the body can be connected to any node ofthe circuit. This type of connection is sometimes used in analog circuits and some other special cases such as memory sense amplifiers when explicit body potential control is required. With this connection transistor behavior is easier to model than a floating body transistor as the body potential is equal to the potential of the circuit node that it is tied to. Furthermore, the body connected this way attenuates the influence of the transistor backgate.
Simplified SO1 transistor model
The electrical model of an SO1 transistor is shown in Figure 1 (b 1, (i = d,g,y,e) are e x t e m a l , c u r r e n o a t t r a o s i s t o r~l n~, These equations are the PMmulacionoffKirchofllaw Fmcurrents.
at transistor terminal nodes..They slioulyilie accompaniedhy'equa-. tions of the transistor model:
In our model, we neglect the gate and backgate currents because they are close to 0 for SO1 MOS transistors.
Independent variables change
As it was mentioned above, floating body potential of an SO1
transistor can change its value very abruptly even when the body charge does not change at all. It happens due to capacitive coupling of the body with the other transistor terminals. This fast voltage variation can be rather high because floating body capacitance is very small. On the other hand body voltage can vary slowly due to very small impact ionization and leakage currents [I] , [Z] . This imposes strict requirements both for integration time step and accuracy of circuit simulator. However body charge can not change its value as fast because the value of the body current is limited. Therefore body charge variation has significantly larger characteristic time of variation. So using body charge as an independent variable during circuit simulation, we can increase the minimum simulation time step and, consequently increase the speed of the simulation. However direct usage of body charge as an independent variable is not convenient enough. Unlike node potentials, body charge may vary significantly from transistor to transistor depending on the transistor size. Therefore we apply a linear transformation to body charge to make its range of variation approximately the same as the range of circuit voltages. This new variable is called normalized shifted body charge:
where Q,,,;" and Q,,, are minimum and maximum body charges when body voltage changes from 0 to From this equation we can see that the difference between normalized body charge and backgate voltage Uy-& varies linearly from 0 to V, when body charge vanes from minimum to maximum values. This kind ofbehavior is similar to variation of the other circuit voltages, simplifying error control during integration of circuit equations.
By introducing average body capacitance: 
Reduction of independent variables
As it was mentioned above, the behavior of an SO1 transistor depends on 4 independent variables which highly complicates constructing a compact model as it requires four dimensional tables. However it is known that transistor currents are almost independent of backgate voltage. lgnoring this dependence we have currents as functions of 3 variables only.
Ji(Vd, Vg. V y , V e ) = J,(V,+ vE, V y ) (EQ 10)
The capacitance between a backgate and the other transistor terminals almost does not depend on the terminal voltages because the backgate is separated from the other parts of the transistor by thick oxide layer that is a good dielectric material. Figure 3 shows that all the transistor terminals charges depend on the backgate voltage linearly. We can use this fact for simplifying our transistor 
Organization of model tables
Transistor model is one of the most heavily used parts of circuit simulators. Accurate analytical transistor models like BSIM3SOI [SI are very complicated and rather slow. Table look UP technique is the only way for significantly increased speed of transistor models. Therefore we constlucted our transistor models as a set of multidimensional tables. Usually multidimensional tables are very large and require lots of memory. We developed a special technique for improving the accuracy of table models while keeping their size reasonably low. We split floating body transistor model into the pari describing transistor behavior when the body is in equilibrium state with drain and source voltages and the pari describing the deviation of transistor behavior from the equilibrium state. We use piece-wise polynomial approximation on nonuniform discretization grid.
Splitting model into two parts
The model of an SO1 transistor with floating body consists of two sets of tables. The first set of tables describes the currents and charges of the transistor at the condition that its body is at the equilibrium state. The second set of tables describes transistor currents and charges in the general case.
The equilibrium state of the transistor is described by the following set of tables: Splitting the table model into these two sets of tables helps to improve the accuracy of the model. The model for the case of equilibrium floating body state is only two dimensional and can be implemented more accurately. The general case has 3 dimensional tables but describes only deviations of the transistor behavior from the case of floating body equilibrium state. Even higher relative approximation error of 3 dimensional tables results in not very large total error. Therefore we can reduce accuracy requirements For 3 dimensional tables and use coarser discretization grid.
Piece-wise polynomial approximation
Piece-wise polynomial approximation provides higher accuracy both for function values and its derivatives. This also allows us to use coarser discretization grid. On the other hand piece-wise polynomial approximation requires storing several polynomial coefficients instead of one function value required by piece-wise linear approximation. The higher order of polynomial the better accuracy and the coarser grid we can use but at the cost of more memory for coefficients. In our tables we use 2-nd order polynomials for representing functions of three variables and 3-rd order polynomials for functions of two variables. This aligns well with higher requirements to relative accuracy of transistor model for equilibrium state and possible lower relative accuracy of the model describing deviation of the non-equilibrium state from the equilibrium one.
Nonuniform discretization
The variation range of the transistor terminal currents and charges is very high. For accurate circuit simulation it is necessary to have small relative error in current and charge approximation for all regions of voltages. Therefore to reduce the size of the model while preserving sufficient accuracy we use nonuniform discretization as shown in Figure 4 . Large grid cells are used for regions of slow variation and small grid cells for rcgions with fast variation. The size of cells is adaptively computed at the time of transistor model characterization.
For fast access to grid cells during circuit simulation, we use multidimensional binary tree, as it is shown in Figure 4 (d) for two dimensional discretization grid. The grid has hierarchical binary structure as well. It is h i l t from two or three dimensional rectangular domain by splitting it into pairs of equal cells as it is shown in Figure 4(b, c, d ). Each splitting is independent of the previous ones. The root node of the binary tree corresponds to the whole domain. The leaf nodes correspond to the terminal grid cells. The other nodes of the tree correspond to the intermediate gnd cells. Each of them specifies the direction along which a cell is split into a pair of smaller cells. Figure 5 shows the algorithm of computing approximate function value using the table model with non uniform grid and binary access tree. This algorithm requires in average log(N) time for accessing a cell of the discretization grid. Moreover, only small regions require more access steps.
Handling transistor width
For circuit simulation we need to model transistors of different widths. In our simulator we use two approaches. We can create individual model for each transistor width used in the circuit or we build models only for several transistor widths and use linear interpolation for the other transistor widths. The first approach is more accurate but requires more memory for transistor models and Transistor model creation consists of building non-uniform grid in the region limited by possible independent parameter variations and computing coefficients of approximation polynomial for each elementary cell of the grid. This procedure is controlled by the required accuracy of approximation. The simplified algorithms for building non-uniform grid is shown in Figure 6 . The algorithm iteratively constructs approximation polynomials for each elementary cell of the grid and checks the accuracy of the approximation. If the accuracy is acceptable the algorithm stops splitting the cell and makes it a leaf cell ofthe tree. Otherwise it tries to split the cell in all possible directions and selects the one that provides better accuracy. This procedure continues until the required accuracy is achieved for all cells of the grid.
Transistor model characterization
The coefficients of an approximation polynomial are computed by linear least squares curve fitting. We minimize the approximation error for both the function and its derivatives because for circuit simulation accurate derivatives are as important as function values. We use the following goal function: I Access tree depth Accuracy Table size where:
/is an approximating polynomial 
Results
The proposed transistor table model is implemented for our fast transient simulator FSIM. FSIM is used as a fast simulation mode of our intemal SPICE level simulation tool and as a fast simulation engine for our noise analysis tool [IO] for simulating noise clusters. In both modes, FSIM uses accurate transistor models for constructing its table models, invoking functions of our accurate SPICE simulator. Figure 7 shows waveforms of Figure I . Waveforms of bipolar effect current simulated using accurate and table models bipolar effect current simulated using accurate and table models.
The current waveforms are almost the same. Figure 8 shows simulation of SO1 history effect by accurate SPICE simulator and our fast simulator using table models. For demonstrating history effect
5.880
Delay @s) and table  model simulator a typical SO1 inverter was simulated by applying 8000 short pulses (125 ps) with frequency of 4GHz to its input and measuring delay of each pulse at inverters's output. In order to emphasize delay variation the time axis is moved up by 3 ps. We see that the table based simulator accurately predicts delay variation due to inverter switching history. The maximum error for falling delay is 0.01 ps and for rising delay error is 0.06 ps. It proves that the accuracy of the proposed SO1 transistor table model is enough even for such a sensitive effect.
Conclusion
An accurate table model for SO1 transistor is proposed in this paper. The model is used for fast transistor level simulator of CMOS SO1 circuits. The proposed model describes both currents and charges of an SO1 transistor and is suitable for all types of SO1 transistor body connections. For improving convergence of fast simulator integration algorithm, the model uses normalized shiRed body charge as independent variable instead of the body potential. By applying transistor equations transformation, the number of table dimensions is reduced from 4 to 3. The model uses piecewise polynomial approximation of second and third order with nonuniform discretization grid and binary access tree. The tables approximate both the function and its first order derivatives.
Experiments carried out on large industrial circuits demonstrated high accuracy and efficiency of the proposed model in simulating SO1 circuits by achieving less than 4.8% average error in delay and less than 3.6% average error in transition time. It is demonstrated that nonuniform discretization requires 226528 times less cells than uniform grid for the same accuracy.
Our current and future investigations include reduction of table dimensions from 3 to 2 at the cost of reduced accuracy by using "gate-offset-voltage concept" [ 5 ] , variable accuracy for different transistor operation regions, developing extrinsic transistor model for 90nm transistors and gate leakage modeling.
