A one-region compact ds model from subthreshold to saturation, which resembles the same form as the well-known long-channel model but includes all major short-channel effects (SCEs) in deep-submicron (DSM) MOSFETs , has been formulated through physics-based effective transformation. The model has 23 process-dependent fitting parameters, which requires an 11-step, one-iteration extraction procedure. The new approach to modeling channel-length modulation (CLM), subthreshold diffusion current, and edge-leakage current, all in a compact form, has been verified with the 0.25-m experimental data. The model covers the full range of gate length (without "binning") and bias conditions, and can be correlated to true process variables for aiding technology development.
subject to process fluctuations, and the CM only models the average values of these quantities. Models that are based on single-device extraction [6] - [7] may not be useful for DSM device/circuit modeling since the trend now is toward technology, rather than transistor, characterization. In other words, the CM for DSM devices should be developed for, and its parameters extracted from, a given technology of varying gate lengths, with short-channel effects (SCEs) "calibrated" to the length-independent long-channel devices of the same technology (wafer).
In this paper, the approach to formulating unified CMs for DSM MOSFETs is presented, which is based on physics-based "effective transformation," a step-by-step process of adding higher order effects to the well-known long-channel MOSFET equations. The idea is similar to the "effective voltage transformation" [8] , and has been demonstrated in our previous work on compact threshold-voltage formulation [9] . The approach is based on the belief that the SCE demonstrates itself as a gradual effect as the gate length alone is decreased and, thus, modeling of which must be separated from (and can be calibrated by) its long-channel counterpart. The general procedure is to incorporate physically-derived or empirically-based equations for each individual SCE with effective quantities that contain processdependent fitting parameters, which also approach the values of their long-channel counterpart in the long-channel limit. All the fitting parameters have their physical meanings, and their extraction follows a unique, one-iteration procedure at the "boundary" condition at which their effect is most pronounced. The sequence of model parameter extraction should be such that when a parameter is to be extracted, those that have not been extracted should have little effect; and once extracted, its value should be fixed in subsequent extraction. In addition, third-order effect should only be considered after first-and second-order effects have been accounted for. The general guide in the tradeoff between detailed physics and compact form is what Albert Einstein described: "Everything should be made as simple as possible, but not any simpler." Our unified compact model is developed with the help of a set of experimental data of a 0.25-m CMOS shallow trench isolation (STI) wafer with drawn gate lengths ( ) from 10 m down to 0.2 m. Section II describes the model-equation formulation with a step-by-step effective transformation that results in a one-region equation from subthreshold to saturation covering all gate length and bias conditions, which also resembles the form of the long-channel model. Section III presents the one-iteration parameter extraction procedure as well as the necessary measurement data. Section IV reports another unique feature of the approach to correlating the model to true process variables. Results of the developed model are demonstrated and discussed in Section V. Finally, Section VI concludes the paper. The way our CM is formulated as well as the sequence the CM parameters are extracted also demonstrates the principles behind our novel approach to compact modeling.
II. COMPACT MODEL FORMULATION
The rationale on the sequence of our model formulation is the following. Accurate modeling of the threshold voltage ( ) [9] , including roll-off SCE and roll-up reverse SCE (RSCE) as well as drain-induced barrier lowering (DIBL) and body effect, is first carried out. represents a critical point on the -curve and is relatively independent of the channel mobility and S/D series resistance. The (electrical) effective channel length ( ), which also appears in the short-channel expression, is modeled with a simple conceptual model for the bias-independent metallurgical channel length ( ) with CD correction and lateral lightly-doped drain (LDD) diffusion. Mobility due to the vertical field ( ) is then modeled semi-empirically [10] , followed by a separate, semi-empirical, gate-bias-dependent S/D series resistance ( ) modeling [11] . Next, the turn-on ("first-order") is formulated by the conventional approach with a two-region velocity-field model and a newly-derived saturation voltage ( ), followed by channel-length modulation (CLM) modeling [12] described by an effective Early voltage ( ). Subthreshold current ("second-order") is then formulated by a modified effective gate overdrive ( ) for the correct diffusion current and a novel transformation to retain the compact form. Edge-leakage current ( ) in STI structures [13] ("third-order"), together with diode-leakage current ( ), is finally added to the main MOS current ( ), which "borrows" the same MOS model.
A. Threshold Voltage and Effective Channel Length
The complete model has been presented [9] . In this subsection, we present an enhanced version with a different, more accurate, DIBL modeling.
The ideal threshold-voltage equation is given by
where symbols have their usual meanings. The basic idea in [9] is first to transform the uniform substrate doping to an effective vertical nonuniform doping [14] , which also extends to short-channel devices; and then, transform to an effective lateral nonuniform doping [9] . To model the charge-sharing effect including the effect of [15] , [16] , the average source and drain depletion width ( and ) is modeled, with two fitting parameters, (major) and (fine-tune)
where is the (short-channel) surface potential at strong inversion. (for DIBL) is approximated by a linear function of (2b) Replacing in (1) by the effective body factor in (2a), remains the same form as in the ideal equation (1a)
For , (2a) becomes the one presented in [9] . From the quasi-2D model [15] , the surface potential (at strong inversion) in short-channel devices is lowered by from the long-channel one ( )
At high , the channel surface potential becomes asymmetric and the minimum potential is no longer at [15] . It can be shown [16] that
In (4b), a fitting parameter (for DIBL) is added, which is approximated by a linear function of :
When and in (4c) is small, (4b) reduces to that presented in [9] .
For completeness, the empirical RSCE model in [9] is shown below, which replaces in the previous equations
The effective channel length is a critical parameter that influences the electrical behavior of a compact terminal model. For DSM devices, conventional approach to extracting starts to become invalid because of the nonscaling behavior of the total linear resistance [17] , and the partition of the intrinsic channel resistance and S/D series resistance becomes strongly definition dependent. In our CM, we use a very simple model [18] for the actual poly-gate length with a constant CD where is the LDD junction depth.
is then assumed to be , with two physical parameters, and
and a new method of extraction together with (see Section III). This bias-independent model makes subsequent modeling and parameter extraction much simpler. The bias-dependent two-dimensional (2-D) SCE is modeled by based on the new "critical-current at linear-threshold" definition [19] as well as separate modeling of [11] (see Section II-B).
B. Effective Mobility and Series Resistance
Our CM adopts a separate and physical modeling of the effective mobility [10] and series resistance [11] . In [10] , the vertical-field channel mobility is modeled semi-empirically, with a compact form to minimize correlation among the three fitting parameters, and
where the effective (vertical) field is given by the well-known expression [20] (7b)
Each fitting parameter has its own physical meaning related to doping or temperature [10] , and the model will be extended to short-channel devices.
Likewise, source and drain series resistance is modeled physically by a bias-independent (extrinsic) part and a -dependent (intrinsic) part as [11] (8) with two fitting parameters, and , to be fitted to the shortchannel linear data. represents some effective resistivity in the LDD region with a spacer thickness of . A first-order and dependence is also built into the model. Since is modeled/extracted separately from , what is modeled is actually the voltage drop across the LDD region [11] , leaving the correct intrinsic voltage drop across the MOSFET effective channel.
C. Turn-on Current and Effective Saturation Voltage
Like model formulation, our modeling also starts with the well-known long-channel equation. Assuming the two-re-gion piecewise velocity-field relation, the drain current is given by [21] (9a) (9b)
where the subscript "0" denotes the condition for .
(9e)
is the bulk-charge factor [22, p. 128] in which is a fitting parameter. The effect of is modeled by the conceptual total resistance ( ) partitioned into the channel resistance ( ) and the S/D resistance ( )
where . Substituting (9a) into (10), with some algebra it can be shown that (10) 
where is given by (12b). This newly-derived saturation current should "join" the one in the linear region. Equating (13b) to in (11a), and extracting out of the equation, it can be shown that
This formulation (as well as result) is different from, and simpler than, the BSIM3v3 expression [1] , [27] .
To achieve a smooth transition from linear to saturation region, the smoothing function in BSIM3v3 [1] , [27] (13e) is used to replace in (9), where is chosen as a fixed parameter.
approaches when and when . Then, (9) becomes a unified one-region equation Strictly speaking, when is not ignored and, hence, (14b) is not accurate since contains . This is the real case where , even for long-channel devices. However, as will be shown in Section III, our separate and extraction makes the error involved to be minimal.
D. Channel-Length Modulation and Effective Early Voltage
So far, (14b) has included the effects of vertical-and parallel-field mobility, bulk charge, velocity saturation, and series resistance, but no CLM. A new approach to modeling CLM including high-field effect based on the quasi-2D formulation [21] has been developed [12] .
The piecewise velocity-field relation assumes that when the electric field , electron velocity saturates, . However, the quasi-2D solution [21] reveals that the electric field in the velocity saturation region (VSR) of length increases exponentially as . Since it is not practical to include "local" quantities in a CM, an "effective average field" is introduced [12] , defined as (15) We assume that the saturation field in (14a) (without CLM) is replaced by (with CLM based on the quasi-2D solution). The physics behind this assumption is to model the voltage drop across the VSR such that the voltage across the intrinsic channel can be modeled correctly, with length and bias dependencies (see [12, Figs. 9 and 10] ). With this replacement, it can be shown that (14a) becomes (16a) where is given by (14a), which takes the familiar form of the "pinch-off" model. The effect of CLM due to increased at decreasing is included in an effective Early voltage given by [12] (16b) in which is a fitting parameter. As (14a) changed to (16a) with the inclusion of CLM, (14b) is then changed to (16c) which will not affect the characteristics in the linear region.
E. Subthreshold Current and Effective Gate Overdrive
Accurate subthreshold modeling (second-order) is only meaningful after threshold voltage, mobility, series resistance, CLM, etc., have been modeled accurately from turn-on current (first-order). In order to obtain a unified equation from subthreshold to strong inversion, the smoothing function in BSIM3v3 [23] , [1] , [27] , for an "effective gate overdrive" ( ) is adopted to replace all -in the previous equations such that in (14a) becomes (17a) where is given by [23] ( 
as in (18a). By replacing in (18b) with , the following unified one-region expression (18g) leads to correct currents for both strong-inversion and subthreshold regions. The final complete model including the effect of CLM and is then given by (16c).
F. Edge-and Diode-Leakage Current
Since the experimental data of our STI wafer exhibit significant edge-leakage current, a novel approach to modeling and extracting is developed. We first rename our developed model as for the main MOS current. Knowing the fact that edge-leakage current in STI structures is due to the parallel parasitic MOSFET along the edge of the channel [13] , it is assumed that should have the same length and bias dependencies as the main MOSFET but with a different channel width ( ) and a scaled threshold voltage ( ), as modeled by the same complete model
where and are two fitting parameters to be extracted from the data when is most pronounced (Section III). Together with a simple model for the diode-leakage current (19b) where is a fitting parameter, the final compact drain current is given by (19c)
III. MODEL PARAMETER EXTRACTION
The philosophy behind our model parameter extraction is based on three principles: 1) minimum measurement data requirement; 2) separate fitting and physical parameters; and 3) one-iteration extraction. Process-dependent fitting parameters ("unknown") should be extracted at the average values of the process-variable physical parameters ("known" or estimated), and the former should be fixed in subsequent application of the model with the latter varied for statistical analysis of process fluctuations.
The unified model requires 11 steps to extract its 23 fitting parameters, which will be detailed in this section. The idea behind our one-iteration extraction is to "calibrate" (or fit) the model at "extreme" (length and bias) conditions-the model already had the correct physics built in but it needs to fit to the "particular process" at hand, and this needs to be done only at the "boundary" cases. Of course, the technology data must include a full range of gate lengths (down to the roll-off region) from the same wafer. In this paper,data are based on ten devices of m ( m), and the following "extreme" conditions are used: the longest gate m, the shortest gate m, the medium gate (with maximum ) m, low V, high V, low , and high V. We had altogether 200 sets of measured -data.
There are four independent variables (inputs): , and . The process fitting parameters will be extracted with the assumed (measured or estimated) "average" values of the physical parameters, mainly , nm, and secondarily: , cm , nm, cm/s, and cm . The principal -sweep that is required is the linearcurve (at and ) for each device. From these curves, linear threshold voltage ( ) is extracted based on the maximumdefinition, and the corresponding critical drain current ( ) at is interpolated (i.e., "measured") for each device. The threshold voltage based on the " @ " definition [19] includes the effects of mobility and series resistance at [18] , which is a key to the success of the model. In principle, all other required data are "point measurements," i.e., one pair of ( , ) data, similar to constant-current method. After @ determination, the required measurement data and the sequence of parameter extraction are described as follows. 
A. Threshold Voltage and Effective Channel Length
Threshold voltage is the most sensitive parameter on an model, whose formulation and extraction are independent of mobility and series resistance based on our definition. Our model attempts to build in all major SCEs and RSCEs with its 11 fitting parameters: and ; together with the parameter, . The extraction follows a five-step procedure, as illustrated (with real data) in Fig. 1 and detailed below.
Step 1) At the longest gate : Biased at , measure @ for a few values of . Fit the long-channel to the data to extract the channel-doping parameter ( ) and workfunction ( ), which are then fixed. Values of all other parameters are irrelevant and set to zero.
Step 2) At each : Biased at and , measure @ . With a few trial values of (ranging, for example, from 0.5 to 1), fit to the data to extract the parameters for charge-sharing ( ), barrier-lowering ( ), and RSCE . Values of the DIBL parameters and are set to zero and to be fine-tuned in step 4. The equation has good properties in nonlinear regression for any practical data [9] .
Step 3) From the extracted data: For each value of as well as the extracted ( ) in step 2, fit to extract . The best parameter set ( ) is selected with minimum error in all values in the extraction of , and fix the extracted in (5d) (5e) Steps 2 and 3 are the only steps involving the concept of optimization, but it is done with a few trial values and one iteration. This novel extraction has been proven to be simple and efficient, which gives a biasindependent that is supposed to be close to since it is extracted at (i.e., zero gate overdrive).
Step and, hence, the complete model.
B. Mobility
Vertical-field mobility should be determined at long channel in linear mode after characterization. This is shown in Fig. 2 by the solid circles.
Step 6) At the longest gate : Use the measured (@ ) data (for ). Fit to the data to extract the mobility parameters ( ), which are then fixed. At long channel and low , bulk charge, series resistance, and CLM are negligible, hence, the longchannel (with ) can be used. This step has the most parameter dependency among all the steps. The formulation as a ratio (7a) helps to reduce correlation in nonlinear regression [10] .
C. Bulk Charge
With the fixed from step 6, bulk-charge effect is then characterized from long-channel device at high , where the effect is most pronounced, as shown in Fig. 2 by the solid squares.
Step 7) At the longest gate : Biased at and , measure (for ). Fit to the data to extract the bulk-charge parameter ( ), which is then fixed.
D. Series Resistance
Once the mobility ( ) and bulk-charge factor ( ) are characterized, they are extended to short-channel devices. Series resistance ( ) is then extracted from short channel in linear mode, as shown in Fig. 2 by the solid triangles.
Step 8) At the shortest gate : Use the measured (@ ) data (for ). Fit ) to the data to extract the series resistance parameters ( ), which are then fixed. In linear mode, CLM is unimportant, so . 
E. Channel-Length Modulation
CLM is next characterized from saturation currents for all gate-length devices, which is shown in the inset of Fig. 3 .
Step 9) At each : Biased at and , measure . Fit ;
to the data to extract the CLM parameter for the effective Early voltage, which is then fixed.
F. Subthreshold Current
After characterizing the turn-on current, subthreshold current is then extracted, which is second order compared to the turn-on current and, hence, it will not influence what has been characterized. This is shown in Fig. 3 by the solid circles.
Step 10) At the condition when is minimum (i.e., and ): measure . Fit to the data to extract the subthreshold-current parameter , which is then fixed. At this condition, edgeand diode-leakage currents are negligible (third order), thus, and can be set to zero.
G. Edge-and Diode-Leakage Current
After the complete MOS current has been characterized, edge-leakage current in STI structures is extracted at the condition when it is most pronounced, i.e., when the main MOS is much smaller than . The real data for this excellent example of extraction (i.e., extract which is embedded in ) is shown in Fig. 3 by the solid triangles.
Step 11) At the condition when is maximum (i.e., , and ): measure . One simple nonlinear regression is used to fit to the data to extract the edge-leakage and diode-leakage parameters. In summary, to extract the 23 parameters used in the model, assuming devices of varying gate lengths on the same wafer, only -sweeps plus point ( , ) measurements for the various s and are needed. This compares favorably with BSIM3v3 [1] , [27] , which requires a minimum of 18 -sweeps (three devices, each at six different bias conditions).
IV. EMPIRICAL PROCESS CORRELATION
One major objective and application of the developed model is for empirical process correlation, which has been demonstrated previously [9] , [24] . In Section III, our model is extracted based on data from wafer #15, which has a -implant dose of cm . We have measurement data from a split-lot with only varied as 0, 1, and cm for wafers #17, #18, and #19, respectively. There are 17 sites on each wafer from which E-test data have been measured. In this section, we present the prediction of our model with as input on threshold voltage, on-state saturation current ( ), and off-state leakage current ( ), compared to the averaged values from those 17 sites. The original measured data have been presented in [9] , [24] . By extracting the model parameter, , of wafers #17 and #19 from its long-channel (10-m) data, a one-to-one correlation of to has been found [9] : cm (20) where is in cm , and it is plotted in the inset of Fig. 4 . Without using any other measurement data from wafers #17 and #19 (and none from wafer #18), excellent prediction of our model (with and as inputs) to the (average) measured and are shown in Figs. 4 and 5, respectively. For prediction, increase in due to roll-off at decreasing should be equally-well modeled by the model correlated to by (20) . However, since our diodeleakage modeling is too simple, long-channel data from the four wafers are taken to formulate an empirical relation pA m (21) where is in 10 cm , which is shown in the inset of Fig. 6 . Using (20) and (21) with as input, our model prediction to the measured is quite well, as shown in Fig. 6 . The excellent prediction of our model with a very simple process correlation is the result of the correct physics that has been built into the model. This approach, combined with a carefully designed wafer split, can be very efficient and useful in reducing experimental wafer split-lot.
V. RESULTS AND DISCUSSION
Besides the presented unique approach to CM formulation through effective transformation, we proposed three new models: CLM (with ), subthreshold modeling (with ), and edge-leakage prediction. These results will be presented in this section. Other sample results based on the same model have been presented in [24] . Fig. 7 shows the modeled curves with two values of (crosses) and (solid lines) for the 0.2-m device compared to the measurement data (symbols). When (no CLM), the small finite drain conductance in saturation is due to inclusion of the bulk-charge current. It can be seen that the -dependent model (see [12, Fig. 5] ) can predict CLM very well. Fig. 8 compares the new model (18d) before extraction (dotted lines), which is fitted to the (@ ) data (triangles), with the BSIM expression (17a) (crosses), which is fitted to the (@ ) data (circles), with separate extraction of (after the same turn-on current extraction). For the BSIM expression, a thermal voltage has to be added to [1] , [27] in order to get the correct slope, but the extracted V fails to predict the data well at high , as it is only valid at low [1] , [27] . Our new model, however, can easily fit the data with the correct slope at high (in Step 10) when edge leakage is negligible, and further extract at high (in Step 11), which has little influence on the extracted subthreshold slope at low , as shown by the solid lines in Fig. 8 .
The best proof of validity and accuracy of our model is the excellent prediction of the edge-leakage currents at various gate lengths and biases. A sample result is shown in Fig. 9 (solid lines) in which none of the data (symbols) has been used in parameter extraction. Also shown (crosses) as a comparison are the BSIM3v3 results whose parameters are extracted automatically by BSIMPro using all (200 sets) of the available -data. Excellent predictions of the -dependent subthreshold current ("hump") at fixed gate length have also been obtained, one of which was reported in [24] , which further validates our unified model and the approach to STI current modeling. This is believed to be the first one-region CM for STI edge-leakage current, which has only two fitting parameters and one extraction.
Due to length limitation, we show in Fig. 10 one sample result of the versus curves for two devices in linear and saturation regions, which demonstrates smoothness of our oneregion model. So far, our model does not include many effects, such as narrow channel, substrate and gate leakage, poly depletion, quantum effects, etc., which can and will be formulated following the same approach.
With the separation of process and physical parameters, device performance fluctuations due to statistical process variations can be studied following the approach in [25] . For example, after process-dependent fitting parameters have been extracted and fixed, , and fluctuations can be related to variations in , etc., which, in turn, are due to process variations. This further research will be carried out as a novel application of the developed model.
The model has been formulated and demonstrated with nMOSFETs. A direct application of the model following the described extraction has been applied to the pMOSFETs on the same wafer (#15), which demonstrated equally-well accuracy. The model has been automated and can be applied to automatic wafer test systems as a quick and reliable aid to technology developers, at least in the 0.25-m regime. This unified compact model has been named as Xsim, which will be implemented in the mixed-mode circuit simulator (Xsim) [26] .
VI. CONCLUSION
In conclusion, a unified one-region equation for DSM MOSFETs has been developed and verified through physicsbased effective transformation. The novelty lies behind the philosophy of one-iteration parameter extraction, which follows a prioritized sequence for extracting the parameters being modeled at the condition when their effect is most pronounced, with process-dependent parameters fitted to the measured terminal data with assumed average physical parameters. The simple form of the formulated equations is a result of building SCEs into long-channel models, resulting in a true single-piece model for all gate lengths (no binning). Development of technology-dependent CMs for circuit-level simulation becomes one of the grand challenges for the 0.1-m technology node. The demonstrated approach in model formulation and process correlation will prove to be extremely useful for DSM technology modeling, process monitoring, as well as in bridging technology developers to circuit designers.
