A nonlinear electro-thermal scalable model for high-power RF LDMOS transistors by Wood, J et al.
A Nonlinear Electro-Thermal Scalable Model for
High Power RF LDMOS Transistors
John Wood, Fellow, IEEE, Peter H. Aaen, Member, IEEE, Daren Bridges, Member, IEEE,
Dan Lamey, Member, IEEE, Michael Guyonnet, Member, IEEE, Daniel S. Chan, Member, IEEE, and
Nelsy Monsauret
Abstract— A new nonlinear, charge-conservative, scalable, dy-
namic electro-thermal compact model for LDMOS RF power
transistors is described in this paper. The transistor is character-
ized using pulsed I-V and S-parameter measurements, to ensure
isothermal conditions. A new extrinsic network and extrinsic
parameter extraction methodology is developed for high power
RF LDMOS transistor modeling, using manifold de-embedding
by electromagnetic simulation, and optimization of the extrinsic
network parameter values over a broad frequency range. The
intrinsic model comprises controlled charge and current sources
that have been implemented using artificial neural networks
(ANNs), designed to permit accurate extrapolation of the tran-
sistor’s performance outside of the measured data domain. A
thermal sub-circuit is coupled to the nonlinear model. Large-
signal validation of this new model shows a very good agreement
with measurements at 2.14 GHz.
Index Terms— field effect transistor, LDMOS, nonlinear, tran-
sistor model
I. INTRODUCTION
LATERALLY-DIFFUSED MOS (LDMOS) FETs are usedalmost exclusively for high power transistors for wireless
infrastructure or base-station applications. They provide an
unmatchable combination of performance and cost, and are
capable of delivering hundreds of watts of RF power.
This combination of high powers and high frequencies in
RF and microwave power amplifiers brings together a unique
set of challenges for the device modeling engineer. The power
transistors themselves are physically large, and may be a
significant fraction of a wavelength wide, even at microwave
frequencies. The electrical behavior in this distributed en-
vironment must be captured in the model. The device will
generate a lot of heat, and the thermal effects on the transistor’s
electrical behavior will also need to be characterized and
included in the model. Further, power amplifiers for wireless
communications systems are tightly specified in terms of their
linearity performance, bandwidth, etc., placing a premium on
the availability of accurate nonlinear transistor models.
In this paper we shall outline a new nonlinear scalable
model for the LDMOS transistor, and its extraction and imple-
mentation. The transistor model architecture is shown in Fig. 1.
The intrinsic part of the model, shown in the center of the
figure, describes the nonlinear behavior of the transistor, using
J. Wood, P. H. Aaen, D. Bridges, D. Lamey, M. Guyonnet and D. Chan
are with the RF Division, Freescale Semiconductor Inc., Tempe, AZ 85284
(e-mail: John.Wood@freescale.com)
N. Monsauret is with the RF Division, Freescale Semiconductor Inc.,
Toulouse France.
Fig. 1. Block representation of the transistor model architecture.
voltage-controlled current and charge sources: the model state
functions or constitutive relations. This basic structure is no
different in principle to those nonlinear models presented by
[1]–[3]. A charge-conservative approach is adopted, as this is
crucial to the accurate prediction of the low-level nonlinearities
such as those described by intermodulation products, adjacent
channel power, and so forth [4]. Included in this model are
accurate, continuously-differentiable functions for the model
currents and charges, and a self-consistent electro-thermal
model coupled to these controlled sources.
These model functions are designed to compute, rapidly
and accurately, values of the constitutive relations anywhere
within the measured characterization plane. It is also necessary
for these functions to be able to handle evaluations of the
state functions outside of the measured data range; in other
words, these functions must have predictable and controlled
extrapolation properties. The reason for this is that in circuit
simulation, as the simulation iterates towards the solution,
the harmonic balance simulator (for example) may at some
point require a solution for input voltages that lie outside the
measurement space. Poor extrapolation behaviour can lead to
non-convergence even if the final solution exists within the
characterized region [5]. Here we present also a new formula-
tion for approximating the state functions for a measurement-
based model that permits well behaved and accurate extrap-
olation. The methodology is extended to include the thermal
dependency of the state-functions.
The extrinsic network describes the electrical behaviour
of the transistor layout; it accounts for the metallizations,
substrate, and semiconductor that connect the intrinsic model
to the outside world. How the electrical behaviour of this
network changes with device size determines how the extrinsic
2Fig. 2. Discrete LDMOS transistor used for device modeling; this structure
is GSG probe-able, and the reference planes for the manifold are shown.
network component values scale. Scaling is a significant con-
sideration: the RF power transistor may have a total gate width
of over 100 mm: this is impossible to model in a traditional
small-signal S-parameter environment – the RF powers are
too high, and the device impedances are too small to permit
measurements of sufficient accuracy for model extraction, if
indeed the device is stable in a 50 Ω environment. Instead, we
measure and extract a model for a smaller device structure,
and scale this model to the appropriate size.
II. TRANSISTOR CHARACTERIZATION
The transistor characterization is carried out on-wafer. The
test transistor has the same basic structure as the production
device, but is much smaller: it has fewer gate fingers. The
RF calibration to the Ground-Signal-Ground (GSG) probe
tips is carried out using an impedance standard substrate.
The current-voltage and S-parameter measurements are all
made under pulsed conditions. A low pulse duty cycle is
used to ensure that the data are captured isothermally. Using
pulsed conditions also enables us to measure beyond the DC
dissipation limit of the device, into realistic instantaneous
voltage and current regimes. A dense pattern of the pulsed
I-V and S-parameter measurements is made over the gate-
drain voltage space of the transistor, bounded by the maximum
drain current, breakdown voltage, and the maximum allowable
power dissipation.
III. MANIFOLD MODELING & DE-EMBEDDING
Figure 2 shows the metallization in a typical test struc-
ture used for transistor model development. The problem is
segmented into two areas of analysis, viz the GSG launches
and the manifolds. The goal here is to remove the effects
of the launch and manifolds, leaving only the response of
the transistor including extrinsic elements. The electrical
behaviour of the GSG launches is determined from measure-
ments of transmission line structures, with care taken over
the placement of the calibration reference plane. The GSG
launches can then be modeled by error boxes placed at the
Fig. 3. Manifold test structure for electromagnetic simulation showing shunt
capacitor loads.
0 5 10 15 20
−40
−30
−20
−10
0
M
ag
ni
tu
de
 S
21
 
(dB
)
−150
−100
−50
0
50
Frequency (GHz)
Ph
as
e 
S 2
1 
(de
gre
es
)
Measured
Model
Fig. 4. Measured versus EM-modeled transmission characteristics for the
structure shown in Fig. 3, indicating excellent broadband agreement.
measurement reference planes. The effects of the manifolds
can then be determined, and the GSG measurements are then
de-embedded to the reference plane shown in Fig. 2.
The behavior of the manifolds in turn is obtained through
electromagnetic simulation, using Sonnet’s em™. An accurate
substrate definition is key to the manifold extraction. The
substrate definition is found from careful analysis of measured
S-parameters for transmission lines of various lengths and
widths. By converting these S-parameters to the telegrapher’s
parameters for each line, the properties of each layer in the
substrate and metal stack can be determined, and the appro-
priate metal model chosen. This is particularly challenging in
LDMOS, because the silicon substrate is quite lossy.
The substrate definition is then used in the simulation of
S-parameters for each of the manifold structures. For the
results presented here, the wafer used for transistor modeling
also included structures for determination of the simulation
substrate definition.
3To validate the procedure, measured S-parameters of a
loaded manifold were compared to a circuit simulator result for
simulated manifolds plus measured loads. Large manifold 2-
port GSG structures were built, both with capacitive loads and
unloaded. A typical structure is illustrated in Fig. 3. The loads
were shunt capacitors placed periodically along the manifold
width, with capacitance matching the effective capacitance of
a typical LDMOS device under bias. S-parameters of a single
such capacitor were also measured directly. Electromagnetic
simulations of the manifolds were combined with the mea-
sured load S-parameters in a circuit simulator.
The two-port measurements of the loaded manifold struc-
ture, illustrated in Fig. 3, are in good agreement with simula-
tions. The transmission through the loaded manifold structure
is plotted against measured results in Fig. 4. Although not
plotted, the remaining S-parameters are in similar agreement.
IV. EXTRINSIC NETWORK PARAMETER EXTRACTION
The traditional extrinsic network for a MESFET comprises
a series resistor-inductor network connected to each gate and
drain, and source nodes of the intrinsic model – the Z-shell,
followed by shunt capacitors from gate and drain to ground,
and between gate and drain – the Y-shell [6].
For LDMOS transistors we have augmented this simple
network with a resistor in parallel with the series gate & drain
inductance, to account for the frequency dependent behavior of
the silicon substrate material in an empirical manner. Adding
these shunt resistors enables an improved broadband fit of
the return losses. Further, we have observed coupling effects
between the gate and drain metallizations in these power
transistors. The unit gate widths in these power transistors are
quite significant, over 500 µm, and there is measurable mutual
inductance between these long metal traces.
We use the Cold-FET method [6]–[8] as the basis for
our extrinsic parameter extraction, using optimization over a
broad frequency range. Our methodology is outlined below,
and described in detail in [9]. Broadband S-parameters were
measured under bias conditions of zero volts on gate and
drain terminals of the transistor: this ensures the LDMOS
transistor is biased below threshold, and is in the passive
or ‘Cold’ condition. The manifolds were then de-embedded
from these data by using their S-parameter blocks determined
as described in Section III. The Cold-FET circuit for the
intrinsic transistor is assumed to be a purely capacitive
network. The extrinsic network and Cold-FET circuit are
illustrated in schematic form in Fig. 5.
The extrinsic network parameter values were extracted
from this de-embedded data using the following sequence
of actions: first, the extrinsic network components were
de-embedded to obtain intrinsic S-parameters; next, the
Cold-FET intrinsic parameter values were calculated from
these intrinsic S-parameters; then this intrinsic model was re-
embedded with the extrinsic network; and finally these model
S-, Y-, and Z-parameters were compared to the measured
values (after de-embedding of the manifolds). The extrinsic
network parameters were adjusted until the measured and
model data were suitably close, over the broadband frequency
Fig. 5. The new extrinsic network with the Cold-FET intrinsic circuit also
shown, for a transistor with a total gate periphery of 4.8 mm.
0 2 4 6 8 10
−1.5
−1
−0.5
0
0.5
M
ag
ni
tu
de
 S
11
 
(dB
)
Frequency (GHz)
−200
−150
−100
−50
0
Ph
as
e 
S 1
1 
(de
gre
es
)
 
 
Measured
Model
0 2 4 6 8 10
−1.1
−1
−0.9
−0.8
−0.7
−0.6
−0.5
−0.4
M
ag
ni
tu
de
 S
22
 
(dB
)
Frequency (GHz)
−140
−120
−100
−80
−60
−40
−20
0
Ph
as
e 
S 2
2 
(de
gre
es
)
 
 
Measured
Model
Fig. 6. Comparison between measured and modeled manifold-de-embedded
S-parameters.
range. Also, we ensured that the resulting small-signal
intrinsic component parameters were physically realistic,
for example, the model resistances were non-negative. This
procedure was performed in Agilent-EEsof ADS™, using the
Sequencer component and the Tuning simulator. As a starting
point, either nominal or zero values for the extrinsic network
components can be used with equal success, indicating the
robustness of the technique. The broadband S-parameters for
the small-signal model and extrinsic network are compared
to the measured and de-embedded S-parameters in Fig. 6,
showing excellent agreement.
4V. NONLINEAR INTRINSIC MODEL
The manifold structure and the extrinsic network are de-
embedded to obtain S-parameter data at the intrinsic model
reference planes [9]. After converting to Y-parameters, the
LDMOS transistor model current and charge state functions
can then be obtained by integration of the small-signal voltage-
dependent parameters, using the following equations [3].
Id (Vgs, Vds) =
Vgs∫
Vgs0
gm (vgs, Vds0)dvgs +
Vds∫
Vds0
gds (Vgs, vds)dvds
+ Id (Vgs0, Vds0)
(1)
Qg (Vgs, Vds) =
Vgs∫
Vgs0
[Cgs (vgs, Vds0) + Cgd (vgs, Vds0)]dvgs
−
Vds∫
Vds0
Cgd (Vgs, vds)dvds +Qg (Vgs0, Vds0)
(2)
Qd (Vgs, Vds) =
Vgs∫
Vgs0
[Cm (vgs, Vds0)− Cgd (vgs, Vds0)]dvgs
+
Vds∫
Vds0
[Cds (Vgs, vds) + Cgd (Vgs, vds)]dvds
+Qd (Vgs0, Vds0)
(3)
The integrations are carried out starting from the quiescent
bias point, denoted by (Vgs0, Vds0). Typically for LDMOS
power transistors this corresponds to Class AB quiescent bias
conditions: Vdd = 28 V, Id = 6 mA/mm, corresponding to
Vgs ≈ 2.7 V. Also for LDMOS, The gate current source is set
to zero.
A. Drain Current Model
The drain current expression given by (1) is a ‘high fre-
quency’ drain current, and using this for the model at RF
overcomes some of the dispersion issues that are present
in some FET technologies, such as trap-related dispersion,
for example. Even though LDMOS does not suffer from
significant trap-induced dispersion, we use the high frequency
drain current to derive the model so that the model generation
process is a generic one that can be applied to different FET
technologies.
In our nonlinear model implementation, we can choose
from a number of possible drain current functions, all of
which are analytical expressions that result in a smooth and
differentiable relationship, enabling accurate prediction of high
order nonlinearities, and better interpolation, or, more strictly,
function approximation, between the measured data points.
One example of a drain current function approximation
that we can use is the analytical expression due to Fager
et al. [10] to fit the high frequency drain current data [11].
This expression has an improved fit, compared with earlier
FET models, to the drain current data in the near-threshold
region, which is typically where the LDMOS power transistor
is biased for power amplifier applications.
The focus of this work is the use of artificial neural
networks (ANNs) to approximate the drain current data. This
approach shows similar levels of accuracy as methods of
function approximation using elementary functions with fitting
parameters [3]. The function approximation using ANNs is
described in more detail in Section VI-A, in the context of
controlled extrapolation. In addition to this ANN model, which
is used to fit the data in the active on-state and threshold
regions, we add functions to mimic the drain current behavior
in the on- and off-state breakdown regions of the FET. These
breakdown currents are modeled using diode-like expressions,
as used in the Motorola Electro-Thermal (MET) model [12].
It is essential to include these effects in the model for large-
signal applications.
B. Charge Model
One of the main objectives of this new nonlinear model
development was to improve the charge model description
over that of the existing MET model. The integral formulation
for determining the charges described by eqns. (2) and (3)
ensures a conservative charge formulation, which is essential
for accurate prediction of low-level phase nonlinearity, and for
convergence in time-domain simulations [3], [13]. A charge-
conservative model formulation, approximated by a smooth
and differentiable function, and capable of including tempera-
ture dependence, was seen as an essential goal for this model.
The conservative gate and drain charges, Qg and Qd of (2)
and (3) were obtained through a Root model extraction from
the measured intrinsic data, using the commercial Agilent-
EEsof IC-CAP™ function. In our model extraction, this pro-
duces tables of the charge data indexed by the intrinsic gate
and drain voltages. These data are then approximated using
artificial neural networks (ANNs), resulting in smooth and
infinitely differentiable two-dimensional charge functions that
are used directly in the model in Fig. 1. It is also possible
to obtain the neural network charge functions by fitting an
adjoint neural net to the small-signal capacitances, avoiding
the direct integration step in (2) and (3) [14].
VI. EXTRAPOLATION CONSIDERATIONS FOR
LARGE-SIGNAL DEVICE MODELING
Extrapolation is a major concern for high power RF transis-
tor modeling. The high current densities associated with power
transistors, and the limited power and current capabilities of
the measurement instrumentation often mean that the region
of the device output characteristics that can be accessed and
measured is much smaller than the region that is covered in RF
operation of the transistor. This is illustrated in Fig. 7, which
shows a dynamic load-line under RF drive superimposed on
the measured DC I-V characteristics. While the dynamic load-
line includes displacement currents, it can be seen from this
illustration that the loadline trajectory can have a significant
excursion into the unmeasured region. Hence, the extrapolation
50 10 20 30 40 50 60 70
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
1.2
Drain−to−Source Voltage, Vds (V)
D
ra
in
 C
ur
re
nt
, I
ds
 
(A
)
Dynamic load−line
Characterization limit
Fig. 7. A load-line for a transistor operating under mismatched conditions
superimposed upon the drain current (under pulsed operation).
characteristics of the model state functions needs to be well
behaved and accurate.
Extrapolation is the ability of the function approximation
to return a value for input data that is outside the original
training or characterization region. Generally, this statement
is taken to mean that the extrapolated value returned by the
function approximation is sufficiently close to the actual or
true value of the data outside the training region. While this is
one requirement of successful extrapolation, there is a further
consideration for the simulation of large-signal models. The
circuit simulator uses an optimization algorithm to determine
the circuit voltages and currents, and for successful conver-
gence there must be no local minima in the solution domain.
The model nonlinear functions must therefore be extrapolated
in such a way as to avoid these non-physical local minima:
the extrapolation of the function outside the characterization
data set should be able to redirect the simulator into the
characterization domain.
In Fig. 8 we plot the measured characterization data, region
I, and the regions of extrapolation corresponding to accurate
function approximation, denoted by region II, and to redi-
rection of the function, denoted by region III. In region I,
the characterized region, the model used must approximate
the measured data accurately, to within some acceptable level
of error. In region II, accurate extrapolation is required since
the loadline of the transistor may pass through this region. In
region III, the breakdown region, provided that the form of the
breakdown currents is correctly modeled, the main emphasis
is to ensure robust convergence of the simulator; fidelity to
the actual drain current is less important.
A. Artificial Neural Networks for Function Approximation and
Extrapolation
The universal approximation theorem was a key factor in
establishing ANNs as a useful approximation method [15].
This theorem states that a network with a single hidden layer
of neurons is capable of approximating any given function,
with any degree of accuracy, provided that there are sufficient
neurons, and that their activation functions are non-constant,
0 20 40 60 80 100
0
0.5
1
1.5
2
2.5
Vds (V)
I d 
(A
)
Region II
Region III
Region I
Fig. 8. An illustration of the various regions of the drain current. The
measured characterization data is indicated by region I, while regions II & III
represent the extrapolation and breakdown regions.
bounded and monotonically increasing [16], [17]. A single-
hidden-layer neural network function is expressed as
y =
n∑
i=1
aif

 m∑
j=1
wjixj + bi

+ bo (4)
where y is the data that is to be approximated, x is the input
data, a, b, w are the adjustable parameters of the function, n
is the number of neurons in the single hidden layer, and m is
the number of inputs – in our case this would be two: the gate
and drain voltage values. The nonlinear activation function f
of the neuron is typically a hyperbolic tangent function, and
this is used in our model.
A known problem with a single hidden layer neural network
is that the neurons tend to interact with each other globally:
modifying the weights to make the approximation better at
one point in the data will degrade it at another: the ‘water-
bed’ effect. This global interaction can be overcome through
the use of two hidden layers. Neurons in the first layer tend
to partition the space into sub-domains and approximate the
function locally. The second layer fits more global features.
That is, the outputs of the first layer tend to operate only
on localized areas of the function range and output zero
elsewhere. The second layer combines these localized regions
[16], [18].
We use a two-hidden-layer neural network for good approx-
imation and extrapolation of the state function data. Further,
a number of other techniques are used in the training of
the neural network to improve the generality of the function
approximation. One technique is that of regularization, where
the performance function of the approximation is modified
from a simple mean square error estimation of the function
approximation. Often, this performance function is modified
to include the mean of the sum of the squares of the weights
and biases of the network. Bayesian regularization assumes
that the weights and biases are random variables with given
distributions, and the regularization is performed using statis-
tical techniques. Another technique is that of ‘early-stopping’
and cross-validation, wherein the training data set is divided
into several different subsets that can be used for training
6and validation of the neural network function approximation.
The network is trained on one set of data, and the function
approximation checked on another set of data: once the error in
the approximation of the validation data set begins to increase,
the training is terminated [19].
We use the Mathworks Matlab™Neural Network Toolbox
as our basic tool for training the artificial neural networks.
We normalize the input and output data to ensure that the
nonlinear neural network functions operate over their optimal
range, and to prevent ill-conditioning of the optimization
process. The training also employs a combination of the above
methods to obtain high accuracy of approximation, and good
generalization of the neural network. Using these techniques,
the network output can represent the data accurately within the
training region, and be well-behaved outside of the training
region, as will be demonstrated later.
We have implemented a generic drain current expression
using a combination of a neural-network and a breakdown
model in the following manner,
Ids(Vds, Vgs, T ) = ANN(Vds, Vgs, T )× Vbr(Vds, Vgs, T )
(5)
where T is the temperature, and ANN is a neural network
capable of an accurate approximation of the data over the
measured characterization region. The breakdown model for
Vbr is of the following general exponential form
Vbr ≈ (1 + ke
Vbrk) (6)
where k is a constant and Vbrk is a temperature dependent
breakdown voltage parameter. In fact, the actual breakdown
model has more parameters than this to ensure good global
fit; details of this breakdown model can be found in the
documentation for the MET model [12].
This drain current formulation is very convenient to imple-
ment within a circuit simulator and the derivatives of Ids can
be readily computed as it is the product of two differentiable
functions. The parameters of the breakdown model are set so
that the function has no influence on the neural network for
the drain current within the characterized region and minimal
influence within the extrapolation region. Extrapolation is
governed the two layer feed-forward neural network and the
breakdown model.
The extrapolation characteristics of the two-layer neural
network are demonstrated using a set of drain current data
obtained from measurement to train the network. The mea-
surement system was limited to points with less than 15 watts
dissipation: this defines the boundary between regions I &
II in Fig. 8. The neural network was trained only on data
from this region. The neural network is then then used to
predict the drain current in regions II & III, and the results are
plotted in Fig. 9, where it can be seen that the neural network
predicts the drain current accurately within the training data
range, and is well behaved in the extrapolation regions. The
drain current plotted as a surface in Fig. 10, shows a smooth
surface without local minima, and again the extrapolation is
well behaved. Typically, we find that using two hidden layers,
each containing between five and fifteen neurons, are sufficient
for this level of accuracy and extrapolation performance.
0 20 40 60 80 100
0
0.5
1
1.5
2
2.5
Vds (V)
I d 
(A
)
 
 
ANN only
Full model
Fig. 9. A plot of the full drain current model including the breakdown model
compared with the drain current as predicted by the neural network.
0
50
100
−10
0
10
20
−2
0
2
4
6
8
Vds
Vgs
I d
Fig. 10. A surface plot of the drain current as predicted by the full drain
current model. The thick line indicates the range of voltages over which the
drain current was measured.
VII. MODEL IMPLEMENTATION AND VERIFICATION
The model has been implemented in ADS using the Agilent-
EEsof MINT™ modeling interface. A modular approach has
been adopted allowing us to choose from and compare a
variety of model implementations for the current and charge
models: table-based models, MET expressions, neural net-
works, and the new model descriptions. This new model has
been compared against the measured data, and existing MET
and Root models extracted from the same data.
A 4.8-mm gate periphery LDMOS transistor was charac-
terized using pulsed I-V and S-parameter measurements, and
these data were used to train the neural networks used to model
this device. The measurement region was defined by several
factors: a maximum permissible power dissipation for the
transistor, a maximum permissible drain current, and a gate-
drain breakdown voltage which has limits on the derivatives
of Gds to prevent the measurement procedure from taking
the transistor into on- or off-state breakdown regions where
the possibility of permanent damage to the transistor occurs.
The measurement procedure was repeated at three different
chuck temperatures: 25, 75, and 125°C. Approximately 1200
measurement points were captured at each temperature.
710−3 10−2 10−1 100 101
−70
−60
−50
−40
−30
−20
−10
0
Output Power − PEP (W)
IM
3 
(dB
c)
 
 
Root Model
New Model
Measurement
6 mA/mm
3 mA/mm
Fig. 11. Measured and simulated results of the intermodulation distortion
products for fo = 2.14 GHz with a tone spacing of 100 kHz, as a function
of the input power, at the intrinsic reference plane, for biases of 3 and 6
mA/mm comparing MET and Root models to the new model.
−30 −20 −10 0 10
−180
−160
−140
−120
−100
−80
−60
−40
−20
0
Input Power (dBm)
IM
3,
 IM
5,
 IM
7 
(dB
c)
 
 
Root Model
New Model
MET Model
Fig. 12. Simulation results of the intermodulation distortion products for
fo = 2.14 GHz with a tone spacing of 100 kHz, as a function of the input
power, at the intrinsic reference plane, comparing MET and Root models to
the new model.
The drain current model, presented earlier in Fig. 9, has
very accurate approximation by the neural network within
the measurement region, and reliable extrapolation beyond
the measurement region boundary, giving confidence in the
simulated drain current performance of this model.
The new model also displays excellent fidelity to the non-
linear device behaviour at RF frequencies. This is shown in
Fig. 11, where we show that the new model predicts accurately
the measured IM3 over a wide range of input drive. Also
shown is the prediction using a table model: while this is
accurate at high powers, at lower powers the interpolation can
cause errors when the signal level is small. The new model also
exhibits the correct asymptotic behavior of the intermodulation
products at low drive powers. This is true also for higher order
IM products as shown in Fig. 12, in contrast to table- or data-
based models, which are often limited by the order of the
spline interpolation.
The approximation and extrapolation of the charge state
functions are also well modeled by the neural networks. In
Fig. 13 we show the gate charge as a function of gate and drain
−200
0
200
400
600 −20
0
20
−50
0
50
Vgs (V)Vds (V)
Q g
 
(pC
)
Measured Regon
Fig. 13. A plot of Qg vs. Vds and Vgs. Outside the measured region indicated
by the thick line, the charge surface predicted by the neural network is smooth
and very well behaved, even at extremely high voltages which would never be
experienced in practice, but may be used by the simulator during convergence.
−4 −2 0 2 4 6 8 10 12
1.8
2
2.2
2.4
2.6
2.8
Vgs  (V)
∂Q
g/∂
V d
s 
 
(pF
)
 
 
From Measured
Neural Network
Fig. 14. Derivatives of the gate charge for various drain voltages plotted as
a function of the gate voltage, Vgs.
voltages. The surface is well-behaved with no local minima
or sharp gradients, and the extrapolation is smooth and well-
behaved. The measured region used during the neural network
training process is also indicated on this figure. The charges
exhibit only small temperature dependence, and the model
also extrapolates well as a function of temperature. Similar
results are found for the drain charge surface. As an illustration
of the robust extrapolation property of this neural network
formulation, we show the model gate charge function over
an extreme range of drain voltage. The function is smooth,
well-behaved, and convergent.
The derivatives predicted by the neural networks fit the
numerical derivatives of the charge very well. In Fig. 14, the
derivatives of the gate charge, as a function of gate voltage,
for various drain voltages are plotted.
VIII. ELECTRO-THERMAL MODEL
Power transistors dissipate a lot of heat, and so it is
necessary to include temperature dependence into the model
parameters. Further, the information signals used in modern
wireless communications have large peak-to-average ratios of
82 2.5 3 3.5 4
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Drain−to−Source Voltage, Vds (V)
D
ra
in
 C
ur
re
nt
, I
ds
 
(A
)
 
 
Measured
Model
Extrapolated Model
Fig. 15. The modeled and measured drain current is plotted at 25, 75, and
125°C, as a function of the applied gate voltage. The drain voltage is 28
V. In addition the model drain current is plotted at -25 and 150°C outside
the measured range. This demonstrates the accurate prediction of the zero-
temperature coefficient (ZTC) point.
the signal voltage or power: therefore, the dynamical response
of the transistor’s thermal behavior to these inputs must also be
captured in the model. An accurate dynamical electro-thermal
model is required.
A simple way of including the thermal effects on the output
current is to use a de-rating function on the drain current
expression [3], [11], [20].
The thermal variation of the threshold voltage acts in the
opposite sense to the drain current de-rating, producing a point
in the drain current-gate voltage relationship that is indepen-
dent of temperature: the zero-temperature coefficient or ZTC
point [21]. This threshold voltage variation is accommodated
by an additional parameter in the drain current model, as in
the MET model [12].
In addition to this accurate isothermal representation, the
neural network model also returns accurate predictions of the
drain current over temperature. In Fig. 15 we show the drain
current dependence on the gate voltage in the region close
to threshold, where competing thermal effects on the drain
current exist, resulting in an observed point of thermally-
independent drain current: the ‘zero temperature coefficient’
current. The neural network predicts the nonlinear drain cur-
rent behavior and its thermal dependences accurately, with the
modeled and measured data being virtually indistinguishable.
The neural network also predicts well behaved currents at
temperatures outside the measured range, maintaining the ZTC
point as expected.
IX. VALIDATION OF THE MODEL
The correct prediction of large-signal performance by the
nonlinear model is also crucial. In Fig. 16 the drive-up
characteristics of the model are compared with measurement
at two different biases, showing very good agreement, even in
compression. In Fig. 17 we show the measured transducer gain
contours for a 4.8 mm gate-width LDMOS power transistor
operating in Class-AB bias, taken at 2.14 GHz. This device is
of identical layout to the device used for the model extraction.
The model matches the contours and peak gain point with
−20 −15 −10 −5 0 5 10 15
−20
−10
0
10
20
30
40
O
ut
pu
t P
ow
er
 (d
B)
Input Power (dBm)
 
 
0
20
40
60
PA
E 
(%
)
10
30
50
Measured
New Model
Fig. 16. Measured and modeled output power versus input power for bias
currents equal to 6 and 9 mA/mm.
20 30 40 50 60 70 80
50
60
70
80
90
100
110
Real Part of Impedance (Ω)
Im
ag
 P
ar
t o
f I
m
pe
da
nc
e 
(Ω
)
 
 
Measured
Model
Max Simuated = 21.8 dB
Max Measured = 21.8 dB
Fig. 17. Comparison between measured and modeled transducer gain load-
pull contours at P1dB . The contours are spaced in steps of 0.3 dB.
TABLE I
COMPARISON OF THE ROOT AND NEW MODEL (TIME IN SECONDS).
DC-IV Envelope HB 1-tone LSSP SP Trans.
Root 1.97 67.90 6.19 3.73 1.39 24.73
MET 1.44 28.30 3.41 3.67 1.05 18.63
This Model 1.36 58.59 6.86 3.38 0.97 19.88
excellent accuracy over the range of output powers and load
impedances. The measured and predicted power-added effi-
ciency load-pull contours for various output powers are shown
in Fig. 18. This is a stringent test of a model: the large-signal,
high-frequency nonlinear behavior and the DC conditions must
be predicted accurately at the same time, and to do this over a
range of load conditions and output powers indicates that the
model is accurate, tracking the measured data very closely.
This new model also executes in the simulator with no speed
penalty, when compared with the industry-standard MET and
Root models, as indicated in Table I. It should be noted that
we have not yet optimized the new model for speed.
X. SCALING CONSIDERATIONS
The scaling behavior of the complete transistor model is
heavily dependent on the scaling of the extrinsic network.
Therefore, the scaling performance of the extrinsic network
920 30 40 50 60 70 80
50
60
70
80
90
100
110
Resistance (Ω)
R
ea
ct
an
ce
 (Ω
)
 
 
Model
Measurement
Max PAE = 63.7 %
Fig. 18. Load-pull contours of power-added efficiency at P1dB , comparing
measured data and the new model. The contours are spaced in steps of 4%.
parameters is a good test of how well this electrical behavior
is modeled over a range of transistor gate peripheries or
widths. For RF power transistors, the unit gate width is
generally maintained constant for a given application, and
scaling of the device is simply by the number of gate fingers.
However, the unit gate width will be different for different
applications. For example, the driver transistors in a multi-
stage IC power amplifier will generally be much shorter than
the power transistor gate widths; and the unit gate width will
be larger for lower frequency applications, from considerations
of overall die size. Moreover, from a modeling perspective, it
is most useful to have one model kernel that can be used
for all circuit and frequency applications, hence the ability
to scale accurately with transistor gate periphery is extremely
beneficial.
In Figs. 19-21 we illustrate the scaling behavior of the
extrinsic resistance and inductance parameters. Four devices
were chosen, having 4, 8, 16 and 32 gate fingers and a constant
unit gate-width of 600 µm, yielding total gate peripheries
of 2.4, 4.8, 9.6 and 19.2 mm, respectively. These devices
were measured under Cold-FET conditions as outlined in
Section IV, and the extrinsic parameters were extracted: the
series gate and drain inductances; the gate, drain and source
resistances; and the shunt gate and drain resistances. These
extrinsic network parameters are seen to scale with total gate
width in either a linear or inverse linear manner. The only
outlier is the gate inductance of the 19.2-mm device: this is
only one sample, and the higher than anticipated inductance
may be a result of feeding a relatively wide device from a
single point. Such simple and direct scaling indicates that
the extrinsic network captures the electrical characteristics of
the device layout very successfully. The extracted extrinsic
capacitances are relatively independent of the gate periphery,
over this range.
Further, as indicated in Fig. 5, we have identified a mutual
inductance as a component of the extrinsic network, modeling
the inductive coupling between the long fingers that are
characteristic of power transistors. In Fig. 22 the extracted
mutual inductance is shown as a function of the inverse of the
total gate periphery for transistors with a unit gate width of
0 0.1 0.2 0.3 0.4
0
50
100
150
Inverse Gate Periphery (1/mm)
In
du
ct
an
ce
 (p
H)
 
 
Lg
L
d
Fig. 19. Extrinsic series gate and drain inductance variation with the inverse
of total gate periphery.
0 0.1 0.2 0.3 0.4
0
2
4
6
8
10
12
Inverse Gate Periphery (1/mm)
R
es
is
ta
nc
e 
(Ω
)
 
 
Rgsh
R
dsh
Fig. 20. Extrinsic gate and drain shunt resistance variation with the inverse
of total gate periphery.
600 µm. The mutual inductance is seen to scale inversely with
gate width, or, alternatively, inversely with the number of gate
fingers for a constant unit gate width.
XI. CONCLUSION
We have described the architecture, extraction and imple-
mentation of a new, nonlinear, charge-conservative, dynamic
electro-thermal compact transistor model for RF power LD-
MOS FETs. This model includes a new extrinsic network for
RF power LDMOS transistors, that is designed to accommo-
date features attributable to the lossy silicon substrate, and
to the layout of the power transistor. The parameter extraction
procedure takes advantage of sequencer and tuning capabilities
of the circuit simulator to identify the network parameter
values over a broad frequency range. The new network was
observed to scale with gate width effectively and simply. The
intrinsic part of the model uses voltage-controlled current and
charge sources, these later being defined in a correct charge-
conservative manner. These sources are implemented using
two-hidden-layer artificial neural nets, and are shown to be
accurate and to extrapolate well. The new Freescale Electro-
Thermal LDMOS FET (FET2) model is found to be more
10
0 0.1 0.2 0.3 0.4
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Inverse Gate Periphery (1/mm)
R
es
is
ta
nc
e 
(Ω
)
 
 
R
s
R
d
Fig. 21. Extrinsic drain and source resistance variation with the inverse of
total gate periphery.
0 0.1 0.2 0.3 0.4
−35
−30
−25
−20
−15
−10
−5
Inverse Gate Periphery (1/mm)
M
ut
ua
l I
nd
uc
ta
nc
e 
(pH
)
Fig. 22. Mutual inductance variation with the inverse of the total gate
periphery.
accurate than both the industry-standard Motorola Electro-
Thermal (MET) model for power FETs and the isothermal
Root model, which is another charge-conservative formulation.
The new model is also at least as fast in simulation of typical
power amplifier design applications, saving valuable design
time.
XII. ACKNOWLEDGMENTS
Thanks are due to the management staff of RF Division,
Freescale, in particular Jaime Pla´, for encouraging and sup-
porting the development of this new nonlinear model.
REFERENCES
[1] D. E. Root, “Measurement-based mathematical active device modeling
for high frequency circuit simulation,” IEICE Trans. Electronics, vol.
E82-C, no. 6, pp. 924–936, June 1999.
[2] P. Jansen, D. Schreurs, W. de Raedt, B. Nauwelaers, and M. van
Rossum, “Consistent small-signal and large-signal extraction techniques
for heterojunction FETs,” IEEE Trans. Microwave Theory Tech., vol. 43,
no. 1, pp. 87–93, Jan. 1995.
[3] P. H. Aaen, J. A. Pla´, and J. Wood, Modeling and Characterization of RF
and Microwave Power FETs. Cambridge, UK: Cambridge University
Press, 2007.
[4] J. Staudinger, M. C. de Baca, and R. Vaitkus, “An examination of several
large-signal capacitance models to predict GaAs HEMT linear power
amplifier performance,” in in Proc. IEEE Radio and Wireless Conf.
(RAWCON), Colorado Springs, CO, Aug. 1998, pp. 343–346.
[5] J. Xu, M. C. E. Yagoub, R. Ding, and Q.-J. Zhang, “Robust neural based
microwave modeling and design using advanced model extrapolation,”
in IEEE MTT-S Int. Microwave Symp. Dig., Fort Worth, TX, June 2004,
pp. 1459–1552.
[6] G. Dambrine, A. Cappy, F. Heliodore, and E. Playez, “A new method
for determining the FET small-signal equivalent circuit,” IEEE Trans.
Microwave Theory Tech., vol. 36, no. 7, pp. 1151–59, July 1988.
[7] D. Lovelace, J. Costa, and N. Camilleri, “Extracting small-signal model
parameters of silicon MOSFET transistors,” in IEEE MTT-S Int. Mi-
crowave Symp. Dig., San Diego, CA, May 1994, pp. 865–868.
[8] J. Wood and D. E. Root, “Bias-dependent linear scalable millimeter-
wave FET model,” IEEE Trans. Microwave Theory Tech., vol. 48, no. 12,
pp. 2352–2360, Dec. 2000.
[9] J. Wood, D. Lamey, D. C. M. Guyonnet, D. Bridges, N. Monsauret, and
P. H. Aaen, “An extrinsic component parameter extraction method for
high power RF LDMOS transistors,” in IEEE MTT-S Int. Microwave
Symp. Dig., Atlanta, GA, June 2008, accepted for publication.
[10] C. Fager, J. C. Pedro, N. B. de Carvalho, and H. Zirath, “Prediction of
IMD in LDMOS transistor amplifiers using a new large-signal model,”
IEEE Trans. Microwave Theory Tech., vol. 50, no. 12, pp. 2834–42,
Dec. 2002.
[11] D. Bridges, J. Wood, M. Guyonnet, and P. H. Aaen, “A nonlinear electro-
thermal model for high power RF LDMOS transistors,” in IEEE MTT-
S Int. Microwave Symp. Dig., Atlanta, GA, June 2008, accepted for
publication.
[12] W. R. Curtice, J. A. Pla´, D. Bridges, T. Liang, and E. E. Shumate, “A
new dynamic electro-thermal nonlinear model for silicon RF LDMOS
FETs,” in IEEE MTT-S Int. Microwave Symp. Dig., Anaheim, CA, June
1999, pp. 419–422.
[13] D. E. Root, “Charge modeling and conservation laws,” in Asia-Pacific
Microwave Conference Workshop WS2, ‘Modeling and characterization
of Microwave devices and packages’, Sydney, Australia, June 1999.
[14] J. Xu, D. Gunyan, M. Iwamoto, A. Cognata, and D. E. Root,
“Measurement-based non-quasi-static large-signal FET model using
artificial neural networks,” in IEEE MTT-S Int. Microwave Symp. Dig.,
San Francisco, CA, June 2006, pp. 469–472.
[15] G. Cybenko, “Approximation by superpostion of sigmodial functions,”
Mathematics of control, signals and systems, no. 2, pp. 303–314, 1989.
[16] S. Haykin, Neural Networks: a Comprehensive Foundation, 2nd ed.
Upper Saddle River, NJ: Prentice Hall, 1995.
[17] C. M. Bishop, Neural Networks for Pattern Recognition. New York,
NY: Oxford University Press, 1995.
[18] D. L. Chester, “Why two hidden layers are better than one,” in Proc. Int.
Joint Conf. Neural Networks, Washington, DC, Jan. 1990, pp. 265–268.
[19] L. Prechelt, “Automatic early stopping using cross validation:quantifying
the criteria,” Neural Networks, vol. 11, no. 4, pp. 761–767, 1998.
[20] P. C. Canfield, S. C. F. Lam, and D. J. Allstot, “Modeling of frequency
and temperature effects in GaAs MESFETs,” IEEE J. Solid State
Circuits, vol. 25, no. 1, pp. 299–306, Feb. 1990.
[21] J.-M. Collantes, P. Bouysse, and R. Quere, “Characterising and model-
ing thermal behaviour of radio-frequency power LDMOS transistors,”
Electron. Lett., vol. 34, no. 14, pp. 1428–30, July 1998.
11
John Wood (M’87, SM’03, F’07) Received B.Sc.
and Ph.D. degrees in Electrical and Electronic Engi-
neering from the University of Leeds, UK, in 1976
and 1980, respectively. He is a Senior Member of
the Technical Staff, responsible for RF CAD &
Modeling in the RF Division of Freescale Semi-
conductor, Inc, Tempe, AZ, USA. From 1997–2005
he worked in the Microwave Technology Center
of Agilent Technologies (then Hewlett Packard) in
Santa Rosa, CA, USA. His areas of expertise include
the development of compact device models and
nonlinear behavioral models for RF power transistors and ICs. He is a Fellow
of the IEEE, and a member of the Microwave Theory and Techniques, and
Electron Devices Societies.
Peter H. Aaen (S’93, M’97) received the B.A.Sc.
degree in Engineering Science and the M.A.Sc.
degree in Electrical Engineering, both from the Uni-
versity of Toronto, Toronto, ON., Canada, and the
Ph.D. degree in Electrical Engineering from Arizona
State University, Tempe, AZ., USA, in 1995, 1997,
and 2005, respectively. He is the Manager of the
RF Modeling team of the RF Division of Freescale
Semiconductor, Inc, Tempe, AZ, USA. His areas
of interest include the development and validation
of microwave transistor models, passive component
modeling, and the electromagnetic simulation of complex packaged envi-
ronments. He has co-authored Modeling and Characterization of RF and
Microwave Power FETs (Cambridge University Press, 2007), and has authored
or co-authored of over a dozen papers.
Daren Bridges received the B.S. degree in electrical engineering from the
University of Utah in 1992 and the M.S. degree in electrical engineering
from the University of Texas at Dallas in 1996. In 1992, he joined the
RF/Microwave Group at Texas Instruments and specialized in linear MESFET
and HEMT modeling for MMIC devices. In 1997, he joined the Modeling
team of the RF Division of Motorola Inc.s Semiconductor Products Sector
(Freescale Semiconductor Inc.) His areas of expertise include RF high power
nonlinear model development and implementation within harmonic balance
simulators and the building and testing of active component model libraries.
He is author or co-author on several papers, articles and workshops in the
areas of RF and microwave device modeling and simulation.
Dan Lamey received his M.S. Physics degree from
the University of Minnesota. After joining Mo-
torola’s Semiconductor Products Sector in 1983 he
worked in the Logic Division as a product analyst
of TTL and ECL technologies, where he was device
engineer in 1990, initially assigned to the technol-
ogy introduction of Mixed-mode/RF BiCMOS. This
lead to his subsequent involvement in LDMOS and
GCMOS technology development. Since 2001 he
has been a member of the RF Division of Freescale
Semiconductor Inc., where he has been responsible
for RF LDMOS integrated circuit CAD, as well as electromagnetic simulation
and compact modeling of on-wafer passive devices.
Michael Guyonnet received his Ph.D. degree in microwaves & RF from
the University of Limoges, France, in 2005. While at the University of
Limoges his work focused on the development of non linear electrothermal
modeling techniques for LDMOS RF power transistors. In 2004 he joined
Freescale Semiconductor Inc., as a senior modeling engineer and his work now
concentrates on improving transistor linearity and advanced Doherty designs.
Daniel S. Chan (M’06) received the B.A.Sc. de-
gree in electrical engineering from the University
of Toronto, Toronto, Canada, in 2003. In 2004, he
joined Qualcomm’s CDMA Technologies division,
San Diego, California, as an applications engineer.
Since 2006, he has been a member of Freescale’s
RF Division, Tempe, Arizona, as a device mod-
eling engineer where he has been responsible for
RF LDMOS CAD and the development of passive
component models.
Nelsy Monsauret received a Ph.D. degree in mi-
crowaves & RF from the University of Limoges,
France, in 2001. Between 1997 and 2001, she
worked on multi-band RF power amplifiers design in
the wireless mobile group of Freescale Semiconduc-
tor (formerly Motorola Semiconductor), Toulouse,
France. She joined the base-station power amplifier
division of Freescale Semiconductor in 2001, as
CAD and Design support. Since 2007, she has been
working as RF modeling engineer in charge of
developing transistor models.
