An LMS-based adaptive predistorter for cancelling nonlinear memory effects in RF power amplifiers by Montoro López, Gabriel et al.
An LMS-Based Adaptive Predistorter for Cancelling 
Nonlinear Memory Effects in RF Power Amplifiers 
 
G. Montoro, P. L. Gilabert , E. Bertran  
 Dpt. Signal Theory and Communications  
Universitat Politècnica Catalunya (UPC)  
Castelldefels (Barcelona), Spain 
montoro@tsc.upc.edu 
A. Cesari  
Dpt. of Intégration de Systèmes de 
Gestion de l’Energie, (LAAS – CNRS) 
Toulouse, France 
acesari@lass.fr 
J. A. García  
Dpt. Communications Engineering 
Universidad de Cantabria (UC) 
 Santander, Spain 
joseangel.garcia@unican.es 
 
 
Abstract— This paper presents the design of an adaptive 
Digital Predistorter (DPD) for Power Amplifier (PA) 
linearization whose implementation and real time adaptation can 
be fully performed in a Field Programmable Gate Array 
(FPGA). The distinctive characteristic of this adaptive DPD is its 
straightforward deduction from a Nonlinear Auto Regressive 
Moving Average (NARMA) PA model and the possibility to be 
completely implemented in a FPGA without the need of an 
additional digital signal processor performing the DPD 
adaptation. The adaptive DPD presents a NARMA structure 
that can be implemented by means of Look-Up Tables (LUTs). 
This configuration results in a Multi-LUT implementation where 
LUT contents are directly updated by means of an LMS 
algorithm. Details on the internal adaptive DPD organization as 
well as its linearization capabilities are provided, taking into 
account memory effects compensation.  
Keywords – RF amplifiers; nonlinear memory effects; adaptive 
predistortion; 
I.  INTRODUCTION 
Modern spectrally efficient multilevel modulation schemes 
are very sensitive to the inter-modulation distortion (IMD) that 
results from nonlinearities in the RF transmitter chain, mainly 
due to PA nonlinear behavior. This implies that for having 
linear amplification and thus being compliant with linearity 
requirements specified in communication standards, 
significant back-off levels in PA amplification are needed. 
Back-off amplification results in a power inefficient 
amplification, moreover when the PA has to handle signals 
presenting high peak-to-average power ratios (PAPRs). The 
use of PA linearizers arises as a recognized solution to deal 
with this trade-off between linearity and efficiency. 
 Among linearizers, digital predistortion (DPD) takes 
advantage of the always faster digital signal processing 
devices (already present in software defined radio subsystems 
within the transmitter), to perform digital adaptive PA 
linearization and thus avoiding RF hardware adjustment 
problems. However when considering signals presenting 
significant bandwidths the performance of a DPD linearizer 
can be degraded due to PA memory effects.  
DPD linearization has been object of intensive research 
generating multiple publications in the nonlinear and memory 
compensation area ([1]-[5]). Some common solutions consist 
in designing the predistortion function as the composition of a 
memoryless nonlinearity and a linear time invariant (LTI) 
block. This generic configuration can be seen as a simplified 
decomposition of a more general Volterra series function. 
Among these solutions it is possible to find DPD based on 
memory polynomials as in [1], or Hammerstein based schemes 
as in [2], where the LTI block is usually described by a finite 
impulse response (FIR) filter. Other solutions directly describe 
the memoryless nonlinear block by means of a Look-up Table 
(LUT), while memory compensation is achieved by adding a 
FIR filter as in [3], by considering a 2-dimension LUT as in 
[4], or taking into account a set of LUTs associated to delayed 
samples of the input signal as in [5]. 
Commonly DPD solutions are validated in laboratory test-
benches formed by the closed loop interconnection of: a PC 
performing the DPD function, a vector signal generator 
(VSG), the PA and the vector signal analyzer (VSA). Thus 
little attention is paid to aspects related to adaptation issues. 
On the other hand, other solutions where the DPD function is 
carried out in a FPGA, such the one presented in [5], perform 
the adaptation process in a PC or eventually in a DSP, which 
introduces and additional power hungry device that reduces 
the overall system power efficiency.  
This paper presents an adaptive DPD for PA linearization 
whose implementation and real time adaptation can be fully 
performed in a Field Programmable Gate Array (FPGA).  The 
DPD here presented is based on the predictive Nonlinear Auto 
Regressive Moving Average (NARMA) DPD presented in [6], 
that can be implemented by means of Look-Up Tables 
(LUTs). The purpose and main contribution of this paper is the 
development of an adaptation algorithm that permits real time 
updates of the LUTs contents that configure the multi-LUT 
structure of the predictive-NARMA DPD. Therefore, unlike 
[6], it is possible to perform real time adaptation in an FPGA 
device without needing a DSP or any other kind of advanced 
coprocessor for doing it.     
Proceedings of Asia-Pacific Microwave Conference 2007
1-4244-0749-4/07/$20.00 @2007 IEEE. 2421
 
 
Figure 1. Adaptive digital predistorter. 
 
 
Figure 2. Structure of a look-up table. 
II. PREDICTIVE NARMA PREDISTORTER 
This predictive DPD system here presented follows the 
general block diagram shown in Fig. 1, where DPD 
linearization is carried out at baseband by adaptively forcing 
the PA to behave as a linear device, as it is explained in [6]. 
First it is necessary to perform an identification of the low-
pass complex envelope PA behavioral model. The coefficients 
of the NARMA structure defining the PA behavioral model 
are extracted using the PA input ( Ax ) and output ( Ay ) discrete 
complex envelope data. The input-output relation of a 
NARMA model can be expressed as 
( )( )
( )( ) ( )( )
0
1 1
( )A A
N D
f g
i A i j A j
i j
y k f x k
f x k g y kτ τ
= =
= +
+ − + −∑ ∑         (1) 
where 0f , if  and jg  are memoryless nonlinear functions 
that can be describe by polynomials or by LUTs. On the other 
hand, fiτ  and 
g
jτ  ( Ν⊂τ ) are the most significant sparse 
delays (in the discrete-time domain) of the input and the 
output respectively, contributing at the description of the PA 
memory effects. The identification of these optimal delays is 
carried out by means of a heuristic search algorithm called 
simulated annealing, as it is explained in [7]. Details on the 
stability analysis of this NARMA structure in order to ensure 
the overall stability of the DPD can be also found in [6]. 
From (1) it is possible to obtain the expression defining the 
PA input, 
( )
( )( )
( )( )
11
0
1
( )
N
f
A i A i
i
A D
g
j A j
j
y k f x k
x k f
g y k
τ
τ
=
−
=
 
− −  
=  
− −   
∑
∑            (2) 
Then, we define nonlinear functions if  and jg as a Cartesian 
complex product between an input/output sample (present or 
delayed) ( ( )fA ix k τ− , ( )gA jy k τ− ) and a complex gain 
( fiG ,
g
jG ), stored in a LUT as shown in Fig. 2, that depends 
on the signal amplitude: 
( )( ) ( )( ) ( )
( )( ) ( )( ) ( )
f f f f
i A i i A i A i
g g g g
j A j j A j A j
f x k G x k x k
g y k G y k y k
τ τ τ
τ τ τ
− = − ⋅ −
− = − ⋅ −
      (3) 
  We now consider Dy  as the desired output, that is, the PA 
output after linearization. This linear output can be defined as 
the transmitted signal Tx  amplified by a linear gain linearG . As 
it can be observed in Fig. 1, if no DPD is 
considered ( ) ( )A Tx k x k= .  On the other hand, with (2) is 
possible to obtain the necessary PA input ( )Ax k  that 
guarantees a certain PA output ( )Ay k . If the desired output 
( )Dy k  is evaluated a priori, then in (2) we can replace Ay  by 
Dy  (and the same for all delayed output samples). In other 
words, we impose ( )Dy k  (desired output) as a prediction of 
the future value of ( )Ay k   and consequently, we calculate the 
input value of the PA ( )Ax k  that permits achieving the desired 
performance at the PA output, that is ( ) ( )A Dy k y k= .  
Now it is possible to rewrite the predistortion function in (2) 
in a more convenient DPD expression, in terms of the 
(delayed) complex inputs ( ( )fA ix k τ− ) and desired outputs 
( ( )gA jy k τ− ) multiplied by its corresponding LUT complex 
gain ( fiG ,
g
jG ),  
_
0( ) ( ( ) ) ( )
f inverse
A A Ax k G u k u k= ⋅                      (4) 
Where inversefG _0  is the inverse of 
fG0 , and the intermediate 
variable ( )Au k  (see Fig. 3) is described as 
( )( ) ( )
( )( ) ( )
1
1
( ) ( )A D
N
f f f
i A i A i
i
D
g g g
j D j D j
j
u k y k
G x k x k
G y k y k
τ τ
τ τ
=
=
= −
− − ⋅ − −
− − ⋅ −
∑
∑
              (5) 
with D linear Ty G x= ⋅ . 
2422
 Figure 3. Multi-LUT structure. 
The resulting DPD configuration is depicted in Fig. 3, 
where the predictive-NARMA structure can be mapped in a 
FPGA as a set of multiple LUTs. Therefore each nonlinear 
function ( 10
−f , if and jg ) in (2) is finally implemented with a 
LUT. This configuration permits an FPGA implementation 
relaxing the computational effort related to the nonlinear 
functions implementation. And in addition permits scalability, 
that is adding or reducing the number of delays considered in 
the predistorter structure. 
III. LOOK-UP TABLES ADAPTATION  
A. DigitalPredistorterOperation  
The adaptive process followed by this DPD in order to 
perform the PA linearization consists in the following steps: 
i) First, it is necessary search for the best sparse delays 
defining the PA model in (1) and perform the stability test 
(bounds given by the small-gain theorem, as explained in 
[6]) to avoid possible instabilities due to the recursive part 
of the DPD structure.  
ii) Once we have identified the best sparse delays and 
ensured that nonlinear functions related to the recursive part 
are consistent, we map the DPD structure in Fig. 3 into the 
FPGA. The initial LUT Gain values are filled with 0’s or 
1’s. 
iii) Finally, we run the DPD process and at every iteration 
step the following actions are performed in parallel in the 
FPGA device: 
• Applying the algorithm defined in (1), but implemented 
with LUTs as defined in (3), an output sample of the 
PA NARMA model ( )(_ ky NARMAA ) is obtained in order 
to create the LMS error.  
• Applying the algorithm defined in (4) a new DPD 
output sample ( ( )Ax k ) is obtained. 
• All complex LUT Gains involved in the calculation of 
the predistortion output sample are updated by means 
of the LMS algorithm.  
After a transient period, in which all LUT Gains are 
continuously being updated, the PA output converges to the 
desired output, achieving then the desired linear amplification.  
B. Updating the Multi-LUT gains. 
The update of the complex LUT Gains is performed by 
means of the complex LMS algorithm [8], described by   
)()())((
)()())((
)()()( __
kekykyG
kekxkxG
kykyke
g
jD
g
j
g
jD
g
i
f
iA
f
i
f
iA
f
i
MeasuredANARMAA
⋅τ−⋅µ=τ−
⋅τ−⋅µ=τ−
−=
∆
∆         (9) 
Where MeasuredAy _  is the measured amplifier output, 
NARMAAy _  is the PA NARMA model output and )(ke  is the 
conjugate of the complex error. 
IV. IMPLEMENTATION AND PERFORMANCE RESULTS  
A. Implementation Issues 
The Multi-LUT structure can be easily implemented in a 
commercial FPGA board as the Nallatech XtremeDSP: it 
consist in a Xilinx Virtex XCV4SX35 connected to two 
analog-to-digital and two digital-to-analog converters running 
at a clock of 105 MHz. Each LUT correspond to an 
addressable memory table which contains 512 addresses-
gains, and as explained previously, every time that the content 
of a LUT is addressed their content is updated. The actual 
NARMA based predictive DPD implemented contains 3 FIR, 
3 IIR terms and 2 additional LUTs, related to 0f and 10
−f  
nonlinear functions.                  
A total of 8 LUTs of size 29=512 addresses corresponds to 
a total of 4096 complex Gains. For testing and debugging 
purposes the developed predictive NARMA DPD has been 
assessed in Matlab, using measured data obtained from an 
LDMOS power amplifier (with a 1 dB compression point of 
39 dBm and central frequency around 2 GHz). An WIMAX 
signal (OFDM with 256 carriers and 16-QAM modulation) 
has been used as excitation signal. Considering a sample rate 
of 10 MSamples/sec., implies that a complex Gain update is 
performed each 10-7seconds. Then, assuming that the access to 
a LUT is uniformly distributed, the update of a particular Gain 
can be executed approximately every 512·10-7 seconds. That 
corresponds to 512 iterations, that is, 512 data samples. 
B. Simulation Results 
In order to show in-band distortion compensation achieved 
by the DPD, Fig. 4. shows both unlinearized (EVM=16.4%) 
and linearized (EVM=0.9%) 16-QAM constellation of a 
demodulated WIMAX signal. The observed scattering 
reduction is achieved thanks to the NARMA structure aimed 
2423
at taking into account memory effects compensation. On the 
other hand, Fig. 5 and Fig. 6 show the AM/AM characteristic 
and the output power spectra respectively, for both scenarios: 
PA with and without DPD. Moreover, the evolution of the 
update is shown in Fig. 7, where the Normalized Mean Square 
Error (NMSE) (see [7]) gives and idea of the convergence 
speed of the adaptive DPD here presented. All these results 
have been obtained after 5000 iterations steps, which 
corresponds to 5000 data samples.  
V. CONCLUSIONS 
The proposed predictive NARMA digital predistorter 
configuration has been proved to be able to compensate PA 
nonlinear memory effects. Moreover, this multi-LUT approach 
and the LMS based adaptation permits a full implementation of 
the adaptive DPD in an FPGA device, reducing then additional 
complexity and power consumption derived from an external 
adaptation policy.    
ACKNOWLEDGMENT 
This work was partially supported by Spanish Government (MEC) under 
project TEC2005-07985-C03-02 and by the EU network TARGET “Top 
Amplifier Research Group in a European Team” (IST-1-507893-NOE).  
 
REFERENCES 
[1] Lei Ding, G.T. Zhou, D.R Morgan, Ma Zhengxiang, J.S. Kenney, Kim 
Jaehyeong, C.R. Giardina, “A robust digital baseband predistorter 
constructed using memory polynomials,” IEEE Trans. on Comm.,  vol. 
52,  pp. 159 – 165. Jan. 2004. 
[2] Taijun Liu, S. Boumaiza and F.M. Ghannouchi “Augmented 
Hammerstein Predistorter for Linearization of Broad-band Wireless 
Transmitters” IEEE Trans. on Microwave Theory and Tech., vol. 54, pp. 
1340 – 1349. June 2006. 
[3] W.-J. Kim, K.-J. Cho, S.P. Stapleton, J.-H. Kim, “Piecewise Pre-
Equalized Linearization of the Wireless Transmitter With a Doherty 
Amplifier,” IEEE Trans. on Microwave Theory and Techn., vol. 54, pp. 
3469 - 3478,  Sep. 2006. 
[4] He Zhi-yong   Ge Jian-hua   Geng Shu-jian   Wang Gang “An improved 
look-up table predistortion technique for HPA with memory effects in 
OFDM systems,” IEEE Trans. on Broadcasting, vol. 52, pp. 87- 91, 
March 2006. 
[5] A. Cesari, P. L. Gilabert, E. Bertran, G. Montoro, Jean M. Dilhac, “A 
FPGA Based Digital Predistorter for RF Power Amplifiers with Memory 
Effects,” in Proc. European Microw. Int. Circuits Conf. (EuMIC), 
EMW07, Munich 2007, to be published. 
[6] G. Montoro, P.L. Gilabert, E. Bertran, A. Cesari and D.D. Silveira, “A 
New Digital Predictive Predistorter for Behavioral Power Amplifier 
Linearization” IEEE Microw. and Wireless Components Lett., vol. 17, 
pp. 448-450, June 2007. 
[7] P. L. Gilabert, D. D. Silveira, G. Montoro, M. E. Gadringer and E. 
Bertran, “Heuristic Algorithms for Power Amplifier 
Behavioral Modeling,” IEEE Microw. and Wireless Components Lett., 
(in press). 
[8] S. Haykin, “Adaptive Filter Theory”, Prentice-Hall, 2001. 
 
 
Figure 4. Nonlinearized and linearized WIMAX constellations. 
 
Figure 5. Amplifier AM/AM curves (with and without linearizer). 
 
 
Figure 6. Power spectra (linearized and nonlinearized cases).  
 
 
Figure 7. NMSE evolution with the iteration number. 
2424
