Abstract-This paper presents the design of an adaptive Digital Predistorter (DPD) for Power Amplifier (PA) linearization whose implementation and real time adaptation can be fully performed in a Field Programmable Gate Array (FPGA). The distinctive characteristic of this adaptive DPD is its straightforward deduction from a Nonlinear Auto Regressive Moving Average (NARMA) PA model and the possibility to be completely implemented in a FPGA without the need of an additional digital signal processor performing the DPD adaptation. The adaptive DPD presents a NARMA structure that can be implemented by means of Look-Up Tables (LUTs) . This configuration results in a Multi-LUT implementation where LUT contents are directly updated by means of an LMS algorithm. Details on the internal adaptive DPD organization as well as its linearization capabilities are provided, taking into account memory effects compensation.
INTRODUCTION
Modern spectrally efficient multilevel modulation schemes are very sensitive to the inter-modulation distortion (IMD) that results from nonlinearities in the RF transmitter chain, mainly due to PA nonlinear behavior. This implies that for having linear amplification and thus being compliant with linearity requirements specified in communication standards, significant back-off levels in PA amplification are needed. Back-off amplification results in a power inefficient amplification, moreover when the PA has to handle signals presenting high peak-to-average power ratios (PAPRs). The use of PA linearizers arises as a recognized solution to deal with this trade-off between linearity and efficiency.
Among linearizers, digital predistortion (DPD) takes advantage of the always faster digital signal processing devices (already present in software defined radio subsystems within the transmitter), to perform digital adaptive PA linearization and thus avoiding RF hardware adjustment problems. However when considering signals presenting significant bandwidths the performance of a DPD linearizer can be degraded due to PA memory effects. DPD linearization has been object of intensive research generating multiple publications in the nonlinear and memory compensation area ([1]- [5] ). Some common solutions consist in designing the predistortion function as the composition of a memoryless nonlinearity and a linear time invariant (LTI) block. This generic configuration can be seen as a simplified decomposition of a more general Volterra series function. Among these solutions it is possible to find DPD based on memory polynomials as in [1] , or Hammerstein based schemes as in [2] , where the LTI block is usually described by a finite impulse response (FIR) filter. Other solutions directly describe the memoryless nonlinear block by means of a Look-up Table  ( LUT), while memory compensation is achieved by adding a FIR filter as in [3] , by considering a 2-dimension LUT as in [4] , or taking into account a set of LUTs associated to delayed samples ofthe input signal as in [5] .
Commonly DPD solutions are validated in laboratory testbenches formed by the closed loop interconnection of: a PC performing the DPD function, a vector signal generator (VSG), the PA and the vector signal analyzer (VSA). Thus little attention is paid to aspects related to adaptation issues. On the other hand, other solutions where the DPD function is carried out in a FPGA, such the one presented in [5] , perform the adaptation process in a PC or eventually in a DSP, which introduces and additional power hungry device that reduces the overall system power efficiency. This paper presents an adaptive DPD for PA linearization whose implementation and real time adaptation can be fully performed in a Field Programmable Gate Array (FPGA). The DPD here presented is based on the predictive Nonlinear Auto Regressive Moving Average (NARMA) DPD presented in [6] , that can be implemented by means of Look-Up Tables (LUTs). The purpose and main contribution ofthis paper is the development of an adaptation algorithm that permits real time updates of the LUTs contents that configure the multi-LUT structure of the predictive-NARMA DPD. Therefore, unlike [6] , it is possible to perform real time adaptation in an FPGA device without needing a DSP or any other kind of advanced coprocessor for doing it.
1-4244-0749-4/07/$20.00 w2007 IEEE. [7] . Details on the stability analysis of this NARMA structure in order to ensure the overall stability of the DPD can be also found in [6] . From (1) it is possible to obtain the expression defining the PA input, We now consider yD as the desired output, that is, the PA output after linearization. This linear output can be defined as the transmitted signal xT amplified by a linear gain Giinear . As it can be observed in Fig. 1 , if no DPD is considered xA(k)= xT(k). On the other hand, with (2) Now it is possible to rewrite the predistortion function in (2) in a more convenient DPD expression, in terms of the (delayed) complex inputs (XA(k--ii)) and desired outputs (YA (k -j ) ) multiplied by its corresponding LUT complex gain (CGf, GJ ),
Where Gf4inverse is the inverse of Gf, and the intermediate variable uA(k) (see Fig. 3 ) is described as
the predistortion output sample are updated by means of the LMS algorithm.
After a transient period, in which all LUT Gains are continuously being updated, the PA output converges to the desired output, achieving then the desired linear amplification.
B. Updating the Multi-LUTgains.
The update of the complex LUT Gains is performed by means of the complex LMS algorithm [8] , described by e(k) = YA NA (k)-YA Measured(k) AGf (xA4(k--f))=f XA (kT-l ) e(k) Figure 3 . Multi-LUT structure.
The resulting DPD configuration is depicted in Fig. 3 , where the predictive-NARMA structure can be mapped in a FPGA as a set of multiple LUTs. Therefore each nonlinear function (fo 1, f, and gj ) in (2) is finally implemented with a LUT. This configuration permits an FPGA implementation relaxing the computational effort related to the nonlinear functions implementation. And in addition permits scalability, that is adding or reducing the number of delays considered in the predistorter structure.
III.
LOOK-UP TABLES ADAPTATION
A. DigitalPredistorterOperation
The adaptive process followed by this DPD in order to perform the PA linearization consists in the following steps: i) First, it is necessary search for the best sparse delays defining the PA model in (1) and perform the stability test (bounds given by the small-gain theorem, as explained in [6] ) to avoid possible instabilities due to the recursive part of the DPD structure.
ii) Once we have identified the best sparse delays and ensured that nonlinear functions related to the recursive part are consistent, we map the DPD structure in Fig. 3 into the FPGA. The initial LUT Gain values are filled with 0's or I's.
iii) Finally, we run the DPD process and at every iteration step the following actions are performed in parallel in the FPGA device:
Applying the algorithm defined in (1), but implemented with LUTs as defined in (3), an output sample of the PA NARMA model (YA NARMA (k) ) is obtained in order to create the LMS error.
Applying the algorithm defined in (4) a new DPD output sample (XA (k)) is obtained.
All complex LUT Gains involved in the calculation of 
IV. IMPLEMENTATION AND PERFORMANCE RESULTS

A. Implementation Issues
The Multi-LUT structure can be easily implemented in a commercial FPGA board as the Nallatech XtremeDSP: it consist in a Xilinx Virtex XCV4SX35 connected to two analog-to-digital and two digital-to-analog converters 
B. Simulation Results
In order to show in-band distortion compensation achieved by the DPD, Fig. 4 . shows both unlinearized (EVM=16.4%) and linearized (EVM=0.9%) 16-QAM constellation of a demodulated WIMAX signal. The observed scattering reduction is achieved thanks to the NARMA structure aimed
at taking into account memory effects compensation. On the other hand, Fig. 5 and Fig. 6 show the AM/AM characteristic and the output power spectra respectively, for both scenarios: PA with and without DPD. Moreover, the evolution of the update is shown in Fig. 7 , where the Normalized Mean Square Error (NMSE) (see [7] ) gives and idea of the convergence speed of the adaptive DPD here presented. All these results have been obtained after 5000 iterations steps, which corresponds to 5000 data samples.
V. CONCLUSIONS
The proposed predictive NARMA digital predistorter configuration has been proved to be able to compensate PA nonlinear memory effects. Moreover, this multi-LUT approach and the LMS based adaptation permits a full implementation of the adaptive DPD in an FPGA device, reducing then additional complexity and power consumption derived from an external adaptation policy. 
