Abstract-in this paper, an improved digital-stage design of a mixed-signal Cartesian Feedback loop for a zero-IF WCDMA transmitter is presented. The transmitter architecture consists of an analog stage including filters, I/Q modulator, feedback I/Q demodulator and a digital stage which adjusts the phase misalignment around the loop. We propose an optimized CORDIC design for the digital part in order to improve the system operating frequency without increasing the silicon surface area. ASIC synthesis proves that using a not fully pipelined CORDIC architecture allows us to reach 230 MHz with system power consumption under 4.3 mw which is two times less than a fully analog system.
INTRODUCTION
Third generation wireless communication standard WCDMA uses non-constant envelope modulation techniques to increase spectral efficiency for high data rates [1] . These modulations require a high linear radio-frequency (RF) power amplifier (PA). Nevertheless, power efficiency is maximized when the PA operates at its non-linear region. The best solution consists in designing a moderately linear PA then employing an adequate linearization technique. Consequently, the amplifier operates as close as possible to saturation, maximizing the power efficiency while the linearization system maximizes the spectral efficiency. Many methods (analog or digital) are proposed to reduce the effects of nonlinearities like Pre-distortion, Post-distortion, Feedback and Feed-forward techniques [2] . Among these, Cartesian feedback loop (CFB) [3] , which forms an alternative feedback technique, is an attractive solution for two reasons: first, it automatically compensates all process variations and secondly, its linearization process is applied to all components in the loop. Nevertheless, this technique has suffered from practical shortcoming; it needs a phase corrector to compensate delay around the loop. Furthermore, analog implementation of phase corrector is difficult to realize and highly area expensive [4] .
In this paper, authors present a detailed study to improve the based solution discussed on [5] and propose a new optimized CORDIC solution in order to provide an accuracy of 1° for the phase estimation process, to reach the desired frequency and to reduce energy consumption. Due to the increasing demand of cost reduction, a Zero-Intermediate Frequency (Zero-IF) architecture avoiding the use of external filter has been chosen. Delegating the phase rotation adjustment processing to a digital stage provides flexibility, higher integration and less area size than in full-analog architecture [3] .
The paper is organized as follow. Section II presents Zero-IF transmitter with the CFB linearization loop and design consideration. Section III deals with implementation study of the digital stage. Section IV describes the proposed architecture and Section V shows implementation results such as system operating frequency, occupied area and power consumption for (CORE65LPSVT) ASIC targets.
II. MIXED CARTESIAN FEEDBACK TRANSMITTER

A. Cartesian Linearization Technique
The proposed linearization technique architecture based on a digital CFB implementation is made up of both analog and digital building blocks as shown in " Fig. 1 ". Quadrature baseband signals (I return , Q return ) are directly upconverted to RF frequency (1.95 GHz) by mixers associated with a local oscillator [6] . The resulting RF signal is then strengthened by the power amplifier. In the feedback path, the PA output is attenuated, down-converted and filtered out. After converting analog signals to the digital domain using analog to digital converters (ADC), a phase adjustment is applied to I FB and Q FB in order to cancel phase rotation around the loop. Feedback signals are subtracted from the input quadrature components to provide return signals I return and Q return . These signals include the forward path non linearity. By loop effect, forward path non linearity is subtracted from input signals. Thus, input I/Q signals are pre-distorted to provide a linearized PA output. Implementing the phase estimation and the vector rotation in digital domain relaxes linearity and in-band noise constraints compared to a fully analog circuit. By having an optimized and high integrated digital stage, we can reach lower power consumption than an analog design.
B. CFB Digital Stage and Design Consideration
The baseband loop filters in the feedback path results in a delay and symbol rotation after subtracting input and feedback signals [7] . Phase variations are cancelled by using the circular transformation given in (1), where θ represents the phase correction value.
θ is calculated by comparing the forward path phase and the feedback path phase. Two architectures are evaluated for the circular transform implementation. The first architecture uses lookup tables (LUT) and a costly multiplier operator [5] . Even this solution does not introduce a large delay into the loop, it is highly expensive in terms of area occupation. In this paper, we'll focus on the second architecture which is based on a coordinate rotation digital computer (CORDIC) algorithm [8] . This iterative algorithm requires less area than a high complexity multiplier when the data path exceeds 10 bits. The pipelined CORDIC introduces latency in the CFB loop, a tradeoff between area occupation, latency and throughput is revealed. A fine tuning of input variables when implementing this architecture will lead to an optimal solution with 1° accuracy. For stability consideration, the delay in the loop is limited by the period of WCDMA data (T chip ) [9] . The digitalstage's operating frequency threshold is set to 220 MHz with respect to DAC and ADC characteristics Fig.1 .
III. DIGITAL CFB IMPLEMENTATION
As already mentioned, the main task of the digital stage is to perform the vector rotation. Previously, the angular deviation has to be estimated. This angle is the difference between the phase of the feedforward channel I/Q and the feedback channel I FB /Q FB . Therefore, the digital CFB architecture is organized using three distinct blocks as shown in Fig. 2. 1. Phase estimation: to estimate the angular distortions in the two paths.
2. Vector rotation: to compensate phase error.
Subtraction: to predistort the input signals.
A. Phase estimation Considering the phase estimation process, it is very important to notice that phase subtraction must be done "modulo 2π" to keep a same range of variation of the angle applied to the next block (vector rotation). It implies that the phase estimation block is divided into two sub-functional units;
• Phase estimation for both paths: computing atan function.
• Modulo function: to keep phase subtraction in , .
1) "atan" function implementation
The implementation of the atan function can be done in several ways. The most trivial method was discussed in [5] and consists of using Lookup table. This solution seems to be over-sized and very tasty in silicon consumption in comparison with other alternatives such as the CORDIC algorithm. This well-known iterative algorithm was designed for the first time by Jack E. VOLDER in 1959 [8] . It consists of two operating modes and it allows the calculation of trigonometric, hyperbolic and some linear functions by only using basic operations with respect to (2) . It was subsequently improved in order to reduce computing cost and to facilitate implementations on an embedded target. For our application, we have used the rotation mode (RM) to perform vector rotation and the vectoring mode (VM) to compute atan function Fig. 2 . (2) Atan function process consists of taking as input variables the two coordinates of the vector, initializing z 0 with zero and retrieving the corresponding phase value after performing nine iterations.
A scrupulous study depending on different hardware implementations of the CORDIC algorithm described in [10] has shown that adopting a fully pipelined design meets performance criterion. In fact, this architecture, as shown in Fig.3 , has a small computation complexity and allows reaching high frequency. We note that calculation accuracy depends only on the number of the CORDIC iterations, thus nine iterations were needed to perform the same accuracy, 1°, as that achieved by the LUT based solution described in [5] . 
2) "Modulo" function implementation Subtraction result must be standardized to overflow due to phase computing and to be input of the next stage. Consequently, the "M consists on calculating the remainder of the Eu of the wanted angle by K*2π. A smart implem function can be described by an algorith follows. First, a sign test is effectuated to b symmetry property of this function. Then th tested if it has exceeded a full circle tur subtracted from it and the test continues, e retained as the output result.
B. Vector Rotation
The CORDIC algorithm performs the without multiplier resources allocation. T computed using a serial of specific incrementa whose sum is equal to the desired angle o elementary rotation is performed only by usi add operations. The same pipelined architec simple initialization on its entries with the namely IFB/QFB and the angle to perform, wi is enough. Nine iterations were required to accuracy.
C. Subtraction
The subtraction function is simple implemented digitally. Indeed, it is necessary two's complement of the second operand a adder.
IV. NOT FULLY-PIPELINED CORDIC DESIG
ASIC synthesis results using the technology have shown that with usin architecture we reach an operating frequen MHz. However, this result does not suit th power criterion. A smart alternative which de meets system specifications consists on al pipelined architecture by reducing the num used at each output stage. We notice that should not decrease operating frequency und order to meet DAC and ADC constraints. Star pipelined architecture, a register is added afte unit. An FPGA target was used to assess the the implemented modules. By trying all possible configura the best is to reduce the registers n to two. The choice of these r CORDIC's operating mode. When mode, the best configuration is to s different registers at each output sta However it is more appropriate performing the CORDIC's rotation
In the next section, ADS simula ASIC synthesis results of the wh discussed.
V. SYSTEM VALIDATION AND A
All given results in this se performing ASIC synthesis.
A. Linearization Technique Validat
All building blocks making th designed and simulated in hardw (HDL) with ModelSim® and have alone. Now, we are able to realize order to validate the overall arc Software have been used. Fig. 6 an spectrum of the PA with and wit same output power. This last exhibi distortions on the adjacent channel channel power ratio (ACPR) rec 22dB at 5MHz from the carrier). I mask defined by the standards UMT a Zero-IF architecture is out of spec the output spectrum of the CFB loop 
B.
Hardware implementation results Simulation result of the optimized COR shown in Fig 8. The predistorted signal floating simulation is illustrated by using a do that obtained using a HDL implementation i continuous line, as shown in Fig 8 . Table I summarizes ASIC synthesis resul power technology (CORE65LPSVT) in occupation and energy consumption for our solutions. Two power types are presented: the which depends on working frequency and th whose consumption depends only on the o area. ASIC synthesis of the proposed architec frequency of 232 MHz is reached with consumption than the fully pipelined solution. We note that power consumption of the imp reduced by 35 % with respect to system cons also that the obtained surface area occupatio than the fully-pipelined solution Fig.9 . 
