This research paper proposes a novel scaling free CORDIC algorithm to operate in vectoring mode which computes absolute magnitude and phase angle of input vector. Using this algorithm, the micro rotation of the vector is unidirectional and totally scaling free. The range of convergence is successfully extended to cover entire coordinate space without increasing any hardware complexity.Further a 16 bit Scaling free vectoring CORDIC architecture based on this proposed algorithm is synthesized on FPGA Xilinx Virtex-5 device using Verilog hardware description language. Synthesized results show throughput every clock cycle with maximum operating frequency of 243.55MHz and demonstrate very low dynamic power consumption.
INTRODUCTION
CORDIC is an acronym for COordinate Rotation Digital Computer. It is a simple and hardware-efficient algorithm for the implementation of various trigonometric functions. It uses simple shift, add, subtract and table look-up operations instead of calculus based methods such as polynomial or rational functional approximation used traditionally. The trigonometric functions are used in many applications including real time digital signal processing (DSP), wireless communication, robotics, computer graphics, navigation and astronomy. In the field of DSP, it is used for calculation of various transforms such as fast fourier transform (FFT), discrete sin/cosine transform (DST/DCT), discrete hartley transform (DHT) and Hough transform (HT) etc. In digital communication, it is used to generate signals during modulation and to estimate phase and frequency parameters during demodulation. Its efficiency in Matrix Computation make it attractive choice for application involving QR decomposition, Singular Value Decomposition (SVD), estimation of Eigens values and vectors etc. In antenna systems it is used for estimation of Direction of Arrival (DOA), Multiple Input Multiple Output (MIMO) detectors etc. It is also used in 3D graphics as vector interpolator [3] . CORDIC has found a significant place even in upcoming technologies like Cognitive radio and Software Define Radio (SDR) [8] .
The CORDIC algorithm was first proposed by [9] on the basis of Givens Rotation of vectors in two dimensional space. Since then it has been subjected to continuous development in terms of algorithmic change or architecture variations to achieve higher and higher throughput rate, lesser and lesser hardware-complexity and latency.The path of development of CORDIC is beautifully summarized in [6] .
The fundamental idea behind the CORDIC is to carry out a sequence of rotations on two-dimensional vectors using a series of specific incremental rotation angles selected such that each is performed by a shift and add operation. It is relatively simple in design and VLSI implementation, as no multipliers are required.
CORDIC Algorithm can operate in two modes namely: rotation and vectoring. In rotation mode, the objective is to rotate a given vector from its initial position to the final position which is at target angle θ, through a series of iterations. The rotation decision at each iteration is made to reduce the magnitude of the residual angle to zero. The rotation trajectory can be linear, circular or hyperbolic depending upon the requirement. In vectoring mode, the aim is to find out magnitude and argument of a vector. The vector is rotated from its initial value to its final value so as to reduce the y component to zero. The magnitude of the vector will get stored in the x component and the angle accrued due to such rotations will be stored in the z register representing its phase.
As indicated by [6] , a lot of research work has been done for reducing the complexity and increasing the performance of rotation mode of CORDIC . Various techniques and hardware architectures have been used for the same. Yet there is an ample scope of optimization for vectoring mode. The conventional Vectoring mode CORDIC has a major drawback of generating a bulk scale factor that needs to be compensated using extra circuitry thus increases the hardware cost significantly. This paper proposes a novel algorithm for vectoring mode of CORDIC which is totally scaling free with a provision for skipping iterations not actually needed so as to speed up the operation. Unlike the conventional Vectoring CORDIC, the rotation of vector in the proposed algorithm is always in one direction and has convergence range extended over entire coordinate space. It also has eliminated the need of lookup table and has significantly saved the memory area.
The rest of the paper is structured as follows: Section 2 briefly presents the CORDIC algorithm overview.The proposed algorithm is discussed in Section 3. Section 4 suggests the architec-ture for the proposed algorithm. Section 5 details the FPGA implementation and comparison and section 6 concludes the paper.
CORDIC ALGORITHM OVERVIEW
The basic principle of CORDIC algorithm is to iteratively rotate a vector in a plane by simple shift and add operation to compute either phase and magnitude of a vector or sine and cosine of an angle in circular trajectory. Various other trigonometric, hyperbolic and linear function can also be computed efficiently based on this basic principle.
Conventional CORDIC algorithm
The conventional CORDIC algorithm [9] is derived from general equation of vector rotation. If a vector V with components (X i , Y i ) is iteratively rotated through an angle φ i , a new vector V' with components (X i+1 , Y i+1 ) is formed. In matrix form, the value of vector after this microrotation can be represented as :
rearranged as :
The multiplication by the tangent term can be avoided if the microrotation angle tan φ i is restricted to 2 −i , i.e
In digital hardware 2 −i denotes a simple binary shift operation. If the microrotations are performed with variable direction d i every iteration, the equation 2.2 can be rewritten as :
where K i = cos(arctan(2 −i )) and d i = ±1. The product of the K i 's represents the K factor or Scaling factor :
The direction of each rotation is defined by d i and the sequence of all d i 's determines the final vector. This yields to a third variable Z i+1 as shown in 2.6 which acts like an angle accumulator and keeps track of angle already rotated.
The input angle of rotation φ is achieved through the summation of all microrotations φ i with appropriate direction.
where w is the word length and the values of arctan(2 −i ) are pre-calculated and stored in a lookup table during implementation.
The conventional CORDIC algorithm can be made to operate in either rotation or vectoring mode depending upon the way of determining the direction of microrotation. In rotation mode the coordinates of the vector and an angle of rotation is given, and the coordinates of the original vector, after rotation through a given angle, are computed. The rotation decision at each iteration is made to reduce the magnitude of the residual angle in the angle accumulator to zero and thus d i is computed on the basis of sign of Z i .
Where as, in vectoring mode the coordinates of the vector are given and the magnitude and angular argument of the original vector are computed. In this mode, the y component is minimized to zero and it is the sign of Y i which decides the direction of microrotation.
In conventional CORDIC algorithm, if some iterations are skipped or repeated to achieve larger or faster convergence range, the scale factor will vary and will require extra circuitry and clock cycle for its compensation. Thus it suffers from limitation of slow speed, small convergence range and bulk scale factor compensation circuitry.
Scaling free CORDIC Algorithm for Rotation mode
To wipe off the effect of variable scaling factor and associated complex circuitry, the Scaling free CORDIC algorithms were developed. Use of Taylor series approximation of sine and cosine functions form the basis of making scaling free CORDIC. In [4] a modified virtually scaling free algorithm is proposed whereas in [2] enhanced version of modified virtually scaling free CORDIC is suggested using booth recoding and conventional CORDIC to make it scaling free. Recent research in [1] uses third order approximation of the series together with high speed most significant-1 detection scheme.
The Taylor series expansion of sine and cosine of an angle α is given by:
First order approximation for sin and cosine series was used in [4] during the implementation, reducing the series as:
But, this approximation imposes a restriction on the allowed values of iterations i as:
for 16 bit data, the value of i comes out to be 4.
There is no complete scaling free version of vectoring CORDIC available as yet. However attempts were made in designing virtually constant scaling factor vectoring CORDIC in [5] and [7] . But it still requires circuitry for fixed scale factor multiplication.
PROPOSED ALGORITHM FOR SCALING FREE VECTORING CORDIC
The proposed algorithm for scaling free vectoring CORDIC uses third order approximation of Taylor series and divided the coordinate space into eight equal sectors, each of 45 degrees as shown in figure 1 based on minimum value of i as 2. With this division the range of convergence is (0,π/4). To cover entire coordinate space i.e for the vectors lying in other sectors and quadrants, quadrant mapping is done. Vectors lying in other quadrants are mapped to first sector of first quadrant. This is done by pre rotating them in the clockwise direction. The The magnitude of x and y coordinates determines the sector of the vector whereas their signs indicates the respective quadrant.
Quadrant demapping is performed to reverse the effect of pre rotation and finally to get the phase angle of the vector. Rule for demapping is listed in table 1.
The essentials steps of the algorithm can be summarized as follows:
(1) Input the x and y coordinates of the vector. Depending on the magnitude and sign of the coordinates, determine the sector and quadrant of the vector.
(2) Map the vector to the sector I of the first quadrant as per the pseudocode. Flow chart for the above algorithm can be graphically represented as figure 2. 
ARCHITECTURE OF THE PROPOSED SCALING-FREE VECTORING CORDIC
Architecture of the proposed Scaling-free vectoring CORDIC is shown in the block diagram of figure 3 . Input to the CORDIC module consists of x and y coordinate of the vector whose magnitude and phase angle is to be determined. These inputs are stored in the X and Y registers. Fixed point 16 bit representation is used for X and Y inputs. The angle accumulator register is initialized to zero in the beginning. The inputs X and Y are given to quadrant mapping block which not only determines the quadrant and sector of the given combination but also maps it to the first sector of the first quadrant. This transformed vector is applied to basic CORDIC pipeline as shown in figure 4 .
This CORDIC pipeline rotates the vector in clockwise direction and compute its new value. This computation is simple shift and add operation performed for particular value of iteration index i. Depending upon the sign of computed y vector, a multiplexer is used to either accept the iteration or to skip it. If the iteration is skipped 0 is stored in angle accumulator register or else 1 is entered.
Processing of the vector from the pipeline stages results in:
(1) X register holding the value of magnitude of the vector, which does not require any further processing, as scale factor is one i.e it is completely scaling free vectoring CORDIC. (2) Zero value of Y register. (3) angle accumulator register value which needs to be modified as per the quadrant demapper to get the final phase angle. 
Basic CORDIC Pipeline Architecture
Basic CORDIC pipeline consists of thirteen stages from i = 2 to 14.
Stages from i = 2 to 6 require six adders as shown in figure 4 whereas for stages i = 7 to 14 ,the requirement of adders reduces to only two, as shown in figure 5 . Shifters for each stage are simple wired connection and does not add to the complexity of the circuit.
FPGA IMPLEMENTATION OF THE VECTORING CORDIC
Implementation of the proposed architecture is carried out in Xilinx ISE9.2i targeting virtex5 device using Verilog hardware description language. The output data rate is one set of data per clock period except for the first data set which is obtained after completely filling the pipeline. The power dissipation of the proposed design for different clock frequencies is computed using XPower tool of Xilinx and is plot- Hardware Utilization on the target device is as shown in table 3 Functional simulation of the proposed design is successfully car- ried out using test bench waveform option of the Xilinx tool.The simulation results are shown in fig 7 . The simulation values are verified mathematically as well.
CONCLUSION
The proposed algorithm successfully realizes completely scalefree vectoring CORDIC . It does not require any complex pre and post processing circuitry. The approximation used here for sin and cosine series has not only increased the accuracy of the processor but also expanded the range of convergence to complete coordinate space. It has also completely eliminated the use of lookup tables for implementation of CORDIC. The hardware requirement is also less as compared to other designs.
