Complex radix (À1 + j) allows the arithmetic operations of complex numbers to be done without treating the divide and conquer rules, which offers the significant speed improvement of complex numbers computation circuitry. Design and hardware implementation of complex radix (À1 + j) converter has been introduced in this paper. Extensive simulation results have been incorporated and an application of this converter towards the implementation of discrete Fourier transformation (DFT) processor has been presented. The functionality of the DFT processor have been verified in Xilinx ISE design suite version 14.7 and performance parameters like propagation delay and dynamic switching power consumption have been calculated by Virtuoso platform in Cadence. The proposed DFT processor has been implemented through conversion, multiplication and addition. The performance parameter matrix in terms of delay and power consumption offered a significant improvement over other traditional implementation of DFT processor. 
Introduction
Back in 1960, D.E. Knuth presented the quarter imaginary number system with the help of radix 2j and established a unique way of complex number representation through digit sets like 0,1,2,3 [1] . Advancement of research in 1964, W. Penny represented complex numbers using base -4 [2] and later on extended his research on the base (À1 + j) [3] . It has been testified that complex number can be treated as single binary string using the base (À1 + j). Moreover, the representation offers similar binary addition rules with some exception.
Substantial amount of research have so far been established for the conversion of decimal number to quarter imaginary number [1] [2] [3] [4] . W.J. Gilbert [4] proposed that each complex numbers can be represented in positional notation using certain complex bases like (À1 ± j), (1Àj), 2j, etc. It has been observed that (À1 + j) is a good radix because the pattern covers the entire complex plane [4] . Later on, T. Jamil et al. has invigorated the conversion algorithm based on the radix (À1 + j) in the year 2000 [5] . Besides the conversion algorithm, the work has been extended for hardware implementation of the arithmetic operations like addition, subtraction, multiplication and division [5] [6] [7] [8] [9] [10] [11] [12] .
However circuitry for performing arithmetic operations of complex numbers based on complex base (À1 + j) have been proposed so far, but circuitry for the representation from decimal number to complex base number have not been introduced till now. In this paper gate level implementation of the converter facilitating decimal to complex base number (À1 + j) have been introduced. 8-bit binary input has been applied to the converter, which offered 20-bits value in complex radix (À1 + j). The converter has been designed based on the division method of binary numbers where the last two bits (starting from least significant bit (LSB)) are taken as remainder in each iterations, and rest of the bit string have been considered as quotient. Basic components like 3:8 decoder, full adders, half adders, multiplexors and compressors [11] have been considered to make up the converter.
The implemented converter has been applied for the implementation discrete Fourier transformation [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] processor. The proposed DFT processor has been implemented through conversion, multiplication and addition. The reported DFT processor has been compared with three different methodologies like distributed arithmetic (DA) [20] , systolic array (SA) [19] , and CORDIC [21] based implementation. Gate level implementation of the circuit has been carried out for its functionality test. Moreover the circuit functionality has been verified by Xilinx ISE design suite version 14.7. Performance parameters like propagation delay and dynamic switching power consumption has been calculated by NC Launch platform in Cadence. Simulation results of 8 point DFT processor have been measured as $109 ns propagation delay along-with $89 mW dynamic power consumption. Moreover, proposed design is 40% faster than DA and 43% faster than SA based implementation respectively.
Review of complex binary number system (CBNS)
A complex number ðA þ jBÞ can be represented by the radix (À1 + j) using the positional notation similar to radix 2 representation [2, 5] . An n bit number whose radix is (À1 + j) can be represented in the form of a power series
where the coefficients a nÀ1 ,a nÀ2, . . .,a 2, a 1, a 0 are binary bits (0 or 1). Let us consider two numbers (based on radix (À1 + j)) 1011 and 1100 whose exact decimal values are 2 + 3j and 2 respectively. The illustration has been shown here under:
The addition rule for complex base number is very similar to the binary with some exception. The addition process is illustrated in Table 1 .
Conversion algorithm for integer to CBNS (positive and negative)
To represent a positive integer into CBNS, the following steps are followed:
Convert the integer into radix 4 representation. This can be implemented through successive division process. Mathematically steps can be represented as
where x i 8ð0; 1; 2; 3Þ.
Convert the radix 4 number to radix (-4) number by introducing a negative sign into odd places taken from least significant digit. Normalize the number obtained in step 2 i.e., the digits has to be in the range 0 to 3. This is done by repeatedly adding 4 to the negative digits and adding a 1 to the digit on its left. If the digit is 4, replace it by a zero and subtract a one from the digit on its left. Finally replace each digit value with the corresponding four-bit sequence ð0 ! 0000; 1 ! 0001; 2 ! 1100; 3 ! 1101Þ
To fully understand the algorithm let us assume an example: 200 base10 when divided by 4 the digits (3, 0, 2, 0) base4 are obtained. When these digits are converted to base -4 the digits (À3, 0, À2, 0) base-4 are obtained.
The next step which is normalization is performed, from this the digits (1, 1, 1, 2, 0) are obtained and replacing each digit by the string of four bit i.e., (0001 0001 0001 1100 0000) which is the final results. For a negative integer the obtained result above is with 11101 which is equivalent to À1 i.e. (0001 0001 0001 1100 0000 Â 11101 = À200.
Conversion algorithm for complex numbers to CBNS
For imaginary numbers we simply multiply the positive integer obtained in Section 2.1 by 11 which is j in base (À1 + j) and 111 which is (-j) in base (À1 + j).
i.e. ð0001 0001 0001 1100 0000 Â 11Þ ¼ 00011001100000100 0000 ¼ ð200jÞ and ð0001 0001 0001 1100 0000 Â 111Þ ¼ 000111011101110 1000000 ¼ ðÀ200jÞ
Hardware implementation of the converter
Hardware implementation procedure of the conversion has been described in this section. The design flow of the converter is shown in Fig. 1 . Input of the converter has been considered as 8-bit string, which will be converted to complex base number. From Fig. 1 , it has been observed that 8-bit string of binary number has been applied to the input of the converter to get 20-bit radix (À1 + j) output. The converted output bits have been shifted once to the left to get the imaginary numbers. Finally converted output and the shifted version of the output has been added with the help of a special adder based on radix (À1 + j) which has been designed for this purpose (shown in Section 3.4) to obtain the positive imaginary number which is in base (À1 + j).
The positive real number has been left shifted two times, and added with the positive imaginary base number to get the negative imaginary base number. The multiplication has been obtained by shifting and addition algorithm. To get the negative real number, the positive real number has to be multiplied with 11101 starting from the LSB. The last three digits is nothing but the negative imaginary number hence, the negative imaginary number is shifted to Table 1 Addition Rule in Base (À1 + j). the left twice and added with the positive real number using the same adder used above.
Converter for positive real number
The first step of the conversion algorithms [1] is based on binary division principle. The architecture has been implemented through successive division by 4, and remainder has been calculated. For an 8 bit binary number ((X ¼ P 7 i¼0 x i 2 i ), rightmost bits i.e. ðx 1 ; x 0 Þ can be considered as remainder, and rest bits can be considered as quotient, when divide by 4. Next remainder can be computed as ðx 3 ; x 2 Þ, and the process continues until the MSB appears. Flowchart of the design is shown in Fig. 2 , where input has been considered as 8-bit binary number. The architecture is made up of 5 blocks as shown in Fig. 3 viz. Fig. 3(a) -(e). Fig. 3 (a) converts radix (base) 2 to radix (base) 4 representation; i.e. the circuit for obtaining the remainders when the input is divided by 4. It is an 8 bit parallel in parallel out register made up of D flip flops shown in Fig. 3(a) . Moreover, the input can be fed towards the input of the radix (base) -4 of the circuit. The second block [ Fig. 3(b) ] which is the base 4 to base (-4) circuit is a 2 0 s complement circuit, which will account for negating at the odd bits. This is done by giving only the bits which are at the odd positions obtained from the first block i.e. if the remainders obtained from the first block.
The third block [ Fig. 3(c) ] which is a circuit for base (-4) to digits representation circuitry; has been implemented through a 3bit serial adder. These adders have been used for addition of the odd position digits with 4 and the leftmost of the odd digit with 1.
The fourth block [ Fig. 3(d) ] is needed when obtained digits has a number 4; then the step is to replace 4 by 0 and subtract 1 from the left. The circuitry has been implemented through 3:8 decoders, 2:1 multiplexer and a 2 0 s complement circuit. The input of the decoder has been taken from the output of the adder circuit from Fig. 3(c) . When input number is (100) 2 , this will select act as a selection of the multiplexer. At the same time the output line of the decoder is enabled and will select the register storing the values 0000. So this way 4 is replaced by 0 and for subtraction of 1 from the left, the same output line is given as select to the 2:1 multiplexer which will select 1 or 0 based on selection line of the multiplexer. The output of the multiplexer is again fed to the input to a 2 0 s complement circuitry.
The last block [ Fig. 3(e) ] which will select the equivalent bits representation in base (À1 + j) is a sequence of register which will store the values 0000, 0001, 1100 and 1101 to represent the digits 0,1,2,3 respectively. Only these digits are considered because after applying the algorithm the digits obtained always falls between (0-3). It will depend on the output from the adders which will be given to the decoder and then the output of the decoder will act as an enable to the registers.
Converter for imaginary numbers
To obtain the imaginary numbers, positive integer is multiplied with 11 for positive imaginary and 111 for negative imaginary. In base (À1 + j), 11 is equivalent to j and 111 is equivalent to -j. Now, instead of using a multiplier that eventually increase the hardware complexity; shifting and adding have been utilized. Whatever digits obtained from the converter shown in Section 3.1 is shifted to the left two times if a positive imaginary is needed and then the shifted bits are added using the complex base adder shown in Section 3.1. The adder has been implemented through half adders, full adders, 4:2 compressors and 5:2 compressors. To get the negative imaginary numbers positive imaginary number have been shifted once and added with the previous one.
Converter for negative real number
To get the negative integers, positive integer is multiplied by 11101 which is equivalent to (À1) in base (À1 + j). In Section 3.2 the negative imaginary is obtained by shifting 3 times and adding which is multiplication by 111. So for obtaining the negative real the last three shifts are omitted because this is nothing but the negative imaginary number obtained in Section 3.2. So just by simply shifting this negative imaginary by 2 times and adding this with the positive real number obtained in Section 3.1 the negative real number is obtained.
Adder for obtaining imaginary and negative integers
Adder implementation hardware has been shown in Fig. 4 . This special adder has been implemented through binary half adders, full adders and compressors [11] . The adder is designed based on the addition rule [shown in Table 1 ] in complex binary number system where the carry is being propagated to the (nth + 2) and (nth + 3) positions. The converter outputs taken from Fig. 3 , is given as input to this adder to obtain the negative real part and the positive and negative imaginary part of the complex number.
Application: Discrete Fourier transformation (DFT)
The Discrete Fourier Transform (DFT) of discrete signal x(n) can be directly computed as: 
A. Shadap, P. Saha / Engineering Science and Technology, an International Journal xxx (2016) xxx-xxx
An efficient DFT computation method significantly reduces the number of required arithmetic operations is called fast Fourier transformation (FFT) [12] [13] [14] [15] [16] [17] [18] [19] [20] . An FFT algorithm divides the DFT calculation into many short-length DFTs and results in huge savings of computations [13] . If the length of DFT N = R v , i.e., the product of identical factors, the corresponding FFT algorithms are called Radix-R algorithms.
Hardware implementation
The implemented block diagram of the DFT structure is shown in Fig. 5 . The structure comprises of the (À1 + j) base converter for generation of input vectors as well as twiddle factors. The output of this converter is then multiplied with the twiddle factors using the complex multiplier and then the result is added using the complex adder to get the final output. The multiplication and addition is done based on complex radix algorithm.
Examples for implementation
For N = 2 and N = 4 multiplication is not required because the coefficient of the twiddle factor involved is 1 (either positive or negative). The converter outputs will both positive and negative real & positive and negative imaginary outputs, that have been handled carefully for multiplication purpose. So, eventually when this output is multiplied with a twiddle factor of 1 the result will be the input itself. Literally this means nothing but addition of the input vectors generated using the converter. So when the twid- 
The terms which involved the imaginary terms like Àjxð1Þ and jxð3Þ will be obtained as output from the converter when the input is either x(1) or x(3). Similarly for the terms -x(1) or -x (2), they will be obtained as output from the converter. The other terms in the converter are also obtained using the converter. There are four inputs and so four converters are required for each term x(0), x(1), x(2) and x(3). The twiddle factor is not needed in this case because the required terms are obtained from the converter itself. Similarly for an 8 point DFT the matrix is shown below in Eq. (5), the twiddle factors involved now are integers other than ±1 and its imaginary form ±1j. So these need to be generated using a separate converter and then multiplied with the inputs, cases where the twiddle factor is ±1 or ±1j multiplication is not required. So this way the number of multiplications is reduced and the operation also becomes very simple which involves only conversion, multiplication (not required if twiddle factor is ±1 or ±1j), and addition.
The conventional algorithm for computing a DFT is given by 
Results and discussions

Simulation environment
The design of the converter of decimal value up-to 255 was considered for testing purpose. The converter and its sub-units were specified in Verilog, synthesized with Xilinx ISE design suite 14.7. The inputs are given at a clock frequency of 250 MHz. Propagation delay and dynamic power consumption of the converter have been calculated by NC Launch tool of Cadence. To validate the correctness of the design, more than 500 test cases covering all rounding modes and exceptions were simulated successfully on the designs.
Results
Synthesized results base on the above mentioned criteria has been tabulated in Table 2 .
The values of delay, power, energy delay product (EDP), and power delay product (PDP) [23] of different operand length converter are measured and shown in Table 3 . The EDP (10 À21 ) J-s and PDP (10 À12 ) J are quantitative measure of the efficiency and a compromise between speed and power dissipation. Input data were taken in a regular fashion for simulation purpose. For each transition, the delay is measured from 50% of the input voltage swing to 50% of the output voltage swing. The proposed DFT design has also been verified, synthesized and the delay has also been noted. The power and cost of the design has been checked in RTL encounter tool in Cadence platform using 45 nm technology which was generated through the Verilog Table 4 shows the different length DFT processor as a function of the modules.
%error ¼ expected À generated expected Â 100%
The (À1 + j) converter showed 100% of accurate for fractional input up to 4 decimal places and for input up to 5 places the output generated is rounded off .This makes our circuit more efficient because no rounding off algorithm or circuit has been employed in designing the converter. This makes it an efficient floating point unit and can be used for many floating point applications. The degree of accuracy was measured in terms of percentage of error calculated using the formula. Table 5 shows the comparison of the error for twiddle factor implementation for DFT computation. From this Table 5 it has been observed that up to 5 decimal places the error is only 0.026%. Table 6 indicates the performance parameters such as propagation delay, and dynamic switching power consumptions and their product analysis proposed DFT processor. Table 6 The implemented methodologies have been compared with three different methodologies like as distributed arithmetic (DA) [20] systolic array (SA) [19] and CORDIC [21] based implementation.
Discussions
From Table 6 it has been observed that, proposed design is 40% faster than DA based implementation, 43% faster than SA based implementation and 39% faster than CORDIC based implementation respectively.
Conclusions
High speed hardware implementation of complex radix (À1 + j) converter has been introduced in this paper. Extensive simulation results are provided and an application of this converter to discrete Fourier transformation (DFT) processor has been presented. The functionality of the DFT processor have been verified in Xilinx ISE design suite version 14.7 and performance parameters like propagation delay and dynamic switching power consumption have been calculated by Virtuoso platform in Cadence. Gate level implementation of the circuit has been validated for its functionality. Performance parameters like propagation delay and dynamic switching power consumption has been calculated by Virtuoso platform in Cadence. Table 5 Comparison between expected and generated output for twiddle factors of DFT from complex base (À1 + j) converter. 
Input
