ABSTRACT
INTRODUCTION
The use of alternative number systems in the implementation of application specific Digital Signal Processing (DSP) systems has gained a remarkable importance in recent years. This is due to the carry propagation problem associated with the conventional number system such as the binary numbers. The attractive carry free properties of Residue Number System (RNS) makes it a suitable candidate. RNS is an integer system which has the potential for high speed and parallel computations. RNS is mostly applied in addition and multiplication dominated DSP applications such as Digital Filtering and Convolutions [1] . In RNS, arithmetic operations such addition, subtraction, and multiplication can be carried out independently and concurrently in several residue channels more efficiently than in the conventional binary system [2].
The difficult RNS arithmetic operations include: division, magnitude comparison, overflow detection, sign detection, moduli selection and reverse conversion. For a successful RNS implementation, moduli selection and reverse conversion are the most critical issues. Reverse conversion has become an important research topic as the solution to the other RNS arithmetic operations depend largely on conversion. In literature, moduli sets that have been presented are classified according to the number of channels or their lengths and their Dynamic Range (DR Several reverse conversion techniques have been proposed based on either the Chinese Remainder Theorem (CRT) [14] , [15] , [16] , or the Mixed Radix Conversion (MRC) [17] . The major CRT problem is the complex and slow modulo-M operation (M = m 1 m 2 m 3 ...m n being the system dynamic range, thus a rather large constant to deal with).
In this paper, we introduce a novel moduli set { 2 2n , 2 2n + 1, 2 2n -1} by enhancing the modulus 2 n to 2 2n in { 2 n , 2 2n + 1, 2 2n -1 } [13] . Next, we present an efficient reverse converter based on the CRT which results in efficient VLSI architecture design with high speed conversion and low cost hardware requirements. Theoretically, our proposal outperforms equivalent state of the art converters. We also implemented the proposed converter and the best equivalent state of the art converters on Xilinx Spartan 6 FPGA. The synthesis results are given in terms of the number of slices and input-to-output gate delays in nano seconds. The results indicate that, on the average, our proposal is about 52.35% and 43.94% better than existng equivalent state of the art in terms of conversion time and hardware resource utilization respectively.
The rest of the paper is structured as follows. Section 2 provides a brief background information on reverse conversion. In Section 3, the novel moduli set is introduced, and the associated reverse conversion algorithm is presented. The hardware implementation of the proposed algorithm is described in Section 4, and Section 5 evaluates the performance of the proposed scheme. Finally, the paper is concluded in Section 6.
RNS is defined in terms of a set of relatively prime moduli set {m i } i=1,k , such that gcd(m i ,m j ) = 1 for i ≠ j, where gcd(m i ,m j ) means the greatest common divisor of m i and m j , while is the dynamic range. The residues of a decimal number X can be derived as x i = |X| mi denotes X mod m i operation.
The main methods for reverse conversion are based on the CRT, New CRT and MRC techniques. In this paper, we utilizes the CRT. Given a moduli set {m i } i=1,3 , the residues (x 1 , x 2 , x 3 ) can be converted into the corresponding decimal number X using the CRT as follows [2] :
Where is the multiplicative inverse of M i with respect to ( w.r.t ) m i
The complexity of Equation [1] is significantly reduced by using the proposed moduli set { 2 2n , 2 2n + 1, 2 2n -1 }.
NEW MODULI SET WITH PROPOSED REVERSE CONVERTER
For a given RNS moduli set to be legitimate, it is required that all the elements in the set to be co-prime. Thus, in order to prove that the proposed set can be utilized for the construction of valid RNS architecture, we have to demonstrate that the moduli 2 2n , 2 2n + 1 and 2 2n -1 are pair-wise relatively prime.
Following the basic integer divison definition in RNS, we finally have:
In order to reduce the hardware complexity, we use the following properties to simplify Equation (12):
Property 1 : The multiplication of a residue number by 2 k in modulo (2 P -1) is computed by k bit circular left shifting Property 2 : A negative number in modulo (2 p -1) is calculated by subtracting the number in question from (2 p -1). In binary representation, the ones complement of the number gives the result. Equation (12) can be directly rewritten as :
Where,
HARDWARE REALIZATION

PERFORMANCE ANALYSIS
The performance of the proposed reverse converter is evaluated in terms of hardware cost and conversion term. In order to properly evaluate the performance of our proposal against state of the art, both theoretical and experimental analysis are performed.
Theoretical Evaluation
We compare our converter with state of the art converters presented in [18] , [13] , and [19] . It must be noted that in [18] and [13] , the converters presented are for 5n bit DR moduli sets, while [19] is a 6n bit DR moduli set. The inclusion of [18] and [13] in the comparison is to demonstrate that, our converter can compete favourably with other existing state of the art 5n bit DR moduli sets and for the fact that our proposed moduli set is an improvement of [13] . The theoretical analysis is presented in Table [1] . From the table, it is seen clearly that our proposal outperforms the existing similar dynamic range state of the art converters in terms of area and delay. For our CE converter, the delay is (8n + 2) t FA while that of [19] exhibits a delay of (12n + 6) t FA. To further simplify the area comparison, we assume that one FA is twice large as an HA, and the expressed the area cost for all the considered designs in terms of HA. It is therefore evident that our converter utilizes lesser area resources.
Experimental Evaluation
For the experimental assessment, the converters were described in VHDL and then implemented on Spartan 6 xc6slx45t-3fgg484 FPGA, with Xilinx ISE 14.3 for various dynamic range requirements. The performance is evaluated in terms of area measured according the number of slices and delay corresponding to the critical path in nanoseconds. Table 2 shows the synthesized results for the various values of n. To confirm the theoretical results, the experimental results clearly shows the superiority of our converter over the state of the arts. In comparison with the reverse converter presented by [19] , the generated values strongly suggest that, on the average, our proposal is capable of performing 52.35% faster than the converter proposed by [19] . Also, in terms of area cost our converter exhibits a 43.94% reduction with respect to state of the art. Figures 3 and 4 presents the performance of our proposal against state of the art in terms of delay and area. 
CONCLUSIONS
In this paper we proposed a novel moduli set {2 2n , 2 2n + 1, 2 2n -1} with its associated reverse converter using the CRT. The moduli set is a 6n bit DR and therefore appropriate for applications requiring specifically 6n DR. We simplified the CRT to obtain an effective algorithm. Further, we reduced the resulting architecture in order to obtain a reverse converter that utilizes only two CSAs and a CPA. We performed both theoretical and experimental evaluation of our proposal. The theoretical analysis shows clearly the advantages of our moduli set and its associated reverse converter. This is confirmed by the experimental results. We described our scheme and those presented by [18] and [19] in VHDL and carried out the implementation on an FPGA using a wide range of values on n. The results indicate that on the average, our scheme is 52.35% faster than the converter proposed by [19] in terms of speed, while it exhibits a 43.94% reduction in area cost. Clearly, the results show that, our proposal outperforms the best known state of the art. 
