The three-modulus residue number system (RNS) 
Keywords: Computer Architecture, Residue Number System (RNS), Reverse Converter

Introduction
The residue number system (RNS) makes it possible to implement arithmetic operations such as addition and multiplication in a parallel and fast architecture, due to its carry-free nature. The reverse converter decodes an RNS-represented number into its weighted binary form. Usually, the reverse conversion is the most complex and critical part of an RNS system, since a low-performance reverse converter can counteract the speed profit of the internal RNS arithmetic operations [1, 2] .
The moduli set {2 n , 2 n -1, 2 2n-1 -1} has been recently suggested in [3] to provide efficient implementation of the RNS arithmetic unit circuits and reverse converter at the same time. In other words, the absent of modulo (2n+1) in the set {2 n , 2 n -1, 2
2n-1 -1} results in decreasing the complexity of the RNS arithmetic unit, and also the simple multiplicative inverses can lead to a high-performance hardware design for reverse converter. However, the previous design of the reverse converter for the moduli set {2 n , 2 n -1, 2 2n-1 -1} that is introduced in [3] relies on large hardware requirements which can result in performance degradation of the total RNS system.
In this work, a reduced-area reverse converter for the moduli set {2 n , 2 n -1, 2 2n-1 -1} using new Chinese remainder theorem 2 (CRT-II) [4] is presented. The proposed converter needs less hardware requirements with a slightly lower conversion delay than the converter of [3] . Besides, comparison the hardware complexity of the proposed design with the reverse converter of the moduli set {2 n , 2 n -1, 2 2n-1 -1} [5] which has relatively the same dynamic range, show the considerable performance improvement in terms of hardware requirements and conversion delay.
Brief Background
The RNS [1] is based on a moduli set {P 1 ,P 2 , …,P n }which consists of pair-wise relatively prime numbers. The dynamic range is defined as M=P1P2…Pn, which is refer to the interval of integer numbers that can uniquely be represented in RNS. A weighted number X<M has a unique representation in RNS as (x 1 , x 2 , …, x n ) where A Reduced-Area Reverse Converter for the Moduli Set {2 n , 2 n -1, 2 2n-1 -1}
A. Sabbagh Molahosseini, M. Kuchaki Rafsanjani, S.H. Ghafouri1, M. Hashemipour
In the RNS with three-moduli set {P1,P2,P3}, and based on the new Chinese remainder theorem 2 [4] , the RNS number (x1,x2,x3) can be converted into weighted binary form by
Also, the multiplicative inverses k and m can be obtained based on the following equations
Conversion Algorithm
Consider the three-moduli set {P 1 ,P 2 ,P 3 }={2 n , 2
2n-1 -1, 2 n -1} with corresponding residues (x 1 ,x 2 ,x 3 ). First, the required multiplicative inverses can be obtained using the following lemmas.
Lemma 1: The multiplicative inverse of 2 n ×(2
. Proof: Based on (4), it is easy to find that 
Now, by replacing the values of multiplicative inverses as well as the moduli in CRT-II algorithm, i.e. (2) and (3), the initial conversion equations can be achieved as follows
The simplification of these equations can be done based on the following well-known arithmetic properties:
Property 1: "Modulo (2 p -1) multiplication of a residue number by 2 k , where p and k are positive integers, is equivalent to k bit circular left shifting" [3, 6] .
Property 2: "Modulo (2 p -1) of a negative number is accomplished by subtracting this number from (2 p -1). This is equivalent to taking the one's complement of the number" [3, 6] . First, to simplify (9), we have 
Similarly, (8) can be calculated by 
Where
Finally, (14) can be rewritten as
Numerical Example: Consider the moduli set {4,7,3} which is obtained from {2 n , 2 2n-1 -1, 2 n -1} for n=2. To achieve the regular weighted form of the RNS number X= (3, 5, 2) , we have
Hardware Realization
The main conversion equations which should be realized in hardware are (11), (15) and (20). Fig. 1 shows the proposed improved design. Similar to [3] and [5] , carry-save adders (CSAs) with end-around carries (EACs) and carry-propagate adders (CPAs) with EACs [7] are used in the proposed architecture to realize the modulo (2 k -1) operations. Note that constant bits of some of the binary vectors resulted in reducing full adders (FAs) to XOR/AND or XNOR/OR pairs. Also, a (3n-1)-bit binary CPA with one A Reduced-Area Reverse Converter for the Moduli Set {2 n , 2 n -1, 2 2n-1 -1}
A. Sabbagh Molahosseini, M. Kuchaki Rafsanjani, S.H. Ghafouri1, M. Hashemipour carry-in is employed to implement (20). It should be noted that, since x 1 is an n-bit number, multiplication by 2 n in (20) needs no hardware, and realized using concatenation. Table 1 presents hardware details of the proposed converter. Table 2 compares hardware complexity of the proposed reverse converter with the converters of [3] and [5] in terms of the needed hardware requirements and conversion delays. Based on this Table, it is clear that the proposed converter has better performance than those of [3] and [5] . Especially, it can be seen that the proposed design needs less hardware requirements than the previous converter for the set {2 n , 2 n -1, 2 2n-1 -1} while relies on slightly less conversion delay. Figure 1 . The reduced-area reverse converter for the moduli set {2 n , 2 2n-1 -1, 2 n -1}. 
Conclusion
This paper presented a new improved design of the reverse converter for the moduli set {2 n , 2 n -1, 2 2n-1 -1}with considerably less hardware requirements, compared to the latest design of the converter for this set. Due to the capability of the moduli set {2 n , 2 n -1, 2 2n-1 -1} to provide efficient RNS arithmetic unit, the proposed improved reverse converter for this moduli set can lead to increasing the total performance of the RNS.
