Abstract-Stored Unibit Transfer (SUT) has been recently proposed as a redundant high-radix encoding for the channels of a Residue Number System (RNS) that can improve the efficiency of conventional redundant RNS. In this paper we propose modulo 2 n ±1 forward and reverse converters for the SUT-RNS encoding. The proposed converters are based on parallel-prefix binary or modulo adders and are therefore very efficient.
INTRODUCTION
Residue Number System (RNS) [1] [2] is a number system commonly adopted for speeding up computations in digital signal processing [3] [4] [5] [6] , cryptography [7] and telecommunication applications [8] [9] . A non-positional RNS is defined by a set of L moduli, suppose {m 1 , …, m L } that are pair-wise relatively prime. Assume that |A| M denotes the modulo M residue of an integer A, that is, the least nonnegative remainder of the division of A by M. A has a unique representation in the RNS, given by the set {a 1 and m i and each z i is computed in parallel in a separate arithmetic unit often called a channel. Since each channel deals with narrow residues instead of wide numbers and since all channels operate in parallel, significant speedup over the binary may be achieved. RNSs built on 2 n ±1 moduli have received significant attention due to the efficient arithmetic circuits that have been proposed for them. Any carry propagation in an RNS is restricted inside each channel.
Binary Signed-Digit (BSD) [10] has been proposed as a redundant, carry-free, number system where addition can be performed in constant time. The BSD number system represents each number with a set of digits in {-1, 0, +1}. Each digit requires two bits for its representation leading to a significant overhead in storage, processing and interconnection requirements. Hybrid redundant number systems, such as those with weighted two-valued digit set encodings [11] [12] , have been proposed as an alternative that can limit the maximum length of carry propagation chains to any desired value and can lead to a wide representation range without the added costs associated with BSD. Furthermore, they can utilize conventional components such as full/half adders and can therefore produce highly efficient circuit implementations. Examples of such encodings are the Stored-Unibit Transfer (SUT) encoding [11] [12] and the Signed-LSB encoding [13] .
Several attempts have been made to combine the parallel nature of RNS with the carry-free or carry-limited nature of redundant number systems. [14] [15] and [16] are among the most recent works that deal with the use of BSD inside each channel of an RNS in order to eliminate the intra-channel carry propagation. They propose efficient arithmetic circuits, such as adders and multipliers, for the modulo 2 n ±1 cases. The authors of [13] propose modulo 2 n ±1 adders based on the Signed-LSB encoding. In order to trade-off the area overhead of the BSD with the delay, [17] [18] propose the use of the SUT encoding for the modulo 2 n ±1 RNS channels and present SUT-RNS addition, subtraction and multiplication circuits. However, to the best of our knowledge, no architecture has been reported so far for converting a binary modulo 2 n ±1 number from/to its corresponding SUT-RNS encoding, making the arithmetic circuits proposed for SUT-RNS in [17] [18] inapplicable.
In this paper we fill this gap by presenting forward and reverse converters for modulo 2 n ±1 SUT-RNS encoding. The proposed converters are based on parallel-prefix binary or modulo 2 n ±1 adders and some extra simple logic and are very efficient.
The remaining of the paper is organized as follows. The next section presents an overview of the SUT-RNS encoding. Forward and reverse converters for modulo 2 n ±1 SUT-RNS channels are given in Sections III and IV, respectively. Section V evaluates the proposed circuits and presents some experimental results. Section VI concludes the paper.
II. REDUNDANT HIGH-RADIX SUT-RNS
Every SUT-encoded number is composed of SUT digits. Each SUT digit consists of two-valued digits (twits) of three types: posibits {0, +1}, negabits {-1, 0}, and unibits {-1, +1}. A posibit has a lower value equal to 0 whereas a negabit and a unibit have a lower value equal to -1. Furthermore, a posibit and a negabit use a gap size equal to 1 whereas a unibit uses a gap size equal to 2 [11] . All three twits require one bit for their representation and use bias encoding, that is, their lower value is encoded in binary with logical 0 whereas their upper value is encoded in binary with logical 1. The dot notations, symbolic notations and binary encodings of the twits are given in Table I . 
. . . 
III. FORWARD CONVERTERS
In order to utilize the adder, subtractor and multiplier circuits that were proposed in [18] for SUT-RNS, one has to use forward converters to derive the SUT-RNS encodings of the input operands. In [18] no such circuits have been presented. [12] reported an algorithm for converting a signed two's complement number to an SUT representation. However this algorithm cannot be used as is in the SUT-RNS encoding since in this case the input is a modulo 2 n +1 or a modulo 2 n -1 unsigned number. We present in this section efficient forward converters of a modulo 2 n ±1 number to the SUT-RNS encoding. We consider the two cases of moduli separately.
A. Modulo 2 n -1
Consider an n-bit modulo 2 n -1 number X=x n-1 …x 0 . The algorithm for forward conversion presented in [12] can correctly encode in SUT-RNS every value of X that lies in the positive range of the SUT-RNS encoding (see Fig. 2 ). However, for all values of X that are encoded in SUT-RNS in the negative range, a value decreased by one compared to the correct modulo 2 n -1 value is produced since modulo 2 n arithmetic is used instead and
. Hence, for all the SUT-RNS encoded values of X that lie in the negative range, we have to increase the corresponding value of X by one in order to get the correct modulo 2 n -1 value.
The following two-step algorithm is a modification of the forward conversion algorithm of [12] that deals with the abovementioned problem and performs correct modulo 2 n -1 forward SUT-RNS conversion:
Step I: Compute y=X+R+s=y'+s, where y and R=(2 kh -1)/(2 h -1) denote n-bit operands, y'=X+R, and s denotes a sign indication bit whose value is equal to 0 when the value of X lies in the positive range of the SUT-RNS encoding and is equal to 1 when the value of X lies in the negative range. ′ n y and c n-1 are the most significant bit and carry out of the (X+R) addition and ∨ denotes the logical OR operation. A straightforward solution for deriving the bits of y uses a binary adder for deriving y′ and a controllable incrementer for incorporating s. However, those two operations can be efficiently merged in a parallel-prefix-based adder (see Fig. 3 ). The X and R operands can be driven to a parallel-prefix structure that in log 2 n levels can derive the n carries (c n-1 , …, c 0 ) of X+R. Then, s can be derived by an XOR and an OR gate, as s = (hs n-1 ⊕ c n-2 ) ∨ c n-1 , where hs n-1 = x n-1 ⊕ R n-1 is the half-sum bit of the most significant bit position. An extra prefix level can then be used for adding the value of s and for producing a new set of carries (c′ n-2 , …, c′ 0 ). Finally, the n-bits of y can be derived by 2-input XOR gates. We have to note that since R is a constant that has a value equal to 1 in all bit positions with weights 2 ih , 0≤i<k, and a value equal to 0 in all other bit positions, the parallel-prefix structure can be significantly simplified. Step II: Use the following logic equations [12] to transform the bits of y to the corresponding SUT-RNS encoding of X, denoted as X SUT-RNS , assuming that y -1 =0 and that ∧ denotes the logical AND operation, while z denotes the logical NOT operation on bit z:
Step II implies that the unibit x′ 0 is always equal to 0. The complete circuit structure that realizes the above algorithm is given in Fig. 3 
B. Modulo 2 n +1
Consider now a (n+1)-bit modulo 2 n +1 number X=x n x n-1 …x 0 ∈ [0, 2 n ]. The forward converter of [12] could be used to encode X in SUT-RNS. However, for all values of X that lie in the negative range of the SUT-RNS encoding an increased by one value compared to the correct one would be produced since Hence, for all these values of X we have to decrease by one in order to get the correct modulo 2 n +1 SUT-RNS encoding.
The following algorithm is similar to the one presented previously for modulo 2 n -1 and performs modulo 2 n +1 SUT-RNS forward conversion.
Step respectively. Note that s also incorporates the most significant bit of X, x n , in order to add the value x n 2 n . Hence, the n least significant bits of X and (R-1) are driven to an n-bit parallelprefix structure. Then s is derived by an XOR and a NOR gate while an extra parallel prefix level is used to add the value of s and produce y.
Step II: Use the same logic equations as in the modulo 
IV. REVERSE CONVERTERS
We present in this section efficient reverse converters of an SUT-RNS encoded modulo 2 n ±1 number to its corresponding binary encoding. We consider the two cases of moduli separately.
A. Modulo 2 n -1
In order to get the binary encoding X of an SUT-RNS encoded modulo 2 n -1 number X SUT-RNS , we need to add in modulo 2 n -1 the following 4 n-bit vectors, as shown in dot notation in Since, for every bit z it holds that
(c) the unibits vector denoted as U. Unibits can be treated as doublebits or equivalently as posibits in the next higher bit position, as long as we also consider a correction equal to -R. Hence, U =0…0x′ (k-1)h 0 0…0x′ (k-2)h 0 … 0…0x′ 0 0. Instead of using a 4-operand modulo 2 n -1 adder, we can merge the 4 vectors in two and use only a 2-operand modulo 2 n -1 adder. The posibits of P along with the negabits of N form an n-bit vector denoted as PN. PN is actually the main part of the . A modulo 255 adder (which is equivalent to an end-around-carry binary adder) with PN and PNUC − as inputs produces the value 01101000 at the output which is equal to 104.
B. Modulo 2 n +1
A similar approach can be used in the modulo 2 n +1 case as well. In order to get the binary encoding X of an SUT-RNS encoded modulo 2 n +1 number X SUT-RNS , we need to add in modulo 2 n +1 arithmetic 4 n-bit vectors (see Fig. 7 ): vectors P, N and U for the posibits, negabits and unibits, respectively, which are equal to those in the modulo 2 n -1 case and a constant correction vector C + which in the case of modulo 2 n +1 is equal to 
The two n-bit vectors PN and PNUC + are then driven to an enhanced diminished-one modulo 2 n +1 adder [19] that produces the (n+1)-bit binary encoding of X SUT-RNS , as shown in Fig. 8 . We have to note that in PNUC + a constant correction term equal to -1 is also taken into account since a diminishedone adder always increases the sum of its two input operands by one. 
V. EVALUATION AND EXPERIMENTAL RESULTS
In this section we at first evaluate the forward and reverse converters that were proposed in Sections III and IV, respectively, and then, we present some experimental results based on CMOS VLSI circuit implementations.
The SUT-RNS forward converters for both modulo 2 n -1 and 2 n +1 are based on an n-bit parallel-prefix structure. A few gates are used to derive the sign bit s which is then added with an extra prefix level and a level of 2-input XOR gates. Finally,
Step II of forward conversion requires some extra 2-input XOR gates in parallel. Since the parallel-prefix structure has a logarithmic delay and all remaining subcircuits have small constant delays, we conclude that the forward converters are very efficient in delay. The SUT-RNS reverse converters are also very efficient since they are based on modulo 2 n -1 or diminished-one modulo 2 n +1 adders whose input operands are formed at a minimum delay of an inverter. Furthermore, both the parallel-prefix structure in the forward converters and the modulo adders in the reverse converters can be designed using any desirable architecture.
We described in HDL forward and reverse converters for both moduli cases and for several values of n, k and h. In the forward converters case we considered a Kogge-Stone [20] parallel prefix structure. In the reverse converters case we considered modulo 2 n -1 and diminished-one modulo 2 n +1 adders that follow the architectures of [21] and [22] , respectively. After validating the correct operation of the HDL descriptions via simulation, we synthesized them in a powercharacterized 90nm CMOS technology, using a standard delay optimization script, and derived estimates for area, delay and average power dissipation. The attained results, given in Table  II , indicate that the proposed converters are very fast and require small area and power dissipation. Since we are not aware of any other work on forward and reverse modulo 2 n ±1 SUT-RNS converters, no comparison with other proposals is possible. 
VI. CONCLUSIONS
Redundant number systems can be used to reduce the carry propagation inside each channel in an RNS. SUT has been proposed as a redundant high-radix encoding for RNS that can improve the efficiency of BSD-based RNS since it can utilize conventional arithmetic components such as full/half adders. We have presented in this paper, for the first time in the open literature, efficient forward and reverse converters for the SUT-RNS encoding for the two most commonly used moduli cases, that is, modulo 2 n ±1. The forward converters are based on parallel-prefix binary adders and simple logic gates whereas the reverse converters are based on parallel-prefix modulo 2 n ±1 adders and simple logic gates.
The incorporation of the proposed converters in the various already proposed forward and reverse converters from/to binary to/from RNS is currently under investigation. This will enable to convert binary representations to SUT-RNS and vice versa without using a residue representation as an intermediate step.
