








AND SIGNED INTEGER COMPARISON
FOR THREE-MODULI SET {2n ± 1,2n+k}
Abstract Comparison, division, and sign detection are considered to be complicated op-
erations in a residue number system (RNS). A straightforward solution is to
convert RNS numbers into binary formats and then perform complicated op-
erations using conventional binary operators. If efficient circuits are provided
for comparison, division, and sign detection, the application of RNS can be
extended to those cases that include these operations.
For RNS comparison in three-moduli set τ = {2n−1, 2n+k, 2n+1}, (0 ≤ k ≤ n),
we have found only one hardware realization. In this paper, an efficient
RNS comparator is proposed for moduli set τ , which employs a sign-detection
method and operates more efficiently than its counterparts. The proposed sign
detector and comparator utilize dynamic range partitioning (DRP), which has
been recently presented for unsigned RNS comparison. The delay and cost of
the proposed comparator are lower than the previous works, which makes it
appropriate for RNS applications with limited delay and cost.
Keywords computer arithmetic, residue number system, signed integer comparison,
dynamic range partitioning
Citation Computer Science 22(3) 2021: 391–405
Copyright © 2021 Author(s). This is an open access publication, which can be used, distributed






392 Z. Torabi, S. Timarchi
1. Introduction
A number X in a residue number system (RNS) with k moduli {m1,m2, · · ·,mk} is
represented by set of residues {x1, x2, · · ·, xk}, where xi = |X|mi denotes the remain-
der of integer division X/mi. If the moduli are pair-wise prime, dynamic range M is
maximized (i.e., M = Πki=1mi). In RNS, operations like addition, subtraction, and
multiplication are performed in k parallel independent channels, which makes it a
promising candidate for applications that use frequent add/multiply operations such
as finite impulse response digital filters [4], data transmission [1], cryptography [18],
and image processing [28]. Furthermore, digital signal processing (DSP) has employed
RNS due to such properties as carry-free operations, parallelism, and modularity [2].
Recently, RNS has been used to achieve the energy-efficient hardware implementation
of neural networks for inference computations [16,17].
Since RNS is a non-weighted number system; comparison, division, sign, and over-
flow detection have difficulties. Difficult operations are often required for several
nonlinear procedures, such as median and rank-order filtering [20]. Several RNS com-
parison [3,5,10,12,13,20,22–24,27], sign-detection [9], and division [25] methods have
been proposed over the past two decades. As regarding the role of comparison in other
complicated operations such as division, sign, and overflow detection, it is expected
that a cost-effective and high-speed implementation of the comparison will assist in
improving other complicated operations. Moreover, the RNS comparators used in
some applications such as video coding [19] and deep neural networks [17] lead to
better results than a binary number system.
Besides popular moduli set {2n±1, 2n}, several three-moduli sets with dynamic ranges
of more than 3n-bit have been reported in the relevant literature (along with their
reverse converters) [6, 7, 14, 15]. Moduli set τ = {2n − 1, 2n+k, 2n + 1} has efficient
balanced arithmetic operations, with a (3n + k)-bit dynamic range and efficient re-
verse converter [8]. Besides the general methods for RNS comparison, there is only
one comparator that is designed specifically for this moduli set [20]. By considering
the efficiency of the dynamic range partitioning method for unsigned number com-
parison (as is experienced in several three-moduli sets), we are motivated to apply
this method for sign detection and signed number comparison to moduli set τ . In this
paper, we propose a signed number comparator for moduli set τ by employing the
DRP method. With analytical evaluations, we show that the proposed DRP-based
signed integer comparator outperforms other previous methods [8,20]. The work of [8]
only provides a reverse converter for moduli set τ , so we have augmented its converter
with a normal binary comparator.
The rest of this paper is organized as follows. Section 2 reviews a number of efficient
RNS comparison methods in the literature. In Sections 3 and 4, the proposed signed
RNS comparator for moduli set τ are described (along with its implementation de-
tails). Section 5 provides a comparison of the proposed comparator with the existing
schemes presented in [20] and [8]. Finally, we draw our conclusions based on the





Sign detection and signed integer comparison for. . . 393
2. Background material
Here, we first describe the representation of signed numbers in RNS and then review
some previous RNS comparison schemes. The dynamic range of representing numbers
in an arbitrary moduli set (i.e.,[0,M)) is partitioned into two nearly equal parts to
provide negative numbers. [0, [M/2]) and [[M/2],M) are considered to be positive
and negative intervals, respectively, where the absolute value of negative number X
is M −X. The sign of an RNS number X can be detected as follows:
sign(X)=
{
0 if 0 ≤ X < [M/2]
1 if [M/2] ≤ X < M .
Example: Consider moduli set {3, 4, 5} with M = 60. X = (1, 2, 3) = 58 >
[ 60/2] ; therefore, X is a negative number. The absolute value of X equals M −X =
60− 58 = 2; since X is negative, we have X = (1, 2, 3) = −2.
Several approaches have attempted to develop RNS comparison methods (mostly for
unsigned numbers) [3,5,10,12,22–24,27]. Some other works compared signed numbers
from a different perspective in which the dynamic range included both positive and
negative numbers [13,20].
The straightforward technique utilizes a reverse convertor to convert RNS operands
to binary and then compares them with a normal comparator. Such convertors are
usually based on Chinese remainder theorem (CRT) [21] or mixed-radix conversion
(MRC) [21]. Subsequently, there are some other comparators [2, 12, 24, 27] that do
not convert numbers completely and compare operands during the reverse-conversion
process.
In [10], a method for non-modular operations in RNS by summation sets of floating-
point numbers is proposed; however, this method is efficient only on modern massively
parallel general-purpose computing platforms such as GPU-based systems.
The authors of [13] proposed a fast signed number comparator for three four-moduli
sets based on the simple quantization method. Such a method exploits the unique
number theoretic properties of the moduli.
Some other methods have proposed the diagonal function [3, 5] to compare RNS
numbers. A diagonal number based on (1) is assigned to each number in the dynamic
range. For comparing two operands X and Y , D(X) and D(Y ) are computed; the
result of comparison is then determined by comparing D(X) and D(Y ). Like CRT,
this method is based on modular operations in a large modulo SQ, where SQ =
Σni=1(M/mi), and the values of µi satisfy |µimi|SQ = 1.
D(x) = |µ1x1 + µ2x2 + · · ·+ µnxn|SQ (1)
The works of [22, 23] compare unsigned RNS numbers via DRP in moduli sets
{2n, 2n ± 1} and {2n, 2n − 1, 2n+1 − 1}. Partitioning the dynamic range of an ar-
bitrary three-moduli set (such as {m1,m2,m3}) produces m1 partitions of a size of
m2 ×m3 that are assigned to a primary integer identifier in [0,m1). Likewise, each





394 Z. Torabi, S. Timarchi
integer identifier in [0,m2). On the other hand, there are exactly one primary integer
identifier and one secondary integer identifier for each operand regarding the RNS
comparison. Therefore, the comparison operation can be reduced to comparing their
primary and secondary integer identifiers (represented by p1(X) and p2(X)) in (2)
and (3), respectively, as well as the modulo-m3 residues.
Equations (2) and (3) (which are borrowed from [22]) show how the DRP components
of an operand X = (x1, x2, x3) can be derived from the corresponding RNS residues,
where x23 = |X|m2m3 ,M1 = m2m3, and the multiplicative inverse of |A−1|B satisfies
|A×A−1|B = 1.
p2(X) = ||m−13 |m2(x2 − x3)|m2 , (2)
p1(X) = ||M−11 |m1(x1 − x23)|m1 (3)
The only comparator for moduli set τ is proposed in [20], which compares signed
numbers. This comparator is based on an optimized version of MRC for sign identi-
fication and performs the comparison through utilizing the sign bits of the operands
as well as their difference. The authors of [20] assumed that the inputs of the com-
parison method are two RNS numbers with two extra bits that identify their signs.
Such a comparator can be used in an architecture that, after performing any RNS
operations, the sign of the result is detected and stored. For multiplication/division
operations, the sign of the result is identified with a simple XOR gate in parallel with
the multiplication/division computation and does not cause any overhead. Unlike
multiplication/division, the sign of the addition/subtraction result is identified by
comparing the result with [M/2] (via MRC digits). Therefore, the adder and sub-
tractor should be augmented with the comparator.
To capture the overhead cost of the sign-detection circuit after each RNS operation,
we have synthesized a modular addition with or without the sign identification circuit
regarding moduli set τ for n = 8 and k = 0 via TSMC 90-nm standard CMOS tech-
nology by Synopsys Design Compiler for delay constraints in a range of 0.7 − 2.0ns





















addition addition with sign detection






Sign detection and signed integer comparison for. . . 395
The delay of the addition in moduli set τ with and without the sign-detection
circuit is 1.6 and 0.7ns, respectively. Augmenting the sign-detection circuit after each
addition/subtraction causes up to twice the delay and power dissipation than simple
addition/subtraction. Such a growth in the power and delay is in contrast with the
strength points of RNS, which makes it an appropriate candidate for applications
with repeated usage of addition and multiplication operations.
3. Proposed DRP-based signed integer comparator
The proposed comparator is presented as Algorithm 1. First, the signs of the operands
are identified; then, if their signs are identical, the comparison operation is performed
by comparing their corresponding DRP components. The E, G, and L values in this
function denote the equality of the inputs, X > Y , and X < Y , respectively. In the
following, the proposed sign-detection method is first explained; it is based on the
DRP method [16]. Then, the computation of the DRP components in moduli set τ
are designated.
Algorithm 1: Signed number comparison in moduli set τ
1. function comparison (inputs: X : (x1, x2, x3),Y : (y1, y2, y3), output: Comp )
2. if sign(X) = sign(Y )
3. if (p1(X) > p1(Y )) then Comp = G // X > Y
4. else if (p1(X) < p1(Y )) then Comp = L // X < Y
5. else if (p2(X) > p2(Y )) then Comp = G // X > Y
6. else if (p2(X) < p2(Y )) then Comp = L // X < Y
7. else Comp = E // X = Y
8. else if sign(X) = 0 then Comp = G // X > Y




Let the m1 = 2
n+k, m2 = 2
n − 1, and m3 = 2n + 1 DRP method described in (2)
and (3) splits the dynamic range into 2n+k equal partitions in the moduli set τ (each
of which contains consecutive numbers – see Table 1). Each number in the dynamic
range belongs to exactly one partition. In each row of Table 1, a partition of the
dynamic range with its primary and secondary integer identifiers (i.e., p1(X) and
p2(X)) are presented, where M1 = M/m1 = 2
2n− 1. As shown in Table 1, the values
of p2(X) are different in each subset of the dynamic range that has a unique p1(X).
For example, the values of p2(X) for the first row of Table 1 with p1(X) = 0 are
presented in Table 2.
In moduli set τ , [0, 2n+k−1(22n − 1)) and [2n+k−1(22n − 1), 2n+k(22n − 1)) are the
intervals for the positive and negative numbers, respectively. As was proven in The-





396 Z. Torabi, S. Timarchi
p1(X). Theorem 1: An RNS number X in moduli set τ is negative if and only if
p1,n+k−1 = 1, where p1(X) = p1,n+k−1 · · · p1,0.
Proof: p1,n+k−1 = 1 indicates that p1(X) ≥ 2n+k−1. Equation (4) describes
X in terms of DRP components, where p1(X) ∈ [0, 2n+k), M1 = 22n − 1, and
x23 = |X|22n−1 ∈ [0, 22n − 1).
X = p1(X)M1 + x23 = p1(X)(2
2n − 1) + x23 (4)
Based on (4) and assuming p1(X) < 2
n+k−1, this entails X < 2n+k−1(22n − 1) + x23,
which shows that X > 0. With similar explanations, it is clear that p1(X) ≥ 2n+k−1
leads to X ≥ 2n+k−1(22n − 1), which indicates that X is negative.
Table 1
p1(X) values for moduli set τ
subset of dynamic range p1(X) p2(X)
[0,M1) 0 [0, 2
n − 1)




n + 1)M1) 2
n [0, 2n − 1)
... ... ...
[(2n+k − 1)M1, 2n+kM1) 2n+k − 1 [0, 2n − 1)
Table 2
values of p1(X) and p2(X) for 0 ≤ X < (22n − 1)
subset of dynamic range p1(X) p2(X)
[0,m3) 0 0
[m3, 2m3) 0 1
... ... ...
[(2n − 2)m3, (2n − 1)m3) 0 2n − 2
3.2. Computation of DRP components
p1(X) and p2(X) are investigated for moduli set τ . The multiplicative inverses that
are required for the DRP components are represented via Properties 1 and 2. Based
on (2), (3), and Properties 1 and 2, p1(X) and p2(X) are described in (5) and (6).
Property 1: |(2n + 1)−1|2n−1 = 2n−1
Property 2: |(22n − 1)−1|2n+k = −1
p1(X) = ||(22n − 1)−1|2n+k(x1 − x23)|2n+k = |x23 − x1|2n+k (5)





Sign detection and signed integer comparison for. . . 397
4. Implementation
The actual delay and cost of the proposed comparison method directly depend on the
complexity of the p1(X) and p2(X) generators. In this section, we provide implemen-
tation details for generating p1(X) and p2(X). To drive the implementation-friendly
equations for the DRP components (in consideration of x3 = 2
n+1−1−x3), p2(X) can
be simplified to (7). Given that |2n−1x2 +2n−1x3 − 2n−1|2n−1 = |U + V |2n−1, U and
V are obtained through a modulo(2n − 1) carry save adder (CSA), where U + V = w
and w = wnwn−1 · · ·w0. Equation (8) is obtained by the well-known property of the
modulo 2n − 1 arithmetic (i.e., |2nwn|2n−1 = wn). Table 3 illustrates weighted-bit
arrangements for the three terms of p2(X), where U = un−1 · · · u0, V = vn−1 · · · v0
and bn−1 · · · b0, cn · · · c0 represent x2 and x3, respectively.
p2(X) = |2n−1x2 + 2n−1(x3 − 2n+1 + 1)|2n−1
= |2n−1x2 + 2n−1x3 − 2n−1|2n−1 (7)
= |U + V |2n−1 = w − 2nwn + wn (8)
Table 3
Bit organization of p2(X)
2n−1 ... 21 20 Component
b0 ... b2 b1 |2n−1x2|2n−1
c0 ... c2 c1 |2n−1x3|2n−1
cn ... 1 1 | − 2n−1|2n−1
un−1 ... u1 u0 U
vn−1 ... v1 v0 V
By the use of a new CRT [26] and (8), we replace x23 with x3+(2
n+1)|U+V |2n−1
in (5), which leads to (9), where x1 = 2
n+k − 1− x1. The bit organization of p1(X)
is illustrated in Table 4, where an+k−1 · · · a0 represents x1.
p1(X) = |x3 + (2n + 1)|2n−1(x2 − x3)|2n−1 − x1|2n+k
= |x3 + (2n + 1)|U + V |2n−1 − x1|2n+k
= |x3 + (2n + 1)(w − 2nwn + wn)− x1|2n+k
= |x3 + 2nw + w + wn − x1|2n+k
= |x3 + (2n + 1)w + wn + x1 + 1|2n+k





398 Z. Torabi, S. Timarchi
Table 4
Bit organization of p1(X)
2n+k−1 ... 2n 2n−1 2n−2 .... 20
cn cn−1 cn−2 ... c0
uk−1 ... u0 un−1 un−2 ... u0
vk−1 ... v0 vn−1 vn−2 ... v0
an+k−1 ... an an−1 an−2 ... a0
1
wn
The value of p2(X) is achieved after the n-bit modulo (2
n − 1) CSA, whose output
(i.e., U and V ) feed an n-bit modular adder. The implementation of p1(X) includes
an array of 42C(4 : 2compressor) to reduce the number of operands that exclude wn
to two. Two adders regarding the two possible values of wn received the outputs
of 42C (as shown in Figure 2). To obtain p1(X), the output of these two adders is
multiplexed through wn. In another realization (NewRed.), two (n+k)-bit adders are
reduced to one (n+ k)-bit adder – the carry of which is wn, where Red. indicates the
reduced version of the proposed architecture.
n-bit modulo                 CSA
(n+k)-bit            
(n+k)-bit Adder
Modulo                   adder
(n+k)-bit Adder
Figure 2. Proposed architecture of p1(X) and p2(X) generators for the moduli set τ
4.1. Parallel prefix realization
In the high-speed design of the proposed comparator (called NewPPA), the two simple
(n+k)-bit adders of Figure 2 can be replaced by one parallel prefix adder (PPA) [11],
which includes a carry input (as shown in Figure 3). In the PPA shown in Figure 2, the





Sign detection and signed integer comparison for. . . 399
for two inputs u = un+k−1 · · · u0 and v = vn+k−1 · · · v0 are computed as follows:
gi = ui ∧ vi, pi = ui ∨ vi
Using gi and pi, group propagate Pi:j and group generate Gi:j are computed by
indicating the carry generation and propagation ability within positions j to i, where
j < i, Gi:0 = Gi, and Pi;0 = Pi. After the computation of carries ci, sum bits si can
be computed in a straightforward way as
hi = ui ⊕ vi, si = hi ⊕ ci.
On the other hand, the prefix design of PPA can be extended to achieve the sum
of two operands and a carry input (i.e., wn) (as shown in Figure 3). Therefore, the
critical delay path (CDP) of this design consists of an array of CSA, an n-bit modular
adder, one-stage PPA, and an XOR array. Figure 3 depicts the required PPA for
generating p1(X) in the aforementioned design.
k-bit CSA
2-Level n-bit Module                 CSA
Module                Adder
(n+k)-bit Parallel Prefix Adder




Figure 3. PPA architecture of adder in NewPPA, which includes carry as input (i.e., wn)
An evaluation of the proposed designs is discussed in the next section. If the
operands have different signs, a final comparison of the two operands (X = (x1, x2, x3)





400 Z. Torabi, S. Timarchi
[p2(X), p2(Y )], [x3, y3] into three different comparators. In other words, the final
comparison operation can be reduced to primarily comparing p1(X) and p1(Y ). In
the event of p1(X) = p1(Y ), a comparison of p2(X) and p2(Y ) can lead to the overall
comparison result unless they are also equal; in such a case, a comparison of the
modulo-(2n + 1) residues yields the final result.
5. Discussion
The proposed comparator method in this paper includes p1(X) and p2(X) genera-
tors followed by three comparators. Based on the p1(X) generator, three different
implementations (namely, New, NewRed., and NewPPA) are proposed. New is ex-
actly based on Figure 2, while NewRed. and NewPPA eliminate the multiplexer that
selects the desired p2(x). In addition, the two (n + k)-bit adders compound into an
(n+k)-bit PPA and an (n+k)-bit carry ripple adder (CRA) to save costs in NewRed.
and NewPPA, respectively.
As discussed in Section 2, the comparator of [20] is applicable to the architectures
that have a sign-detection circuit after each operation (which leads to extra power
consumption). Therefore, we assume that, in the comparator of [20], the signs of
the operands are computed like their difference, which leads to triple the same hard-
ware or execution time. Here, we compare three different proposed schemas with the
straightforward comparator and the comparator of [20]. The straightforward com-
parator includes the best RNS reverse converter [8] for moduli set τ and a normal
binary comparator.
In Tables 5 and 6, the number of different components within the CDP and all of
the components in each design are illustrated, respectively. Based on the components
of each design in Table 5, the comparators of [8, 20] have more components within
the CDP, which leads to more delay than the proposed architectures. Indeed, the
comparators of [8,20] have three different adders, while New and NewPPA have only
two adders within the CDP.
Table 5
Number of components within CDP
Method CSA
CRA PPA
n-bit n+ k-bit 3n+ k-bit n+ k-bit
New 3 2
NewRed. 1 1 2
NewPPA 2 1 1
[20] 3 2 1
[8] 1 2 1
The total delays and costs of the proposed architectures as well as those of [8,20]
are described in Table 7. The maximum and minimum total cost of the τ -comparators





Sign detection and signed integer comparison for. . . 401
cost than the proposed comparators; however, its delay is drastically greater than the
others. The delay comparison is based on gate counting within the CDP, where the
delay of a simple two-input gate (i.e., AND, OR, NAND, and NOR) is denoted by
△G [11]. An n-bit CRA or CSA have a 2n△G delay and 7n#G cost, while an n-bit
PPA has a (2logn+ 3)△G delay and (1.5nlog(n+ 1) + 5n)#G cost.
Table 6
Number of components in different comparators
Method Simple Gate
CSA CRA PPA
k-bit n-bit n+ k-bit n-bit n+ k-bit 3n+ k-bit n-bit n+ k-bit
New log(3n+ k) 2 4 4 5
NewRed. log(3n+ k) 2 3 4 5
NewPPA log(3n+ k) 2 3 2 5 2
[20] 3log(3n+ k) 3 3 6 6 3 3
[8] 12k + 2n 4 4 1
Table 7
Total delay and cost in RNS comparators
Method Total delay (△G) Total cost (#G)
New 4n+ 4k + 12 95n+ 63k + log(3n+ k)
NewRed. 6n+ 4k + 4 88n+ 56k + log(3n+ k)
NewPPA 4n+ 2k + 8 91n+ 63k + 3nlogn+ log(3n+ k)
[20] 4(n+ k) + 2log(n+ k) + 15 135n+ 99k + log(3n+ k) + 4.5n(log(n) + log(n+ k))
[8] 10n+ 2k + 4 79n+ 19k
In order to find better insights into the merits of the proposed architectures, the
total delay and cost of each design are compared via the plots of Figures 4 and 5 for
n = 8, 16, where k ≤ n. As shown in Figures 4 and 5, the cost of the comparator
of [20] and the delay of [8] are each more than twice as much as those in the proposed
architectures. Based on the plots in Figure 4, the delay of the two proposed architec-
tures (New and NewPPA) are less than the comparators of [8] and [20]. In addition,
this confirms the superiority of the NewPPA in terms of delay with increased k. In
conclusion, the best previous τ -comparator [20] has more delay and cost as compared
to the proposed method. The lower cost of the straightforward comparator [8] is
achieved at a cost of much more delay. The considerably high delay of [8] (which
is in contrast to the RNS properties) makes it inefficient for RNS applications. The
performances of the proposed architectures are better than their counterparts; among








































Figure 4. Delays of New, NewPPA, NewRed., and two previous τ -comparators for a) n = 8
































Figure 5. Costs of New, NewPPA, NewRed., and two previous τ -comparators for a) n = 8
and b) n = 16
6. Conclusion
Because of the costly and time-consuming comparison operation in a residue number





Sign detection and signed integer comparison for. . . 403
computations. However, some research efforts have been ongoing in order to realize
efficient comparators to widen the application of this number system. Moduli set τ is
an extension of popular moduli set {2n − 1, 2n, 2n + 1}, whose reverse converter has
been appeared in the relevant literature.
Dynamic range partitioning with partitioning the dynamic range to several in-
tervals helps to facilitate complicated operations. With the use of DRP components,
the proposed comparator identified the sign of the operands and compared them. An
evaluation of the proposed designs and the best previous comparator showed that the
proposed reduced design has less delay and that the proposed parallel prefix design
has less cost for n = 8 and n = 16.
References
[1] Alhassan I.Z., Ansong E.D., Abdul-Salaam G., Alhassan S.: Enhancing Image
Security during Transmission using Residue Number System and k-shuffle, Earth-
line Journal of Mathematical Sciences, vol. 4(2), pp. 399–424, 2020.
[2] Bi S., Gross W.J.: The mixed-radix Chinese remainder theorem and its appli-
cations to residue comparison, IEEE Transactions on Computers, vol. 57(12),
pp. 1624–1632, 2008.
[3] Boyvalenkov P., Chervyakov N.I., Lyakhov P., Semyonova N., Nazarov A., Val-
ueva M., Boyvalenkov G., Bogaevskiy D., Kaplun D.: Classification of Moduli
Sets for Residue Number System With Special Diagonal Functions, IEEE Access,
vol. 8, pp. 156104–156116, 2020.
[4] carlo Cardarilli G., Di Nunzio L., Fazzolari R., Nannarelli A., Petricca M., Re M.:
Design Space Exploration based Methodology for Residue Number System Digital
Filters Implementation, IEEE Transactions on Emerging Topics in Computing,
2020.
[5] Dimauro G., Impedovo S., Pirlo G., Salzo A.: RNS architectures for the imple-
mentation of thediagonal function’, Information processing letters, vol. 73(5-6),
pp. 189–198, 2000.
[6] Gbolagade K., Chaves R., Sousa L., Cotofana S.: Residue-to-binary converters
for the moduli set {2 2n+ 1-1, 2 2n, 2 n-1}. In: 2009 2nd International Conference
on Adaptive Science & Technology (ICAST), pp. 26–33, IEEE, 2009.
[7] Gbolagade K.A., Chaves R., Sousa L., Cotofana S.D.: An improved RNS reverse
converter for the {2 2n+ 1- 1, 2 n, 2 n- 1} moduli set. In: Proceedings of 2010
IEEE International Symposium on Circuits and Systems, pp. 2103–2106, IEEE,
2010.
[8] Hiasat A.: A residue-to-binary converter with an adjustable structure for an
extended RNS three-moduli set, Journal of Circuits, Systems and Computers,





404 Z. Torabi, S. Timarchi
[9] Hiasat A., Sousa L.: Sign Identifier for the Enhanced Three Moduli Set
{2n+k, 2n − 1, 2n+1 − 1}, Journal of Signal Processing Systems, vol. 91(8),
pp. 953–961, 2019.
[10] Isupov K.: Using Floating-Point Intervals for Non-Modular Computations in
Residue Number System, IEEE Access, vol. 8, pp. 58603–58619, 2020.
[11] Kalampoukas L., Efstathiou C., Nikoloo D., Vergos H.T., Kalamatianos J.: High-
-speed parallel-prefix modulo 2n-1 adders, 2006. US Patent 7,155,473.
[12] Krasnobayev V., Yanko A., Koshman S.: A Method for arithmetic comparison of
data represented in a residue number system, Cybernetics and Systems Analysis,
vol. 52(1), pp. 145–150, 2016.
[13] Kumar S., Chang C.H., Tay T.F.: New algorithm for signed integer comparison
in 2n+k, 2n − 1, 2n + 1, 2n±1 − 1 and its efficient hardware implementation, IEEE
Transactions on Circuits and Systems I: Regular Papers, vol. 64(6), pp. 1481–
1493, 2016.
[14] Latha M.M., Rachh R.R., Mohan P.A.: RNS-to-Binary Converters for a Three-
Moduli Set {2 n- 1- 1, 2 n- 1, 2 n+ k}, IETE Journal of Education, vol. 58(1),
pp. 20–28, 2017.
[15] Mohan P.V.A.: RNS-To-Binary Converter for a New Three-Moduli Set
2n+1 − 1, 2n, 2n − 1, IEEE Transactions on Circuits and Systems II: Express
Briefs, vol. 54(9), pp. 775–779, 2007.
[16] Salamat S., Imani M., Gupta S., Rosing T.: Rnsnet: In-memory neural network
acceleration using residue number system. In: 2018 IEEE International Confer-
ence on Rebooting Computing (ICRC), pp. 1–12, IEEE, 2018.
[17] Samimi N., Kamal M., Afzali-Kusha A., Pedram M.: Res-DNN: A Residue Num-
ber System-Based DNN Accelerator Unit, IEEE Transactions on Circuits and
Systems I: Regular Papers, vol. 67(2), pp. 658–671, 2019.
[18] Schoinianakis D.: Residue arithmetic systems in cryptography: a survey on
modern security applications, Journal of Cryptographic Engineering, vol. 10(3),
pp. 249–267, 2020.
[19] Sousa L.: Efficient method for magnitude comparison in RNS based on two
pairs of conjugate moduli. In: 18th IEEE Symposium on Computer Arithmetic
(ARITH’07), pp. 240–250, IEEE, 2007.
[20] Sousa L., Martins P.: Sign Detection and Number Comparison on RNS 3-Moduli
Sets {2n − 1, 2n+k, 2n + 1}, Circuits, Systems, and Signal Processing, vol. 36(3),
pp. 1224–1246, 2017.
[21] Szabo N.S., Tanaka R.I.: Residue arithmetic and its applications to computer
technology, McGraw-Hill, 1967.
[22] Torabi Z., Jaberipur G.: Low-power/cost RNS comparison via partitioning the
dynamic range, IEEE Transactions on Very Large Scale Integration (VLSI) Sys-





Sign detection and signed integer comparison for. . . 405
[23] Torabi Z., Jaberipur G.: Low-power/cost RNS comparison via partitioning the
dynamic range, IEEE Transactions on Very Large Scale Integration (VLSI) Sys-
tems, vol. 24(5), pp. 1849–1857, 2015.
[24] Torabi Z., Jaberipur G.: Fast low energy RNS comparators for 4-moduli sets
{2n±1, 2n,m} with ∈ {2n+1 ± 1, 2n−1 − 1}, Integration, vol. 55, pp. 155–161,
2016.
[25] Torabi Z., Jaberipur G., Belghadr A.: Fast division in the residue number sys-
tem {2n+ 1, 2n, 2n-1} based on shortcut mixed radix conversion, Computers &
Electrical Engineering, vol. 83, p. 106571, 2020.
[26] Wang Y.: New Chinese remainder theorems. In: Conference Record of Thirty-
Second Asilomar Conference on Signals, Systems and Computers (Cat. No.
98CH36284), vol. 1, pp. 165–171, IEEE, 1998.
[27] Wang Y., Song X., Aboulhamid M.: A new algorithm for RNS magnitude com-
parison based on new Chinese remainder theorem II. In: Proceedings Ninth Great
Lakes Symposium on VLSI, pp. 362–365, IEEE, 1999.
[28] Youssef M., Emam A.E., Abd Elghany M.: Image multiplexing using residue
number system coding over MIMO-OFDM communication system., International
Journal of Electrical & Computer Engineering (2088-8708), vol. 9, 2019.
Affiliations
Zeinab Torabi
Shahid Rajaee Teacher Training University, Department of Computer Engineering, Tehran,
Iran, z.torabi@sru.ac.ir
Somayeh Timarchi
Shahid Beheshti University, Faculty of Electrical Engineering, Tehran, Iran,
s timarchi@sbu.ac.ir
Received: ??.??.2021
Revised: ??.??.2021
Accepted: ??.??.2021
