A new Binary Number Code and a Multiplier, based on 3 as semi-primitive
  root of 1 mod 2^k by Benschop, N. F.
ar
X
iv
:m
at
h/
01
05
02
9v
1 
 [m
ath
.G
M
]  
3 M
ay
 20
01
A new Binary Number Code and a Multiplier,
based on 3 as semi-primitive root of 1 mod 2k
N.F.Benschop, Geldrop (NL), 3-aug-1997
———– Patent US-5923888 (13jul99)————-
1 Prior art
The usual parallel array multipliers [1, p164] are much too powerful for their purpose, to be
shown as follows. Assuming without loss of generality a square array, the known parallel n× n
bit array multipliers all have a structure consisting of two main parts. An input part with a
2-dimensional array of n(hrz) + n(vrt) bitlines, for the two n-bit input operands x and y, with
an AND-gate at each of the n2 bitline crossings (details for signed TC code are neglected here).
And a processing part, which accumulates this pattern of n2 bits to the required 2n-bit result,
using an array of some n2 Full-Adders (FA). Various types of Adder-array exist, like a normal
array of n rows of n FA’s each (for a compact layout and small silicon area), or the known
’Wallace tree’ [1, p167] (with an irregular and larger layout but less delay), or anything between
these extremes, trading-off total delay and silicon area.
The inefficiency of the usual adder array hardware is easily seen as follows. The adder array
can add any n × n bit pattern of n2 bits (there are 2n.n patterns), while for multiplication of
two n-bit operands only 22n of these are ever input and processed (each n-bit row or column
is either all 0’s or a copy of one operand). So the hardware is used for processing only a very
small fraction 2n+n/2n.n of all possible input patterns it could process. Clearly, the hardware is
much too powerfull for its purpose, and is used very inefficiently. Some recoding schemes have
been applied in the past to improve the efficiency of multipliers.
For instance in the known Booth multiplier [1, p198], each successive bit-pair of one input
operand has value range {0, 1, 2, 3}, where 3 is recoded as −1+ 4. The -1 causes a subtraction
of the other operand, while ’+4’, as positive carry into the next bit-pair position, implies an
addition there. The result is an effective reduction of the logic depth in the add/subtract array,
and a corresponding speed-up, at the cost of a more complex recoding of one operand, and extra
subtract hardware.
A similar recoding scheme, but now for both operands, and based on a deeper algebraic property
of the powers of 3 in the semigroup of binary multiplication M(.) mod 2k, will next be proposed.
2 Proposed new binary number code
A better structure might be found by using the algebraic properties of the closed system (semi-
group) of binary multiplication mod 2k, such as associativity a(bc) = (ab)c, commutativity
ab = ba, and the iterative sub-structures or iteration class a∗ = {ai} of all powers of any number
a. Especially a = 3, which generates the maximum possible iteration class of order 2k−2, to be
proven next. Exploiting this 3* property makes multipliers much more efficient.
For k ≥ 3 bits the powers of 3 generate half of the odd residues. In other words, in binary coded
residues: 3 is a semi-primitive root of unity. A new binary number code based on this property
1
simplifies binary multiplication, and in fact translates it to addition, using base 3 logarithm for
odd residues. The proof is best given by first considering residues mod pk for prime p >2, and
then taking p=2 as special case. Denote a cyclic group of order n by Cn or C[n].
Lemma: For prime p >2, the cyclic subgroup B = (p + 1)∗ mod pk has order pk−1.
Proof: The group of units G of all n with {ni=1} mod pk for some i >0, is known to be cyclic.
Its order (p − 1).pk−1 has two relative prime factors, so G = A × B is a direct product of two
cycles. Here B = (p+1)∗ because (p+1)p = p2+1 mod p3, and by induction (p+1)p
m
= pm+1+1
mod pm+2. The period of p+1, the smallest x with (p + 1)x = 1 mod pk, implies m+ 1 = k, so
m = k− 1, yielding period pk−1. No smaller x yields 1 mod pk since |B| has only divisors ps. ♠
Corollary ( binary 3* property ): For p=2 we have p+1=3, and it is readily verified that 3
does not generate −1 mod 2k if k ≥ 3, since (2 + 1)2 > 23 (in binary code 32=1001), while
(p+1)2 = p2+2p+1 < p3 for all p >2. The carry in binary code is the cause of this phenomenon.
In fact B = C2.C[2
k−2] is not cyclic, with sign 2-cycle C2 = {−1, 1}. Then |3
∗| = 2k−2, with 3
generating only half of the odd numbers mod 2k; the other half are their complements. So each
non-zero residue is n = ±3i.2j mod 2k, with i < 2k−2 and j < k, while n = 0 for j = k. ♠
2.1 Example
For instance mod 32 (k=5) the cycle 3* ={3, 9,−5,−15,−13,−7, 11, 1} has period 8, while
the remaining 8 odd numbers are their complements, with a two-component decomposition
G = C2.C8= {−1, 1} x 3* for all 16 odd numbers, which allows component-wise multiplication.
The 5-bit binary codes of 3i are shown in the next table, as well as for p > 2 the lower significant
digits of (p + 1)p
m
in p-ary code. The logic structure of the few least significant bits of 3i is
rather simple, as boolean functions of the k− 2 exponent bits, but the higher order bits quickly
increase in complexity, showing no obvious structure.
Table 1: The powers of 3 in binary code mod 25, and (p + 1)p
m
in p-ary code:
i 3^i (bin) 3^i (dec) | p>2 (p+1)^i i
1. 00011 3 Notice: 3^even = 1 mod 8 |-------------+----
2. 01001 9 3^ odd = 3 mod 8 | 11 1
3. 11011 27 = -5 so two bits are fixed: | ..101 p
4. 10001 17 = -15 bit(2^0)= 1, bit(2^2)= 0 | ....1001 p^2
5. 10011 19 = -13 hence: |3*| = 2^k / 4 |.....10001 p^3
6. 11001 25 = -7 |
7. 01011 11
8. 00001 1
Table 2: ------------- Multiplier structure ------------
Operands a = sign(a) 3^i.2^j
b = sign(b) 3^r.2^s | sign(p)= XOR(signs)
Product p= a.b = sign(p) 3^t.2^u where: | t= i+r mod 2^{k-2}
| u= j+s < k (saturate at k)
’overflow’
3 Application to multipliers
By the corollary each residue is n = ± 3i. 2j mod 2k (k >2) for a unique pair (i, j) of expo-
nents, with 0 ≤ i < 2k−2 (k-2 bits mantissa) and 0 ≤ j < k (binlog k bits), with n=0 iff j=k.
This 2.3-star number code reduces multiplication to addition of exponent-pairs, because:
2
(3i.2j).(3r.2s) = 3i+r.2j+s, and the 1-bit signs add (mod 2). The multiplier structure is summa-
rized in table 2: the product sign is the XOR or the operand signs, the exponents of 3 add mod
2k−2 using only the k-1-(j+s) least significant bits, and those of 2 add, with saturation at the
chosen maximum precision k.
The input precision k must be taken equal to the desired output precision. For instance, for
an 8 x 8 bit multiplier with 16-bit output, odd input operands are encoded as index i in a
16-bit power 3i. Addition is difficult in this code, so application is suggested for environments
restricted to multiplication mod 2k.
3.1 Signed magnitude binary code over bases 2 and 3
The proposed new number code is a signed magnitude code, well suited for multiplication, and
it uses two bases, namely 2 and 3. As shown, each k-digit binary coded residue n (mod 2k) is
the product of a power 2j of 2 (j ≤ k), called the even part of n, and an odd residue called the
odd part of n, as shown the binary residue of a signed power ±3i of 3 with i < 2k−2.
Exponent pair (i, j) and sign s uniquely encode each nonzero residue from −(2k − 1) to 2k − 1,
while the zero number 0 requires j = k, which can be considered as an extra zero-bit z.
To represent all k-bit binary numbers n (integers), of which there are 2k, a 4-component code
n = [z, s, t, u] is proposed, with the next interpretation:
z : one zero bit, with z = 0 if n = 0 and z = 1 if n 6= 0.
s : one sign bit, with s = 0 if n > 0 and s = 1 if n < 0.
t : k − 2 bits for the exponent t of odd part 3t.
u : e bits for the exponent u of even part 2u (u < k ≤ 2e).
Extra overflow bit v = 1 iff ua + ub ≥ k : in case a product a.b exceeds 2
k−1 in magnitude.
The code of the product of two such coded numbers a = [za, sa, ta, ua] and b = [zb, sb, tb, ub]
is obtained by adding in binary code, by known means, the odd and even code parts t and u
respectively, and adding the signs sa + sb mod 2 (XOR), while multiplying the two zero bits
za.zb (AND). The overflow result bit v = 1 iff the even part overflows: ua + ub ≥ k.
Using for instance the known ’ripple-carry’ way of binary addition hardware with a full-adder
cell FA per bit position, the schematic diagram is as follows, where t, ta, tb, u, ua, ub consist of 3
bits (of weights 20, 21, 22), and the optional overflow bit v = u[2] ∗ (u[1] + u[0]) so iff u ≥ 5:
zb za
❄❄
and
❄
z
sb sa
❄❄
xor
❄
s
22
❄
❄❄
✛ 21
❄
❄❄
✛ 20
❄
❄❄
tb
ta
✲
t
22
❄
❄❄
✛ 21
❄
❄❄
✛ 20
❄
❄❄
ub
ua
✲
u
Fig.1: Example multiplier mod 32 = 25, with code ± 3t.2u (t < 23, u ≤ 5)
Reference: 1. K.Hwang: Computer Arithmetic, J.Wiley & Sons, NY 1979.
3
