High-Speed Area-Efficient Multiplier Design Using Multiple-Valued Current-Mode Circuits by 亀山  充隆
High-Speed Area-Efficient Multiplier Design
Using Multiple-Valued Current-Mode Circuits
著者 亀山  充隆
journal or
publication title
IEEE Transactions on Computers
volume 43
number 1
page range 34-42
year 1994
URL http://hdl.handle.net/10097/46842
doi: 10.1109/12.250607
34 IEEE TRANSACTIONS ON COMPUTERS. VOL. 43, NO. 1 ,  JANUARY 1994 
High-speed Area-Efficient Multiplier Design Using 
Multiple-valued Current-Mode Circuits 
Shoji Kawahito, Member, IEEE, Makoto Ishida, Tetsuro Nakamura, Michitaka Kameyama, Senior Member, IEEE, 
and Tatsuo Higuchi, Fellow, IEEE 
Abstruct- This paper presents a very-large-scale-integration 
(VLS1)-oriented high-speed multiplier design method based on 
carry-propagation-free addition trees and a circuit technique, 
so-called multiple-valued current-mode (MVCM) circuits. The 
carry-propagation-free addition method uses a redundant digit 
set such as (0 ,  1,2 .3}  and (0 ,1 ,2 ,3 ,4} .  The number represen- 
tations using such redundant digit sets are called redundant 
positive-digit number representations. The carry-propagation- 
free addition is written by three steps, and the adder can be 
designed directly and efficiently from the algorithm using MVCM 
circuits. The designed multiplier internally using the MVCM 
parallel adder with the digit set ( 0 , l .  2 ,3}  in radix 2 has attrac- 
tive features on speed, regularity of the structure, and reduced 
complexities of active elements and interconnections. A prototype 
CMOS integrated circuit of the MVCM parallel adder has been 
implemented, and its stable operation has been confirmed. Other 
possible schemes of multipliers with redundant digit sets using 
MVCM technology are discussed. 
Index Terms-Area-efficient design, carry-propagation-free ad- 
dition, high-speed multiplier, multiple-valued current-mode cir- 
cuits, redundant number representations, tree structure, VLSI 
I. INTRODUCTION 
DVANCES in microelectronics have made possible high- A speed parallel multipliers as macrocells in various very- 
large-scale-integration (-1) chips. High-speed multipliers 
are usually built up by tree structures, because the multi- 
plication time is in the fastest class of multiplication; that 
is, it is proportional to the logarithm of the operand length. 
Wallace [I]  first suggested a tree scheme for the parallel 
multipliers using full adders [(3, 2) parallel counter]. Dadda 
[2], [3] generalized the idea of using full adders by introducing 
the concept of ( p , q )  parallel counters, which count up the 
number of 1’s from the same column of the partial product 
matrix. Alternatively, high-speed multipliers are possible using 
a tree of carry-propagation-free parallel adders such as those 
based on the signed-digit number representations [4], [5] using 
a redundant digit set { - l , O ,  1) and the carry-save number 
Manuscript received October 21, 1991; revised April 2,1992, and July 20, 
1992. This work was supported in part by a Grant-in-Aid for Young Scientists 
03750350 from the Ministry of Education, Science and Culture of Japan. 
S. Kawahito is with the Department of Information and Computer Sciences, 
Toyohashi University of Technology, Toyohashi 441, Japan. 
M. Ishida and T. Nakamura are with the Department of Electrical and 
Electronic Engineering, Toyohashi University of Technology, Toyohashi 441, 
Japan. 
M. Kameyama and T. Higuchi are with the Department of System Informa- 
tion Sciences, Graduate School of Information Sciences, Tohoku University, 
Sendai 980, Japan. 
IEEE Log Number 9208773. 
representation [6] using a redundant digit set {0,1,2}. These 
multipliers have a rather simple and regular layout compared 
with that of the Wallace tree multiplier. In the design of such 
high-speed multipliers in VLSI, it is important to consider 
not only the reduction of the number of gate delays in the 
critical signal path, but also the regularity of the structure and 
the reduction of the complexity. These reduce the length of 
interconnections and the corresponding delays. In addition to 
the design of a good parallel structure, we should consider the 
use of special circuit techniques such as quasidigital circuits 
[7] for the efficient design of high-speed multipliers in VLSI. 
In this paper, we present a design method of high-speed 
area-efficient VLSI multipliers based on regular tree structures 
with carry-propagation-free parallel adders and a circuit tech- 
nique, the so-called multiple-valued current-mode (MVCM) 
circuitry. For the carry-propagation-free additions, we uti- 
lize a group of redundant number representations, so-called 
redundant positive-digit number representations [8], where 
each digit is allowed not only the ordinary digit values 
( O , 1 ,  . . . , r - l), but also the redundant positive-digit values 
in radix r ( T ,  T + 1,. . . , q -- 1, y; y is a positive integer as 
y 2 r ) .  By using a redundant number representation with 
the digit set {0,1,2,3} in radix 2, for instance, two-operand 
carry-propagation-free addition can be written by the following 
three steps in each digit position. 
1) Linearly sum up input digits. 
2) Generate carries and an intermediate sum from the linear 
3) Linearly sum up the intermediate sum and the carries 
A multiplier internally using the carry-propagation-free ad- 
dition has a binary-tree structure. A multiplier with a similar 
structure can also be designed with the conventional (6, 3) 
parallel binary counter, which has six inputs and three outputs. 
Conversely, the use of the redundant number representations 
and the three-step expression of the parallel addition is quite 
useful especially when designing the multipliers with the 
special circuit technology of MVCM. By using the MVCM 
circuit technology, the linear summations of Steps 1 and 3 can 
be performed by wiring without active devices. This property 
allows the resulting circuit configuration to be quite simple. 
Step 2 needs some specific MVCM circuits. Step 3 performs 
the conversion to a multiple-valued digit, and is actually 
effective for reducing the number of output wires to one per 
digit. Since multiple-valued information can be carried by a 
line, the complexity of interconnections is reduced greatly. In 
sum. 
from lower positions. 
0018-9340/94$04.00 0 1994 IEEE 
- _ _ _  ~- - 
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on March 03,2010 at 02:26:36 EST from IEEE Xplore.  Restrictions apply. 
KAWAHITO et d.: MULTIPLIER DESIGN 35 
the case of using a (6, 3) parallel binary counter, the operation 
corresponding to Step 3 is not defined, and it has three 
output wires per digit. Since the MVCM carry-propagation- 
free adders can be used effectively in the multiplier design, the 
complexities of interconnections and transistors can be reduced 
greatly compared with the corresponding binary CMOS logic 
implementations. 
The carry-propagation-free parallel adder with digit set 
{0,1,2} can be designed with MVCM circuits. However, in 
our present design, the addition method with {0,1,2,3} is 
more suitable rather than that with {0,1,2}, because the addi- 
tion with {0,1,2} has to be written by five steps. The carry- 
propagation-free three-operand addition is written by three 
steps with the digit set {0,1,2,3,4} and can be designed by a 
similar technique. The multiplier employing this three-operand 
parallel adder has a ternary-tree structure. Generalizing a carry- 
propagation-free multiple-operand parallel adder with radix 
r,  and multipliers with a multiple-operand addition tree are 
realized with corresponding redundant positive-digit number 
representations. The application of the MVCM technology to 
multipliers with such multiple-operand adders might be a very 
powerful method to realize high-speed area-efficient VLSI 
multipliers, although the implementation needs some efforts 
in present VLSI technology. 
This paper is organized as follows. In Section 11, a general 
form of the carry-propagation-free multiple-operand addition 
and a typical example of parallel addition with the digit set 
{0,1,2,3} are described. Section 111 treats the design of the 
carry-propagation-free adders with MVCM circuit technology, 
as the basis for the multiplier design. The result of implemen- 
tation of a prototype CMOS integrated circuit is also shown. In 
Section IV, we describe the multiplication algorithm and the 
multiplier design with MVCM circuits. The performance of 
several types of multipliers are compared from the viewpoint 
of VLSI implementations. 
11. CARRY-PROPAGATION-FREE ADDITIONS 
A. Redundant Number Representations and 
Multiple-Operand Additions 
For unsigned numbers, the radix r ( r  2 2) redundant 
positive-digit number representations use digit sets of q + 1 
values, {0,1,2,.  . , r - 1, r,  . . . , q}, where q is a positive in- 
teger such that q > T .  These are called, hereafter, the PD(r ,  q) 
representations. For example, the PD( 2,3) representation 
means a radix-2 redundant positive-digit representation of 
which each digit belongs to {0,1,2,3}. Any n-digit positive 
integer X is denoted by X = ( ~ - 1  . . .xi . . . Z O ) P D ( ~ , ~ )  and 
has the value X = Cy: zirz, where xi E {O, .. . , q}, (i = 
0,. . . , TI - 1). The n-digit PD(r ,  q) number takes the value 
in the range 0 5 X 5 q x (r” - l)/(r - 1). 
With the P D ( r , q )  representation as the input and the 
output numbers, a carry-propagation-free addition of mul- 
tiple operands is performed. The inputs are Mn-digit un- 
signed integers of the PD(r ,q )  representation as Xj = 
(z!?~ . . . ~ ~ ) . . . z ~ ) ) ~ ~ ~ r , ~ ) ( ~  = l , . . . , M ) ,  where zi E 
TABLE I 
PERFORM THREE-STEP CARRY-PROPAGATION-FREE ADDITION 
POSSIBLE CHOICES OF q.  -11. L.  AND p FOR r = 2.3. AND 4 TO 
(0.1,. . . , q } ( z  = 0,1, .  . . , n - 1). The addition can be written 
by the following three steps for the zth digit. 
Step 1: Linearly sum up the z th digits of M operands as 
follows: 
2, = $) + $1 + . . . + . . . + zz(M), (1) 
where z ,  E {O.l,...,q,q+l,...,2q,2q+l,...Mq-l,Mq} 
is a linear sum digit. 
Step 2: Generate carry digits for IC places to the left (more 
significant digit position), ~ z ( ~ ) ,  ( I C  = 1,2,.  . , L,  where L is 
the number of carry digits) and an intermediate sum digit w, 
from the linear sum digit z, (z = 0,1,.  . , n - 1): 
L ( L ) + + l  c, (L-1) +. . .+r k c, ( k )  +. . . + r c y + w ,  = z,, (2) 
?- c, 
where c Z ( ~ )  E {O,l,...,p},(l < p 5 r -  l ) , ~ ! ~ )  E 
{O,l,.,..r-l},(IC= l,2,...,L-l),andwZ E {O,l,...,r- 
Step 3: Linearly sum up w, and the carry digits from the 
(z - k)th position on the right (less significant digit position), 
1). 
, ( k )  2 - k  ( I C  = 1,2,.  . , L ) :  
s, = wi + c p l  + c p 2  + . . . + c p k  t ’ + c y i .  (3) 
The ith carries, cik) ( I C  = 1, . . . , L ) ,  are determined by the 
linear sum zi of (1) and are independent of the other linear 
sum digits. A final sum digit depends on the carry digit of at 
most the (i - L)th position on the right. Therefore, the addition 
of M operands is performed in a constant time independent of 
the length of the operands, since the number of carry digits, 
L ,  is also independent of the length of the operands. Since the 
final sum, s,, should be in the same digit set of the inputs, 
{0,1, . . . , q}, the value q is chosen as 
q=(r-l)x L + p ,  (4) 
from (2) and (3). Since the maximum value of zi, M x q, must 
be coded with the carry and the intermediate sum digits by (2), 
the following relation should be satisfied: 
M x q < p  x r L  + ( r  - 1) x ( rL- l  + rLP2 + .  . . + r + 1) 
= rL  x ( p  + 1) - 1. (5) 
When r and q are given, the minimum number of carry 
digits, L,  and the maximum number of input operands, M ,  
are determined by satisfying (4) and (9, and the typical 
relationship is shown in Table I. For example, the radix- 
2 three-operand carry-propagation-free adder is realized with 
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on March 03,2010 at 02:26:36 EST from IEEE Xplore.  Restrictions apply. 
36 IEEE TRANSACTIONS ON COMPUTERS, VOL. 43, NO. 1, JANUARY 1994 
Fig. 1. Two-operand parallel P D ( 2 , 3 )  adder. 
q = 4 , p  = 1, and L = 3. In the case of radix 2, (4) and (5) 
are simplified to q = L + 1 and M x q 5 29 - 1, respectively. 
B. Radix-2 Two-Operand Additions 
For radix-2 carry-propagation-free additions, there are two 
possible choices of redundant positive-digit representations 
with digit sets {0,1,2,3} and {0,1,2},  where those 
are PD(2 ,3 )  and PD(2 ,2 )  representations, respectively. 
The addition of two integers of the PD(2 ,3 )  repre- 
sentation, X = (xnPl . . . zi . . . X O ) P D ( ~ , ~ )  and Y = 
(Y~-~...?/~...YO)PD(Z,~), where Xi,yi  E {0,1,2,3}, ( 2  = 
0 , 1 , .  . . , n - l ) ,  is performed by the aforementioned three- 
step multiple-operand addition method for the case of 
M = 2 , r  = 2,q = 3, and L = 2. The addition is written as 
follows for the ith digit. 
Step 1: Linearly add up xi and yi as follows: 
where zi E {0,1, . . . , 6 }  is a linear sum digit. 
Step 2: Generate an intermediate sum digit 
digits C ~ ~ ) , C ~ ~ )  from zi: 
4c,(2) + 2 c p  + wi = z ; ,  
where ci2), cjl),  wi E {0,1}. 
(i - 1)th and (i - 2)th positions on the right: 
Step 3: Linearly add up wi and the carry 
Obviously, any z ,  can be uniquely encoded by 
(6) 
wi and carry 
(7) 
digits of the 
(8) 
ci2), c!’), and 
w, by (7). It is clear that a final sum digit s, belongs to 
{0,1,2,3} from (S), so that the final result is also an operand 
of the PD(2 ,3 )  representation. Since w,, c!:l, and c!?)~ are 
determined by, respectively, input digits (x,, y z ) ,  ( ~ ~ - 1 ,  y,-l), 
and (zz-2,yz-2), s, depends on only these six digits. This 
property allows the carry-propagation-free addition of two 
operands. 
Fig. 1 shows the structure of a two-operand adder with 
the PD(2 ,3 )  representation, here after the parallel PD(2 ,3 )  
adder. In Fig. 1, LS1, PDA(2, 3), and LS2 denote cells that 
correspond to (6), (7), and (S), respectively. It is clear that 
each final digit is determined by six input digits. 
For the addition of two signed numbers, we utilize redundant 
representations having a negative weighting factor at the most 
significant digit, and the operation of Step 2 at the most 
significant digit position is changed as follows: 
(9) -4Cn-1 (2) + 2C,-l ( 1 )  + wn-1 = --&-I, 
where c r i l  E {0,1,2} and C : ? ~ , W ~ - ~  E { O , l } .  A final 
sum digit sn+l = c??~ has a negative weighting factor. 
The arithmetic is quite similar to ordinary 2’s-complement 
Arithmetic. 
In the case of using the P D ( 2 )  representation (or carry-save 
representations), the carry-propagation-free addition is written 
by the following five steps: 
Step 1: 
z ,  = z, + y,. (10) 
Step 2: 
Step 3: 
Step 4: 
Step 5: 
S; = U, + d .  2 - 1 .  
Here x;, yi E {0,1,2} the input digits, zi E {0,1,2,3,4} 
the input linear sum digit, t i  E {0,1,2,3} the intermediate 
linear sum digit, c,(’), cil), di E (0, l} the carry digits, and 
w;, zi, E {0,1} the intermediate sum digits, and si E {0,1,2} 
the final sum digit. 
(14) 
111. CARRY-PROPAGATION-FREE 
ADDERS WITH MVCM CIRCUITS 
The multiple-valued current-mode circuits are attractive 
to implement arithmetic VLSI very compactly, because the 
linear summations often used in arithmetic operations can 
be performed by wiring without active devices [9]-[ll]. Fig. 
2(a) shows a parallel P D ( 2 , 3 )  adder designed with MVCM 
circuits. In this scheme, the linear summations of (6) and 
(8) can be performed by wiring without active devices. Each 
signal level is represented by a multiple of unit current I,. 
The PDA(2,3)  cell of Fig. 2(a) performs the operation of 
(7) and is designed with 28 transistors as shown in Fig. 2(b) 
[SI. The current source is realized with the output of PMOS 
current mirrors. NMOS current mirrors whose symbol and 
circuit configuration are shown in Fig. 3 are useful to generate 
the replicas of an input current at the multiple outputs and for 
scaling the current level. In the NMOS current mirror of Fig. 2, 
the current levels are reduced to half at the outputs. The current 
levels are detected by current differencing between each of the 
half-scaled currents, ( z ;  + 0.5)1,/2 and each of six types of 
threshold currents from 0.51, to 3.01,. The voltages of nodes 
of which threshold current sources and the outputs of NMOS 
current mirrors are connected become “high” or ‘‘low’’ levels 
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on March 03,2010 at 02:26:36 EST from IEEE Xplore.  Restrictions apply. 
KAWAHITO ef aL: MULTIPLIER DESIGN 37 
Fig. 2. Parallel PD(2.3)  adder with MVCM circuits: (a) adder; (b) 
PDA( 2 , 3 )  cell. q;; 
t 
c 
Yn 
(a) (b) 
Fig. 3. NMOS current mirror: (a) Symbol. (b) Circuit configuration. 
following the result of threshold detections. The final outputs 
are generated by switching current signal paths with the 
combinational circuits of NMOS and PMOS transistors. The 
propagation delay times from the input to the outputs of the 
designed P D A ( 2 , 3 )  are estimated by the simulation using the 
SPICE program under the 5-pm CMOS LSI implementation. 
In the case of the unit current of 70 pA, the maximum delay 
time is estimated to be 63 ns. 
To confirm the principal operations of the MVCM P D ( 2 , 3 )  
adder, we have implemented a prototype integrated circuit 
based on 5-pm CMOS technology. Fig. 4 shows the pho- 
tomicrograph of the implemented integrated circuit of the 
PDA(2,3)  cell and the measured dc current-transfer char- 
acteristics. The unit current is 50 PA. The measured char- 
acteristics agree with those expected as the function of the 
PDA(2,3)  cell. 
The two-operand parallel adder with the carry-save repre- 
sentation or the P D ( 2 , 2 )  representation can be designed by 
the MVCM circuits. Fig. 5(a) shows the parallel adder with 
the P D ( 2 , 2 )  representation. Two basic cell types shown in 
Fig. 5(b) and (c) are needed because the addition procedure 
is given by five steps of equations, in contrast with the three- 
step addition with the P D ( 2 , 3 )  representation. The maximum 
delay times of basic cells, PDA1(2 ,2)  and PDA2(2 ,2) ,  are 
estimated to be 49 and 41 ns, respectively, for the unit current 
of 70 PA. The basic cells, PDA1(2,2)  and PDA2(2 ,2) ,  are 
composed of 21 and 15 transistors, respectively. 
Other carry-propagation-free parallel adders can also be 
designed with the MVCM circuits by the extension of the 
design method of the P D ( 2 , 3 )  parallel adder. Fig. 6 shows 
a generalized configuration of the PD(r ,  q)  parallel adder 
having multiple inputs designed with the MVCM circuits. The 
linear summations of (1) and (3) are performed by wiring, 
(b) 
Fig. 4. Implementation of a prototype CMOS integrated circuit: (a) photomi- 
crograph of the implemented PDA(  2 . 3 )  cell; (b) measured current-transfer 
curves of the PDA( 2 , 3 )  cell. 
0.5 1 1.5 1 1 
(c) 
Fig. 5 .  Parallel PD(2.2)  adder (carry-save adder) with MVCM circuits: (a) 
adder; (b) PD.Al(2.2) cell; (c) PDA2(2,2)  cell. 
and the implementation of (2) needs some specific MVCM 
circuits. For example, the three-operand parallel PDA( 2 ,4)  
adder cell is designed with 52 transistors. Since the basic cell 
treats with 13 current levels from 0 to 121a1 it should consider 
the reduced noise immunity of the circuits. 
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on March 03,2010 at 02:26:36 EST from IEEE Xplore.  Restrictions apply. 
38 
Fig. 6. 
Multiulier 
SlEPI 
Multiplicand 
PD(2,3) adder 
STEP4 PJJ-wBINARY Convma 
Product 
PD( 2.3) representation. 
Fig. 7. Block diagram of the 12 x 12-bit multiplier internally using the 
Table I11 shows the comparison of several addition cells of 
parallel adders and parallel counters. The equivalent number 
of gate delays are estimated by using the gate delays of 
several simple gates compared with two-input CMOS NOR 
gates for fan-out of 2(12 ns in 5-pm CMOS), which were 
estimated by SPICE simulation as shown in Table 11. The 
number of transistors required and the number of input/output 
wires are also shown in Table 111. These results are useful for 
the performance estimations of parallel multipliers using these 
basic addition cells. 
IV. MULTIPLIER DESIGN 
A. Algorithm 
High-speed multiplication can be achieved by internally 
using the multiple-operand adders with the PD(r ,  q )  rep- 
resentation by means of a tree structure. For simplicity, a 
multiplication internally using two-operand PD( 2,3) adders 
is considered here. The number of digits, n, is assumed to be 
3 x 2‘, IC = 1 ,2 ,  . . . . For instance, the operand length is 6, 
12, 24, 48, etc. The inputs are n-bit unsigned binary integers 
as X = (xn-1. . .x1. . .xo)2 and Y = (yn - l . . . y j . . . yg )2 ,  
where xi E {0 , l } ( i  = O , l , . . . , n -  1) and y j  E ( 0 , l ) j  = 
0,1,  . . .  , n  - 1). The output is a (2n)-bit unsigned inte- 
ger as P = ( P ~ ~ - I . . . P ~ . . . P O ) ~ ,  where pi E {0,1)( i  = 
0,1 , .  . . , 2 n  - 1). 
The multiplication algorithm is as follows: 
Step 1: Generate n x n partial products, p z , j  (i = 
0 , 1 , . . . , n -  1, j  = O , l , . . . , n -  1): 
Pi,j = Xi X Y i i  
where pi , j  E ( 0 , l )  (i = 0, l , . . . , n  - 1,j  = 
0,1,  * . . , n - 1). Thus, n partial-product operands, 
are generated. 
(15) 
Pj = ( P  n-1,j . . . ~ i , j  . . P I , ~ P O , ~ ) Z  ( j  = O i l , .  ‘ 1  n - 1) 
IEEE TRANSACTIONS ON COMPUTERS, VOL. 43, NO. 1, JANUARY 1994 
TABLE I1 
EQUIVALENT GATE DELAYS OF CMOS GATES COMPARED 
WITH TWO-INPUT NOR GATE DELAY 
I I  f a n 0 U l  I 
0.68 0.84 1.0 1.16 1.32 
Tb 0.79 1.0 1.2 1.6 
Step 2: Convert n partial-product operands into n’ operands 
of the P D ( 2 , 3 )  representation by linearly summing up each 
digit of three partial-product operands (i = O , l , .  . . , n+ 1, j = 
0 , 1 , .  . . , n’ - l), where n’ = n / 3  is a positive integer: 
- where PI,, = P-2,3 = 0 and pn,, = pn+l,j  - 
O ( j  = 0 , 1 , . . . , n  - 1). Since p!:) E {0,1,2,3},n’ 
operands of the PD(2 ,3 )  representation, Pjo) = 
(0) (0) ( 0 )  
are generated. 
Step 3: Add up the partial-product operands by means of a 
tree of parallel PD(2 ,3 )  adders and obtain the product P(L)o  
where L is log, n’. Perform all additions in the P D ( 2 , 3 )  
representations in parallel at each level in the tree. Namely, 
perform two-operand additions of Pg) ,  . . . , P$Ll in parallel 
for j = O, l , . . . , n ’ /2 ‘  at the kth level (k = 1 , 2 , . . . . L ) ,  
where Pjk) denotes the j th  intermediate result at the kth level 
in the addition tree. 
Step 4: Convert the product P e )  of the P D ( 2 , 3 )  repre- 
sentation to the equivalent binary number {P}, where {P} is 
the final result. 
In this algorithm, the partial product generations in Step 1 
and the linear summations of Step 2 can be performed com- 
pletely in parallel, independent of the length of the operands, 
n. The additions of partial-product operands in each tree 
level can be performed in a constant time independent of 
the length of the operands by using the PD(2 ,3 )  adders. 
Therefore, the computation time for the total addition of 
partial-product operands to generate a single product operand 
in the PD(2 ,3 )  representation is proportional to the levels of 
adders in the tree. The number of adder levels is given by 
[log, (n/3)1 , where [CAI denotes the smallest integer such that 
[CAI 2 a. The conversion from the PD(2 ,3 )  to binary number 
representations can be performed speedily in a computation 
time proportional to log, n by using conventional binary 
adders such as a carry lookahead adder. In this way, O(1og n)  
time multiplication can be achieved. 
Fig. 7 shows an example of a 12 x 12-bit unsigned integer 
multiplier with the parallel P D ( 2 , 3 )  adders. In this case, 
the tree structure is composed of two levels of the P D ( 2 , 3 )  
adders. At the first level of the tree, six operands are added 
in parallel by a PD(2 ,3 )  adder. This multiplier can be 
implemented by a regular structure in VLSI similar to a 
(P,+I,, “ ‘ p ~ ~ ) ” ’ p l . , ~ 0 , , ) P ~ ( 2 , 3 )  j  = O. l . ” . . n ’  - 
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on March 03,2010 at 02:26:36 EST from IEEE Xplore.  Restrictions apply. 
UAWAHITO et 01.: MULTIPLIER DESIGN 
Binary 
Logic 
CMOS 
~ 
39 
gate delays’ transistors input wires output wires 
Binnry full adder 3.1 24 3 2 ~ 4 1  
Signed digit 6.3 42 4 2 [51 
(7,3) counter 5.6 110 7 3 ~ 9 1  
Carry save. 6.2 48 4 2 
PD(2.3) 6.0 56 4 2 
TABLE 111 
COMPARISON F VARIOUS ADDITION CELLS 
I Technology I Addition cell 1 Number of I Number of I Number of I Number of 1 References 1 
Mu1 t i ple-Valued 
Current-mode 
Circuits 
I ,  
PD(2,4) 10.1 146 9 3 
P D  (2,2) 7.5 36 1 1 
Signed digit 9.0 34 1 1 ~ 7 1  
(Carry save) 
PD(2.3) 5.3 28 1 1 
PD(2.4) 52 1 1 
Multiplicand(1 2bios) 
I I I I I I I I I I I I I I I I I I I I I I I I  
PD-to-BINARY Converter 
I I I I I I I T I I I I I I I I I I I I I ~ I  
Product(24bits) 
Fig. 8. 12 x 12-bit multiplier internally using the MVCM P D ( 2 . 3 )  adders. 
parallel multiplier with a signed-digit representation and a 
carry-save representation [5], [6]. 
This multiplication algorithm can be applied easily to 2’s- 
complement binary integer multiplication, and the modified 
Booth’s algorithm can be used [12]. In this case, the number 
of adder levels is further reduced to [log, (n/6)1. 
B. Design 
Fig. 8 shows an example of the 12 x 12-bit multiplier 
internally using the MVCM parallel P D ( 2 , 3 )  adders. The 
multiplier and the multiplicand inputs are of 2’s-complement 
binary number representations, and the modified Booth’s al- 
gorithm is applied to the multiplier input. In Fig. 8, a block 
denoted by “Booth Recoder” performs to recode the multiplier 
input into six groups. The outputs of the j th  group have three 
two-valued signals; no-shift signal SO(j), 1-bit-shift signal 
Sl(j), and complement signal C(j) .  Fig. 9(a) generates a par- 
tial product p t , j  as a current-mode signal from the ith digit of 
the multiplicand (q), (SO(j), Sl(j) ,  C(j) ,  and a signal from 
the partial product generator on the right. Fig. 9(b) generates 
partial products at the most significant digit position. These are 
denoted by P. Blocks C in Fig. 8 are for adding 1 at the least 
significant position when the negations of the multiplicand are 
performed by 2’s-complement representations. The circuit is 
shown in Fig. 9(d). Not to exceed the allowed digit value of 
the PD(2 ,3 )  representations, the additions of 1’s to the least 
significant digit are performed based on the fact that adding 1 
to the ith digit position is equivalent to adding 1 and 2 to the 
(i - 1)th and (i - 2)th positions, respectively, or adding 1’s 
to the (i = 1)th and (i - 2)th positions and 2 to the (i - 3)th 
position. 
In Fig. 8, six partial products in each digit are summed 
up by wiring, and the summed currents flow into blocks 
denoted by A in Fig. 8, where block A is the PDA(2,3)  cell 
of the MVCM PD(2 ,3 )  adder shown in Fig. 2. The block 
A(MSB) is the PDA(2 ,3 )  cell for the most significant bit, 
which can also be designed by minor change of the circuit 
in Fig. 2(b). At the outputs of the adder, an operand of 
the P D ( 2 , 3 )  representation is obtained as an intermediate 
product. The intermediate result is converted to an equivalent 
2’s-complement binary number as a final result, which is 
performed by the PD-to-Binary Converter in Fig. 8. The 
converter consists of decoders shown in Fig. 9(c) that convert 
four-valued current-mode signals into two-valued voltage- 
mode signals and a two-operand 2’s-complement binary adder, 
where a high-speed adder such as a carry-lookahead adder is 
used [13]. 
In the multiplier with the three-operand PD(2 ,4 )  adder 
tree, application of the MVCM circuits is also attractive. Fig. 
10 shows an example of a 72-bit 2’s-complement multiplier 
internally using three-operand parallel PD(2,4)  adders and 
employing the modified Booth’s algorithm. Similarly in the 
multiplier with the PD(2,3) ,  every 12 operands can be added 
by a parallel PD(2 ,4 )  adder at the first tree level. The entire 
structure of the multiplier is of a ternary tree. Since the outputs 
of 12 partial products can be summed by a single wire in each 
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on March 03,2010 at 02:26:36 EST from IEEE Xplore.  Restrictions apply. 
- 
40 
Binary 
CMOS 
IEEE TRANSACTIONS ON COMPUTERS, VOL. 43, NO. 1 ,  JANUARY 1994 
gate delays' transistors inteconnections 
Carry-save array 65.2 12,200 850 
WaUace tree 49.6 13,900 970 
Signed-digit addition tree 52.3 13,900 900 
(7.3) counter 45.3 17.700 810 
TABLE IV 
I Technology 1 Parallel Multiplier I Number of 1 Number of I Number of I 
COMPARISON F VARIOUS 24 X 24-BIT PARALLEL MULTIPLIERS 
Logic 
Multiple-valued 
Current-Mode 
Circuits 
. I  
Carry-save addition tree 49.6 13,600 710 
PD(2,3) addition tree 46.1 14,000 780 
PD(2,4) addition tree 47.0 16,500 890 
Signed-digit addition tree 58.6 10,900 180 
PD(2,Z) addition tree 53.1 10,800 200 
(Carry-save addition tree) 
PD(2,3) addition tree 38.2 8,500 150 
PD(2,4) addition tree 9,300 140 
(c) ( 4  
Fig. 9. Building blocks of the multiplier of Fig. 8: (a) Partial product 
generator; (b) partial product generator (MSD); (c) decoder; (d) compensator. 
digit, the great reduction of the number of interconnections is 
expected compared with the corresponding design with binary 
logic gates, or parallel counters. 
C. Comparison 
Table IV compares various multipliers in 24-bit precision. 
For all the multipliers, inputs are 2's-complement binary 
numbers and the modified Booth's algorithm is employed. 
As for the two-operand adder at the final stage of all the 
multipliers, a carry-lookahead adder with iterative circuits 
is used [13]. The number of interconnections between cells 
shown in Table IV include only those connecting between 
basic cells, such as the adder cell and partial product generator 
cell, and exclude the interconnections from both multiplier and 
multiplicand inputs to partial-product generators, because these 
interconnections are common in all the multipliers. We should 
consider the total performance with respect to critical signal 
path delay, complexities of gates and interconnections, and 
regularity of the structure. 
The carry-save array multiplier [ 141 has the best regularity 
of layout. However, the multiplication time is not reasonable 
when the operand length is very long. The other multipliers 
belong to the same class of multiplication whose time is 
Multiplier (72 bits) 
- 
u 
n 
N 
v 
$ 
d a 
u - 
5 
Product (144 bits) 
Fig. 10. Block diagram of 72-bit multiplier internally using the MVCM 
PD( 2 . 1 )  adders. 
proportional to the logarithm of the operand length. We can 
hardly say which type is outstanding as for the gate delays of 
the critical signal path. A Wallace tree multiplier [l] is one 
of the traditional high-speed multipliers. A drawback of the 
Wallace tree is the complexity of the structure and the intercon- 
nections. In the multipliers with parallel counters introduced 
by Dadda, the regularity of the structure is improved compared 
with the Wallace tree. Recently, VLSI multipliers with (7, 
3) counters [15], [7], and the use of (9, 2) counters have 
been proposed [ 161. However, the interconnection schemes are 
not always regular. The multipliers with the radix-2 signed- 
digit (redundant binary) adder tree [5], [17] and carry-save 
adder tree (4 : 2 compressor tree) 161, [18] are excellent for 
the regularity of the structure, including the regularity of the 
interconnection schemes. 
The multipliers with PD(2 ,3 )  adder tree designed with 
MVCM circuits are quite excellent for both speed and com- 
pactness. This is because of a good combination of the 
addition method and the MVCM circuit technique. The reg- 
ularity of layout is rather good, similar to those of the 
redundant binary and carry-save adder trees. In the design 
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on March 03,2010 at 02:26:36 EST from IEEE Xplore.  Restrictions apply. 
KAWAHITO et al.: MULTIPLIER DESIGN 41 
of the signed-digit addition cell, a circuit technique using 
multiple-valued bidirectional current levels [lo], [ 171 has 
been developed. By using this technique, the addition cell 
can be designed directly following the basic signed-digit 
addition algorithm. In the design of the PD(2, a ) ,  PD(2.3),  
and the PD(2,4) adders, the addition cells can be designed 
directly following the basic addition algorithm by using single 
directional current levels [8]. This property allows to design 
with simpler circuit configurations. The PD(2,2) adder de- 
signed with MVCM circuits has two levels of basic circuits 
as shown in Fig. 5, whereas the PD(2,3) and PD(2,4)  
adders have a single level of basic circuits. The PD(2,4)  
adder is a powerful module to reduce the complexity of 
interconnections and to achieve fewer levels of tree structure 
of the multipliers. However, the practical implementation 
needs some efforts to assure reasonable noise immunity. In 
our present design with MVCM technology, the PD(2,3)  
adder offers the best speed performance among various types 
of parallel adders with MVCM circuits as shown in Table 
111. Furthermore, for 24-bit precision, the number of adder 
levels of the multiplier with the PD(2,3)  addition tree is 
two compared with three adder levels of multipliers with 
the signed-digit and the PD(2,2) adder (carry-save adder) 
trees. Thus the multiplier with the PD(2,3)  adder tree is 
best. 
V. CONCLUDING REMARKS 
We have presented a very-large-scale-integration (VLS1)- 
oriented multiplier design method based on a circuit tech- 
nique of multiple-valued current-mode (MVCM) circuits. We 
have introduced a generalized carry-propagation-free multiple- 
operand addition method based on the redundant number rep- 
resentations using unsigned digit values. The addition method 
is quite useful for the efficient design of the parallel adders and 
the parallel multipliers with MVCM circuits. The multiplier 
designed internally using the MVCM parallel adder with a 
redundant digit set {0 ,1 ,2 ,3}  has good total performance 
with respect to speed, regularity of the structure, and reduced 
complexities of active devices and interconnections. A pro- 
totype CMOS integrated circuit for examining the principal 
operation has been implemented, and we have shown that 
the MVCM adder can be realized by standard CMOS LSI 
technology. For demonstration of the actual performance of the 
proposed multipliers with the MVCM circuits, implementation 
with VLSI technology will be necessary. We also need addi- 
tional studies including not only the hardware algorithm and 
circuit technique, but also the development of VLSI devices 
and fabrication technology essentially suitable for MVCM 
circuits. 
ACKNOWLEDGMENT 
The authors appreciate the anonymous referees for their con- 
structive comments that helped to improve the paper. Thanks 
also to M. Ashiki of Toyohashi University of Technology for 
helpful discussions. 
REFERENCES 
[ 11 C. S. Wallace, “A suggestion for a fast muItipIier,”IEEE Trans. Electron. 
Comput., vol. EC-13, pp. 14-17, Feb. 1964. 
[2] L. Dadda, “Some schemes for parallel multipliers,” Alfa Freq., vol. 34, 
pp. 349-356, Mar. 1965. 
[3] -, “On parallel digital multipliers,”Altu Freq., vol. 45, pp. 574-580, 
1976. 
[4] A. Avizienis, “Signed-digit number representations for fast parallel 
arithmetic,” IRE Trans. Elect. Comput., .vel. EC-10, pp. 389400, Sept. 
1961. 
[5] N. Takagi, H. Yasuura, and S. Yajima, “High-speed VLSI multiplication 
algorithm with a redundant binary addition tree,” IEEE Trans. Comput., 
vol. C-34, pp. 789-796, Sept. 1985. 
[6] J.  E. Vuillemin, “A very fast multiplication algorithm for VLSI imple- 
mentation,” Integration, VLSI J., vol. 1, pp. 39-52, Apr. 1983. 
[7] E. E. Swartzlander, “Parallel counters,” IEEE Trans. Comput., vol. C-22, 
[SI S. Kawahito, K. Mizuno, and T. Nakamura, “Multiple-valued current- 
mode arithmetic circuits based on redundant positive-digit number 
representations,” in Proc. Int. Symp. Multiple-Valued Logic, Victoria, 
Canada, May 1991, pp. 330-339. 
(91 M. Kameyama and T. Higuchi, “Design of radix-4 signed-digit arith- 
metic circuits for digital filtering,” in Proc. Int. Symp. Multiple-Valued 
Logic, June 1980, pp. 272-277. 
[ 101 S. Kawahito, M. Kameyama, T. Higuchi, and H. Yamada, “32 x 32 bit 
multiplier using multiple-valued MOS current-mode circuits,” IEEE J.  
Solid-state Circuits, vol. SC-23, pp. 124-132, Feb. 1988. 
[ l l ]  M. Kameyama, S. Kawahito, and T. Higuchi, “A multiplier chip 
with multiple-valued bidirectional current-mode logic circuits,” IEEE 
Computer, vol. 21, pp. 43-56, Apr. 1988. 
[12] L. P. Rubinfield, “A proof of the modified Booth’s algorithm for 
multiplication,” IEEE Trans. Comput., vol. C-24, pp. 1014-1015, Oct. 
1975. 
[13] S. H. Unger, “Tree realizations of iterative circuits,” IEEE Trans. 
Comput., vol. C-26, pp. 365-383, Apr. 1977. 
[ 14) K. Hwang, Computer Arithmetic-Principle, Architecture and Design. 
New York: Wiley, 1979. 
[15] R. K. Montoye, P. W. Cook, E. Hokenek, and R. P. Havreluk, “An 18 
ns 56-bit multiply-adder circuit,” in Dig. Tech. Papers, Int. Solid-State 
Circuits Conf, WPM 3.4, Feb. 1990, pp. 4 6 4 7 .  
[16] P. J. Song and G. D. Micheli, “Circuit and architecture trade-offs for 
high-speed multiplication,” IEEE J.  Solid-State Circuits, vol. SC-26, pp. 
1184-1198, Sept. 1991. 
[17] S. Kawahito, M. Kameyama, and T. Higuchi, “Multiple-valued radix- 
2 signed-digit arithmetic circuits for high-performance VLSI systems,” 
IEEE J .  Solid-state Circuits, vol. SC-25, pp. 125-131, Feb. 1990. 
[18] M. R. Santoro and M. A. Horowitz, “SPIM: A pipelined 64 x 64-bit 
iterative multiplier,” IEEE J .  Solid-state Circuits, vol. SC-24, Apr. 1989. 
[19] M. Mehta, V. Parmar, and E. Swartzlander, “High-speed multiplier 
design using multi-input counters and compressor circuits,” in Proc. 
Int. Symp. Comput. Arithmetic, 1991, pp. 43-50. 
pp. 1021-1024, NOV. 1973. 
Shoji hawahito (S’S5-M’SS) was born in 
Tokushima, Japan, on March 21, 1961. He received 
the B.E. and M.E degrees in electrical and 
electronic engineering from Toyohashi University 
of Technology, Toyohashi, Japan, in 1983 and 
1985, respectively, and the D.E. degreein electronic 
engineering from Tohoku University, Sendai, Japan 
in 1988. 
He is currently a Lecturer with the Department 
of Electrical and Electronic Engineering, Toyohashi 
University of Technology. His research interests 
include multiple-valued arithmetic, VLSI, and integrated smart sensors. 
Dr. Kawahito is a member of the Institute of Electronics, Information and 
Communication Engineers of Japan and the Institute of Electrical Engineers 
of Japan. He received the Outstanding Paper Award at the 1987 IEEE 
International Symposium on Multiple-Valued Logic (with T. Higuchi et al.). 
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on March 03,2010 at 02:26:36 EST from IEEE Xplore.  Restrictions apply. 
42 IEEE TRANSACTIONS ON COMPUTERS, VOL. 43, NO. 1, JANUARY 1994 
Makoto Ishida was born in Hyogo, Japan, on July 
14, 1950. He received the B.E. and M.E. degrees in 
electronic engineering from Toyama University in 
1974 and Shizuoka University in 1976, respectively, 
and the D.E. degree from Kyoto University in 
1979. 
He is an Associate Professor at the Department 
of Electrical and Electronic Engineering, Toyohashi 
University of Technology. His research interests 
include silicon-on-insulator material, structures, and 
devices. 
Dr. Ishida is a member of the Institute of Electronics, Information and 
Communication Engineers of Japan, the Japanese Society of Applied Physics, 
and the Material Research Society. 
Tetsum Nakamura was born in Niigata, Japan, 
in 1932. He received the B.E. and D.E degrees 
from Tohoku University, Sendai, Japan, in 1957 and 
1968, respectively. 
He joined NEC, Kawasaki, Japan, in 1957, where 
he worked on Ge-alloy-type switching transistors, 
Si high-frequency high-power transistors, and mi- 
crowave diode and diffusion processes on Si. Since 
1978, he has been with Toyohashi University of 
Technology, where he is currently a Professor. His 
research interests are in semiconductor integrated 
sensors. 
Dr. Nakamura is a member of the Institute of Electrical Engineers of Japan 
and the Japanese Society of Applied Physics 
Michitaka Kameyama (M’79-SM’91) was born in 
Utsunomiya, Japan, on May 12, 1950. He received 
the B.E., M.E., and D.E. degrees in electronic 
engineering from Tohoku University, Sendai, Japan, 
in 1973, 1975, and 1978, respectively. 
He is currently a Professor in the Department 
of System Information Sciences, Graduate School 
of Information Sciences, Tohoku University. His 
general research interests include robot electronics, 
VLSI systems, highly reliable digital systems, and 
multiple-valued logic systems. 
He received the Outstanding Paper Award at the 1984,1985, 1987, and 1989 
IEEE International Symposia on Multiple-valued Logic (with T. Higuchi et 
al.), and the Technically Excellent Award from the Society of Instrument and 
Control Engineers of Japan in 1986, the Outstanding Paper Award from the 
Institute of Electronics, Information and Communication Engineers of Japan 
in 1989 (with T. Higuchi et al.), and the Technically Excellent Award from 
the Robotics Society of Japan in 1990 (with T. Higuchi et al.). Dr. Kameyama 
is a member of the Institute of Electronics, Information and Communication 
Engineers of Japan, the Society of Instrument and Control Engineers of Japan, 
the Information Processing Society of Japan, and the Robotics Society of 
Japan. 
Tatsuo Higuchi (M’70-SM’83-F’92) was born in 
Sendai, Japan, on March 30, 1940. He received 
the B.E., M.E., and D.E. degrees in electronic 
engineering from Tohoku University, Sendai, Japan, 
in 1962, 1964, and 1969, respectively. 
He is currently a Professor with the Depart- 
ment of System Information Sciences, Graduate 
School of Information Sciences, Tohoku University 
His research interests include the design of one- 
and multidimensional digital filters, multiple-valued 
logic systems, robot electronics, and VLSI comput- 
ing structures for signal and image processing 
Dr Higuchi received the Outstanding Paper Award at the 1984, 1985, 
1987, and 1989 IEEE International Symposia on Multiple-valued Logic, the 
Technically Excellent Award from the Society of Instrument and Control 
Engineers of Japan in 1984 and 1986, the Outstanding Transactions Paper 
Award from the Institute of Electronics, Information and Communication 
Engineers of Japan in 1989, and the Technically Excellent Award from the 
Robotics Society of Japan in 1990. He was the Program Chairman of the 
1983 IEEE International Symposium on Multiple-Valued Logic, and served 
as Symposium Chair of the 1992 symposium. He is a member of the Society 
of Instrument and Control Engineers of Japan, the Institute of Electronics, 
Information and Communication Engineer? of Japan, and the Robotics Society 
of Japan. 
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on March 03,2010 at 02:26:36 EST from IEEE Xplore.  Restrictions apply. 
