Fast carry accumulator design by Mastin, W. C.
N 
FAST CARRY GN 
By William C. Mastin 
Astrionic s Laboratory 
February 9,  1970 
e 
https://ntrs.nasa.gov/search.jsp?R=19700020676 2020-03-23T19:27:09+00:00Z
TECHNICAL REPORT STANDARD T I T L E  PAGE 
TM x - 5398.1 I I 
1. T I T L E  AND SUBTITLE 15. REPORT DATE 
Fast Carry Accumulator Design 
Februar 9,  1970 
6. PERFORMlNG ORGANIZATION CODE  
7. AUTHOR(S) 8 .  PERFORMING ORGANIZATION REP OR^ 
William C. Mastin 
3 .  PERFORMING ORGANIZATION NAME AND ADDRESS 110. WORK UNIT NO. 
George G. Marshall §pace Flight Center 
Marshall Space Flight Center, Alabama 35812 11 .  CONTRACT OR GRANT NO. I 
Technical Memorandum 
15. SUPPLEMENTARY NOTES 
Prepared by Astrionics Laboratory Science and Engineering Directorate 
16. ABSTRACT 
Methods for increasing the speed of binary addition by decreasing carry propagation time 
a r e  reviewed. 
accumulator using ones complements with end-around carry for subtraction. An iterative 
accumulator using pulse logic for input and carry signals is described. Realizations of gated 
carry,  carry-completion detection, and carry-skip circuits that would be compatible with this 
accumulator a r e  presented. NAND gates a r e  used in the design of the required combinational 
networks a 
Particular attention is given to those methods that would be applicable to an 
17. KEY WORDS 18 .  DISTRIBUTION STATEMENT 
Digital Accumulation 
Gated Carry 
Carry-Completion Detection 
Carry Skip 
l20. SECURITY CLASSIF. (of this page) 2 1 .  NO. OF PAGES 22. P R I C E  I 19.  SECURITY CLASSIF.  (of this report) 
Unclassified Unclassified 
MSFC - Form 3292 (May 1969) 
c s 
SUMMARY ........................................ 
Page 
I 
REVIEW O F  METHODS FOR DECREASING CARRY 
PROPAGATION TIME. ................................ 3 
Gated C a r r y  . ;. ............................... 
Carry-Completion Detection ....................... 8 
Carry-Skip Techniques ...................... : . . . .  
5 
12 
DESCRIPTION OF A SIMPLE ITERATIVE ACCUMULATOR . . . . . . .  19 
RAPID ACCUMULATOR REALIZATIONS AND ANALYSIS. . . . . . . . .  26 
G a t e d C a r r y  .................................. 26 
Carry-Completion Detection ....................... 30 
Carry-Skip Techniques.. ......................... 39 
DISCUSSIONS AND CONCLUSIONS ........................ 51 
APPENDIX: DETERMINATION OF OPTIMAL AND 
ECONOMICAL SKIP DISTRIBUTIONS . . . . . . . . . . . . .  53 
REFERENCES. .  .................................... 62 
BIBLIOGRAPHY .................................... 64 
iii 
s 
Table Title Page 
5 I . Truth Table for Carry Determination . . . . . . . . . . . . . . .  
2 . Logic Element Delay Times ...................... 25 
3 . Relative Cost of Logic Elements. . . . . . . . . . . . . . . . . . .  25 
4 . Accumulator Cost and Time Requirements . . . . . . . . . . . .  51 
5 . Comparison of Accumulator Designs . . . . . . . . . . . . . . . .  52 
iv 
Figure 
1 . 
2. 
3 . 
4 . 
5 . 
6 . 
7 . 
8 . 
9 . 
10 . 
11 . 
12 . 
13 . 
14 . 
15 . 
16 . 
17 . 
18 . 
Title Page 
Closed-loop system with sampling ..................... 1' 
Digital compensator .............................. 2 
Simple series accumulator ......................... 3 
Typical ful l  binary adder stage ....................... 
Simple iterated (pseudoparallel) adder . . . . . . . . . . . . . . . . .  
4 
5 
Gated carry ................................... 6 
Stored carry ................................... 7 
Principle of carry-completion detection . . . . . . . . . . . . . . . . .  9 
Carry-completion detection. one stage . . . . . . . . . . . . . . . . .  10 
NOR gate adder stage ............................. 11 
NOR gate realization of adder carry and carry-completion 
detection circuits ................................ 13 
14 Full binary adder stage ............................ 
Three-bit adder group with full carry prediction . e e . a . . -, . e 16 
Carry-look-ahead for three groups .................... 
Carry-skip circuit ............................... 20 
A simple iterative accurnulator ....................... 
17 
21 
Timing diagram for  carry generation and assimilation. . e e . 24 
28 Accumulator with gated carry ......................... 
V 
Figure Title Page 
19. Timing diagram €or carry generation and assimilation 
(gated carry) . - . .  . . 31 e e a e e . ,  . e * .  . . s . .  . . . - .  . e 
20. Carry-completion detection circuit e , e . . . e . . . . e . . . e 33 
21. Timing diagram for detection enable signal generation e . a e e 34 
22. Continuity of carry signal at sensing gates e e . . . . e . . . . . e . 35 
23. Timing diagram for carry generation and assimilation in 
carry-completion detection realization e . . . . . e a . . 36 
24. Timing diagram for carry-completion signal generation . . . e 38 
25. Carry propagation through group with 
t i *  t 2 *  t 3 *  t4 .  t, true. e e . .  . * . .  . . . . . . . . . . . . * .  . 43 . . 
45 26. Car ry  skip group k . a * .  . a .  . , . e * . .  . . . e * .  . e 
49 27. Gating level timing diagram . . . . . . . . e . . . . . . . . . . e . e 
Six-group representation. . . . e * .  . . . . . . . . . . a . . . e 50 28. 
vi 
Symbol 
+ 
AC 
DM 
Do 
S 
Definition 
Logical OR function 
Logical AND function 
Inversion 
Logical EXCLUSIVE OR function 
Logical COINCIDENCE function 
Accumulator register element; T flip-flop 
Delay multivibrator 
NAND gate 
NOR gate 
Delay 
vii 
TECHNICAL MEMORANDUM X-53983 
Methods for increasing the speed of binary addition by decreasing carry 
propagation time are  reviewed. Particular attention is given to those methods 
that would be applicable to an accumulator using ones complements with end- 
around carry for subtraction. An  iterative accumulator using pulse logic for 
input and carry signals is described, Realizations of gated carry,  carry- 
completion detection, and carry-skip circuits that would be compatible with this 
accumulator are presented. NAND gates are used in the design of the required 
combinational networks. 
I ON 
Digital techniques a r e  finding application in closed-loop control systems. 
Sampling permits the control of large amounts of power by sensitive control 
elements and decreases the loading on sensing devices. The use of sampled 
data and digital components in control systems facilitates time sharing of 
portions of the system, which results in both economy and reliability in design. 
A digital compensator in a control loop is shown in Figure I, where D (z) 
represents the transfer function of the compensator, G (s) is the transfer 
function of the system sensing element, and G (s) is the transfer function of 
the system driving element. 
a 
b 
Figure 1. Closed-loop system with sampling [ i l  
A model of a digital compensator is shown in Figure 2 [I]. A second- 
order compensator transfer function may be described by [ 21 
Sampler Ho Id 
Figure 2. Digital compensator [ I]. 
e2(KT) = el(KT) + alel(KT-T) + a2el(KT-2T) - bie2(KT-T) 
-b2e2(KT-2T) 
where T is the sampling period. 
The accuracy with which this equation actually describes the transfer 
function may be seen to depend upon the speed with which the computer performs 
the indicated arithmetic operations. If T is the time required to complete the 
computation, the output of the compensator is a function of KT + T. This T 
corresponds to a lag in a continuous system and, if  large enough, could con- 
tribute to stability problems. 
Adder circuitry is essential not only for addition, but also for multipli- 
cation, subtraction, and division. The speed of the computational process is 
then heavily dependent upon that of the adder. In general, the basic design 
problem for arithmetic units is that of achieving high speed and accuracy at 
low cost. Size, weight, and sensitivity to environment are not generally of 
great importance in a general purpose computer. These factors, along with a 
greater emphasis on reliability, do become of great concern in the special 
purpose computers used in digital compensators. 
The most economical adder is probably the serial machine represented 
in Figure 3. 
addend and augend words. A typical FBA stage is shown in Figure 4. However, 
only one element of each word is added in one computation time so the time for 
the addition of the two words (of equal length) is then nT where n is the 
number of elements in a word and T is the time required for each addition. 
only one full  binary adder (FBA) unit is required to sum the 
The time required for the addition of the two words can be decreased 
by the use of a pseudoparallel or simple iterated adder as represented in 
Figure 5. Here the elements of the n-bit addend and augend a re  applied to n 
FBA's instantaneously, and all partial sums (xi @ yi) are formed during one 
2 
Input Register 
MSB LSB 
Accumulator Register 
I 
Figure 3. Simple series accumulator. 
addition time T. 
generated have been added to the partial sums in the succeeding stages. This 
carry propagation is essentially a serial  process, and in the worst case a carry 
generated in the first stage would have to propagate through the entire length of 
the adder. 
The addition is not completed, however, until all carries 
It is thus recognized that the problem of increasing the speed of the 
addition process reduces, to a great extent, to that of decreasing carry prop- 
agation time. A considerable amount of work has been done in this area and 
several solutions have been proposed. 
RE N 
Several methods for increasing the speed of binary addition by decreasing 
carry propagation time have been presented by SMansky [ 31 , MacSorely [ 41 , 
and Lehman and Burla [ 51 Those methods and others that might be applicable 
to accumulators using ones complements with end-around carry for subtraction 
will  be reviewed here, 
3 
*i 
Y i  
‘i 
si 
ci+l 
I TRUTH TABLE I 
S i 
\ x i Y i  
s = c xi yi + ci xi yi i i - - + c .  x yi + ci xi Yi i i  
= ci @‘Xi  8 yi 
i+l 
C = c x + x .  yi 
+c. yi 
- ci xi yi + ci xi yi 
+c. x .  Y i ‘ + F .  x .  yi 
(Xi 8 Yi> + xi yi 
i+l i i 1 
- 1 - 
1 1  1 1  
= c i 
Figure 4. Typical full binary adder stage. 
4 
1 S s2 
Figure 5. Simple iterated (pseudoparallel) adder, 
The truth table for carry' determination for two-summand binary addition 
is given in Table I. 
TABLE I. TRUTH TABLE FOR CARRY DETERMINATION 
i+i 1 C 
Carry out determined solely by carry in 
Carry out determined solely by the 
l Y y i  a 
summands, x. 
Perhaps the simplest method for decreasing carry propagation time is 
that of gated carry. Table I [61 shows that the dependency of the carry out of 
a stage will  fall into two categories; one in which the carry out is determined 
entirely by the carry in and one in which the carry out is determined only by 
the summands. Unlike summand bits in a stage may be detected and used to 
provide a path for carries from its preceding stage directly to its succeeding 
stage, In a stage with like summands, the above path is interrupted, and a 
one carry or zero carry is provided for its succeeding stage, depending upon 
whether the summands a re  both ones or  both zeros. One such scheme using 
a 2-bit time addition is shown in Figtare 6 [ 7 I a A variation of this method 
using storage elements for the carry produced in each stage is called the stored- 
carry technique and is shown in Figure 7 [ 71 @ 
5 
End-Around Carry -- - -  
x - Element of Accumulator Register i 
- Element of Input Register Yi 
Sum, X+Y, stored in Accumulator Register 
Figure 6. Gated carry. 
Carry 
Control 
Pulse 
Add 
Contro 1 
Pulse 
6 
End-Around Carry 7 - - 
-- 
. 9  
x - Element of Accumulator Reg i s t e r  i 
yi - Element of Input  Reg i s t e r  
c - Element of Carry Reg i s t e r  
i 
Sum, X+Y, s t o r e d  i n  Accumulator R e g i s t e r  
Reset 
Carry 
Control  
S i g n a l  
Add 
Control  
S i g n a l  
Figure 7. Stored carry. 
7 
In either of these techniques, a carry entering the first stage would 
pass through two gates per stage through the n-I stage and through two gates 
th in the n stage to be assimilated. The maximum carry propagation time is 
then 2(nd) , where n is the number of stages and d is the delay time per 
gate. 
In methods using fixed-time addition, time for the worst case propaga- 
tion through the entire accumulator plus a safety factor must be allowed. It 
can be shown that for the addition of numbers with a random distribution of 
ones and zeros, the average maximum carry length is less than logzn, where 
n is the word length [ 81. For the propagation of both carry and no carry, the 
average length is logz 5n/4 9,101. The addition could then be speeded up if 
only the time needed for the actual carry propagation could be allowed. 
Gilchrist, Pomerene, and Wong [ 61 proposed a scheme to detect the presence 
of a carry and use its completion to initiate the next addition step. The princi- 
ple of carry-completion detection is illustrated in Figure 8. At the initiation 
of a carry operation, a one is inserted into the zero-carry line preceding the 
first adder stage. A one carry or  zero carry is then gated or generated in 
the succeeding stages as determined by the inputs at those stages. The con- 
dition for a one carry is 
c ' = x y  + c  (x. BYi) i i i i-1 1 
and the condition for a zero carry is 
- -- 
c O = x.y. + ci-i a (x. OYi, 
i 1 1  1 
Note that both the one- and zero-carry lines must be monitored so that a false 
completion signal will  not be generated. One stage of an accumulator using 
carry-completion detection is shown in Figure 9 [ 71 e A method for evaluating 
the reduction in time achieved by the use of asynchronous addition techniques 
has been given by Hendrickson L IO] , who arrived at a value of 90 percent 
savings for a 100-bit adder. 
8 
- - -  - -  
End-Around Carry 
xi Yi xi 0 Y i 
x BY1 XI% 1 
-- 
AND Gate 
Completion Signal 
Figure 8. Principle of carry-completion detection. 
A NOR gate binary adder with carry-completion detection has been 
presented by Majerski and Wiweger [ i 1 I (I One stage of their realization of an 
adder is shown in Figure I O .  The Boolean formulas describing the circuit of 
Figure 10 are 
-- 
D. = xiyi + x y i  
1 
-- - - - 
D C = X.Y. + x.y. + x y + D  c i i  1 1 1 a i-1 i-I i-i i-I 
1 
i+l C 
0 
i+l C 
To 
Carry 
Completion 
AND 
Gate 
1 
i 
‘i 
Carry Carry carry not 
generated 
C 
0 
transmitted transmitted 
& _  
XY X @  Y XY 
I Carry Control Logic 
x - Accumulator Register Element 
y - Input Register Element 
Sum, X+Y, stored in Accumulator Register 
Figure 9. Carry-completion detection, one stage. 
10 
i 'i X i 'i X 
- 
C i 
C i 
i Yi 
D. ci 
X 
1 
- -  
xi Yi) 
Di - 1  ci 
Figure  IO. NOR gate adder  stage [ll].  
-- 
s = D.c. + D e  
i 1 1  i i  
~~ - 
= D.c. + D.c. +x y + D c 
1 1 1 1 i-1 i-I i-I i-l 
where  
x.y. (i = I, 2, e , n) = bits of summands 
1 1  
c. (i = i , 2 ,  
s. ( i=  1,2,, , n) = i bit of sum 
e , n+i) = c a r r y  from i-i to i position 
1 
th 
1 
i+l C 
- 
i+l C 
Figure 11 shows the adder carry and carry-completion detection circuits of 
Majerski and Wiweger [ 111 a An advantage of the adder carry circuit is that 
a carry must propagate through only one gate per stage, whereas two gates 
are required in other known designs. The Boolean formulas for the carry- 
completion detection circuit are 
H. = x.y. + x.y. + D.c.h + D.c.h 
1 1 1  1 1  1 1  1 1  
i = 2,4,6, .  e . , 2E (n/2) 
H = H + H  + H  + . * *  + H  
2 4 6  2E (n/2) 
where h is the carry control signal. 
Operation begins with h = 1 and the add control signal. Af te r  at least 
2d, where d is the maximum propagation time of a NOR gate, h is switched 
to 0. The carry propagation process then begins and lasts until all D c and 
D.F. gates stabilize. The h = 1 signal is applied to prevent a false carry- 
completion signal from being generated. The carry-completion signal may 
precede the real completion by up to d since only every second position is 
monitored. 
i i  
1 1  
es 
Assuming the complements of both summands are available, an FBA 
stage may be represented as shown in Figure 12. The functions representing 
the sum and carry are then 
s = x. 0 yi 0 ci i 1 
= x.y. + X.C. + YiCi 
i+ 1 1 1  1 1  C 
12  

x i 
X i 
’i 
i C 
i S 
Figure 12. Full binary adder stage. 
Let 
c = r . 9  (c = r  ) i+i 1 i i-I 
Equation (10) can be written as 
ri = (xi B y i )  ci -t x.y i i  
Let 
Then 
- 
i+l C 
t. = xi B y i  = condition for transmission of a carry 
1 
- xiyi = condition for generation of a carry. 
gi 
r = gi + t c = carry out of a stage. 
i i i  
(12) 
(13) 
(14) 
(15) 
The carry into any stage may then be expanded as 
14 
c . = r  
I i-1 
t a r  i-2 i-3 
The expansion of r may be carried a s  f a r  back as desired. The limit 
I" for an n-stage adder is an expression for c containing r n 
Application of this principle to a section of an adder is illustrated in 
Allowing the i to take on successive values in equation (4) Figure 13 141 
and omitting all terms with negative subscripts, it is seen that each stage of the 
adder will  require one i-input OR gate and i AND gates having one through i 
inputs. Thus the number of circuit elements would become prohibitive for 
adders of more than a few stages, However the maximum carry path between 
any two stages is two levels and it is four levels for a complete addition, 
The adder can be broken up into groups of stages connected in the manner 
described. The carry into each group would be designated c the carry out of 
a group designated r 
group I. Assume that six stages would be a reasonable number of stages to be 
connected with fu l l  carry-look-ahead. If the six-stage groups a re  now connected 
in ser ies  with c = r g(i.i) a carry wi l l  require four levels to be generated, 
two levels to be transmitted through each intermediate group, and four levels 
to reach and produce a sum in the final group, For 6-bit groups then, the 
maximum carry path length would be 4 + (2n/6) where n is the number of 
stages. For a 30-bit adder, this would be 14. ?$his technique may be extended 
by providing carry-look-ahead circuitry between groups and even further by 
dividing the groups into sections and providing look-ahead between sections. A 
carry-look-ahead, or carry-skip, network for three groups is illustrated in 
Figure 14, 
gi ' 
and the group containing the lower-order stages designated 
gi 
gi 
15 
- I -  - - 
- i -  - -  I 
h 
Ll 
i 
I I  
I I  U I 
I 
- 
I 
I t  
x'" 
.i 
16 
I? 
Majerski [ 121 has proposed and implemented a NOR gate realization of 
a carry-slip circuit. For use of this circuit the skips may comprise only odd 
numbers of adder stages and their distribution relative to one another must 
always be an even number of stages. The adder with which this circuit is used 
is an asynchronous binary NOR gate implementation, of which the structure is 
given by the following equations: 
For odd adder stages, 
D i ~ i = ~ . ~ . + ~ . y . + ~ i - l ~ i - l + D i - i  1 1  1 1  c i-l + h  Y 
where h is the carry control signal; 
for even adder stages, 
-- - - 
s. = D.c. f x y + D c +xiyi+xiyi+ Dici 
1 1 1 1-1 i-I i-I i-I 9 
D.c. = X.Y. + X.Y. + x y 1 1  1 1  1 1  i-i i-i + Di-ici-i 9 
where 
D . = x y  + x y  
1 i i  i i  
(18) 
(21) 
Carry propagation begins with a change in h from I to 0 and ends with the 
stabilization of the states of all gates. 
The carry-skip circuit, comprising k stages from j + I to j + k of 
an adder, is described by the function: 
18 
where j is an even number and k is an odd number, The carry and skip 
circuits for k = 3 are shown in Figure 15 [ 121 e 
MajerskiBs equations for determining optimal and economical skip dis- 
tributions in adders using his design a re  presented in the Appendix. 
CR s 
The parallel binary accumulator shown in reduced block diagram form 
in Figure 16 has been used by the Auburn Research Foundation in the design of 
a digital compensator [ 21 
feature and relatively simple control circuitry. Its properties will be briefly 
described here and used as a basis on which to judge improvements in carry 
propagation time resulting from the application of fast carry techniques. 
It was selected because of its automatic carry 
The blocks labeled AC are T flip-flop storage elements of the accum- 
ulator register, and those labeled DM are  delay multivibrators used for carry 
generation. The NAND gate pairs in the Bower part of the figure form the 
augend-complementing structure. The input from the complementing-gate 
structure is a negative-going pulse when the input is positive. The negation 
output of the DM is used, The input to the th-stage flip-flop (AC) is then 
yi * c. = yi + ci 
1 
The output equation of the ith flip-flop is 
- -  
9 x = t.x. + t,x, 
i 1 1  1 1  9 
(23) 
(24) 
where t. is the input. 
1 
19 
I, 
i 
+ 
x 
k 
0 
.r( 
20 
!n 
VI 
m 
N 
rn 
I I 
m 
21 
The total addition for  a stage takes place in two steps. In the first step, 
t. = y i  
1 
Then 
- -  
x! =y.x. + y x  
i+i 1 1  i i 
= y . o x i  1 
The complement output of the flip-flop is the function 
yi @Xi = Y i 0  xi 
(25) 
(26) 
th 
In the second step of the addition, the input of the i 
output is then 
flip-flop is cie The final 
- 
x? = c. (y. 0 xi) + ci (Yi O X i )  1+2 1 1 
= c. 0 Yi 0 xi  
1 
(27) 
The accumulate operation is initiated by the application of the accumulate 
pulse. The contents of each stage of the augend, or input, register are gated to 
the input of the corresponding stage of the addend, or accumulator register, 
This operation is simultaneous for all stages within the limits of the variation in 
propagation time between NAND gates. If the augend bit is a logic one, with the 
sign bit a logic zero (positive), the accumulator flip-flop changes state. Thus 
if a logic one from the augend stage is added to a logic one in the accumulator 
stage, the partial sum, a logic zero, is stored in the accumulator. The transi- 
tion of the accumulator stage from a one to a zero state triggers the delay 
22 
multivibrator and generates a carry9 which is routed to the succeeding stage, 
If a logic one from the augend stage is added to a logic zero in the corresponding 
accumulator stage, the partial sum, a logic one, is stored in the accumulator. 
The transition of the accumulator stage from a zero state to a one state does not 
trigger the delay mdtivibrator, and no carry is generated, A t  this point the 
accumulator contains the results of the operation xi 8 yi e The carry generated 
in the preceding stage is then added to the contents of each stage of the accum- 
ulater, forming another partial sum and possibly generating other carries, The 
delay time of the multivibrator is such that all partial sums are  formed before 
the carries a re  generated, A timing diagram for the operation of one stage of 
the accumulator is shown in Figure 17. The AC registers trigger when there is 
a change from the one to the zero level. Therefore, the trailing edge of the 
accumulate pulse is used as the zero-time reference. A t  the end of the accumu- 
late operation, the augend in the input register has been added to the addend in 
the accumulator register and the sum has been stored in the accumulator 
register. 
Typical delay times of the elements used in the accumulator are given 
in Table 2 1131 The DM pulse width is set at 100 ns to assure operation of all 
circuits to which it is applied. It can be seen from the values in this table and 
by referring to Figure 16 that a maximum delay time of 120 ns is required from 
the end of the accumulate pulse to the formation of partial sums in the accumu- 
lator register. 
ceeding register is a 30-ns delay for the multivibrator, a 100-ns pulse width 
of the multivibrator, and a 30-ns delay for the input gate, yielding 160 ns. The 
total time for the generation and transmission of the first carry is then 280 ns. 
The time for the generation and transmission of each secondary carry (generated 
by the assimilation of a carry) would be 220 ns, since the delay time of the 
complementation gate and one input gate would then be omitted. The worst case 
carry propagation time for a 30-stage accumulator of this design would then be 
280 ns for the generation of a primary carry in the first stage, 6160 ns (28x220) 
for transmission through the intermediate stages, and 60 ns for assimilation in 
the 30 
The time required to form a carry and transmit it to the suc- 
th stage, yielding 6. 50 ps. 
The relative cost of the circuit elements used in the accumulator is given 
in Table 3 1131, using a 2-input NAND gate as a basis. Inverters will  be con- 
sidered as  2-input NAND gates with the inputs common. A considerable variation 
in cost is found, but these figures should be realistic enough for comparative 
evaluation of designs, 
The cost figure for the portion of this accumulator dealing with carry 
propagation is then 630, 
23 
I 
I 
1 
x'" 
I 
I 
u o  
7- a 
U 
3 
0 
n a 
0) 
4J 
(d 
k 
a) c 
a) 
bo 
h 
k 
Fc 
cd 
4J 
7 
.I4 
6" 
- 
I 
I 
I 
I 
I 
1 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
- 1  
I 
I 
I 
I 
I 
I 
1 
n 
2 
v 
O a )  e 
.rl 
E-l 
0 
0 
24 
TABLE 2, LOGIC ELEMENT DELAY TIMES 
TABLE 3, RELATIVE COST OF LOGIC ELEMENTS 
Logic Element 
2-input NAND gate 
3-input NAND gate 
4-input NAND gate 
6-input NAND gate 
12-input NAND gate 
Flip-flop 
Delay multivibbrator 
25 
The methods previously reviewed were investigated with regard to their 
application to the accumulator. This accumulator utilizes pulse logic for the 
input and carry signals. NAND gates were used in the design of combinational 
circuits because of their availability in integrated circuit form and their use in 
the basic accumulator. 
The techniques presented by Majerski and Wiweger [ li, 121 offer faster 
propagation times than those given in the following sections when used with 
adders of their design, but were not directly applicable to this accumulator 
because of its use of pulse logic. 
Both one- and two-layer carry-skip methods were investigated. It was  
found that two-layer skips offered no appreciable speed advantage over one- 
layer skips for this accumulator. 
The results of the investigation a re  presented in the following sections. 
It is seen from Table I that the carry out of a stage depends solely upon 
the carry in when the summands are unlike. The function representing unlike 
summands is 
- -  - xi 0 yi - XiYi + XiYi 
The carry out of a stage depends only on the summands when they a re  both 
alike. The function representing like summands is 
-- 
x. 0 yi = XiYi + XiYi 
1 
In the accumulator previously described, the above two functions a re  available 
from the i -stage flip-flop of the accumulator at the true and complement 
outputs, respectively. The carry from the i stage is generated when the 
th 
th 
26 
coniplement output goes to the logic-one level, The functions necessary for 
applying the gated-carry technique presented a re  available. The assimilation 
of a carry in a stage for which the function x. B y .  is true will produce another 
carry,  which poses no problem with level logic. In the accumulator, extraneous 
p d s e s  would be inserted into the carry propagation line. The suppression of 
these extraneous pulses must be considered in applying the gated-carry 
technique 
1 H 
t3-k 
Let r. be the carry inserted into the carry propagation line by the i 
a 
stage, and let 
c. = r  
1 i-I (28) 
Let g be the carry pulse generated by the condition x y which results 
when both summands of the i-I stage are at the logic-one level. The function 
for c. must then be 
i- 9 i-I i-I 
1 
where e is a carry-generation enable pulse, This pulse must be present for 
the generation of a carry in step one of the addition process, and absent to 
prevent the generation of an extraneous carry when a carry is assimilated, 
A circuit to realize this c. function is shown in Figure 18, The equation 
B 
for e., written from the figure, is 
1 
27 
? 
a 
v3 
. .  
a 
v! 
c 
h 
a 
rl :: 
h 
l-l I h 
k 
0 
28 
Applying DeMorganPs theorem , 
D 
which is the desired result, 
The accumulation step is initiated with the accumulate pulse. If xi 
and y a re  logic ones, the AC complement output changes state from a zero to 
a one, triggering the DM and generating a carry pulse. The carry pulse enters 
the carry propagation line through the gate labeled 4 in Figure 18. If x. and 
y. a re  both at the logic-zero level, the complement output of the AC is at the 
logic-one level and does not change state and no carry is generated, This is 
the situation in which the carry out is determined solely by the summands, The 
true output of the AC is then at the logic-zero level and attains this level 120 ns 
before a carry is generated in the preceding stages, This level is then applied 
to the gate labeled 3 in Figure 18, blocking the transmission of any carry 
generated in the preceding stages. A carry entering the i stage is routed 
through gates 5 and 6 to the AC input, causing a change in state, The assim- 
ilation of the carry then removes the transmission block, but does so 120 ns 
after the carry pulse has ceased to be present at gate 3, 
i 
B 
1 
th 
If x. and y. a re  different, x. @yi is true, and the true output of the 
1. a 1 
AC takes on the logic-one level. If x. is one and y. is zero, there is no AC 
state change. If y. is one and x is zero, the AC would change state from a 
1 i 
complement logic-one level to a true logic-one level. In either case, no carry 
would be generated. The resulting situation is that the carry out is determined 
solely by the carry in. The true level is applied to gate 3, enabling the trans- 
mission of an incoming carry. A carry into the stage is then assimilated through 
gates 5 and 6,  causing a state change of the AC, This removes the gating level 
to gate 3, 120 ns after the carny is transmitted. No new carry is generated by 
the stage because of the removal of the carry-generation enabling level, e, 
from gate 2. The enabling level, e, is generated by a DM, which is actuated 
by the accumulate pulse, The DM pulse width is set  at 370 ns to gate all 
primary carries generated and to block a11 secondary carries generated, 
1 1 
29 
I 
A timing diagram for one stage of the accumulator is shown in Figure 
19. Assuming the necessary conditions, the time required to insert a carry 
into the propagation line after the end of the accumulate pulse is a 30-ns delay 
each for the complementing gate and gate 6, a 60-ns delay for the state change 
of the AC, a 60-ns delay for the DM, a 100-ns delay for the DM pulse width, 
a 30-ns delay for gate 2, and 30 ns for the insertion-gate delay, or  a total of 
340 ns. There is a delay of 30 ns per gate per stage, or  60 ns per stage through 
the carry propagation line. It requires 120 ns to assimilate a carry in a stage. 
The equation for determining the total accumulation time for two binary numbers 
is 
A = (460 + 60 n9) ( 30) 
where nv is the number of stages through which a carry must be propagated 
before assimilation. The total time required to generate a carry in the first 
stage of a 30-stage accumulator, propagate it through the entire length, and 
assimilate it in the last stage is 
A = [460 t 60(28)] ns = 2.14 ps 
The equipment that must be added to the accumulator previously described 
to facilitate carry gating consists of four 2-input NAND gates per stage and one 
DM unit to generate the carry-generation enable pulse, e. The cost figure for 
the portion of this accumulator concerned with carry propagation is then 756. 
The gated-carry design gives a speed increase of 6.50 ps/2.14 ps or  
3.37 to i over the simple iterative accumulator. The cost increase is 21 percent. 
The gated-carry method produces an appreciable decrease in the maxi- 
mum carry-propagation time. To operate the accumulator on a fixed-time basis, 
maximum delays with an added safety factor must be considered, and the accum- 
ulation rate must always allow time for the maximum length of carry propagation. 
30 
I 
I 
Ti a 
a, 
U 
(d 
M 
u 
0 
U 
3 
0 
h 
.rl 
a gd 5 
d 
0 a, 
U M 
U h 
3 $4 
a 
H V  
.rl 
rJ 
a, 
dJ 
(d 
M 
w 
0 
U 
3 
0 
A 
k 
k 
(d 
W 
rl + 
.rl 
m 
a, 
U 
(d 
M 
w 
0 
u 
3 
0 
h 
)-I 
)-I 
cd 
V 
1 
I 
ri 
Ti 
+ 
a 
a, u 
(d 
M 
w 
0 
c, 
3 
0 
A 
)-I 
)-I 
cd 
V 
I 
I 
I 
I 
I 
I 
I 
1 
I 
I 
I 
I 
I 
B 
1 
I 
I 
I 
I 
1 - 
F 
T a 
L 
0 
0 m 
0 
0 e 
0 
0 
cr) 
0 
0 
c\l 
0 
0 
t-i 
0 
3r 
k 
k 
k 
0 
ecl 
31. 
A s  pointed out previously, the average carry propagation length is 
logp. This value is 4. 91 for an n of 30, so more rapid accumulation could 
be accomplished using a variable rate, with the rate controlled by the completion 
of carry propagation. Let r i = i , 2 , .  , 30 ,  be the carry pulse inserted 
th 
into the carry-propagation line by the i stage. Carry propagation is com- 
plete when a pulse is not present at any of the insertion points. The required 
condition for carry completion, D , is then 
i' 
A carry-completion-detection circuit that could be used with the gated- 
carry accumulator is shown in Figure 20. From the figure, the carry-com- 
pletion signal is a positive level described by the equation 
Applying DeMorgan's theorem gives 
where d is the detection-enable level. This level is generated from the 
accumulate pulse by means of a DM and a T flip-flop. The delay of the DM 
is set to ensure that any carry being propagated wil l  be present at the output 
of the gates labeled 5 before the detection system becomes operative. This 
prevents the generation of a false completion signal. The completion detection 
signal then resets the flip-flop in preparation for the next accumulate pulse, 
which wil l  also be controlled by D. The timing diagram for this operation is 
shown in Figure 21. The overlap of carry signal at adjoining sensing points 
along the carry propagation line is shown in Figure 22 and is seen to be 40 ns. 
The timing diagram for the generation and assimilation of a carry is 
shown in Figure 23. The maximum delay times of circuit elements were used 
32 
I I I  
1 I 1 I I I 
33 
.rl 
rl 
a, u 
cd 
00 
w 
0 
U 
1 
0 
.l-l 
h 
a 
a, u 
(d 
k 
a, 
M 
2 
.I+ 
N 
a, u 
(d 
M 
4-1 
0 
u 
1 
0 
*rl 
.rf 
a, 
4J 
(d 
M 
w 
0 
h 
LI 
M a 
V 
4 
.rl 
+ 
In 
a, u 
(d 
bo 
w 
0 
.u 
1 
0 
h 
M 
M 
(d 
V 
I 
I 
L 
3 
3 a 
0 
0 
m 
0 
0 
N 
0 
0 
d 
0 
a 
a, u 
(d 
&I 
a, 
M 
a 
G 
c 
c 
.i rn 
k 
0 w 
a) 
k 
Frc 
% 
.d 
. 
34 
Carry generated 
Carry ou t  of g a t e  2 i 
Carry o u t  of g a t e  4 i 
Carry o u t  of g a t e  5 i 
"d" generated 
Carry o u t  of g a t e  3 i+l 
Carry o u t  of g a t e  4i+l 
i+l Carry o u t  of g a t e  5 
I 
- -  
I 
I 
0 100 200 3 00 
T i m e  ( n s )  
Figure 22. Continuity of carry signal at sensing gates. 
in defining this diagram. It is seen that 340 ns a re  required to insert a carry 
into the carry propagation line. The time required to assimilate a carry 
after it has arrived at a stage is 120 ns, The equation for accumulation time 
is 
A = (460 + 60ns) ns  (32) 
where n' is the number of stages through which a carry must be propagated 
before it is assimilated. The time required to generate a carry and propagate 
it through the entire length of a 30-stage accumulator is 
A = t460 + 60(28)1 ns 
= 2.14 ps 
35 
.rl 
l-l 
a, 
tl 
(d 
M 
w 
0 
u 
7 
0 
h 
-4 
I 
J 
- 
1 .A
I 
a' .
a, u 
cd 
M 
w 
0 
tl 
7 
0 
h 
*" 
rl 
c a, 
h 
k 
k 
cd 
V 
% 
a 
a, 
U 
((t 
k 
a, c 
a, 
M 
h 
k 
k 
cd 
V 
I 
I 
rl 
.rl 
+ 
ul 
%I 
0 
U : 
h 
k 
k 
cd u 
- 
I 
I 
I 
1 
I 
I 
I 
I 
1 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
a 
a, 
U 
(d 
rl ." 
E 
74 
m 
m 
(d 
i? 
k 
(d 
V 
0 
0 
ul 
0 
0 e 
0 
0 
N 
0 
36 
C 
The carry propagation complete signal is generated 210 ns after the 
last carry has ceased to be present at a sensing point, as shown in Figure 24, 
The final accumulator-register transition is completed 90 ns from that time. 
The difference, 120ns, must be added to the 2.14 ps to yield the total maximum 
accumulation time of 2. 26 ps. For a one-stage propagation, x would 
change state 460 ns after the end of the accumulate pulse. The total time for 
the accumulation step in this case is 460 ns + 120 ns = 580 ns. If no carries 
are generated in an accumulation, the limiting factors on speed are the gener- 
ation of the detection enable signal, d , and its propagation through the com- 
pletion detection circuit. 
300 ns + 210 ns = 510 ns. 
i- I 
The minimum accumulation time is 
For the theoretical average propagation length of 4.91 (or 5) stages, 
the accumulation time is 
340 ns for carry generation and insertion into carry propagation line 
+ 5 x 60 ns for propagation through 5 stages 
+ 30 ns for the delay of gate 5 
+ 210 ns  for completion circuit delay 
= 880 ns 
Compared with the accumulation time of 6. 50 ps which must always be allowed 
in the accumulator previously described, this is an increase in speed of 7.39 
to I for the addition of a large number of summands with random bit distribution. 
The circuit elements that must be added to the gated-carry accumulator 
to implement the carry-completion detection feature are three I I-input NAND 
gates one 3-input NAND gate, four 2-input NAND gates, two flip-flops, and 
one delay multivibrator. From Table 3 it can be seen that the cost figure will 
increase by 62, yielding a total of 818. This is a 29. 8-percent cost increase 
over that of the simple iterative accumulator. 
37 
I 
I 
- 
G 
0 
*d u 
td 
a w o  
O k  a 
*d k 
u k  
a la  r l u  a 
V 
T 
L 
I 
I 
b 
al u 
td 
M 
6" 
2 
.d 
al 
VJ 
w 
0 
U 
1 a 
2 
I 
i 
- 
co 
al u 
td 
M 
w 
0 
u 
u 
1 
0 
2 
m 
al u 
(d 
bD 
W 
0 
u 
u 
7 
0 
2 
I 
I 
I 
0 
rl 
al u 
td 
M 
w 
0 
ca 
n 
a, 
4-l 
al 
ri a e 
0 
U 
h 
k 
k 
td 
V 
I 
I 
0 
0 
U 
0 
0 
cc) 
0 
0 
N 
0 
0 
rl 
0 
k 
0 w 
4 
N 
a, 
k 
% 
.r( 
Fr 
38 
The accumulator under investigation uses ones complements represen- 
tation of negative numbers with end-around carry,  so a skip distribution using 
equal size groups is considered [ 6 ] .  
into k groups of m bits each s o  that 
The accumulator of n bits is divided 
m k = n  (33) 
The greatest carry propagation time results when the conditions for 
the summands are 
XI * Y1 9 
xi 0 yl , i =  2 ,3 , .  . . , n-1, 
and 
- -  
x30 ' Y30 
The time required for propagation is then 
7 )  = [1 + (m-1) + (k-2) f (m- l ) ]  t.u. 
=[2m + k-31 t.u. 
= (2m + n/m-3) t.u. 
( 34) 
( 35) 
(36) 
(37) 
where t. u. is the propagation time of two NAND gates [6] e Differentiating 
equation (37) with respect to m and equating to zero yields 
39 
2-n/m2 = o 9 
m 2 = n / 2  , 
m 2 = m ,  
or 
m 2 = k m / 2  , 
k = 2 m  
(38) 
(39) 
for the minimum propagation time. For n = 30, the approximations m = 5 
and k = 6 are used. The group containing the lower-order bits is called 
group I. 
The maximum propagation time for a 30-bit accumulator is 
The time required to generate a carry was 340 ns, and the time to 
The maximum assimilate a carry after it arrived at  a stage was 120 ns. 
accumulation time for a 30-stage accumulator using one-layer carry-skips 
is then 
A = (460 + ~ 9 ~ )  ns 
= 1. 24ps . 
Equations (11) through (16)  will  be rewritten here using notation for a 
five-stage group. Let 
40 
th th 
th th 
be the carry signal produced by the i stage of the k group, where y 
and 
carry into that stage and 
k, i 
are the summands of the i stage of the k group and c is the %, i k,  i 
k, i = rk, i-1 C 
Let 
be the condition for the transmission of a carry through a stage. This signal 
is present at the true output of the k, i 
after the accumulate pulse has ended (Fig. 23). 
th -stage accumulator register 110 ns 
Let 
'k , i=Xk, i '  'k,i (43) 
th be the condition for the generation of a carry in the k , i  stage. This signal 
is generated by the k, i DM when the complement output of the k, i 
accumulator register becomes a one. 
280 ns after the end of the accumulate pulse. The carry produced by the k 
group is then 
th th  
Figure 23 shows that gk, is generated 
th 
rk = gk, 5 + tk, 5 ' %, 4 + tk, 5 
+ t  k , 5 '  t k , 4 '  t k , 3 s  gk,2 
gk, 3 
+ t  a t  e t  0 
k , 5  k ,4  k , 3  t k , 2 "  'k,i 
+ t  a t  O t  
k , 5  k ,4  k ,3  tk,2 tk, i' @k ( 44) 
41 
Equation (44) is satisfactory for carry levels, but with carry pulses 
k 
pulses out of that group. For pulse 
100 ns in width, Figure 25 shows that for ti e t z  * t3 0 t 4  
pulse into a group will  result in two r 
carries,  equation (44) must then be changed to 
t5 true, each c 
k 
k, 5 
t 4 t * tk ,4*  tk,5) 0 (g r k =  ‘tk,l k ,2  k , 3  
+ t  * e t  ’ k , 5  ‘k,4’ tk,5 k , 4  gk,3 
+ t  ‘ t  * k ,5  k , 4  tk ,3*  gk,2 
+ t  ‘ t  ‘ t  e t  * ) k,5  k ,4  k , 3  k ,2  gk , i  
e t  e t  
+ ‘tk,l ’ tk,2 k ,3  k ,4 tk,5’ ‘k 
- * t  * t  ‘ t  * 
- ‘tk,l k ,2  k , 3  k ,4  tk,5) gk,5 
+ ‘tk,l ‘ t  k, 2 e tk,3 * tk,4 * tk,5) ‘tk,5 * ‘k,4 
a t  - t  o f  + t k , 5 *  tk ,4’  gk ,3+ tk ,5  k , 4  k , 3  k ,2  
+ t k , 5 n  t k , 4 *  t k , 3 ”  \ , 2 ’  gk, l )  
+ t  * t  ‘ t  * t  * k , i  k ,2  k , 3  k , 4  tk,5’ ‘k (45) 
The carries that must be produced by the stages within the group are described 
by the following equations: 
rk, 1 = ‘k * tk, i %, 1 (46) 
42 
al 
rn 
l-l 
7 
PI 
I 
n 
m 
n al 
(d 
M (d .a 
al a M 
(d 
c, 
rn 
5 
0 
0 co 
0 
0 
\o 
0 
0 
-3 
0 
0 
N 
0 
R 
43 
I I i  
' r  = 
k ,2  'k,l e t k , 2 + g k , 2  
= c  * t  k k , i  t k , 2 f % , i  * t k , 2 + g k , 2  
r = r  - k ,3  k ,2  t k , 3 + g k , 3  
(47) 
r = r  k ,4  k , 3  t k , 4 + g k , 4  
= c  ' t  . t  . t  0 
k k , i  k , 2  k , 3  tk ,4+gk ,1  e tk,2 
+ gk, 2 ' t  k,  3 e tk, 4 + gk, 3 tk, 4 + gk, 4 (49) 
rk, 5 = tk (50) 
A reduced block diagram of the five-stage group using NAND gates 
to produce the combinational functions in equations (46) through (50) is given 
in Figure 26. To show that this circuit does satisfy the required conditions, 
DeMorgan's theorem is applied to equations written from the diagram. The 
carry-generation enable level, e , is a necessary condition for 
k,  1 
explained in the section on gated carry,  but will be omitted from the equations 
as it does not directly affect the argument. 
as 
- 
r = c  " t  0 k , i  k k , i  g k , i  
%, 1 = 'k * tk, i 
44 
L 
I I 
I I 
4 
U 
I 
I 1  
I I I  
N 
h 
Q 
4 
h 
h 
k 
k 
cd u 
45 
r =  
k, 3 
-  
r =  
k9 4 
r s t  . 
k , l  k ,2  gk,2 
tk, 2 + gk, 2 
r *  
k,  1 
r '  
k ,2  tk,3 * gk, 3 
rk, 2 tk, 3 + gk, 3 
r . t  
k , 3  k ,4  'k,4 
r 0 ,  
k, 3 tk,4 + gk,4 \ 
'k' t k , l  * t k , 2 '  t k , 3 °  t k , 4 + g k , 1  * t k 9 2  
e t  k , 3  e t k , 4 + g k , 2  e t  k , 3  e t k ,4+gk ,3  
l 
t k , 4 + g k , 4  
It is seen that these equations are in agreement with equations (46) through 
(49). 
The carry produced by the fifth stage of the group is 
- -  
r = r  = a t  ' t  " t  ' t  " t  k ,5  k gk95 k , l  k , 2  k ,3  k ,4  k ,5  
-- - -  
"1: * t  * ' t  * t  * t  e t  1 
k , 4  k , 5  ' tk, l  k , 2  k , 3  k , 4  k ,5  
46 
* t  O t  1 
gk,5 'tky 1 k , 2  k , 3  t k , 4 '  tk,5 
- 
+ r  0 - t  1 k,4  tk,5 'tk, i * t k , 2  a tk,3 tk,4 k, 5 
+ c  (t 
k k , i  
It must be shown that 
t * t  e t  * t  ) 
k,2  k , 3  k ,4  k ,5  
r a t  e " t  * t  * t  e t  1 k , 4  k , 5  ' tk , l  k , 2  k , 3  k , 4  k ,5  
-  
(tk, 1 * tk,2 e tk ,3  tk,4 tk,5) (gk,4 ' t5 
+ g k , 3  k , 4  t k , 5 + g k , 2 '  t k , 3 '  % , 4 "  tk,5 
+ g k , i  k ,2  k , 3  k , 4  k , 5  
a t  * 
a t  u t  * t  " t  ) 
We see that 
a t  e t  o r  a t  ) t 0 r = c  (tk,l k , 2  k , 3  k ,4  k ,5  k ,5  k , 4  
a t  ' t  
+ g k , 4 '  t k , 5 + g k , 3  k ,4  k ,5  
+ g k , 2  k , 3  k , 4  k ,5  
+ g k , i '  t k , 2 "  t k , 3 "  tk ,4 '  tk,5 
e t  a t  e t  
47 
Then 
'tk, 1 * tk,2 * tk,3 a tk,4 e tk,5 ) - t  k ,5  o r  k , 4  
0 = c .  
e ( G y 1 " t  e t  e t  a t  k ,2  k ,3  k , 4  k ,5  
+ ( tk , l  e t k , 2 '  t k , 3 e  h , 4 '  tk,5' * 'gk,4 '  tk,5 
+ gk, 3 tk, 4 * tk, 5 + gk, 2 \, 3 e tk,4 * tk, 5 
' t  ' t  ' t  ' t  ) + g k , i  k ,2  k ,3  k ,4  k , 5  ¶ 
which is the required result. 
It is necessary that the proper gating levels be at gates 2, 3, and 13 
of Figure 26 before a carry pulse reaches them. 
blocking level wil l  be present 50 ns before g 
k, 5 
r reaches gate 3. The gating level will be present 80 ns before c reaches 
gate 13. 
Figure 27 shows that the 
reaches gate 2 and 90 ns before 
k, 4 k 
The 30-stage accumulator is made up of six groups of five stages 
except that c = r the end-around carry. This is shown 
1 6 '  
with c = r 
in Figure 28. 
k k-1 ' 
The accumulator using carry skips yields a speed advantage of 6.50 
6.50/1.24 ps or  5.21/1 over the accumulator previously described. 
The components that must be added to the gated-carry circuit to 
facilitate carry skips a re  one 6-input NAND gate, one 2-input NAND gate, and 
one inverter for each group. In every ftfth stage, two 2-input NAND gates must 
be replaced with 3-input NAND gates. The cost figure for the 30-stage accumu- 
lator with carry skips is then 816. This is an increase of 29.5 percent over the 
cost of the simple iterative accumulator. 
48 
0 
3 
=r 
a, a 
VI a, 
rl u 
3 cd 
PI 
0 u 
? &  
lnrl 
I 
II 
N 
a, u 
cd 
M 
0 u 
m 
a, u 
cd 
M 
0 u 
m 
rl 
a, u 
cd 
u i  m @n 
0 
0 
N 
0 
0 
I4 
f i  
2 
W 
a, 
E 
.rl 
0 w 
I4 
a, 
3 a, 
4 
49 
P 
k 
k 
a 
ti 
a 
E 
E 
p: 
F. 
z 
C 
k 
00' 
N 
50 
The relative cost figures and accumulation times for the simple 
iterative accumulator and accumulators using fast carry techniques a re  
presented in Table 4. It is seen that all three realizations investigated offer 
a considerable decrease in accumulation time without too great a difference in 
cost figure a 
TABLE 4. ACCUMULATOR COST AND TIME REQUIREMENTS 
Accumulator with gated carry 
Accumulator with carry- 
completion detection 
Max Accum Avg Accum 
Cost Figure Time (PSI  Time (,us) 
630 
756 
6. 50 
2. 14 
6 .  50 
2. 14 
0. 88 
I. 24 
The comparison may be formalized by the use of a criterion similar 
to that used by Bur a E51 for parallel adders. The criterion is 
where 
t = accumulation time of the circuit under consideration 
t o  = accumulation time of the iterative accumulator 
F = cost figure of the network under consideration 
Fo = cost figure of the iterative accumulator. 
51 
. .  0 -li 
,_ , 5 .".",,, . ' 1 . .  
The results of applying this criterion a re  shown in Table 5. Using 
this criterion, the circuit for carry-completion detection appears best. How- 
ever, the control circuitry for it is more complicated than for the others con- 
sidered. The average accumulation time was calculated using carry propagation 
lengths resulting from the accumulation of numbers with random bit distribution. 
If this distribution is not random, the accumulation time for the carry-com- 
pletion detection circuit could be greater than that for the carry-skip circuit. 
TABLE 5, COMPARISON OF ACCUMULATOR DESIGNS 
Accumulator Design 
Simple iterative accumulator 
Accumulator with gated carry 
Accumulator with carry- 
completion detection 
Accumulator with carry skips 
6. 50 
2. 14 
(Avg) 
0, 880 
1.24 . 
t/tox 100 
100 
33 
14 
19 
F 
6 30 
756 
818 
816 
1 
1.2 
1 . 3  
1 . 3  
Q -
100 
40 
i a  
25 
In some applications the gated-carry circuit might be desirable. If 
the accumulation time for it is acceptable, it has the advantage of having fewer 
components. 
The gated-carry circuit offers the fastest accumulation time for 
fixed-time operation. In general, the carry-completion detection circuit would 
be recommended for application when a variable accumulation rate is acceptable; 
the carry-skip circuit would be recommended when a fixed accumulation rate 
is desired. 
52 
> 
NO 
A discussion of Majerskiss methods for determining optimal and 
economical skip distributions in an adder with a maximum possible number of 
stages for a given maximum allowable propagation time will require the intro- 
duction of several definitions and notational conventions. One-layer skip dis- 
tribution occurs when adder stages are included in carry skipso and every stage 
is included at most in one skip. 
Two-layer skip distribution occurs when adder stages are included in 
two adder skips, and every stage is included at most in two skips. If two skips 
include common stages, all stages contained in the shorter skip are included in 
the longer skip. The longer skip is called first-layer skip, and the shorter one 
is called second-layer skip. If none of the stages in a skip are contained in 
another skip, the skip is treated as a first-layer skip. 
An adder with one-layer skip distribution is divided into groups, each 
of which includes all stages in a skip circuit or a single stage not contained in a 
skip. 
An adder with two-layer skip distribution is divided into sections, 
which include all stages composing one first-layer skip or a single stage not 
included in skips. A section is divided into groups. A group includes all stages 
in a second-layer skip or a single stage in a second-layer skip. Sections and 
groups are numbered successively, starting with the least significant bits. 
The notation used is: 
n = Number of adder stages 
m = Number of groups in an adder with one-layer skip 
distribution 
.th k (i = 1,2,. $rn) = Number of stages of the P group of i’ an adder with one-layer skip distribution 
53 
p = Number of skips in an adder with one-layer skip 
distribution 
M = Number of sections in an adder with two-layer skip 
distribution 
th 
M) = Number of stages in the i Kiy (i = l , 2 y .  e e section of 
an adder with two-layer skip distribution 
P = Number of first-layer skips in an adder with two-layer 
skip distribution 
th 
m (i = 1,2,. e , M )  = Number of groups in the i section i’ of an adder 
k..y (i = l , 2 y .  a yM); j = i929ee . , mi) =Number of stages in the 
th th 1J 
j group of the i 
section of an adder 
th pi, (i = I, 2,.  . e , M )  = Number of second-layer skips in the i 
section of an adder 
d = Maximum propagation time of a NOR gate 
T = Maximum carry propagation time (mcpt) ; the greatest 
carry propagation time for all possible combinations of 
adder input variables. 
The optimal skip distribution in an n-stage adder is defined as 
occurring when, for a given n, T is minimal and the number of skips is mini- 
mal. The economical skip distribution is defined a s  occurring when, for a 
given n and T, the number of skips.is minimal. 
The mean carry propagation time (mcpt) in the NOR gate adder with 
one-layer skip distribution is given as: 
Ti = maximum (ka - 2) + (P  - ,a - 1) + k 1 
P 
a 9 P  
54 
T, = (m - I) + maximum (ka! - I) 
a! 
D 
(-4-2) 
and where the 01 , P indexes of groups satisfy the conditions 
l s a s r n ,  a < p < m + a a n d k  m+i +k i  
The terms (ka - 21, kp,  and maximum (k - I) denote maximum times of 
carry propagation through positions of the groups in which the carry propagation 
begins and ends. The -2 in the term (ka - 2) is due to the carry propagation 
time beginning with the change of the carry control signal rather than at the 
change of states of the adder inputs. 
a! a 
For the optimal skip distribu.tion in an n (T) -position adder, the 
maximum number of adder positions is the function of mept, T 2 5, given by 
the formulas 
I ,  T =(y)2 
= 2E “3 
where v is defined by equation (A-4).  T 
For odd values of T for which T/8 does not produce a remainder 
equal 3, the optimal skip distribution is the one in which the numbers of stages 
in groups are k and I alternately. For this ease, the following formulas 
apply: 
T+l+qFj ] - l  s , 
- T + 3  -- 
KT 4 Y 
(A-6) 
(A-7) 
55 
m = T - 2k + 5 (m . . even number) (A-8) 
For a 36-stage adder , the values given by these formulas are a minimal value 
of T = 17 and a skip distribution of stages as  5 ,  1, 5, 1, 5, 1, 5 ,  1, 5, 1, 
5 ,  1. 
For the values of T such that T/8 gives a remainder of 3, the 
optimal skip distribution is the one with the numbers of stages in successive 
fours of groups equaling K-2, 1, k,  1. In this case, the following formulas 
a re  used: 
T + 7  
K; =- 4 9 (A-10) 
m = T - 2K + 7 (m e . . number divisible by 4) . (A-11) 
To determine an economical skip distribution, the distributions for 
two adders a re  computed and the one with the smaller number of skips is 
chosen. 
The first adder, with k and 1 stages in groups alternately, is 
determined by formulas (A-4) , (A-6) , (A-7) ,  and (A-8), with n replacing 
in (A-6). An adder with (m/2) (k + 1) stages is obtained. If (m/2) n 
(k + 1) is greater than n , (m/2) (k + I) - n stages are canceled. This 
should be done in pairs in groups and preferably in such a way a s  to cancel 
one skip. ' 
(TI 
For the adder with k-2, 1, k, I positions in successive fours of 
groups, formulas (A-41, (A-51, (A-9 ) ,  (A-IO), and ( A - l l ) ,  with n replacing 
n (T)  in (A-10) , apply. The number of stages obtained is 
mt  (k - 1) + mrr (k + I) > 
56 
where 
(A-12) 
The adder with the smaller number of skips is then the one with economical 
skip distribution. 
The mcpt for an adder with two-layer skip distribution is given by 
T=maximum (Ti, T,, T3, T4) , 
where 
(A- 13) 
(A-16) 
(A-17) 
57 
(A-18) 
1 T, = (M - I)  + maximum 9 , @ , P  
(A-19) 
The methods presented here apply exclusively to the NOR gate adders 
with an even number of stages, in which T always takes odd values. 
In an adder composed of an even number of M sections, having K and 
1 stages alternately, with identical skip distributions in multistage sections , 
T may be expressed as: 
(A-21) T = t + M - 3 ( t an even number) , 
where t is a function of the number of stages and of the skip distribution in a 
section. Formula (A-21) defines t 
Optimal skip distribution in a section occurs when, for a given t , the 
number K 
the number of skips in the section is a minimum. The parameters for such a 
distribution a re  given by the following formulas: 
(t) (t) 
positions in a section is a maximum and, with maximum K 
if t <  16 (A-23) 
58 
(t) for j = 1,2,* . . ,113. P (A-24) 
(A-25) 
take exclusively odd values. The parameters m , k. 
3 
Majerski found for T = 13, t = 8, t. = 8, I, 8 ,  1, 8 ,  1, 8, 1, n 
(t) (t) ,p  
= 32, (T)  
(T) 1 and for T = 15, t = 10, t. = 12, 1, 8, 1, 12, 1, 8, 1, n = 34. 
1 
(T)  The optimal skip distribution in an n 
imum number of positions for a given T 2 9 may be determined as follows: 
- position adder with the max- 
2T+ 9 (A-26) s t 5  
3 3 
2T - 6 
M = T - t + 3  (A-27) 
F o r  t = 6 and t = 8 , 12, 16, e . , the adder is made up of sections 
of K(t) and I stages alternately. The number of adder stages is 
(A-28) 
For t = 10, 14, 18, . . , successive adder sections are taken according 
to the parameters given by the terms of sequences 
59 
t+2, 1, t-2, 1, t+2, 1, t-2, 1, . . e y t+2, 1, t-2, 1 (A-29) 
if M is a number divisible by 4 and 
t+2, 1, t-2, 1, t+2, 1, t-2, 1, . . . , t+2, 1, t-2, t, 1 (A-30) 
if M is a number not divisible by 4. 
The 1 terms in formulas (A-29) and (A-30) correspond to one-stage 
sections. A cyclic change of adder stages is peJmitted. 
stages is then 
The number of adder 
From among the skip distributions for all t obtained from formula - 
stages and for with minimum (TI (A-26) , the one with maximum n 
number of skips is chosen. 
For an economical skip distribution for an n-stage adder with t 2 9, 
the skip distribution is determined for two adders with 
In the first adder, accept K( t )  and 1-stage sections alternately. The 
number of adder sections and the number of adder stages a re  given in formulas 
(A-27) and (A-28). 
In the second adder (only if t 2 81, accept the same number of adder 
sections, but for successive sections use instead of t the successive terms 
of the sequences 
tt-2, 1, t-2, 1, e e . 9 t+2, 1, t-2, 1 
if M is a number divisible by 4, (A-32) 
60 
or 
t+2, 1, t-2, 1, t+2, 1, t-2, 1, . . * , t+2, 1, t-2, 1, t ,  1 
if M is a number not divisible by 4 and t = 10, 14, 18, . . . 
o r  
(A-33) 
2, 1, t+2, 1, t-2, 1, t+2, 1, . . , t-2, 1, t+2, 1, t, 1 
if M is a number not divisible by 4 and t = 8, 12, 16 . . . 
Such skip distributions are determined for at least three successive 
integers t and for further t until both numbers n' and nr r  are smaller than 
n . Then for all skip distributions in the adders for which n' > n or nrc > n, 
cancel n' -n o r  nrr -n stages to cancel the greatest number of skips. 
among the obtained skip distributions of n -position adders, the one with 
the minimum number of skips  is chosen. 
From 
61 
REFER EN CES 
1. 
2. 
3. 
4. 
5. 
6. 
7. 
8. 
9. 
10. 
Carroll, C. C. ; et al: On the Realization of a Generalized Second-Order 
Digital Compensator, Technical Report N r  9, NAS8- 11 274, Auburn 
Research Foundation, Auburn, Ala. , September 1967. 
Carroll, C. C. ; and Nagle, H. T. : A Special-Purpose Computer 
Realization of a Third-Order Digital Filter for the PIGA Control Loop. 
Technical Report N r  9, NAS8-20163, Auburn Research Foundation, 
Auburn, Ala. , May 1968. 
Sklansky, J. : An Evaluation of Several Two-Summand Binary Adders, 
IRE Transactions on Electronic Computers, vol. EC-9, June 1960, 
pp. 213-226. 
MacSorely, 0. L. : High-speed Arithmetic in Binary Computers. 
Proceedings of IRE, vol. 49, January 1961, pp. 67-91. 
Lehman, M. ; and Burla, N. : Skip Techniques for High-speed Carry- 
Propagation in Binary Arithmetic Units. IRE Transactions on 
Electronic Computers, vol. EC-IO, December 1961, pp. 691-698. 
Gilchrist, B; Pomerene, J. H. ; and Wong, S. Y. : Fast-Carry Logic 
for Digital Computers. IRE Transactions on Electronic Computers, 
vol. EC-4, December 1955, pp. 133-136. 
Chu, Yoahan: Digital Computer Design Fundamentals. McGraw-Hill, 
Inc. (New York), 1962. 
Burks, A. W. ; Goldstine, H. H. ; and Neumann, J. Von: Preliminary 
Discussion of the Logical Design of an Electronic Computing Instrument. 
The Institute for Advanced Study (Princeton, N. J. 1 ,  1947. 
Reitweisner, G. W. : The Determination of Carry Propagation Length 
for Binary Addition. IRE Transactions on Electronic Computers, vol. 
EC-9, March 1960, pp. 35-38. 
Hendrickson, H. C. : Fast High-Accuracy Binary Parallel Addition. 
IRE Transactions on Electronic Computers , vol. EC-9, December 1960, 
pp. 465-469. 
62 
REFERENCES (Concluded) 
11. Majerski, S. ; and Wiweger, M. : NOR-Gate Binary Adder with Carry 
Completion Detection. IEEE Transactions on Electronic Computers , 
vol. EC-16, February 1967, pp. 90-92. 
12. Majerski, S. : On Determination of Opthnal Distributions of Carry 
Skips in Adders. IEEE Transactions on Electronic Computers, vol. 
EC-16, no. I ,  February 1967, pp. 90-92. 
13. Instruction Manual, p-Pac Integrated Circuit Modules. Honeywell 
Computer Control Division, September 1966. 
63 
B I BL 1 OGRAPHY 
Bartee, T. C. ; and Chapman, D. J. : Design of An Accumulator for a General 
Purpose Computer. IEEE Transactions on Electronic Computers, vol. EC-14, 
no. 4, August 1965, pp. 570-575. 
McCluskey, E. J. : Introduction to the Theory of Switching Circuits. McGraw- 
Hi l l  Book Co. (New York) , 1965. 
Miller, R. E. : Switching Theory (Combinatorial Circuits). John Wiley and 
Sons (New York), vol. I, 1965. 
Morgan, C. P, ; and Jarvis, D. B. : Transistor Logic Using Current Switching 
and Routing Techniques and Its Application to a Fast-Carry Propagation Adder. 
Proceedings IEE (London) , vol. 106B, September 1959, pp. 477-478. 
Prather, Ronald E. : Introduction to Switching Theory: A Mathematical 
Approach. Allyn and Bacon, Inc. (Boston), 1967. 
Specialist discussion meeting on new digital computer techniques - Special 
Aspects of Logical Design. Proceedings IEE (London), vol. 106B, 
September 1959, pp. 462-469. 
h 
Weinberger, A. ; and Smith, J. L. : A Logic for High-speed Addition. Natl. 
Bur. Standards (Washington, D. C. ) , Circular 591, Sec. 1; February 1958. 
64 
A P PROVA L TM X-53953 
FAST CARRY ACCUMULATOR DESIGN 
By William C. Mastin 
. The information in this report has been reviewed for security classifi- 
cation. Review of any information concerning Deparment of Defense or  Atomic 
Energy Commission programs has been made by the MSFC Security Classifica- 
tion Officer. This report, in its entirety, has been determined to be unclassi- 
fied. 
This document has also been reviewed and approved for technical 
accuracy. 
Chief Optical Sensors Section Chief, Guidadce and Control Division 
P. H. BROUSSARD, J&/ F. B. MOORE 
Chief, Sensors Branch Directory Astrionics Laboratory 
65 
INTERNAL 
DIR 
DEP-T 
AD-S 
PM-PR-M 
S&E-DIR 
S& E-CSE-DIR 
DP. Haeussermann 
S& E-ASTR-DIR 
Mr.  Moore 
S& E-ASTR-A 
Mr.  Hosenthien 
M i s s  Flowers 
S&E-ASTR-G 
Mr. Mandel 
Dr. D o m e  
Mr.  Wood 
Mr.  Broussa rd  
Mr .  Walls 
Mr .  Jones  
Mr.  Doran 
Mr.  Mastin (3)  
Mr.  C a r t e r  
Mr. Turne r  
Mr .  Richards  
TM X-53983 
D I STR I BUT I ON 
A&TS-MS-IL (8) 
A&TS-MS-IP (2) 
A&TS-MS-H 
A&TS-PAT 
Mr.  Wofford 
A&TS-TU (15) 
Mr . Window 
EXTERNAL 
Dr. C. C. C a r r o l  (2) 
pept.  of Elec t r ica l  Engineering 
Auburn University 
Auburn, Alabama 
Scientific and Technic a 1 Information 
P. 0. Box 33 
College P a r k ,  Maryland 20740 
Attn: NASA Representative (S-AK/RKT) 
Facility (2) 
S&E-ASTIA-S 
Mr. Wojtalik 
. S&E-ASTR-C 
Mr . Swearingen 
Mr.  White 
S&E-ASTR-ZX 
NASA-MSFC 
66 
