Evaluating A+B=K conditions in constant time by Cortadella, Jordi & Llaberia Griñó, José M.
EVALUATING “A+B = K” CONDITIONS IN CONSTANT TIME 
J. Cortadella, J.M. Llaberia 
Dept. dArquitectura de Computadors, Facultat d’Infodtica 
Universitat Polithica de Catalunya 
Pau Gargallo, 5 08028 Barcelona (Spain) 
ABSTRACT THEORETICAL BASIS 
One of the most important components of an ALU is the 
adder. Its response time is mainly determined by the carry 
propagation delay. Evaluation of conditions between two 
numbers are usually performed with the ALU by means of a 
substraction. In this paper we deal with a type of conditions 
that can be evalauted without requiring a complete ALU 
operation. The circuit that is presented detects the condition 
A+B=K (n-bit numbers) in constant time, avoiding the carry 
propagation delay. Some applications for this circuit are also 
presented. 
INTRODUCTION 
Given three n-bit vectors A=%%-l...al, B=b,bn- ,... b, 
and K= k&,,l...kl that represent two’s complement integers, 
we want to design a circuit that can evaluate the conditon 
A+B=K (arithmetic addition) in constant time. This means 
that the evaluation does not depend on the lenght (number of 
bits) of the vectors. 
The basic idea of the circuit is the local evaluation of the 
condition at each bit position i, assuming that the condition is 
fulfilled in the rest of bits. The last stage computes the global 
logical OR of all the local evaluations. 
The behavior of a full adder can be described with the 
The carry computation is one major problem in the 
response time of parallel adders. Several approaches have 
been proposed in order to reduce it [1][2][3]. VLSI 
techniques have been also used to minimize design costs and 
chip area. The fastest adders, such as lookahead adders, 
perform additions of n-bit numbers in time O(1og n) and area 
O(n log n). 
following expressions: 
pi = ai @ bi 
gi = ai A bi 
ci = (pi A ci-l) v gi 
ri = pi 63 ci-l 
(Carry propagation) 
(Carry generation) 
(Carry. We define co=O) 
(Addition result) 
The condition “A+B=K@ could be detected by comparing 
ri and ki at each bit position i and performing a global OR of 
all the comparisons. In this case the response time would be 
determined by the carry propagation time, since ci is defined 
as a function of ci-l. In order to avoid the carry propagation 
problem, we define the following expressions: 
This paper deals with a problem associated with parallel 
adders. A circuit for detecting when the addition of two n-bit 
numbers is equal to another n-bit number (A+B=K) is 
presented. We will prove that the result of this evaluation can 
be computed in constant time and area O(n), avoiding the 
problem of the carry propagation delay. 
- 
qi = (pi A ki) v gi 
Si Pi 63 qj-1 (Predicted addition result) 
(Predicted carry. We define qo=O) 
The paper is organized as follows. First, the theoretical 
basis of the problem is presented. Next, the design of the 
circuit is described. Finally, some applications for this circuit 
zi = si @ ki (zi = 0 e si = ki) 
are discussed. 
This work was supported by the Ministry of Education of Spain 
(CAICYT) under conuact number 314-85 
ISCAS’88 
243 
CH2458-8/88/0000-0243$1 .OO 0 1988 IEEE 
© 1988 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing 
this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work 
in other works. DOI 10.1109/ISCAS.1988.14912
The predicted carry (si) substitutes the carry for the 
computation of the predicted result. We can easily prove that 
qi = ci in case that ri = $: 
- 
qi = (pi A 5) v gi = (pi A ri) v gi = (pi A ci-,) v gi = ci 
The design of the circuit is based on the following 
theorem: 
Theorem. 
Z,, = 0 w ri = $ for each i E (1, .... n} 
Proof. By induction on n. 
First, we will prove that the theorem holds for n=l. 
Z, = zl= s1 @ kl = p1 @ k, = r1 @ k, 
thus 
(since co = qo = 0) 
Z,=O w r l = k l  
Next, we will prove that the theorem holds for n > 1, Let's 
assume the theorem holds for b l  bits. 
If Z, = 1 then Z,, = Zn-, v z,, = 1, and the theorem holds for 
n bits, since there is i e (1 ..... n-1} c (1 ..... n} such that 
rifki (by induction hypothesis). 
IfZ,,=OthenZ,,=z, 
By the definition of 5 we have that 
~n = Sn @ k, = Pn @ qn-1@ k, 
since Zn-l = 0 then rn-l = &-, and qn-, = cn-l (by induction 
hypothesis). Therefore, 
5 = Pn @ Cn-1@ k, =rn @ k, 
and 
z n = O w  r n = k ,  
Since ri = ki for each i E { 1, .... n-I}, we have that 
Z,, = q, = 0 0 rn = k,, ts ri = 4 for each i E { 1, .... n} 
cl 
9i Pi ki 
............................. 
Figure 1. Circuit design 
CIRCUIT DESIGN 
From the previous section we can observe that function 
qi (predicted carry) does not depend on any information in the 
other stages. So functions zican be computed in parallel. 
Figure 1 depicts a diagram of the circuit that computes Z,,. We 
assume that each gate and the z,, line discharge (n-input 
NOR) take constant time. So we can state that the computation 
of Z,, also takes constant time. Since the circuit is made of n 
identical cells, the chip area is O(n). 
The circuit design can be simplified if any of the 
operands is congtant. Figure 2 shows a diagram of the circuit 
when vector K is constant. 
9i Pi 9i Pi 
................. ................. 
Figure 2. Circuit design with K constant 
244 
SOME APPLICATIONS 
The condition evaluation is one of the most important 
operations performed in the execution of conditional 
branches. In most architectures, the condition is evaluated as a 
function of the condition codes. Their value depends on the 
result of ALU operations. Figure 3 depicts a widely used 
ALU structure [4]. It consists of an Operand Modifier Unit 
(OMU) and an adder. The OMU computes the carry 
propagation and generation functions (pi and gi) depending on 
the data input (ai and bi) and the operation. 
1 Ij 
Operand Modifier Unit 
Operation 1- 
r Adder I 
Figure 3 ALU structure 
The dependencies produced by the execution of branches 
have been extensively studied by many authors [5][6]. In 
pipelined processors, the condition evaluation has to be 
delayed until all the instructions that modify the condition 
codes and precede the branch have finished its execution. The 
condition evaluation is the most important dependency that 
restricts the execution of branches with zero delay [7]. 
Katevenis observed that 80% of conditional branches 
involve tests for equality, inequality and any relation with zero 
(fast comparisons) [8]. Equality and inequality tests are 
determined by the zero condition code (Z ) ,  usually after a 
substraction operation (comparison). Relations with zero are 
determined by Z and the sign bit of the operand that is tested. 
j7 ii i' i: 
Operation 
Operand Modifier Unit 
I 
Adder 
'n 'i r l  
Figure 4 Evaluation of condition code Z 
CO 
By designing a circuit that detects the condition A+B=O, the 
computation of Z can be advanced and the delay produced by 
the condition evaluation reduced (see figure 4). This 
improvement can increase the processor performance 
substantially since about one in every six instructions are 
branches that have to evaluate this kind of conditions. 
Another application can be found for DO-like loops. This 
kind of loops are very used in numerical programs (figure 
La). The compiler generates a code similar to the one shown 
in figure 5.b. As Katevenis also observed, some comparisons 
can be converted to fast comparisons [6]. This is the case of 
"I S N", which can be converted to "I # N+1" without 
modifying the loop behavior. By considering an ALU 
structure such as the one shown in figure 6, and introducing a 
new instrucion in the machine language (NEXT), the compiler 
can generate a code similar to the one in fig. 5.c. The 
Ri c 1 Ri -1 
Rlimit e- n + 1 do i = l , n  do: 
do: 
end do Ri + Ri + 1 
if Ri 5 n got0 do NEXT Ri, do 
Figure 5. Do-like loops 
245 
instruction NEXT increments Ri and compares the new value 
with Rlimit. In case that Ri#Rlimit, control is transferred to 
the branch target address. Again, the improvement is based on 
the fact that the condition can be evaluated before computing 
the new value of Ri, avoiding pipeline delays in the execution 
of conditional branches. 
I Riimit 
kn k, k, 
Operation 
Operand Mod fier Unit 
Adder r 1rn I r l  
Figure 6. ALU structure for DO-like loops 
CONCLUSIONS 
CO 
In this paper we have presented a circuit that detects the 
condition "A+B=K' in constant time. Its area is proportional 
to the number of bits of the vectors. The theoretical basis and 
several design approaches have been described. 
This circuit can be used to detect a wide spectrum of 
conditions in branch instructions. It can improve the 
processor performance by advancing the evaluation of 
conditions and eliminating the pipeline delays produced by 
these operations. 
[2] R.P. Brent and H.T. Kung, " A  Regular Layout for 
Parallel Adders", IEEE Transactions on Computers, Vol. 
C-31, No. 3, March 1982, pp. 260-264. 
[3] M.Lehman and N.Burla, "Skip Techniques for  
High-speed Carry-Propagation in Binary Arithmetic 
Units", R E  Transactions on Electronic Computers, Dec. 
1961, p. 691. 
[4] M. Pomper et al., "A 32-bit Execution Unit in an 
Advanced nMOS Technology", IEEE Journal of Solid 
State Circuits, Vol. SC-17, No. 3, June 1982, pp. 
533-538. 
[5] E.M. Riseman and C.C. Foster, "The Inhibition of 
Potencial Parallelism by Conditional Jumps", IEEE 
Transaction on Computers, Vol. C-21, No. 12, Dec. 
1972, pp. 1405-1411. 
[6] Scott McFarling and J.L. Hennessy, "Reducing the Cost 
of Branches", Proc. 13th. Annual Symposium on 
Computer Architecture, June 1986, pp. 396-403. 
[7] D.R. Ditzel and H.R. McLellan, "Branch Folding in the 
CRISP Microprocessor: Reducing Branch Delay to 
Zero", 14th. Ann. Int. Symp. on Computer Architecture, 
June 1987. 
[8] M.G.H. Katevenis, "Reduced Instruction Set Computer 
Architectures for V a l " ,  Ph. D. dissertation, University 
of Califomia, Berkeley, October 1983. 
REFERENCES 
[ l ]  K. Hwang, "Computer Arithmetic. Principles, 
Architecture and Design, John Wiley & Sons, 1979. 
246 
