Cellular logic array for computation of squares by Whitaker, S. et al.
3rd NASA Symposium on VLSI Design 1991
N94-18343
2.4.1
Cellular Logic Array for
Computation of Squares 1
M. Shamanna, S. Whitaker and J. Canaris
NASA Space Engineering Research Center
for VLSI System Design
University of Idaho
Moscow, Idaho 83843
Abstract- A cellular logic array is described for squaring binary numbers. This
array offers a significant increase in speed, with a relatively small hardware
overhead. This improvement is a result of novel implementation of the formula
(x ÷ y)2 _,_ z2 + y2 T 2xy. These results can also be incorporated in the existing
arrays achieving considerable hardware reduction.
1 Introduction
The advent of VLSI has spurred a renewed interest in the development of specialized
arithmetic circuits. Special arithmetic functions like squares and square-roots are generally
implemented in software. However, when a machine is designed for a specific application,
wherein squaring is a frequent process, it may prove advantageous in terms of speed to use a
hardware implementation. Most of the approaches, reported in literature for squaring and
square-rooting, use array multipliers or special purpose arrays which perform a multitude
of other operations in addition to squaring. As a result, there are very few arrays which are
solely devoted to extraction of squares. However, Dean[l] has reported such a dedicated
array which is probably among one of the fastest squaring circuits known, thus far. In
addition, Dean's array uses considerably less hardware than other arrays reported so fax.
Hence Dean's array has been selected as the obvious choice for comparison with the array
proposed in this paper. The proposed array, will provide a significant gain in speed, with
a very small hardware overhead, as compared to Dean's squarer[I].
2 Algorithm
Dean[l] has not presented a formal algorithm for his implementation. So, the widely
used general binary squaring algorithm[3] will be presented first followed by the proposed
algorithm for purposes of clarity and easy understanding. The existing algorithm for binary
squaring is generally formulated as follows:
(1)2 = (01) 
(a,1) 2 = (al) 2 + (0a,01)b or
tThis research was supported ( or partially supported ) by NASA under Space Engineering Research
Center Grant NAGW-1406.
https://ntrs.nasa.gov/search.jsp?R=19940013870 2020-06-16T18:09:54+00:00Z
2.4.2
F2 : F1 + (0al01)b
where F1 = (0l)b if al -" 1 and F1 = (00)b otherwise. Similarly, we have
In general if 1 then,
(anal1) 2 "- (a2a!) 2 + (OOa2alO!)b, or
F3= F2+ (00a2a,01)b
F,.+I = F,. + D_
rtimes
where F,:(a,.a,._a...a2al) 2 is the r th square and D, : 00 .... 0t_h-ar-1 ... al01 is called
the r th radicand. It is obvious that F,+I = F, if q,.+l = O. The ab.ove ite_rative formula
applies for all:r=_ ], 2,..., n. -F_gures 4 and $ ...........show t_e schematic'..... _ deta_s......................oi_ a three bit'=
squaring array for the above algorithm[3].
The proposed algorithm ma_esuse of_the well known formula (x + y)2 = z2 + y2 + 2zy_
Consider a three bit number (a222 + a_2 _ + ao2a). The LSB-1 and LSI] of the square 9f
any number will respectively be 0 and LSB of the origin_ n umber_itse]f. =_erei_: ....
..............................
(a222 + a,2' + ao2°) 2 = (an + al)24 + (a;ao)23 + (a, ao +ao)2 _ +:ao._ ..... . ::
The same result can also be achieved by the repeated application of the formula (Z +
y)2 : z2 + y2 + 2xy where y is the LSB and x is the rest of the binary number.
Also,
= (a_21) 2 + 2(a2a_2 _) + a12 °
x 2 2zl/ y2
= (a2)2 2 + (a2a,)2 2 + a12°
: (a2 + a2al)22 -4-a12 ° (1)
z y
= (a222 -4- a121) 2 + 2(a2ao22 + alao2 _) + ao2 °
z 3 2zy y_
= (a22 _ -4-a,2°1_2' + (a_ao23 + a,ao2') + ao2 ° (2)
Equation 1 proves that the LSB-i bit and the LSB of the final answer is always 0
and the LSB of the original number itself respectively. Since multiplication by 2 implies
a !eft-shift by one bit position the term (2a_al) has been shifted from the 21 bit position
to 2 _ bit position in Equation 1. This result for a three bit binary number i_ realized by
the array of Figure 1. The algorithm can easily be extended to any n bit number. The
uovelness of the algorithm lies in the f_ct that squaring of the number is carried out in
steps coupled with the ingenious us_e _of left-shifts in the bit positions.
3rd NASA Symposium on VLSI Design 1991 2.4.3
3 Comparison
The implementation of the proposed algorithm for a 3 bit and a 4 bit number has been
illustrated in Figures 1 and 3 respectively. The proposed array is built of the basic half-
adder cell shown in Figure 2. Its function may be defined as follows:
= (w + _-1)(_v)
The symbols + and • stands for the Inclusive-Or and And operations in the above expres-
sions.
The implementation of 3 bit squarer based on Dean's algorithm is also illustrated in
the Figures 6 and 7. The basic cell (Figure 7) has two control inputs A and B. The inputs
on the lines C and D are added in the cell, S being the sum out and P being the carry
out. When both A and B are present, a further digit is added to the sum (and carry), so
that the cen then functions as a full-adder[l].
It can be seen that the proposed array has 1 + _=3 i whereas Dean's array [1] uses
1 + _'=1 i cens resulting in a overhead of (n - 2) cells. However, the hardware inside the
proposed basic cell is much simpler, as it utilizes only half-adders, compared to full-adders
in Dean's array. So the increase in the number of cells is offset by the reduction in the
complexity of the individual cell. This leads to the authors contention that the hardware
overhead which translates into increased chip area is almost negligible. Moreover, the
propagation time through the proposed array is only nr as compared to (2n - 3)v which
is the delay through Dean's array. The hardware overhead-speed gain relation follows the
square law for most specialized arithmetic arrays. Here, an increase in speed has been
accomplished with a linear increase in hardware.
The proposed array has a number of unused inputs which can be used to add in an
other number so that the array would function as a full squarer (all outputs in l state).
A specialized array of this sort has a number of applications including the generation of
binary logarithms[2] which depends on iterative squaring.
4 Conclusions
A new cellular array for extraction of squares of binary numbers has been presented. An
squaring algorithm based on the formula (z + y)2 has been described. The proposed array
provides impressive speed gains compared to the existing arrays at the expense of negligible
hardware overhead. It is hoped, that the algorithm discussed in this paper will provide
fresh insights, to reduce redundant hardware present in most of the existing squaring
arrays.
References
[1] K, J. Dean, Cellular Logical Array for Obtaining the Square of a Binary Number,
Electronics I, etters, Vol. 5, Aug. 1969, pp.370-371.
2.4.4
[2] K. J. Dean, A fresh approach to logarithmic computation, Electron. Engng., 41, April
1969, pp.488-490.
[3] K. Hwang, Computer Arithmetic: Principles, Architecture and Design, John Wiley
and 5onJ, 1979.
_ 7
• ° _igure 1: l_rop0sed sq_ar]-ng array _or t_ree _ nurn]_ers
Y W
V = V-1
U
Figure 2: Basic ceil Used in the proposed Squaring array
3rd NASA Symposium on VLSI Design i991 2.4.5
a
0
X X
0\ !_X
V.o
Figure 3: Proposed squaring array for four bit numbers
0 0 1 0
alN_cAF_ _ _C_AF 0
0 0
_-FF . J c
i
"--1_
0
Figure 4: A three bit squaring array using the general algorithm
_.4.6
E
CAF cell
\
Figure 5: Basic cell used in the general three bit squaring array
a_ ai V.
Figure 6: Dean's array for three bit numbers
ao
3rd NASA Symposium on VLSI Design 199I 2.4.7
B S A
D
Figure 7: Basic cell used in Dean's array
=Z
m
