Binary addition and multiplication in cellular space by Katona, Endre
Binary addition and multiplication in cellular space 
By E . KATONA 
Cellular automata are highly parallel bitprocessors, so they are suitable for the 
bitparallel execution of distinct computational tasks. In this paper powerful bitpar-
allel algorithms are given for fixed point binary addition and multiplication, taking 
into account the cellprocessor architecture developed by T. LEGENDI [1]. For this 
architecture there have been constructed more then 100 cellular algorithms solving 
different computational tasks [6]. In a large cellular space a high number of cellular 
adders, multipliers and other processing elements may be embedded, and more com-
plex tasks may be computed in parallel, as matrix multiplication [4], certain data 
processing tasks [5], etc. 
1. Introduction 
A cellular automaton is a highly parallel processor, but the economical pro-
gramming of such a processor is not an easy task. If macro-cells are applied (a cell 
works as a microprocessor), then the programming of the cellular structure is some-
what easier [7], but the architecture has lower flexibility (fixed operations, fixed 
word length, etc.) and in general the bitparallel execution of the operations is impos-
sible. 
If micro-cells are applied (having maximum 16 states) with variable transition 
functions, then the cellprocessor has high flexibility and a totally bitparallel process-
ing is possible. In [3], [4], [5], [6] and in this paper it is shown that a cellprocessor 
consisting of micro-cells is economically programmable, and the speed of the cellu-
lar algorithms is wordlength-independent in most cases. 
The cellprocessor architecture proposed in [1] is based on the micro-cell concep-
tion, and has — from the point of view of this paper — the following characteristic 
properties: 
(i) The cellular space is a two-dimensional rectangle-form cell-matrix which is 
bounded by dummy-cells (the dummy cells have no transition funtion, but their 
states can be set from the outside world). In the cellular net the von Neumann neigh-
bourhood is assumed. 
(ii) The cells do not have a fixed transition function, but receive commands 
(microinstructions) from a central control (CCPU), and arbitrary local transition 
function may be realized by the execution of a certain sequence of microinstructions. 
This implies that the cellprocessor can work with an arbitrary local transition func-
tion, and — moreover — it can work with 'time-varying transition function. 
458 E. Katona 
(iii) The cellular space is inhomogeneous, that is, the individual cells may work 
with different transition functions at the same time. To ensure this property, each 
cell has an internal state. The cells having different internal states may work with 
different transition functions. So, if there are n different internal states, then maxi-
mum n different transition functions may work in parallel. The internal states are 
set at t—0, and during the working of the cellprocessor they are unchanged. 
The transition functions will be defined according to [2] by microconfiguration 
terms. A microconfiguration term has the form: 
the state of a group 
of cells at time t 
Each cell on the right side occurs on the left side, too, and is marked by double 
frame for the identification. Because of the inhomogeneity a microconfiguration term 
may describe more transition functions together. 
The notation [jcfcjcft_j_ jcx] will be used often in the text, which means a /c-digit 
binary number having the digits xk,xk_lt xx (x ;e{0, 1}). 
2. Binary addition 
Binary addition is the most fundamental arithmetic operation. The cellular al-
gorithm described below is applied in many further cellular processing elements (see 
the cellular multiplier in this paper, and [4], [5], [6]). 
v ' The cellular binary addition is based on the "carry save" addition algorithm. 
Let y=[yk...yi] and z=[zk...z1] be binary numbers of A: digits to be 
added. In the first step x and y are added in a parallel way: a (partial) sum 
s=[i^...jj] and a carry vector c=[ck...cx] is computed as follows 
[Cijr,]: for any /. (1) 
In the second step the number z can be added to s and c by the formula 
[c'ts,] := Zi+Si+c^i f o r any /. (2) 
(The sign ' serves for the distinction between the old and new values of s and c.) 
If there are more numbers to be added, then they can be added to s and c also 
by formula (2). The complete sum of the operands should be computed from the last 
s and c in k— 1 steps applying the formula 
[c^H : = ¿-¡4-c,-! for any i. (3) 
On the basis of the described parallel addition algorithm it is easy to construct 
a cellular automaton for binary addition. It consists of k adder cells, each contain-
ing a sum bit S and a carry bit C (4-state cells). A dummy cell is connected to each 
adder cell as upper neighbour (Fig. 1). 
the required state of (another) 
group of cells at time i + 1 










c c c 
Fig. 1 
At f = 0 the bits S and C are 0, and the bits / contain the first number to be 
added. In any further step a new number will be written into the bits I and the adder 





where [C'S'] = S+C+I. 
After the input of the last operand the dummy cells are set into 0 and after k— 1 
steps the complete sum of the operands is computed in the bits S of the cell-row. 
(In this way the above transition function includes the formulas (1), (2), (3).) 
The addition of n numbers each consisting of k bits, needs n+k — l steps, so 
the parallel addition algorithm is economical for many operands. 
Remark. To prevent the overflow, for n operands a cellular adder consisting of 
k+log2 n cells should be used. If only k cells are applied, then the leftmost cell needs 
a special overflow-watching transition function (inhomogeneity). 
The above cellular adder has many simple applications, as the binary counter, 
the computation of certain number-rows (e.g. Fibonacci-numbers), vector addition, 
etc. [6]; but the most important application is the binary multiplication discussed in 
the next point. 
3. The multiplication of two binary numbers 
The cellular multiplication algorithm is based, as usual, on the addition: the 
partial products will be generated in a special cell-row, and another cell-row under 
it works as an adder (Fig. 2). 
A A A 







c c c 
Fig. 2 
460 E. Katona 
The partial products are generated in an overlapped manner. Between the digits 
of the multiplicand a—[ak...ai\ and the multiplier b=[bk...b1] zero digits are 
inserted, and in such a form they move step by step one against another in the upper 
cell-row (Fig. 3). 
a4 0 a3 0 a2 0 ax 
step 1 ¿4 0 ¿3 0 b2 0 
a d d e r 
a4 0 a3 0 a2 0 ay 
step 2 b4 0 b3 0 b2 0 
a d d e r 
a4 0 a3 0 a2 0 ax 
step 3 b4 0 ¿3 0 b2 0 ¿>x 
a d d e r 
Fig. 3 
Cellular algorithm for binary multiplication in the case k = 4. 
The products of the operand digits staying on the same position are summed by 
the adder (on Fig. 3 in the first step a1A4, in the second step a2b4 and axb3 are summed). 
Fig. 3 shows well that in steps 1, 2, 3 and 4 the bit b4 is multiplied by. , a2, a3 and 
fl4, thus the partial product [a4a3a2tfi] • b4 is generated for the adder. The partial pro-
ducts corresponding to b3, b2 and bx are computed in a similar way, and each is 
created on the appropriate position. 
The two rows of the cellular multiplier have distinct transition functions, which 
may be defined together as follows: 
A, A A, 
B B, 
1 1 — 1 
s 
c c 
w h e r e [ C S ' ] = S+C+A-B. 
If A>bit numbers are multiplied, then the product has 2k bits, therefore an adder 
of length 2k should be used. Thus the multiplier needs 4k 4-state cells. 
If at / = 0 the configuration of Fig. 3 (step 1) is assumed, then at t=2k — 1 
all the partial products are generated. It is easy to see that at t=2k the rightmost 
k cells of the adder have zero carry bits. Therefore to compute the complete product 
further k steps are needed, thus the whole multiplication process uses 3k steps. 
Binary addition and multiplication in cellular space 461 
Remark. If between the digits of a and b the digits of further two A>bit numbers 
x and y are written (instead of the zeros), then the multiplier computes the expression 
a-b+x-yl The cellular multiplier may be used for vector-multiplication in a similar 
way [4]. 
4. Multiplication of more then two numbers 
In this section a cellular algorithm is given to compute the product ... ,v„ 
where xf is a fc-bit number and 1 holds for any i (the leftmost digit of xL 
has the positional value 2_1). To solve this task the cellular multiplier of section 3 
will be modified: 3-bit cells (i.e. 8-state cells) will be used where the third bits in the 






Cellular multiplier for more then two numbers. The control bits are marked by V. 
At i = 0 the number is stored in the bits " S " of the adder. The numbers x2, ..., x„ 
come from the outside world and go left on the bits "B". Before each number xi 
a control signal of value 1 is sent, which goes left on the control bits and copies the 
bits "S" into the bits " A " (at the same time the adder is cleared). Thus the number .r,-
coming from the outside world is multiplied by the product x x - . . . •x i_1 , and the 
process may be repeated until it is necessary. 
According to the above principle, the transition functions of section 3 should 
be modified as follows. 
If the adder cell contains a control signal 0: 
B B, B, 
Ax A 
\ 1 1 — \ 
5 S' 
— — c C' 
0 V V 
where [C 'S ' ] = S + C + ^ - f i . 
462 E. Katona 
Listing 
S T E P 0 : STEP 1 2 : 1 . 0 . 1 . . . 
. . . . 1 . 0 . 
1 0 0 1 0 0 0 0 0 1 1 0 1 1 0 0 
STEP 1 : STEP 1 3 : . 0 . 1 . . . 1 
0 1 . 0 
1 0 0 1 0 0 0 0 0 1 1 0 2 0 0 0 
STEP 2 : STEP 1 4 : 0 . 1 . . . 1 
0 . . . . . 0 . 1 . 
1 0 0 1 0 0 0 0 0 1 1 1 0 0 0 0 
< . . 
S T E P 3 : 1 STEP 1 5 : . 1 . . . 1 . 1 
0 . 0 . . . 1 . 0 . 1 
1 0 0 1 0 0 0 0 0 1 1 0 0 0 1 0 
< . . 
S T E P 4 : 1 STEP 1 6 ; 1 . . . 1 . 1 
. . . . 0 . 0 . . . 1 . 1 . 0 . 
1 0 0 1 0 0 0 0 0 1 0 0 0 0 1 1 . < . . . . . - < 
STEP 5 : 1 . 1 STEP 1 7 : . . . 1 . 1 . 1 
. . . 1 . 0 . 0 . 1 . 1 . 1 . 0 
1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 
STEP 6 : . . . . 1 . 1 . STEP 1 8 : ,. . 1 . 1 . 1 . 
. . 0 . 1 . 0 . Ó . 1 . 1 . 1 . 
1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 
S T E P 7 : . . . 1 . 1 . 0 STEP 1 9 : . 1 . 1 . 1 . 0 
. 0 . 0 . 1 . 0 . 0 . 1 . 1 . 1 
1 0 0 0 1 0 0 0 0 0 1 1 2 1 2 1 
S T E P 8 : . . 1 . 1 . 0 . S T E P 2 0 : 1 . 1 . 1 . 0 
1 . 0 . 0 . 1 . . . 0 . 1 . 1 
0 0 0 0 1 1 0 0 0 0 1 3 0 3 0 
S T E P 9 : . 1 . 1 . 0 . 1 STEP 2 1 : . 1 . 1 . 0 . . 
. 1 . 0 . 0 . 1 . . . 0 . 1 . 1 
0 0 0 0 1 1 0 0 0 0 2 1 2 1 0 1 
S T E P 1 0 : 1 . 1 . 0 . 1 S T E P 2 2 : 1 . 1 . 0 . . . 
. . 1 . 0 . 0 . . . . . 0 . 1 . 
0 1 0 0 1 1 0 1 0 1 0 2 0 1 0 1 
STEP 1 1 : . 1 . 0 . 1 . . STEP 2 3 : . 1 . 0 . . . . 
. . . 1 . 0 . 1 0 . 1 
0 1 1 0 1 1 0 0 0 1 1 0 0 1 0 1 
Binary addition and multiplication in cellular space 463 
If the adder cell contains a control signal 1: 
The multiplication process is demonstrated on a simulation example (see List-
ing). The product of xx —0.1001, x2=0.1101 and x3=0.1110 will be computed 
by an 8-bit multiplier. The multiplier is displayed in 4 rows, according to Fig. 4, 
but in the third row the bits S and C are printed together in the form [CS] (that is, 
for example the value 2 means C = 1 and £=0) . The points mean insignificant 
zeros in each row. 
At / = 0 , xx is stored in the adder, and a control signal marked by " < " starts 
on the right end of the multiplier. Between t= 1 and t=S the number xx is copied 
into the bits "A" and it is shifted right (hereby zeros are inserted between the digits). 
The number x2 comes from outside and will be multiplied by xx. At t= 10 the 
rightmost digit of XjX2 is computed. Already at this moment a new control signal 
may be started which ensures the multiplication of xxx2 by x3, thus an overlapping 
is possible between the consecutive multiplications. 
For the multiplication of n numbers (2k + 2)(n — l) + 2k % 2kn steps are 
required, and the modified multiplier consists of Ak %-state cells. The product con-
tains 2k digits (the leftmost digit has the positional value 2 _ 1) and the first 
7k—Iog2 k—log2 n bits are always correct. 
5. Concluding remarks 
In this paper three fundamental cellular processing elements have been discus-
sed, each designed for the same cellprocessor architecture [1]. Each processing element 
is based on a bitparallel cellular algorithm where nearly all cells work effectively in 
each time-step. By the interconnection of such simple processing elements more 
complex tasks may be solved in bitparallel by a cellprocessor. 
RESEARCH GROUP ON THEORY OF AUTOMATA 
H U N G A R I A N ACADEMY OF SCIENCES 
SOMOGYI U. 7. 
SZI'.GED, HUNGARY 
H-6720 
464 E. Katona: Binary addition and multiplication in cellular space 
References 
]1] LEGENDI, T . , Cellprocessors in computer architecture, Computational Linguistics and Computer 
Languages, v. 11, 1977, pp. 147—167. 
[2] LEGENDI, T., A 2D transition function definition language for a subsystem of the C E L L A S 
cellular processor simulation language, Computational Linguistics and Computer Languages, 
v. 13, 1979, pp. 169—194. 
[3] KATONA, E . , T . LEGENDI, Cellular algorithms for fixed point decimal addition and multiplication, 
Elektron. Informationsverarb. Kybernet., v. 17, 1 9 8 1 , pp. 6 3 7 — 6 4 4 . 
[4] KATONA, E . , Cellular algorithms for fixed point vector- and matrix-multiplication, Proceedings 
of the Conference Programming Systems' 81, pp. 262—280, in Hungarian. 
[5] KATONA, E . , The application of cellprocessors in conventional data processing, Proceedings of 
the Third Hungarian Computer Science Conference, Publishing House of the Hungarian Academy 
of Sciences, Budapest, 1981, pp. 295—306. 
[6] KATONA, E . , Cellular algorithms (Selected results of the cellprocessor team led by T . Legendi), 
Von Neumann Society, Budapest, 160 pages in Hungarian, 1981 
[7] DOMÁN, A . , A 3-dimensional cellular space, Sejtautomaták, Gondolat Kiadó, Budapest, 1 9 7 8 , 
in Hungarian. 
[8] VOLLMAR, R., Algorithmen in Zel/ularautomaten, B. G. Teubner, Stuttgart, 1979. 
[9] NISHIO, H . , Real time sorting of binary numbers by 1-dimensional cellular automaton, Proceed-
ings of the International Symposium on Uniformly Structured Automata and Logic, Tokyo, 1975, 
p p . 1 5 3 — 1 6 2 . 
(Received April 10, 1981) 
