Polynomial division using left shift register  by Sarkar, P. et al.
Pergamon 
Computers Math. Applic. Vol. 35, No. 6, pp. 27-31, 1998 
Copyright(~)1998 Elsevier Science Ltd 
Printed in Great Britain. All rights reserved 
PII: S0898-1221(98)00014-5 0898-1221/98 $19.00 + 0.00 
Polynomial Division Using Left Shift Register 
P. SARKAR, B. K. RoY AND P. P. CHOUDHURY 
Computer Science Unit, Indian Statistical Institute 
203, B.T. Road, Calcutta 700035, India 
{palash, bimal, pabitra}@isical, ernet, in 
R. BARUA 
Stat/Math Unit, Indian Statistical Institute 
203, B.T. Road, Calcutta 700035, India 
rana@isical, ernet ,  in 
(Received December 1996; accepted October 1997) 
Abst rac t - - ln  this short note, we describe a simple polynomial division circuit based on a left shift 
register. The circuit essentially performs the modulo operation f(z) modp(x). It is shown how the 
same circuit can be used to perform f(x)g(x) modp(x). Applications to standard basis multiplication 
and encoding and decoding of systematic cyclic codes are also described. 
Keywords--vLsI ,  Shift register, Polynomial division, Finite field arithmetic. 
1. INTRODUCTION 
Polynomial division over GF(2) are of fundamental importance in designing circuits for error 
correcting codes. Presently, this is done using Linear Feedback Shift Register (LFSR) [1,2]. In 
this short note, we describe simple circuits using a left shift register to perform the operations 
f (x )  modp(x) and f (x )g (x )modp(x) .  The main advantage of our scheme is that the circuit 
for performing f (x )modp(x)  (which we will call the MOD circuit) is independent of p(x) and 
the same circuit can be used for dividing by another polynomial of the same degree (requir- 
ing only a change in stored values). Moreover, this MOD circuit can also be used to perform 
f (x )g(x)  modp(x) for arbitrary polynomials f (x )  and g(x). For corresponding LFSR circuits 
which perform modulo multiplication, p(x) along with one of f (x )  or g(x) must be fixed. This 
flexibility allows the MOD circuit to be used as a standard basis multiplier. The trade-off being 
that our circuit require additional f ip-flops and EX-OR gates. A second advantage is that com- 
pared to the corresponding LFSR circuit, the MOD circuit takes less clock cycles to perform the 
operation f (x) mod p(x). 
In what follows, all operations are over GF(2). 
2. POLYNOMIAL  D IV IS ION 
Let p(x) = x" + an_ix n-1 +. . .  + ao be a fixed polynomial of degree n. 
Then, x n (modp(x)) = an_ix n-1 +. . .  + ao. 
Let x "+a (modp(x)) = c~_lx n-1 +. . .  + co, ~ >_ O. 
Typeset by .4j~ctS-TEX 
27 
28 P. SARKAB. et aL 
Then, 
x n+a+l (modp(x)) = Cn_lX n +. . .  + cox 
= Cn- l (an-xX n-1 + ' ' "  + ao) +Cn-2X n-1 +""  + COX 
---- (Cn_lan_l -1- Cn_2)X n-1 -1- (Cn_lan_ 2 -1- Cn_3)Z n-2 
+""  + (Cn-lal + co)X + C.n-laO. 
Using the natural representation f polynomials as vectors, we can say that if x = [cn-1, . . . ,  co] 
t be the vector representing x n+a (modpCx)), then the vector y = [cn_ l , . . .  ,c/0] representing 
x n+a+l (modp(x)) is obtained as follows: 
i 
1 0  
an-2 0 1 . . .  
[2 ;  ° ° " ' "  0 0 ...  0 
which we can write as Ax T = yT. 
We first describe an algorithm for obtaining f ( z )  (modp(x)) for any polynomial f (x )  = f rnxm+ 
fm- lX  m-1 + ' ' "  + fo with fm= 1. 
ALGORITHM .m. 
input f ---- (fro . . . .  , f0) a vector representing the polynomial fCx). 
output r = ( rn_ l , . . . ,  r0) a vector representing the polynomial f (x )  (modpCx)) 
method 
begin 
r = ( fn -1 , . . . ,  10) 
t = (an-1 , . . . ,a0)  
for i = 0 to ( m-  n) do 





1. Correctness of the algorithm follow from the previous discussion. 
2. Assuming that the loop can be executed in one clock cycle, the algorithm takes m - n + 1 
clock cycles. Next, we show how to execute the loop in one clock cycle. 
Let x = (Cn_l, . . . ,co) and C be an ( n + 1) cell left shift register. Let the contents of C be 
(d, c~-1 , . . . ,  co), where d means don't care. Now C is left shifted once. If the content of the first 
cell is 1, then Can_ l , . . . ,  ao) is added bitwise to the contents of C leaving out the first cell. Then 
the contents of C Cleaving out the first cell) gives the vector y. = Ax.  
Using this implementation of the operation t_-- At, it is easy to implement Algorithm ~4 in 
VLSI. We essentially compute the powers x i (modpCx)) and add up the partial results in another 
register. The entire MOD circuit is shown in Figure 1 which operates as follows. 
Initially, C is loaded with (0, an- I , . . . ,  a0) and 7~ is loaded with ( fn-1  . . . .  , f0). The register 7 9 
contains the fixed vector Can_ l , . . . ,  ao). The coefficient fn+i 0 <_ i < rn -  n is available in the i th 
clock cycle. In the positive half of the i th clock cycle, C is evolved Cleft shifted) once and if 
fn+i -- 1, then simultaneously, the contents of C are added bitwise to the contents of T~. In the 
negative half of the ith cycle, if the leftmost bit of C is 1, then the contents of 79 are added bitwise 
to the contents of C. 
Polynomial Division 29 
REMARK 2.2. 
1. The flexibility mentioned in the introduction arises due to the fact that to divide by an- 
other polynomial only requires a change of values in register :P (which stores the coefficients 
of polynomial p(x)). 
2. Using the above implementation Algorithm ,4 can be completed in m - n + 1 clock cycles. 
This time is an improvement over LFSR circuits which require m clock cycles. 
3. To implement Algorithm ,4, 2n 2-input EXOR gates and 3 n-bit registers are required. 
For LFSR circuits, one n bit register is required and the number of gates required is _< n. 
3. POLYNOMIAL  MULT IPL ICAT ION 
MODULO A POLYNOMIAL  
In this section, we describe how the MOD circuit (Figure 1) can be used to perform f(x)g(x) 
(modp(x)), where f(z)  and g(x) are arbitrary polynomials and p(x) is a fixed polynomial of 
degree n. 
Let, 
/ (z)  =/~z  m +. . .  +/0 ,  
g(x)  = g.~z m +. . .  + go. 
The algorithm for modulo multiplication is presented as Algorithm B. 
ALGORITHM B. 
We assume a function modulo(/(x)) exists which returns / (x )  (modp(x)). 
begin 
end 
q (z) = modulo(/(z))  
if (go = 0) then r(x) = 0 else r(x) = q(z) 
for i = 1 to m do 
q(x) = modulo(x • q(z)) 
if (gi = 1) then r(z) = q(x) + r(x) 
od 
dock l l cl 
d ,-1 Co C 
. . .  
. ° °  
D . . .  1 I 
Figure 1. The MOD circuit. 
30 P. SARKAR et al. 
To see that the algorithm works, note that the following holds. 
1. x iq'l * h(x) modp(x) = x * (x ~ * h(x) modp(x)) (modp(x)), 
2. f(x)g(x) modp(x) = (gmX m * h(x) (modp(x)) + . . .  + gix ~ * h(x) (modp(x)) + . . .  + go * 
h(x) (modp(x))) (modp(x)), 
where h(x) = f (x)  modp(x). 
The MOD circuit can be used to implement Algorithm B in the following way. In the first phase 
of operation, we compute f (x)  modp(x) as usual. This takes m - n + 1 clock cycles. In the next 
clock cycle, C (leaving out the leftmost cell) is loaded with 7~ and if go = 0, then simultaneously 
register T~ is reset to 0. Starting from the next clock cycle and continuing for the next m clock 
cycles, the coefficients gl ... gm are input to the circuit (in place of fn+is). So after 2m - n + 2 
clock cycles, the register 7~ contains the result. 
As in the case of Algorithm .A, dividing by another p(x) (of same degree), will only require a 
change in stored values of P. Also note that for a fixed p(x), both f (x)  and g(x) are arbitrary. 
For LFSR circuits, p(x) along with one of f (x)  or g(x) must be fixed. Thus, our implementation 
provides more flexibility in design. 
4. APPL ICAT IONS 
4.1. Standard Basis Multiplication 
In this section, we describe the use of the MOD circuit for finite field multiplication. Arithmetic 
over finite fields has important applications. See [3-8] for related work in this area. We first 
describe some finite field terminology all of which can be found in [1,2]. 
Let GF(2 n) be the field extension of degree n over GF(2). Then it is also a vector space 
of dimension  over GF(2). Let p(x) be an irreducible polynomial of degree n over GF(2) 
and let a be one of its roots. Then 1,~, . . .  ,a  n-1 forms a basis (called a standard basis) for 
GF(2 n) over GF(2). Any element ;3 of OF(2 n) can be written as ;3 = f (a) ,  where f (x)  is a 
polynomial of degree at most n over GF(2). If ;3,')' E GF(2n), where ;3 : f (a)  and ~/= g(a), 
then ;3 + "r = f (a)  + g(a). So addition over GF(2 n) is simply the polynomial addition of f (x)  
and g(x), and hence is simple to perform. The multiplication is more complicated and it is the 
multiplication which can be easily performed by the MOD circuit. Let, 
;3 = $o + f la  +. . .  + fn - la  n- l ,  
"[ : gO "~- ~i O~ -{- " ' "  "~- gn-1 O~n-1. 
Then /3 and 7 are, respectively, represented in a unique way by the tuples (f0, . . .  , fn -1 )  
and (go,... ,gn-1), with respect o the standard basis 1 ,a , . . .  ,a  n-1. Then ;37 = h(a), where, 
h(x) = f(x)g(x) modp(x) = ho + hlx +. . .  + hn-lX n-1 
Again ;3"r is also uniquely represented by the tuple (h0,. . . ,  hn-1). Note that in the whole 
operation the role played by a is only that of a placeholder. Hence to obtain the element ;3"I all 
we have to do is to perform the modulo multiplication f (x)g(x)modp(x).  This is done by the 
MOD circuit in the way described in the previous ection and takes a total of n + 2 clock cycles. 
REMARK 4.1. 
1. For a modulo multiplication circuit to be used as a standard basis multiplier, it is essential 
for the circuit design to be independent of f (x)  and g(x). Hence, LFSR based modulo 
multiplication circuits cannot be used for standard basis multiplication. 
2. In the use of the MOD circuit for standard basis multiplication, the register P stores the 
the coefficients of p(x). If one were to choose some other irreducible polynomial, then all 
that is required is a change in the stored values of register P. 
Polynomial Division 31 
4.2. Systematic Cyclic Codes 
One of the major application areas for polynomial division circuit is encoding and decoding of 
error correcting codes [1,2,9]. Here, we briefly describe how the above circuits can be used for 
encoding and decoding of systematic cyclic codes [1]. See [10] for a discussion on Reed-Solomon 
code and its VLSI implementation. 
Let d(x) denote the data polynomial (of degree n) and g(x) the generator polynomial (of 
degree r) for the code space. Then the codeword u(x) is formed as follows: 
u(x) = d(x):r r - r(x), 
where r(x) = d(x)x r, (mod g(x) ). 
(1) 
(2) 
The operation in (2) can be done using the MOD circuit. The actual method of obtaining u(x) 
on line is discussed below. 
Decoding is done by generating the syndrome for the received codeword and then using it for 
possible error correction. Let v(x) be the received codeword. Then the syndrome of v(x) is 
formed as follows. 
s(v(x)) = v(x), (mod g(x)). 
Again the above operation can be done using the MOD circuit. 
For LFSR based designs, the data polynomial is fed to the encoding circuit, high order bit first. 
Since LFSR based division circuit require input with high order bit first, the data polynomial can 
be input to the division circuit and transmitted simultaneously. After the remainder has been 
formed, it is also transmitted with high order bit first. Thus, the polynomial d(x)x r + r(x) can 
be transmitted high order bit first with no delay. 
For the MOD circuit that we have described, input is required low order bit first. Therefore, 
d(x) is also transmitted low order bit first. After r(x) have been computed it is transmitted low 
order bit first. At the receiving end d(x) is obtained before r(x). This however does not cause 
any delay in syndrome generation, since d(x)x r mod g(x) is first computed and then r(x) added 
to the result. 
REFERENCES 
1. T.R.N. Rao and E. Fujiwara, Error Control Coding for Computer Systems, Prentice Hall, (1989). 
2. W.W. Peterson and E.J. Weldon, Error Correcting Codes, MIT Press, Cambridge, MA, (1972). 
3. M.A. Hasan, Division-and-accumulation overGF(2m), IEEE 7~ansactions on Computers 46 (6), (1997). 
4. R. Baxua and S. Sengupta, Proceedings of I0 th Int. Conf. on VLSI Design, pp. 465-468, IEEE Press, (1994). 
5. E.D. Mastrovito, VLSI Architectures for computations in galois fields, Ph.D. Thesis, Dept. of Elec. Eng., 
Linkoping Univ., Linkoping, Sweden, (1991). 
6. C.C. Wang, T.K. Truong, H.M. Shao, L.J. Deutsch, J.K. Omura and I.S, Reed, VLSI architectures for 
computing multiplications and inverses inGF(2m), IEEE Transactions on Computers C-34, 709-717, (1985). 
7. B.B. Zhou, A new bit-serial systolic multiplier over GF(2m), IEEE Transactions on Computers C-37, 749- 
751, (1988). 
8. P. Pal Choudhury and R. Barua, Cellular automata based VLSI architectures for computing multiplication 
and inverses in GF(2ra), In Proceedings of7 t~ Int. Conf. on VLSI Design, pp. 279-282, IEEE Press, (1994). 
9. E.I~. Berlekamp, Algebraic Coding Theory, McGraw-Hill, New York, (1968). 
10. Algorithms and Architectures for the Design of a VLSI Reed-Solomon Code, Edited by S.B. Wicker and V.K. 
Bhargava, IEEE Press, (1994). 
