1) Introduction
Current digital networks are constantly pushing the envelope when it comes to performance and reliability. Today's engineers must employ cunning techniques to maintain these metrics today and surpass those of tomorrow. One such technique that is commonly used is block coding. Block coding refers to the type of signal transformations designed to improve performance over a digital communications link by enabling the transmitted signals to better withstand various channel impairments. Such impairments include (but not limited to) noise, fading, dispersion, and jamming. One reason this technique has become so popular is that it is implemented quite efficiently through the use of very large scale integrated (VLSI) circuits. Such components can be utilized to affect an 8db improvement over non coded transmissions without an increase in the necessary power to transmit the signal [1] . This paper will be devoted to the discussion, design, and simulation of a class of block codes known as "cyclic codes". We will discuss the pertinent details that make up this technique and we will also simulate this method to better understand its functionality.
2) Linear Block Codes
Block codes represent the method of inserting redundancy (overhead) into the original message vector so that the presence of errors can be detected and ultimately corrected. This redundant encoding technique is also known as parity-check codes. This technique allows an improvement in P b (probability of bit error) performance but at the expense of bandwidth. Otherwise known as "bandwidth expansion".
2.1) Code Rate & Redundancy
With block codes, the source data is segmented into blocks of k data bits called message bits. Each segmented block can represent any one of 2 k distinct messages. The encoder transforms each kbit data block into a larger block of n bits. The n-k bits, which are added to each data block by the encoder, are known as the redundant bits, parity bits, or overhead (they carry no pertinent information). The code is typically referred to as a (n,k) code. The ratio of redundant (overhead) bits to data (information) bits (n-k)/k, within a block is called the redundancy of the code, and the ratio of data bits to total bits (overhead and information), k/n, is called the code rate. Essentially the code rate can be regarded as the portion of the code that constitutes information [1] .
Given the explanation of the code rate it should be readily apparent how a bandwidth expansion does occur when utilizing block coding. For an error control technique that utilizes an (2,1) code the rate would be ½. This employs 100% redundancy and would double the bandwidth requirements. However if the rate were (4,3), the code rate would be ¾ and the redundancy would be only 33%. Keeping the information bits larger is the key to bandwidth efficiency when regarding block coding [1] .
2.2) Parity-Check Codes
Parity-check codes use linear sums of the information bits (k-bits), called parity symbols or parity bits for error detection or correction. A single parity check code is constructed by adding a single parity bit to a block of data bits. The parity bit takes on the value of one or zero depending on whether the modulo-2 summation of the information bits yields an even or odd result. For even parity, if the modulo-2 summation of the information bits yields a value of zero, the transmitter sets the parity bit to zero as well so that the overall codeword modulo-2 summation has a zero result. However, if the modulo-2 summation of the information bits yields a value of one, the transmitter sets the parity bit to one so that the overall codeword modulo-2 summation has a zero result. At the receiver, the decoding procedure consists of testing the modulo-2 summation of the received codeword to see if it yields a zero result. If the result is nonzero, the codeword is known to contain errors [1] . This is shown in Figure 1 .
• Figure -1 Example of parity & codeword (n,k). [1] For single parity, the decoder cannot correct these errors, but it can detect errors as long as there is an odd number of errors. For example, if the codeword 1001 is transmitted and the codeword 1111 is received, we have two bits (even number of bits) in error. However, the modulo-2 summation for the received codeword has a zero result indicating that there are no errors. To correct errors, we must use a form of parity-check codes called Rectangular (or Product) codes.
The idea behind Rectangular codes is to form the message bits in M rows by N columns, and then append a horizontal parity check to each row and a vertical parity check to each column. The transmitted codewords are then comprised of the new (M+1) x (N+1) matrix. This is shown in Figure  2 Rectangular codes have the ability to correct some errors by performing a triangulation. For instance, if the first bit in the matrix is received as a zero instead of a one as shown in Figure 2 ., the first row horizontal modulo-2 summation will result in a value of one indicating an error somewhere in row one. As we compute the vertical modulo-2 summation for each column, we will detect that there is an error in column one. Thus we can identify the first bit in the matrix is in error and we can correct it.
Although rectangular codes can detect and correct for single errors, it too has trouble with multiple errors. In some cases, rectangular codes can detect multiple errors but can not correct for them. In other cases, they can not even detect the errors. For example, if the first two bits in row one and first two bits in row two in Figure 2 are received in error, both the horizontal and vertical parity checks will compute a modulo-2 summation of zero indicating there are no errors when there are actually four errors.
The advantage of rectangular check codes over single parity check codes is the ability to detect and correct some errors. Rectangular check codes are also less susceptible to missing a bit in error. The trade off is that by adding more redundancy with the rectangular codes, we use more bandwidth than the single parity check codes.
2.3) Linearity & Vector Subspace
A block code represents a one-to-one assignment, whereby the 2 k message k-tuples are uniquely mapped into a new set of 2 k codeword n-tuples. For linear codes, the mapping transformation is, of course, linear. The linear combination of any set of codewords is also a codeword. This linear property lends to less complicated encoding and decoding.
A set of 2 k n-tuples is called a linear block code if, and only if, it is a subspace of the vector space of all V n n-tuples. A subset of the vector space V n is deemed a subspace if the following conditions are met [1] :
• The all-zeros vector is included in the subspace.
• The sum of any two vectors in the subspace is also included in the same subspace. 
2.4) Generator Matrix
If k is large, a table look-up implementation of the encoder becomes prohibitive. It is possible to reduce complexity by generating the required code vectors as needed, instead of storing them in tabular form.
Since a set of code vectors that forms a linear block code is a k-dimensional subspace of the n-dimensional binary vector space (k<n), it is always possible to find a set of n-tuples, fewer than 2 k , than can generate all the 2 k member vectors of the subspace. The generating set of vectors is said to span the subspace (V 1 , V 2 ,….. V k ) [1] . Each set of the 2 k code vectors U can be described by:
Where m i = (0 or 1) are the message digits and i = 1,….,k. The generator matrix can be defined as the following k x n array:
• Figure -3 Example of generator matrix (G). [1] For systematic linear block codes the generator matrix is made up of a parity matrix, P, and an identity matrix, I. The generator matrix, G, then takes on the form shown in the following:
• Figure -4 Example of systematic generator matrix (G). [1] Code vectors are typically designated as row vectors from a matrix. Therefore the message vector m, derived from a sequence of k bits, can be shown as follows (a 1 x k matrix having one row and k columns): m = m 1 , m 2 , …., m k
The generation of the code vector, U, is written in matrix notation as the product of the message vector, m, and the generator matrix G as shown in the following:
U = mG
It can be seen that the code vector, U, is a linear combination of the rows of the generator matrix G. Since the code is totally defined by the generator matrix G, the encoder need only store the k rows of G instead of all 2 k possible vectors of the code [1] . This result in a significant reduction in the memory required to generate code vectors.
2.5) Parity-Check Matrix & Syndrome Testing
Since we have discussed the manner in which code vectors are generated for transmission we now need to turn to decoding and error detecting of the received signal. A parity check matrix, H, will enable us to decode the received code vectors. For each (k x n) generator matrix, G, there exists an (n -k) x n matrix, H, such that the rows of G are orthogonal to the rows of H. That is GH T = 0, where H T is the transpose of the parity check matrix H, and 0 is a k x (n -k) all zeros matrix [1] . The components of the parity check matrix, H, and its transpose can be written as follows:
• Figure -5 Example of parity check matrix H & its transpose H t . [1] As a result of the transmitted code vector U and any channel impairments that can be affected on the message, the received signal r (where r = r1, r2,….. r n from one of 2 n n-tuples) is equivalent to the following: r = U + e
Where U is the original code vector transmitted and e is the error induced or added due to any given channel impairments.
The syndrome, S, of a received code vector, r, is the result of a parity check performed on the received vector to determine if it is a valid member of the codeword set. The syndrome, S, can be expressed as:
If the received code vector, r, is indeed a member of the codeword set, the syndrome, S, retains the value 0. If however r contains detectable errors, the syndrome has some nonzero value. This information can allow the received vector to be earmarked with which bit(s) are in error so that the decoder can correct them. It is up to the design of the decoder as to which method of correction is initiated (FEC, ARQ, etc…). The value of the syndrome, S, is used to lookup the error pattern vector kept in a pre determined syndrome lookup table. This error pattern is then added to the received vector, r, to determine the corrected vector [1] .
3) Cyclic Codes
As indicated in our preceding discussion of linear block codes, we have seen that the imposition of structure, explicitly linearity, yielded advantages in terms of the complexity of encoding and decoding the code vectors. The disadvantage lies in that even the syndrome lookup table method of error correction is very cumbersome and complex, and is impractical for all but the shortest of code vectors [2] .
In an effort to reduce complexity in encoding and decoding, algebraic coding theory can be utilized to introduce even more structure into code vectors. The algebraic structure of this class of codes makes encoding and decoding much more efficient. Cyclic codes are an example of this and are an important subclass of linear block codes. In particular, encoders and syndrome computation circuits for cyclic codes can be implemented using simple shift register circuits [2] .
The components of a code vector can be treated as the coefficients of a polynomial U(X) as follows:
The polynomial function U(X) can be thought of as a "placeholder" for the digits of the code vector U. The presence or absence of each term in the polynomial indicates the presence of a 1 or 0 in the corresponding location [1] .
3.1) Polynomial Dividing Circuit
Cyclic code implementation requires the implementation of polynomial division. This operation can be achieved with a dividing circuit comprised of feedback shift registers. The following diagram performs the polynomial division which determines the quotient and remainder necessary for decoding and error detection [1] .
• Figure -6 Example of circuit for dividing polynomials. [1] Initially all stages of the register are set to zero (initialized). The first r shifts enter the most significant coefficients of the polynomial V(X). After the r th shift, the quotient output is g r -1 vm, the highest order term in the quotient. For each quotient coefficient q i the polynomial q i g(X) must be subtracted from the dividend. The feedback connections shown in Fig 6 accomplish this subtraction. The difference between the leftmost r terms remaining in the dividend and the feedback terms is formed on each shift of the circuit and appears as the contents of the register. At each shift of the register, the difference is shifted one stage, the highest order coefficient is shifted out while the next is shifted in. It takes m + 1 total shifts into the register to serially represent the quotient at the output. The remainder resides in the register [1] .
3.2) Shift Register Systematic Encoding
Cyclic codes involve shifting the polynomial bit by bit to include all the parity check bits necessary for the predetermined code. The parity bits (parity polynomial) are calculated and then placed in the appropriate location along side the message polynomial. The parity polynomial is the remainder after dividing by the generator polynomial. It appears in the register after n shifts through the (n -k) stage feedback register shown in the following diagram.
• Figure -7 Example of Systematic Encoding Circuit. [1] Since the first n -k shifts through the register are merely filling the register, there cannot be any feedback until the rightmost stage has been filled. Loading the input data to the output of the last stage can shorten the shifting cycle. The circuit feedback connections correspond to the coefficients of the generator polynomial G(X) which can be expressed by [1] :
The following describes the encoding procedure for the circuit shown in Figure 7: 1. Switch 1 is closed during the first k shifts, to allow transmission of the message into the n -k stage encoding shift register.
2. Switch 2 is in the down position to allow transmission of the message bits directly to an output register during the first k shifts.
3. After transmission of the k th message bit, switch 1 is opened and switch 2 is moved to the up position.
4. The remaining n -k shifts clear the encoding register by moving the parity bits to the output register.
5. The total number of shifts is equal to n, and the contents of the output register is the codeword polynomial r(X) + x n-k m(X).
For example, lets set the generator vector to 1101 and the message vector to 1011. If we follow through the above steps, we see the redundant bits are calculated as shown in Figure 8 .
• Figure -8 Theoretical Redundant Bits Thus, for this example, the theoretical codeword will be the message bits appended by the redundant bits held in the register contents of Figure 8 , which yields 1001011.
3.3) Decoding in Cyclic Codes
The decoder for linear consists of three basic steps: 1. Calculating the syndrome of the received vector. 2. Identify the correctable error pattern that corresponds to syndrome calculated in step1.
The correspondence between the syndrome and a correctable error pattern is one-to-one. This is the error pattern that presumably has occurred. 3. Correct the errors by taking the modulo-2 sum of the received vector and the error pattern found in step 2.
A general decoder for an (n, k) cyclic code is shown in Figure 9 . It consists of three major parts.
• • Figure -9 Cyclic Code Decoder. [2] The correction procedure can be described as follows:
Step 1. Shifting the entire received vector into the syndrome register forms the syndrome. At the same time, the received vector is stored into the buffer.
Step 2. The syndrome is read into the detector and is tested for the corresponding error pattern. The detector is a combinational logic. It is designed in such a way that its output is " 1" if and only if the syndrome in the syndrome register corresponds to a correctable error pattern with an error at the highest position order, n. That is if "1" appears at the output of the detector the received symbol in the right most stage of the buffer register is assumed to be erroneous and must be corrected. On the other hand, if a "0" appears at the output of the detector, the received symbol at the rightmost stage of the buffer register is assumed to be correct and no correction is necessary. This implies that the output of the detector is the estimated error value for the symbol to come out of the buffer.
Step 3. The received symbol is read out of the buffer. At the same time, the syndrome register is shifted once. If the received symbol is detected to be an erroneous symbol, it is then corrected by the output of the detector. That output is also fed back into the syndrome register to modify the syndrome. The result is a new syndrome, which corresponds to the altered received vector shifted one place to the right.
Step 4. The new syndrome formed in Step 3 is used to detect whether or not the next received symbol now at the rightmost stage of the buffer register, is an erroneous symbol. If the symbol is in error, this next received symbol is then corrected in exactly the same way as the first received symbol was corrected.
Step 5. The decoder decodes the received vector symbol by symbol in the above manner until the entire received vector is read out of the buffer.
After the entire received vector has shifted through, the errors should have been corrected if they correspond to the error pattern built into the detector, and the syndrome register will contain all zeros. If at the end of the process the syndrome register does not contain all zeros, an uncorrectable error has been detected. The error correction performance of the decoder depends largely on the capacity of error patterns built into the detector.
4) Simulation of Systematic Cyclic Codes
Now that we have investigated the theory behind cyclic codes, we look specifically at the systematic encoding of cyclic codes with shift registers. By using Figure 7 as a guide, both a software and hardware approach is taken to simulate the block diagram. To simplify the simulation, using the following generator polynomial for both the software and hardware approach generates a (7, 4) code vector:
4.1) Software Simulation
To simulate the systematic cyclic codes in software, Borland C++ was used to construct a relatively simple program that would allow a user to input a generator vector and message vector. By pressing "C" the Codeword would then be computed and displayed. Most of the program code is designed to establish a user interface. The portion we are concerned with is the createCode function listed below: To begin, both the message vector and the new code vector are passed into the function. The code vector will be modified and returned to the calling function. The generator vector is defined as a global variable and is referenced directly from the createCode function. We use the tempTerm variable to hold the value of the highest order redundant bit, which will be used as part of the codeword. This is necessary to retain the current value while the next state value is computed and stored. The tempTerm is then used as the feedback value into the generator vector.
Once we have initialized the new codeword, we shift through the message vector from the Most Significant Bit (MSB) to the Least Significant Bit (LSB) by using a FOR loop. We identify the MSB to be the coefficient to the highest order term in the vector and the LSB to be the lowest order term in the vector. While in the FOR loop we start by storing the tempTerm. Then we apply the generator vector by using a nested FOR loop. For each term of the generator vector, we check to see if the value is one or zero. If the value is one, we exclusive OR (XOR) the tempTerm with the next lower order redundant bit term until we reach the lowest order redundant bit term. If the value is zero, we just shift the next lower order redundant bit into the current redundant bit term. Once we reach the end of the nested FOR loop, we set the lowest order redundant bit term to the tempTerm value. The last step we do while still in the first FOR loop is to shift the current message into the codeword register.
After the FOR loop has completed, the codeword is now composed of the message vector appended by the redundant n-k bits as shown in Figure 10 below:
The image in Figure 11 shows the codeword 1001011 was computed with the Systematic Cyclic Test Program for a generator vetcor of 1101 and a message vector of 1011. This matches the theoretical value determined in the example shown in section 3.2)
• Figure - Without going into a detailed explanation, we can see that the received vector is equal to the transmitted codeword vector, which gives us a syndrome vector of all zeros indicating there were no errors detected. However, if we receive a bit in error, as shown in Figure 12 , the syndrome will have a non-zero value indicating that there has been an error detected. 
4.2) Hardware Simulation
To simulate the systematic cyclic codes in hardware, a trial version of a software package called Design Explorer 99 SE developed by Protel International was used. We again use the example discussed in section 3.2 to construct the circuit shown in Figure 13 . All the parts shown in the schematic are SPICE simulation ready components within the Design Explorer software. The logic chips are all Texas Instruments components.
The first part of the circuit is designed to load in the message vector and initializes the D-Flip Flops. Part U8 is a 74LS165 parallel-load 8-bit shift register which is used to load in the message vector upon a low Trigger signal, and then to shift the contents serially out on each positive edge clock pulse after the Trigger signal has goes high again. The Trigger signal is a pulse wave that was specially constructed such that it would initialize the system. It is used in conjunction with the AND gates (part U1) to allow the D-Flip Flops to load in a zero value upon the first clock pulse. After the first clock pulse, the Trigger signal goes high so that the output of the AND gates is the same as the input of the AND gates.
The D-Flip Flops (parts U3 and U4) are used to store the redundant bits. On each clock pulse the next state values are loaded. From the block diagram in Figure 7 , we can see that the generator vector value and the XOR gates determine the next state values.
Finally, we use a switching mechanism to help generate the complete codeword. Another specialized signal called SWpulse is used to time when the message bits have all been shifted. SWpulse is assumed to be a signal that is generated outside our immediate circuit. Some form of an up/down counter and supporting logic devices that could generate it. Initially switches S1 and S3 are closed while switch S2 is open. This allows the message bits to be shifted into the codeword register while simultaneously using the loopback to produce the redundant bits. Once the message bits have all been shifted, Switches S1 and S3 are opened while S2 is closed. This appends the redundant bits to the codeword register by shifting the D-Flip Flops. It also stops the loopback signal since S1 is open. The resisters, R3 and R4, are used to tie the open connections to ground whenever a switch is left open.
The circuit is simulated such that the input values of U8 are set to E=1, F=0, G=1, H=1. The shift register shifts the value of H out first followed by G, F, and E respectively. In this manner, the message vector 1011 is passed through the circuit. We are mostly interested in the Codeword signal shown on the schematic. While looking at the output waveforms, it is important to note that the Codeword signal must be read from left to right to match a software generated codeword read from right to left. This can be confusing, but it is due to the manner in which time is displayed. For instance, as we read the Codeword waveform shown in Figure 14 from left to right, at each positive edge clock pulse we see that the value is +5v, +5v, 0v, +5v, 0v, 0v, +5v, which translates to logic 1,1,0,1,0,0,1. What we have to remember is that the Codeword waveform at the first clock pulse is the same position as the MSB Message Bit position of the codeword shown in Figure 10 . Likewise, the Codeword waveform at the last clock pulse is the same position as the LSB Redundant Bit position of the codeword shown in Figure 10 . With this in mind, we can see that both the hardware and software simulations produce the same codeword given the same generator vector and message vector.
• Figure -14 Hardware Simulated Codeword
4.3) Simulation Results
In any testing environment, it is important to verify your results. In this section we use the method discussed in the example of section 3.2 to compute all the theoretical codewords with a generator vector of 1101 and a four bit message vector. We then use the Systematic Cyclic Code Test Program and the Design Explorer simulation software to compute the software and hardware simulation results for the same conditions. Figures 15 and 16 show the software and hardware simulation results for a message vector of 1100. Accounting for the time display phenomena discussed at the end of section 4.2, we can see that the two simulations agree with the theoretical code word of 1011100. Table 1 shows all the theoretical, software simulated, and hardware simulated codewords for these conditions.
• Figure - 
