Abstract
INTRODUCTION
As their name suggests, LDPC codes are block codes with parity-check matrices that contain only a very small number of non-zero entries proposed by Gallager in 1962 [1] and has gained popularity due to their capacity-approaching error correcting performance [2] . In LDPC codes sparseness of H guarantees both a decoding complexity and minimum distance which increases only linearly with the code length, however, finding a sparse paritycheck matrix for an existing code is not practical. Instead LDPC codes are designed by constructing a sparse paritycheck matrix first and then determining a generator matrix for the code afterwards.
In order to reduce encoding complexity, LDPC codes with dual diagonal structure is adopted by the latest next-generation wireless LAN standard, IEEE 802.11n [3] . The LDPC encoding algorithm used is near-linear time proposed by [4] & [5] .An LDPC code parity-check matrix is called (wc,wr)-regular if each code bit is contained in a fixed number, wc, of parity checks and each parity-check equation contains a fixed number, wr, of code bits. An efficient encoding algorithm [6] is used to reduce the encoding complexity.
In this paper we have implemented the low complexity Encoder algorithm on hardware platform on Xilinx Spartan 3E FPGA & simulated using 
LDPC CONSTRUCTION
The construction of binary LDPC codes involves assigning a small number of the values in an all-zero matrix to be 1 so that the rows and columns have the required degree distribution.
The original LDPC codes presented by Gallager are regular and defined by a banded structure in H. The rows of Gallager's parity-check matrices are divided into wc sets with M/wc rows in each set. The first set of rows contains wr consecutive ones ordered from left to right across the columns. (i.e. for i ≤ M/wc, the i-th row has non zero entries in the ((i − 1)K + 1)-th to i-th columns). Every other set of rows is a randomly chosen column permutation of this first set. Consequently every column of H has a '1' entry once in every one of the wc sets. Since LDPC codes are often constructed pseudo-randomly we often talk about the set (or ensemble) of all possible codes with certain parameters (for example a certain degree distribution) rather than about a particular choice of parity-check matrix with those parameters. LDPC codes are often represented in graphical form by a Tanner graph.
The Tanner graph as shows in figure-1, consists of two sets of vertices: n vertices for the code word bits (called bit nodes), and m vertices for the parity-check equations (called check nodes). An edge joins a bit node to a check node if that bit is included in the corresponding parity-check equation and so the number of edges in the Tanner graph is equal to the number of ones in the parity-check matrix.
Fig 1:
The Tanner graph representation of the parity-check a 6-cycle is shown in bold.
A cycle in a Tanner graph is a sequence of connected vertices which start and end at the same vertex in the graph, and which contain other vertices no more than once. The length of a cycle is the number of edges it contains, and the girth of a graph is the size of its smallest cycle. The Mackay Neal construction method for LDPC codes can be adapted to avoid cycles of length 4, called 4-cycles, by checking each pair of columns in H to see if they overlap in two places. The construction of 4-cycle free codes using this method is given in Algorithm 1. Input is the code length n, rate r, and column and row degree distributions v and h. The vector α is a length n vector which contains an entry i for each column in H of weight i and the vector β is a length m vector which contains an entry i for each row in H of weight i. 
Algorithm 1: H Matrix Generation

ENCODING USING GENERATOR MATRIX
For a linear block code, the sum of any two code words results in another code word. LDPC code construction is also done in similar way of linear block code. From a given parity check matrix, H, a generator matrix, G is derived. Data, m = m 1 , m 2 …..m n is encoded by multiplying it with the generator matrix, c = mG where m is a string of information bits. It has to be noted that putting H in systematic form, H= [P T | I M ], no longer has fixed column or row weights and P is very likely to be dense. The denseness of P determines the encoder computational complexity. A dense generator matrix requires a large number of operations when doing the matrix multiplication with the data to be sent. The encoding complexity could be reduced for some codes by parity check matrix pre-processing. An efficient encoding technique has been developed to reduce the encoding complexity by rearranging the parity check matrix before encoding. The encoding complexity also depends on the structure of the code The construction of LDPC codes is categorized mainly into two: Random constructions and structured constructions. The type of construction is determined by the connections between check nodes and variable nodes in Tanner graph. Each type of constructions has their advantages over the other. Random constructions refer to the unstructured row-column connections in the parity check matrix with no predefined pattern. Random codes have better performance compared to structured codes in case of long codes. They are used in cases we want to increase the girth or rate of a given size. But longer length random LDPC codes require large memory storage in practical implementation which affects the computational efficiency of the code. The uncertainty of guaranteeing an asymptotically optimum performance in random constructions leads to the use of structured construction of LDPC codes. Structured construction method put constraints on rowcolumn connections to get a desired or predefined connection pattern that is easier to implement in hardware. 
LDPC Encoding Example
LINEAR-TIME ENCODING FOR LDPC CODES
Instead of finding a generator matrix for H, an LDPC code can be encoded using the parity-check matrix directly by transforming it into upper triangular form and using back substitution. The idea is to do as much of the transformation as possible using only row and column permutations so as to keep as much of H as possible sparse.
Firstly, using only row and column permutations, the paritycheck matrix is put into approximate lower triangular form:
Where the matrix T is a lower triangular matrix (that is T has ones on the diagonal from left to right and all entries above the diagonal zero) of size 
Step 1
Instead of putting H into reduced row-echelon form we put it into approximate lower triangular form using only row and column swaps. For this H we swap the 2-nd and 3-rd rows and 6-th and 10-th columns to obtain: Since E has been cleared, the parity bits in p1 depend only on the message bits, and so can be calculated independently of the parity bits in p 2 . If D Equation ( Step 3
As T is upper-triangular the bits in p using back substitution and the code word is c = Again column permutations were used to obtain and so either H t , or H with the same column permutation applied, will be used at the
ENCODER DESIGN
Hardware implementation of Encoder is don Spartan 3E FPGA starter kit. Figure 2 shows the flow diagram for encoder implementation. We have implemented the ½ rate encoder for different matrix size 4X8, 16X32, 32X64, 64X128 has been cleared, the parity bits in p1 depend only on bits, and so can be calculated independently of D is invertible, p1 can be found from 1 Cu.
---- (3) is not invertible the columns of H can be permuted until it as small as possible the added complexity burden of the matrix multiplication in Equation (3), which is Once p 1 is known p 2 can be found from Bp1), ---- (4) A, B and T can be employed to keep of this operation low and, as T is upper can be found using back substitution.
we partition the length 10 codeword c = [c1, c2, .
p2] where p1 = [c6, c7] and p2 = [c8, in p1 are calculated from the message triangular the bits in p 2 can then be calculated
Again column permutations were used to obtain H t from H with the same column permutation applied, will be used at the decoder.
DESIGN & IMPLEMENTATION
Hardware implementation of Encoder is done on Xilinx kit. Figure 2 shows the flow diagram for encoder implementation. We have implemented the ½ rate encoder for different matrix size 4X8, 16X32, 32X64, 64X128 
IMPLEMENTATION RESULTS
We have simulated the ldpc encoder & log domain decoder algorithm in Matlab & results are verified both in simulation & implementation. Figure-4 shows Matlab simulation results. We have implemented linear-time Encoder for LDPC codes. This algorithm is implemented on Xilinx Spartan 3E board using ISE 13.1 & Xilinx High Level synthesis vivado HLS tool.
The synthesis results for Spartan 3E FPGA are shown in Figure 3 .
Fig -3
Device Utilization for Spartan 3E FPGA 
