



Available online at www.sciencedirect.com



Procedia Computer Science 79 (2016) 765 – 771



# 7th International Conference on Communication, Computing and Virtualization 2016

# VLSI Implementation of a Rate Decoder for Structural LDPC Channel Codes

<sup>a</sup>Sandeep Kakde, <sup>b</sup>Atish Khobragade \*a

<sup>a</sup>Department of Electronics Engineering, Y C College of Engineering, Nagpur,441 110 India <sup>b</sup>Department of Electronics Engineering, Rajiv Gandhi College of Engineering, Nagpur,441 110 India

#### Abstract

This paper proposes a low complexity low-density parity check decoder (LDPC) design. The design mainly accomplishes a message passing algorithm and systolic high throughput architecture. The typical mathematical calculations are based on the observation that nodes with high log likelihood ratio provide almost same information in every iteration and can be considered as stationary, we propose an algorithm in which the parity check matrix H is updated to a reduced complexity form every time a stationary node is encountered which results in lesser number of numerical computations in subsequent iterations. In this paper, we contemplately focuses on computational complexity and the decoder design significantly benefits from the high throughput point of view and the various improvisations introduced at various levels of abstraction in the decoder design. Threshold Controlled Min Sum Algorithm implements the LDPC decoder design for a code compliant with wired and wireless applications. A high performance LDPC decoder has been designed that achieves a throughput of 0.890 Gbps. The whole design of LDPC Decoder is designed, simulated and synthesized using Xilinx ISE 13.1 EDA Tool.

© 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the Organizing Committee of ICCCV 2016

Keywords: High Throughput, LDPC Decoder, Message Passing Algorithm, Low Complexity.

# 1. Introduction

High-throughput LDPC decoders with a low area and power on a silicon chip for practical applications is one of the implementing challenge. High throughput emulation allows for the analysis of the low bit error rate (BER). The excellent error-correction performance of LDPC codes is observed up until a moderate bit error rate (BER).Low-density parity-check (LDPC) codes have been demonstrated to perform very close to the Shannon limit when

\* Sandeep Kakde Tel.: +91 9730675845 *E-mail address*:sandip.kakde@gmail.com decoded iteratively using a message-passing algorithm. The LDPC codes have been chosen as forward error correction in application including digital video broadcasting (DVBS2), 10 Gigabit Ethernet (10GBASE-T) broadband wireless access (Wi-Max), wireless local area network, deep-space communications, magnetic storage in hard disk drives and in high end processors. In high-throughput applications, error floors are a major factor in limiting the deployment of LDPC codes. The sum product algorithm is used, which greatly reduces the implementation complexity, but incurs degradation in decoding performance. Ccommunication systems such as wireless communication, memory require very fast and low complexity error correcting schemes. Among existing forward error decoding algorithms for LDPC codes on the Binary Symmetric Channel, the bit flipping algorithms are least complex and possess desirable bit error correcting abilities. The near Shannon limit performance of Low Density Parity Check (LDPC) codes [7] decoding algorithms like sum product (SP) [8], min-sum, and modified min-sum [3], has helped LDPC codes to be adopted in various digital communication standards such as DVB-S2 standard, WiMAX (802.16e), Wi-Fi (802.11n), 10Gbit Ethernet (802.3an) and others. However, due to the logarithmic and exponential functions involved in their decoding algorithms, LDPC decoders are often criticized for their large memory requirement and long convergence time. Various VLSI hardware architectures have been proposed for real time processing of Belief Propagation and min sum algorithm based decoding methods. However, parallel architectures usually result in excessive implementation cost, while serial architectures are too slow for most applications. Apart from memory related requirements, suppleness with respect to code rate and length is also important. Due to these design and implementation problems, LDPC is still considered as optional feature in wireless standards for mobile network like IEEE 802.11n and Wi-Bro.

#### 2. Related Work

In a low accuracy implementation, the error floors are predominated by the fixed-point decoding effects, whereas in a superior precision implementation the errors are attributed to special configurations within the code, whose effect is worse in a fixed-point decoder [2]. This paper explores realistic LDPC decoder design critical issues using an simulation-based approach. The main contribution is to shed light on the throughput, which is caused both by intrinsic properties of the code as well as aspects of the quantization scheme. Conventional quantization schemes applied to an array-based LDPC code can induce low-weight weak absorbing sets and, as a result, elevate the error floor [4]. Benefit of an adaptive quantization scheme is that it performs well even in very few iterations. An adaptive quantization approach improves the fidelity of extrinsic messages and channel likelihoods. Quantization has a significant effect on the composition of absorbing sets in the error floor region.

#### 3. Taxonomy of Message Passing Algorithms

**Girth:** A cycle of length l in a Tanner graph is a path of l distinct edges, which closes on itself. The girth of a Tanner graph is the minimum cycle length of the graph. The shortest possible cycle in a Tanner graph has length 4. **Regular Vs Irregular LDPC Codes:** In the sparse matrix, if the row weight is same as column weight then it is called as regular LDPC code matrix.

**Bipartite Graph:** A Tanner graph is a bipartite graph that describes the parity check matrix H. There are two types of nodes namely:

Variable-nodes: Correspond to bits of the code word or equivalently, to columns of the parity check matrix. There are *n* v-nodes

**Check-nodes:** Correspond to parity check equations or equivalently, to rows of the parity check matrix. There are m = n-k c-nodes.

**Bipartite** means that nodes of the same type cannot be connected. For example, a b-node cannot be connected to another b-node). The i<sup>th</sup> check node is connected to the j<sup>th</sup> variable node if the (i,j)<sup>th</sup> element of the parity check matrix is one, i.e. if  $h_{ij} = 1$ . All of the v-nodes connected to a particular c-node must sum (modulo-2) to zero Example: Tanner Graph for Sparse Parity Check H Matrix is as shown in Fig 1(a).



$$H = \begin{bmatrix} 1110100\\ 1101010\\ 1011001\\ 0001111\\ 0010111 \end{bmatrix}$$
 Variable Node

VN1 VN2 VN3 VN4 VN5 VN6 VN7

Fig. 1. (a) Sparse Parity Check H-Matrix (b) Tanner Graph

The corresponding Tanner Graph is as shown in Fig 1(b).

### 4. Iterative Decoding Algorithms

# 4.1. Belief Propagation Algorithm (BPA)

A binary LDPC code [1], [2] is a linear block code described by a sparse parity-check matrix. A bipartite graph with check nodes in one class and symbol or variable nodes in the other can be created using as its incidence matrix. Such a graph is known as the Tanner graph. An LDPC code is called -regular if in its bipartite graph, every symbol node is connected to check nodes, and every check node is connected to symbol nodes; otherwise, it is called an irregular LDPC code. The graphical representation of LDPC codes is attractive, because it not only helps understand their parity-check structure, but, more importantly, also facilitates a powerful decoding approach. The key decoding steps are the local application of Bayes' rule at each node and the exchange of the results ("messages") with neighbouring nodes. At any given iteration, two types of messages are passed: probabilities or "beliefs" from symbol nodes to check nodes, and probabilities or "beliefs" from check nodes to symbol nodes. In Density Evolution, we keep track of message densities, rather than the densities themselves. At each iteration, we average over all of the edges that are connected by a permutation. We assume that the all-zeros codeword was transmitted which requires that the channel be symmetric.

#### 4.2. Threshold Controlled Min Sum Algorithm

This section presents the proposed transformation of Min-Sum algorithm formulation. The Belief Propogation Algorithm involves multifarious check-node computational complexity and is difficult for hardware implementation. First M.P.C. Fossorier introduced the Min-Sum (MS) Algorithm by simplifying the check-node computation in the Belief Propogation Algorithm [17]. Despite the fact that the MS algorithm can decrease the computational complexity, the decoding performance has been sacrificed too much. Subsequently a lot design work has been done to alter the MS algorithm to achieve low bit error rate. First, we introduce some conventional definitions and notations following the open literature. Given an  $M \times N$  parity check matrix H. better performance [5-6]. Chen then introduced the Normalized Min-Sum (NMS) and Offset Min-Sum (OMS) algorithms [20]. In these two algorithms, the normalized factor and offset factor are applied to the check-node update equation to improve the decoding performance. But as mentioned in Res.[21], the normalized factor is determined by the magnitude of the minimum value and the decoding performance suffers severe degradation when the output is close to zero. The offset factor is set before decoding and does not take the output value of each iteration output into account, so the decoding performance is under optimization. In Res. [21], the authors initiate the modified OMS algorithm to solve the above problems and achieve some improvements. The LDPC codes are binary linear block codes that have a low-density parity check matrix H. The LDPC encoder encodes a K length inputs binary message (x0, x1... xK-1) into a N bits systematic LDPC codeword  $X = (x_0, x_1, ..., x_{K-1}, x_K, ..., x_{N-1})$ . The valid codeword  $X \in C$  have to satisfy

Where H is the parity check matrix and C is the set of the valid codeword. Each column in H is associated to a bit of the codeword and each row corresponds to a parity check. In the Tanner graph, the codeword bits are shown as the

variable nodes (VN) and the parity check as the check nodes (CN). The VN is connected by edge to a CN in the Tanner graph if and only if the corresponding codeword bit takes part in the corresponding parity check equation. The typical mathematical calculations are based on the observation that nodes with high log likelihood ratio provide almost same information in every iteration and can be considered as stationary, we propose an algorithm in which the parity check matrix H is updated to a reduced complexity form every time a stationary node is encountered which results in lesser number of numerical computations in subsequent iterations. Not only we concentrate on complexity but also the design greatly benefits from the high throughput point of view and the various improvisations introduced at various levels of abstraction in the decoder design.

#### 5. Hardware Implementation Results



Fig. 2. Decoding Architecture

The LDPC decoder design is implemented using Threshold Controlled Min Sum Algorithm. The serial approach leads to low cost and low power implementations and it offers a high level of flexibility with respect to the supported code. However, serial architectures did not receive much attention, because the sequential processing does not achieve large throughput. The input vector used here is of 4 bits. The length of input vectors can be increased to block lengths used for specific applications. The systolic and fully parallel high throughput decoding architecture is as shown in fig 2. Device Utilization Summary and throughput comparison is as shown in Table 1 and Table 2.





b

Fig. 3. (a) RTL View; (b) Technological View.

Table 1: Device Utilization Summary

| FPGA Family: Xilinx<br>Target Device: xc3s200-5pq208 |                  |           |             |  |
|------------------------------------------------------|------------------|-----------|-------------|--|
| Threshold Controll                                   | ed Min Sum Alg   | orithm    |             |  |
| Channel Model: AWGN                                  | , Modulation Sch | eme: BPSK |             |  |
| Logic Utilization                                    | Used             | Available | Utilization |  |
| 4 Input LUTs                                         | 11               | 3840      | 1 %         |  |
| Slices                                               | 6                | 1920      | 1 %         |  |
| Number of Slices containing only related logic       | 6                | 6         | 100 %       |  |
| No. of Bonded IOBs                                   | 6                | 141       | 4 %         |  |
| Average Fan-out of Non-Clock Nets                    | 2.56             |           |             |  |
| Delay (ns) (Clock to out pin)                        | 8.929            |           |             |  |

Table 2: Throughput Comparison

| LDPC DECODER     |          |           |  |
|------------------|----------|-----------|--|
| Design Parameter | [16]     | This Work |  |
| Throughput       | 522 Mbps | 890 Mbps  |  |

## Conclusion

In this paper, we have designed Threshold Controlled Min Sum Algorithm based LDPC Decoder. In fact, simplified reduced-complexity decoding schemes sometimes can outperform the MAP decoding algorithm. The proposed algorithm shows an efficient high-level approach to design the VNPU and the CNPU blocks for the Modified MSA. A high performance LDPC decoder has been designed that achieves a throughput of 0.890 Gbps. The decoder is vigorously configured to support multiple 3G and 4G wireless standards. LDPC codes have been selected in a number of next generation wireless standards for forward error correction. Implementing a flexible VLSI architecture while satisfying silicon area, latency and dynamic power metrics is still a demanding task. In this paper, we presented a Threshold Controlled min sum algorithm based LDPC decoder architecture that can decode any structured or unstructured LDPC channel code. Furthermore, we implemented the proposed decoder architecture on an FPGA and showed simulation results using Xilinx ISE 13.1. The whole design of LDPC Decoder is implemented in Verilog and burn the code on Virtex-VI FPGA kits.

# References

- 1. R. G. Gallager, "Low density parity check codes," IRE Trans. Inform. Theory, vol. IT-8, pp. 21-28, Jan. 1962.
- 2. R. Tanner, A recursive approach to low complexity codes, Information Theory, IEEE Transactions on 27 (5) (1981) 533-547.
- D. J. C. MacKay and R. M. Neal, "Near Shannon limit performance of low density parity check codes," Electron. Lett., vol. 32, no. 18, pp.16451646, 1996.
- Zhengya Zhang, "Design of LDPC Decoders for Improved Low Error Rate Performance: Quantization and Algorithm Choices", IEEE TRANNSACTIONS ON WIRELESS COMMUNICATION, VOL.8, NO.11,NOVEMBER 2009.
- Zhengya Zhang, Lara Dolecek, Borivoje Nikolić, Venkat Anantharam, and Martin Wainwright, "Investigation of Error Floors of Structured Low-Density Parity-Check Codes by Hardware Emulation", 1-4244-0357-X/06 © 2006 IEEE.
- Lara Dolecek, Zhengya Zhang, Venkat Anantharam, Martin J. Wainwright, Borivoje Nikolic´, "Analysis of Absorbing Sets and Fully Absorbing Sets of Array-Based LDPC Codes", IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 56, NO. 1, JANUARY 2010.
- Zhengya Zhang, Lara Dolecek, Martin Wainwright, Venkat Anantharam, and Borivoje Nikolić, "Quantization Effects in Low-Density Parity-Check Decoders", 1-4244-0353-7/07 ©2007 IEEE.
- Achilleas Anastasopoulos, "A comparison between the sum-product and the min-sum iterative detection algorithms based on density evolution", 0-7803-7206-9/01 © 2001 IEEE.
- Jinghu Chen, Ajay Dholakia, Evangelos Eleftheriou,", Reduced-Complexity Decoding of LDPC Codes" IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 53, NO. 8, AUGUST 2005.
- Fabian Angarita, Javier Valls, Vicenc Almenar, and Vicente Torres, "Reduced-Complexity Min-Sum Algorithm for Decoding LDPC Codes With Low Error-Floor", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-I: REGULAR PAPERS, VOL. 61, NO. 7, JULY 2014.
- 11. Jinghu Chen, Ajay Dholakia, Senior, Evangelos Eleftheriou, Fellow, Marc P. C. Fossorier, and Xiao-Yu Hu," Reduced-Complexity Decoding of LDPC Codes", IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 53, NO. 8, AUGUST 2005.
- 12. Jingyu Kang, Qin Huang, Shu Lin, and Khaled Abdel-Ghaffar, "An Iterative Decoding Algorithm with Backtracking to Lower the Error-Floors of LDPC Codes", IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 59, NO. 1, 0090-6778/11 © 2011 IEEE , JANUARY 2011.
- 13. Brian K. Butler, Paul H. Siegel, Fellow, "Error Floor Approximation for LDPC codes in AWGN Channel", IEEE TRANSACTIONS ON INFORMATION THEORY VERSION: JUNE 4, 2013
- 14. Xiaojie Zhang and Paul H. Siegel," Quantized Min-Sum Decoders with Low Error Floor for LDPC Codes", IEEE International Symposium on Information Theory Proceedings, 2012.
- 15. Y. Han and W. E. Ryan, "Low-floor decoder for LDPC codes," IEEE Trans. Commun., vol. 57, no. 6, pp. 1663-1673, June 2009.
- Sachin Singh Khati, Pooja Bisht, Subhash Chandra Pujari," Improved Decoder Design for LDPC Codes based on selective node processing", 2012 World Congress on Information and Communication Technologies, pp 413-419, 978-1-4673-4805-8/12 © 2012 IEEE.
- 17. M. P. C. Fossorier, M. Mihaljevic, and H. Imai. Reduced complexity iterative decoding of low density parity check nodes based on belief propagation[J]. IEEE Trans. on Commun., 1999, 47(5): 673-680.
- N. Kanistras, V. Paliouras," Impact of round off error on the decisions of the Log Sum Product Algorithm for LDPC Decoding", 978-1-4244-2924-0/08 ©2008 IEEE
- L. Liu and C. J. R. Shi.Sliced message passing: High throughput overlapped decoding of high-rate low-density parity-check codes, IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 55, no. 11, pp. 3697–3710, Dec. 2008.
- J. Chen and M. P. C. Fossorier. Density evolution for two improved BP-based decoding algorithm of LDPC codes[J]. IEEE Communication Letters, 2002, 6(5): 208-210.

- Ming Jiang, Chunming Zhao, Li Zhang, et al. Adaptive offset min-sum algorithm for low-density parity check codes[J]. IEEE Communication Letters, 2006, 10(6): 483-485.
- 22. J. Zhao, F. Zarkeshvari, and A. H. Banihashemi. On implementation of min-sum algorithm and its modifications for decoding lowdensity parity-check (LDPC) codes[J]. IEEE Trans. on Commun., 2005, 53(4): 549-554.
- 23. Xiaojie Zhang and Paul H. Siegel. Quantized Min-Sum Decoders with Low Error Floor for LDPC Codes. IEEE International Symposium on Information Theory Proceedings, 2012.
- 24. Brian K. Butler, Paul H. Siegel, Fellow. Error Floor Approximation for LDPC codes in AWGN Channel. IEEE TRANSACTIONS ON INFORMATION THEORY VERSION:JUNE 4, 2013