Abstract: In this letter, a novel bit flipping decoding of systematic LDPC codes is proposed. Unsuccessfully decoded codeword is efficiently redecoded by the candidate information bit flipping (CIBF) decoder using cyclic redundancy check (CRC) information at the end of each iteration. We adopt the CIBF decoder to the LDPC decoding additionally and that makes it possible to reduce the power consumption up to 12.7% because of the reduced average number of iterations and to improve the frame error rate (FER) performance. Based on the hardware cost analysis in the CMOS cell library, the additional hardware cost of the CIBF decoder is negligible compared with the conventional LDPC decoder.
Introduction
Low-density parity-check (LDPC) codes [1] are adopted in many communication standards such as DVB-S2, WLAN (802.11n), and WiMAX (802.16e) and improving the performance or convergence rate of the LDPC decoder with minimal additional complexity is of prime interest. Owing to the capacity-approaching performance, the LDPC codes using sum-product (SP) decoding also have been adopted in data-storage systems [2, 3] . The most popular LDPC decoding algorithm is min-sum algorithm (MSA) because of its low computational complexity with slight loss of the coding gain. To reduce the performance loss of MSA, modified versions of MSA have been researched based on normalized MS (NMS) decoding [4] , which simply multiplies the check to variable node messages by a scaling factor ¡ to compensate for overestimated message in comparison to the SP algorithm. Compared with the conventional SP decoder, MS decoder does not require the SNR estimator, which reduces the decoder implementation size and the power consumption. Thus, implementation of efficient LDPC decoders has been actively studied in various ways such as new decoding algorithm [4] , smart scheduling scheme [5] and early termination scheme algorithm [6] .
In general, the belief propagation (BP)-based LDPC decoder is stopped if the syndrome check is passed or the maximum number of iterations is reached. However, even though the parity check equation (syndrome check) of LDPC code is satisfied, an undetected error may occur if another codeword is obtained by the LDPC decoder, which frequently happens in most finite-length LDPC codes. Thus, cyclic redundancy check (CRC) is included in a frame structure for recent deep space communication, data-storage system, and digital communication standard [7, 8, 9] .
In [10] , LDPC code concatenated with CRC was proposed using ordered statistical decoding (OSD) algorithm and perturbation method for decoding of LDPC code aided by CRC was proposed [11] . However, the previous work in [10] requires high decoding complexity because the Gaussian elimination of parity-check matrix is required for OSD algorithm at every BP decoding and specific parameters are demanded to perform the perturbation method for each LDPC code [11] .
In this letter, we propose a simple decoding algorithm to reduce the average number of iterations (ANIs) of LDPC codes based on the characteristics of erroneous information bits. In the candidate information bit flipping (CIBF) decoder, candidate information bits (CIBs) are selected by identifying the least reliable information bits and they are exploited to the CIBF decoding at the end of each NMS decoding iteration. The proposed decoding not only reduces the power consumption by increasing the convergence speed of NMS decoding but also it improves the FER performance. Simulation results show that the power consumption of the proposed decoder is reduced in 12.7% although the overall hardware cost is slightly increased. Moreover, it does not require modification of the coding scheme.
Characteristics of erroneous information bits
In Fig. 1 , the histograms of LLR values of the erroneous information bits at the iterations 4 and 6 are shown for the NMS decoding at SNR ¼ 3 dB wherek ðiÞ out and th denote the output information LLR vector at the i-th iteration of NMS decoder and the threshold value for CIBs selection, respectively.
In general, the LLR magnitude of the erroneous bits tends to be decreased as iteration increases. As shown in Fig 1, the information bits in the range of th are the most unreliable, and if the decoder declares a fail then those bits remain still errors. And the number of them is also very small. Thus, we focus on these characteristics of erroneous information bits. For the simulation, ð2304; 1152Þ IEEE 802.16e LDPC code is used. BPSK modulation and additive white Gaussian noise (AWGN) channel are assumed. The scaling factor of the NMS decoder is set to 0.75. Note that we assume that all-zero codeword is transmitted for Fig. 1 . 3 Candidate information bit flipping decoding
System description
In this subsection, the CIBF decoding is described. Fig. 2 presents the proposed decoder structure where y in is the received LLR vector from the channel,k ðiÞ out is the output information LLR vector at the i-th iteration of NMS decoder,x ðiÞ out is the decoded information bits at the i-th iteration of NMS decoder,x ðwÞ out CIBF is the decoded information bits at the w-th iteration of CIBF decoder, andû out is the final decoded information bits.
In general, the conventional LDPC decoder is composed of normalized MS decoder and CRC calculator. After the NMS decoder is finished using the syndrome check, the CRC calculator is performed to check the undetected error of the decoded frame. We use the CRC more efficiently in the proposed decoding; the CRC calculation has to be performed at the end of each iteration of the NMS decoding to check correctness of the frame (instead of the syndrome check). If CRC fails, the CIBF decoder is initiated. The output information LLR vectork ðiÞ out is loaded into the CIBF decoder from the NMS decoder. The CIBF decoder is iteratively conducted until CRC succeeds or the given maximum number of CIBF iterations is reached.
Candidate information bit flipping decoding
In this Section, the CIBF decoding is described. If the NMS decoder has a fail, then there exist few erroneous information bits with small LLR magnitude. In the information bits in range of th , some of them are selected as the CIBs and efficiently re-decoded using CIBF decoder.
The CIBF decoder consists of the sorting and the flipping algorithms. The sorting may increase the CIBF decoding complexity as the information bits increases. Thus, to reduce the complexity of the sorting, most of the reliable information bits are excluded by the threshold th . Thek ðiÞ th denotes the unreliable information bits selected in the range of th . We sortk ðiÞ th as ascending order based on the LLR magnitude and then first N CIB information bits are selected as the CIBs. Through extensive simulation, th ¼ 1:1 is proper value for IEEE 802.16e LDPC codes. Note that the size of N CIB is determined by considering trade-off between the decoding performance and the decoding complexity. and checks the CRC iteratively until the CRC succeeds, as described in Algorithm 1. Table II . Table I shows the computational complexity comparison per one iteration. Since the CIBF decoding is performed at the end of each iteration of NMS decoding, the computational complexity of the proposed decoding consists of the NMS decoding and the CIBF decoding. Here, d v and d c are the average degree of variable nodes (VNs) and check nodes (CNs), respectively. Suppose that codeword Compar.
Experimental results

Decoding performance and power consumption analysis
bits n c comprises n k information bits and n m ¼ n c À n k check bits, where information bits consist of n d data bits and n r ¼ n k À n d CRC bits. Note that CRC is applied to only data bits. Although the computation complexity of the proposed decoding is slightly increased due to the CIBF decoding compared with conventional decoding, the ANIs of the proposed decoding is substantially reduced as shown in Table II.   Table II presents the simulation results about the ANIs of the conventional NMS decoding and the proposed decoding. For fair comparison, the conventional NMS decoding iteration is also terminated when CRC succeeds. The ANIs of the CIBF decoding is included in counting the ANIs of the proposed decoding in Table II . For example, at E b =N 0 ¼ 3:0 dB, the ANIs of the proposed decoding 4.46 is composed of 4.33 from NMS decoding and 0.13 that is converted from the ANIs of CIBF decoding. The convergence speed of the proposed decoding improved as SNR increases because the CIBF decoding success rate is increased due to the small number of erroneous information bits at high-SNR region compared with the large number of erroneous informations bit at low-SNR region. Thus, the power consumption is reduced in 12.7% compared with the conventional NMS decoding at SNR ¼ 3 dB. 
Hardware cost analysis
In this section, the hardware cost for implementing the proposed algorithm is addressed. Implementing the computation module of the CIBF decoding is straightforward when a conventional LDPC decoder has the CRC calculator, the minimum operators at CNs [14] . Selection of the information bits with magnitudes that are lower than the threshold th and sorting ofk ðiÞ th based on magnitude of LLR are implemented by using the subtractors and the minimum operators, respectively. The information bit flipping for the CIBs is implemented by the modulo2 adders. To verify the hardware cost of the proposed decoding, CN, VN unit, modulo2 adder unit, and subtractor unit for the IEEE 802.16e decoder were synthesized using a 0.18-µm CMOS cell library. Table III presents the area comparison between CN, VN unit and subtractor, modulo2 adder. For fair comparison, we compare the area per each unit. The area of subtractors is determined by the number of VNs of the LDPC decoder. Although the overall hardware cost is slightly increased, the CIBF decoder requires only 1.26% hardware cost compared with the CN unit. Thus, the hardware cost of CIBF decoder is negligible.
Conclusion
In this letter, a novel bit flipping decoding for systematic LDPC decoder is presented. If there is a decoding fail in the NMS decoder, the CIBF decoder tries to re-decode this codeword using CRC at the end of each NMS decoding iteration. At a slightly increased hardware cost, the CIBF decoder can decode the most of erroneous codewords of the NMS decoder successfully because it estimates the most unreliable bits and efficiently re-decodes them using bit flipping and CRC. Thus, the CIBF decoder makes it possible to reduce the average number of iterations of the decoder, and finally the power consumption related to the average number of iterations is reduced up to 12.7%. Moreover, the proposed decoding can be applied easily to the existing communication standards and the data-storage systems. 
Acknowledgments
