A new decoder architecture for nonbinary low-density parity check (LDPC) codes is presented in this paper to reduce the hardware operational complexity and power consumption. Adaptive message control (AMC) is to achieve the low decoding complexity, that dynamically trims the message length of belief information to reduce the amount of memory accesses and arithmetic operations. A new horizontal nonbinary LDPC decoder architecture is developed to implement AMC. Key components in the architecture have been designed with the consideration of variable message lengths to leverage the benefit of the proposed AMC. Simulation results demonstrate that the proposed nonbinary LDPC decoder architecture can significantly reduce hardware operations and power consumption as compared with existing work with negligible performance degradation.
INTRODUCTION
Due to these features, design and implementation of Low-density parity-check (LDPC) codes [1] , [2] emerging applications such as underwater acoustic are considered as one of the most powerful capacitycommunications. approaching codes. LDPC codes can be constructed in A key challenge in the application of both binary domain and Galois fields (i.e., GF (2 ) , where nonbinary LDPC codes is their high decoding m m >1). Binary LDPC codes have been studied extensively complexity, as each symbol in the codeword is and adopted in many communication protocols, such as decoded using a long message. A lot of research effort DVB-T2, WiMax, etc. In general, a very long code length aims at reducing the decoding complexity of nonbinary is necessary for binary LDPC codes to approach the LDPC codes at the algorithm level. To deal with the channel capacity. Low-Density Parity-Check (LDPC) problem that computational complexity increases codes have recently received a lot of attention exponentially with, the extended Min-Sum (EMS) was because of their admirable performance and have been proposed in [3] where only the most significant n entries widely considered as a promising candidate errorin a message were used in the decoding. A decoding correcting coding scheme for many real applications in technique developed in [4] conducted the EMS with a telecommunications and magnetic storage. Nonbinary reduced complexity of o(n log n ) with minor LDPC codes constructed in Galois fields offer improved performance degradation. It should be noted that these performance at a moderate code length. In addition, algorithm-level techniques do not explicitly consider the nonbinary LDPC codes can be combined with high order complexity in the implementation of nonbinary LDPC modulations to increase the bandwidth efficiency.
decoders. Different from these existing works targeting The decoding process of nonbinary LDPC codes hardware implementation cost, the focus of this paper is to reduce the hardware operational complexity in nonbinary LDPC decoder architectures. This enables efficient decoding suitable for emerging applications such as underwater acoustic sensor networks [5] that are under the severe resource (e.g., energy) constraints. It was reported that memory accesses and arithmetic operations are the two major contributors to the operating cost in LDPC decoders. As the amount of memory accesses and arithmetic operations is largely determined by the message length, reducing message length is deemed as an effective way for efficient decoding. Different from the EMS which maintains a constant message length for every symbol; the proposed AMC adjusts the message length adaptively, which can reduce the message length at the required performance.
In this paper, a horizontal sequential VLSI architecture for the nonbinary LDPC decoder employing the AMC is developed. The design of the key components in this architecture, such as variable node and check node update units, is optimized by exploiting the variable length sorters, which can be dynamically configured in different functional units to accommodate variable message lengths. The AMC is implemented by a low-complexity approximation method to avoid hardware overheads and performance impact. A mapping table based approach is proposed to conduct searching operations with low complexity. We apply AMC to EMS to address the memory and throughput issues caused by the worst case message length. Note that AMC can also be employed in other decoding update rules such as the Min-Max algorithm. In addition, the proposed AMC can be generally applied to most existing decoder architectures (sequential, partial parallel and fully parallel). operates on the messages that represent the probability distribution of symbols. A message is a length-2 vector m recording the 2 belief information of a symbol subject to m channel noise, where each belief information indicates the probability of this noisy symbol to be one of the 2 elements in GF (2 ) . The decoding process is essentially m m iterative message exchange and update between the check nodes and variable nodes in the Tanner graph representation. The messages on the Tanner graph exchange and update in two directions. Variable node messages (VNMs), denoted by q, pass from the variable nodes to the check nodes; and check node messages (CNMs), denoted by r, pass from the check nodes to the variable nodes. VNMs are initialized with channel messages, which are the input to the decoder. Then, VNMs are sent to the check nodes to update the CNMs. The new CNMs are then sent back to the variable nodes and used together with the channel messages to update the VNMs. This procedure is known as the belief propagation.
Review of Nonbinary
There are mainly two types of decoding algorithms-sum-product algorithm (SPA) and min-sum (MS), of which the latter one is a mathematical approximation of SPA where the sum of product is replaced by the maximum product term. Due to its relatively simple operation, MS is often chosen for practical applications. Further reduction in the complexity of MS is desirable; one such approach is the extended MS (EMS) algorithm.
Min-Sum with Adaptive Message Control:
In this section, an adaptive message control method that dynamically adjusts the message length of a symbol during the decoding iteration was developed. A truncation scheme is proposed for the proposed AMC-EMS algorithm.
Adaptive Message Control (AMC):
Intuitively, when the distribution of belief information is more concentrated, a message with a smaller number of entries might be sufficient to retain the same amount of the belief information. Exploiting this fact, we propose to adaptively control the message length during the decoding iteration. Our approach leads to two major advantages. First, due to the casual nature of channel noise, channel messages for different symbols may have different statistics. In comparison with the EMS which maintains a fixed number of entries for all channel messages, the proposed 2013 AMC can lower the average message length while complexity, and q(k) indicates the kth entry in the log retaining the same amount of belief information. Second, as the decoding proceeds, the belief information will gradually concentrate around the correct element in the case of convergence. Thus, fewer entries are needed in the message of a symbol, i.e., the message can be truncated even more. These features are essential for reducing the computational complexity of decoding nonbinary LDPC codes.
The major decoding operation at a check node is to refine the estimated message of a symbol based on the messages of other symbols that are correlated by the PCM H. The elementary check node operation in the Min-Sum (MS) decoding algorithm [3], [4] can be expressed as r = q1 q2 (1) where, q and q are the length 2 variable node messages and Which sums up the belief information associated with the same finite field element. The hardware complexity in implementing (1) and (2) is determined by the lengths of variable node messages (VNMs) and check node messages (CNMs). Intuitively, when the distribution of belief information is more concentrated, a shorter message might be sufficient to retain most of the belief information.
The basic idea of AMC is to keep as few entries in a message as possible without incurring much information loss. It has been demonstrated that message truncation can be implemented by finding the minimal n that satisfies
Where ae is the confidence factor that determines the tradeoff between performance and operational domain representation of the message q that is sorted in order. In this case, the truncation criteria can be recast as
where, the threshold is used to truncate messages. The operation of AMC in a nonbinary LDPC decoder can be summarized as follows.
Initialization:
Channel message truncation: ? =AMC(f ) where f is 
VLSI Architecture For Sequential AMC-Based Decoder:
In this section, a sequential nonbinary LDPC decoder architecture for the proposed AMC is presented. We first discuss the top level architecture and then detail the design of several key components such as the variable length sorter, variable node update unit (VNU) and check node update unite (CNU). The proposed AMC can be applied to different non binary LDPC decoding schedules, such as sequential, partially parallel and fully parallel architectures.
Horizontal Nonbinary LDPC Decoder Architecture:
The design of horizontal nonbinary LDPC decoder employing with AMC is shown in fig. 1 . In this above architecture consists of variable node update unit, check node update unit, check sum unit, tentative decoding decoder employing with AMC Random access memory is used to store the data. In this architecture RAM memory is divided into four divisions. RAMb is used to store the channel messages; RAMa is to store some bits, which are given to the RAMd to generate the intermediate check node messages. permutation is used to rearrange the order of inputs, which are coming from the variable node update unit. Inverse permutation is used to arrange the order of bits, which are from the check node update unit. Tentative decoding is used to do the iteration process. Check sum unit is used to check the channel messages with the values in that unit.
At the beginning of the decoding, the truncated Fig. 2 : Implementation of the VNU unit channel messages are loaded into the memory RAMb. Tentative decoding is conducted with the channel messages and the results are stored in the memory RAMc. If tentative decoding suceeds, decoding terminates and RAMc outputs the final result. Otherwise, the decoder initializes the intermediate check node messages (ICNMs) in the memory RAMd with the channel messages. After the initialization, the check node update unit (CNU) reads ICNMs from RAMd to perform the check node update. The CNMs from the CNU are inversely permuted and then passed to the variable node update unit along with the associated channel message to perform variable node update. The variable node update Fig. 3 : Implementation of the CNU unit unit (VNU) also generates the messages for the tentative decoding unit (TDU) to conduct tentative decoding.
in the RAMd. After this, the decoder proceeds to another If tentative decoding does not succeed, the VNMs from variable node. This process continues until the check sum VNU are then permuted and sent back to the CNU to unit (CSU) decides to terminate because of either update the corresponding ICNMs, which are then stored successful decoding or reaching the limit of iterations. fig.2 . The function of VNU is to compute two messages, where the belief information associated with the same Galois field element in the two messages is summed. Variable node update unit is used to truncate the message length. In this unit register array, mapping table, multiplexer, sorter, comparator are presented. The mapping table is utilized to sum up the entries associated with the same finite filed element. The final results are sent to the variable length sorter with the associated Galois field elements and AMC is then applied to the outputs of the sorter, starting from the largest entry in the sorter, when the entry is smaller than the threshold, the following entries are discarded. This reduces the hardware operations in the VNU.
C. Check Node Update Unit:
The CNU produces the largest elements among all the combinations from two input messages. The proposed design of CNU is shown in fig. 3 . Figure shows that the complexity of CNU can be reduced to additions and insertion operations if the two input messages are sorted in the descending order. Here adopt this method in the CNU design. The complexity of the CNU depends on two factors.
The first factor is the number of outputs, which determines how many insertion operations are needed. The second factor is the length of the sorter, which determines how many comparisons and shifting operations are involved in each insertion operation. Employing the proposed AMC, the message length decreases thus becomes smaller during the iteration. This reduces the number of arithmetic operations and insertions. To address the second factor, the sorter is configured to be the same length of the shorter message thereby reducing the complexity of each insertion operation. The CNU also performs additions in the Galois field. However, additions in the Galois field are essentially bit-wise XOR operations, thus the complexity is much lower than the real number additions of belief information.
D. Variable Length Sorter:
Sorting is an important operation in the CNU and VNU. In the CNU, the sorter chooses number of elements with the largest belief information as required for check node update. In the VNU, truncation is carried out and it discards the smaller elements and the remaining elements are sorted in order as the input of CNU.
The basic sorting operation is to insert a new element into a vector according to its magnitude. The proposed variable length sorter is illustrated in fig. 4 , where only the data path of belief information is shown as it determines the shifting operation. The sorter consists of number of stages to accommodate the longest message length. Each stage contains a comparator and a register. Each time new belief information comes in, it is compared with all the belief information in the active stages in parallel. The content of the first stage having larger belief information will be replaced by the new input, whereas its original belief information will be shifted to the right stage and so on. Sorter is an important operation in check node update unit and variable node update unit. Sorter consists of decoder, register, comparator, multiplexer, xor gate. The operation is parallel process. Employing AMC, the message length reduces gradually. Thus only the last stages are enabled. This is controlled by the longer one of the two messages in the check node update unit (CNU) or variable node update unit (VNU).
The major operations in the sorter are comparisons and shiftings. The number of these operations is mainly determined by the length of the message to be processed. The proposed AMC dynamically adjusts the message length during the iteration, thereby reducing the complexity of sorting. E. Searching: Searching is another major operation in the check node update unit (CNU) and variable node update unit (VNU). In the VNU, the belief information in one message needs to search for its counterpart in another message to sums up the belief information. The output of the sorter is considered valid if and only if the corresponding Galois field element has not been generated previously. A searching operation has to be conducted on the current output to compare it with the previous outputs of the sorter.The mapping information between the Galois field elements and their indexes in a message is maintained by a mapping table of words with word length of bits. In the CNU, number of different elements need to be generated according to variable node messages. The output of the sorter is considered valid if and only if the corresponding Galois field element has not been generated previously. A searching operation has to be conducted on the current output to compare it with the previous outputs of the sorter. mapping table is to records the status of Galois field elements, it is referred to as status table in the CNU. The existence of the element needs to be determined, a mapping table with only number of bits can be constructed. The proposed mapping-based searching scheme is naturally low complexity in comparison with direct searching of the message, especially when the message is very long.
RESULT AND DISCUSSION

A. Simulation result
CONCLUSION
A new nonbinary low density parity check (LDPC) decoding architecture is proposed to reduce hardware operation and power consumption. A horizontal sequential VLSI architecture for the nonbinary LDPC decoder employing the AMC. The design of the key components in this architecture, such as variable node and check node update units, is optimized by exploiting the variable length sorters, which can be dynamically configured in different functional units to accommodate variable message lengths. The AMC is implemented by a low-complexity approximation method to avoid hardware overheads and performance impact. Here Adaptive Message Control method is used to reduce the message length. Thus the message length is reduced by applying adaptive message control to the variable node update unit, then the truncation is achieved. Simulation results and synthesis reports are analyzed.
