The analog low-density parity-check (LDPC) decoder, which is a specific application of the probabilistic computing, is considered to be a promising solution for power-constrained applications. However, due to the lack of efficient electronic design automation tools and reliable circuit model, the analog LDPC decoders suffer from costly hand-craft design cycle, and are unable to provide enough coding gains for practical applications. In this paper, we present an implementation of a (480,240) CMOS analog LDPC decoder, which is the longest implemented code to date using the analog approach. We first propose an analog LDPC decoder architecture, which is constructed by the reusable modules and can significantly reduce the hardware complexity. And then, we present a mixed behavioral and structural model for the analog LDPC decoding circuits, which can reliably and efficiently predict the error-correcting performance. Finally, the experimental results show that the decoder prototype, which is fabricated in a 0.35-µm CMOS technology, can achieve a throughput higher than 50 Mbps with the power consumption of 86.3 mW for the decoder core, and can offer a superior 6.3-dB coding gain at the bit error rate of 10 −6 when the tested throughput is 5 Mbps. The proposed analog LDPC decoder is suitable for the power-limited applications with moderate throughput and certain coding gains.
I. INTRODUCTION
Low-density parity-check (LDPC) codes [1] , [2] , which belong to an important class of capacity-approaching errorcorrecting codes, have been adopted by many commercialized communication standards, such as Wi-Fi, WiMAX, DVB-S2, and CCSDS. The key driver behind the success of LDPC is the use of iterative message-passing algorithms [3] , which are powerful decoding algorithms with a manageable complexity. The iterative decoding algorithms are scalable, and hence can be efficiently implemented using digital [4] , [5] and analog [6] , [7] approaches. The motivation to use analog circuits for decoding is based on lower power dissipation and faster processing speed than the digital design [8] , [9] . Therefore, the analog approach has the potential to meet the requirement of energy efficiency in the current and emerging application scenarios. For example, the space telecommand applications require low power consumption with high communication reliability [10] , and the ultra-portable devices need long battery-driven life-time with small physical dimension [11] . Over the past decade, a number of low power analog LDPC decoding chips have been designed and fabricated [10] - [14] , where the (120,75) Min-Sum (MS) based analog LDPC decoder [12] is the longest implemented code using analog techniques. It is noticed that the above mentioned analogue circuits were merely limited to proof-of-concept decoders with very short block lengths, which are unable to provide enough coding gains for practical applications [15] .
There exist two factors that limit the applicability of analog decoders. On the one hand, costly hand-craft method plays a large part in the design flow of the analog LDPC decoders due to the lack of efficient electronic design automation tools [16] . The computation of the analog decoders is performed in a full-parallel fashion, and hence the complexity in the traditional design grows proportionally with respect to the code length. To design an analog LDPC decoder for practical applications, the hand-craft process is tedious, which in turn prolongs the design cycle and raises the error rate. On the other hand, it is very difficult to obtain fast and accurate estimation of the decoding circuit performance for large codes due to the high-complexity of the analogue circuit [15] . Developing a model of the analog decoder to evaluate the bit error rates (BER) is necessary before performing the physical implementation. However, the transistor-level simulations are too time-consuming to obtain the BER for complex decoders. The circuit behavioral models do not consider the non-ideal effects, such as the device matching and the time delay [14] , [17] . When considering complex decoders, there is a relatively large gap between the simulation results from the behavioral model and the actual measurements from the fabricated circuits [15] , [18] . Motivated by this, we attempt to overcome the above limitations by performing operations at three levels of abstraction in this paper, namely, the architecture design level, the circuit simulation level, and the chip implementation level, as shown in Fig. 1 . Firstly, a low-complexity decoder architecture with reusable modules is proposed to solve the hand-craft design complexity problem. Secondly, a mixed behavioral and structural model, which relates the device size to the error-correcting performance, is presented to permit the prelayout simulation in an acceptable time. Finally, a (480,240) CMOS analog LDPC decoder, which is the longest implemented code to date using the analog approach, is realized in a 0.35-µm standard CMOS process. The experimental results show that the prototype chip can achieve a throughput higher than 50Mbps with a power consumption of 86.3mW for decoder core, which corresponds to the energy per decoded bit of 1.726nJ. When the test throughput is 5Mbps, the coding gain at BER = 10 −4 is about 5.6dB, which is the highest among all reported analog decoders, and the coding gain at BER = 10 −6 is up to 6.3dB. The experiments also indicate that the probability stopping criterion employed in the proposed analog decoder can reduce the decoding delay by at most 93%, compared with the experiment-based decoding delay adopted in the existing works.
The rest of this paper is organized as follows. The decoding algorithms and basic soft-logic gates for analog LDPC decoders are introduced in Section II. Details on the decoder architecture design are described in Section III. Section IV presents the mixed model for analog decoding circuits and analyzes the simulation results. In Section V, the hardware architecture of the fabricated chip is presented. In Section VI, the experimental results and comparisons with existing analog LDPC decoders are discussed. Conclusions can be found in Section VII.
II. ANALOG LDPC DECODER BASED ON SUM-PRODUCT ALGORITHMS
LDPC is defined by the check matrix H , whose structure can be graphically represented by a factor graph [3] . For an (N , K ) LDPC code, the factor graph consists of variable nodes (VNs) v n , n = 1, · · · , N , which connect with check nodes (CNs) c k , k = 1, · · · , N − K using edges. The degree of a node is equal to the number of nodes that connect to it. The dimension and structure of the check matrix H not only define the error-correcting performance, but also determine the complexity of the decoding process.
The iterative message-passing algorithm [19] , which offers near-optimum decoding performance at a manageable complexity, is the most widely used method for LDPC codes. In practice, the standard decoding methods are either based on the sum-product (SP) algorithm [3] or its approximation, commonly referred to as the MS algorithm [20] . The SP algorithm outperforms the MS algorithm in terms of error correction performance at the cost of computational complexity. Both the SP and MS decoding algorithms can be performed by analog circuits.
In the iterative decoding process, the basic computations of the SP algorithm can be expressed as [21] 
where p X and p Y are the input probability density functions (PDFs) defined on the finite sets X and Y respectively, p Z is the output PDF defined on a finite set Z , f (x, y, z) is an indicator function from X × Y × Z into {0, 1}, and γ is a scale factor to guarantee that m p Z (z m ) = 1, ∀z m ∈ Z . Based on the factor graph of LDPC codes, the updating message, which is the PDF of each received bit, can be transmitted between VNs and CNs in an iterative manner. The function f (x, y, z) for the updating rule of variable-to-check message is equal to 1 if and only if x = y = z and 0 otherwise. As a result, the updating rule can be given by
Eq. (2) is the formulation of the basic soft-equal gate. Similarly, the function f (x, y, z) for the updating rule of check-to-variable message is equal to 1 if and only if z = x ⊕ y and 0 otherwise, where ⊕ denotes the binary addition. The corresponding updating rule can be given by
Eq. (3) is the formulation of the basic soft-XOR gate. From (2) and (3), it is shown that the basic soft-logic gate takes two input PDFs and calculates one output PDF. The updating computation in degree-3 nodes can be realized by three basic gates, and the nodes with a degree of more than three can be implemented by the degree-3 nodes in a cascade structure [21] . In the following, we will use the two basic gates to design and construct an analog LDPC decoder. The implementation of the basic soft-logic gate could be viewed as a generalization of the well-known Gilbert multiplier. In this approach, the currents represent the probability values, and are normalized to the ''unity'' current I U . There are three operations in the basic gate: multiplication, summation, and normalization. The function of the pairwise multiplication circuit, as shown in Fig. 2 , is then given by
where
According to the Kirchhoff current law, the summation computation is easily accomplished by connecting wires together. The normalization circuit is actually a degenerate version of the Gilbert multiplier with m = 1 and I x,1 fixed to I U , and its function is given by
where I z,k is the output current from the summation circuit.
III. LOW-COMPLEXITY DECODER ARCHITECTURE DESIGN
An important issue to be addressed in fabricating an analog LDPC decoder for practical application is the costly handcraft design caused by the lack of the efficient automation tools. To mitigate this problem, here we propose a design approach, whose core idea is to take the full advantage of the structured nature in the architecture-aware LDPC codes, to construct an efficient LDPC decoder architecture in this section.
A. ARCHITECTURE-AWARE LDPC CODES
Unstructured randomly-designed LDPC codes potentially achieve the best error correction performance at the cost of complex routing [22] . Therefore, all the standardized LDPC codes adopt architecture-aware design [23] , and have shown comparable error performance as randomly structured codes. As such, we design an architecture-aware LDPC code with easy description. The parity-check matrix of the proposed (480,240) LDPC code incorporates a common set of features favorable for efficient decoder implementation, and is expressed as [24] , and π B , π C , and π D are obtained by rotating π A counterclockwise, respectively, as shown in Fig. 3 (a). 
Similarly, we can derive the sub-matrices H B , H C , and H D .
According to the grouping structure of the parity-check matrix H , there are three parts of vertex set as shown in (6), namely, the check node part containing four subsets (C A , C B , C C , and C D ), the variable node part for the information bits containing four subsets (V A , V B , V C , and V D ), and the variable node part for the parity bits containing one subset V p .
B. REUSABLE MODULES
From (6), we can observe that the parity-check matrix H , which is constructed by the circular row vectors and the dualdiagonal pattern, is particularly convenient for building the reusable modules.
By performing the row transformation of the parity matrix H and rearranging the order of the check nodes within each subset, the sub-matrices H A , H B , H C , and H D are converted to a similar structure. For example, the sub-matrix H A is converted to
After transforming, the parity-check matrix of the (480,240) LDPC code becomes
where H p is also a 240 × 240 square matrix with a dualdiagonal pattern, but its corresponding variable node part V p has been reordered. By fully exploiting the structural nature of the parity-check matrix H in (9), the modular matrix with similar structure can be found and expressed as
where P 15 is isolated from H p and is a 15 × 15 dual-diagonal matrix. Based on the cascade structure, each check node with a degree 6 in (6) can be constructed by connecting a degree-5 node and a degree-3 node, and each variable node with a degree 5 for the information bits in (6) can be constructed by connecting a degree-4 node and a degree-3 node. As such, there are also three parts of vertex set in (10), namely, the check node part containing a degree-5 check node subset C α and a degree-3 check node subset C β , the variable node part for the information bits containing a degree-4 variable node subset V α and a degree-3 variable node subset V β , and the variable node part for the parity bits containing one subset V θ . The parity-check matrix H can then be constructed by using 16 modular matrices H m . Based on (10), the corresponding isomorphic sub-graph is shown in Fig. 4(a) , and it is constructed from a (3,2) protograph by making 15 copies of each variable node and each check node. Due to the fact that the message representation is described by a probability vector with two elements, i.e., p(0) and p (1) , each edge in the isomorphic sub-graph represents 2 × 15 lines. According to the probability stopping criterion [25] , an edge named ''Stopping feedback'' is added to the node C β , 1 whose degree becomes 4. The 5-degree check node C α is realized by three 3-degree nodes in serial connection, and the 4-degree nodes V α and C β are implemented with two 3-degree variable nodes and two 3-degree check nodes in series, respectively. The connection ''In&Out-α'' represents the input probability of transmitting bits and the output probability of decoding bits for node V α , and the connection ''In-θ'' represents the input probability of transmitting bits for node V θ . The external interfaces ''CHK-α'', ''CHK-β'' and ''CHK-θ '' of the check nodes are connected to the corresponding nodes in other modular 1 For convenience, the node subset C β is simplified to the node C β in Fig. 4(a) , and other node subsets in the following are similar. matrices, and so do the external interfaces of the variable nodes. The internal routing within the isomorphic sub-graph is connected according to the four permutation matrices (π A , π B , π C , and π D ) and the identity matrix I 15 .
The implementation of the reusable module is shown in Fig. 4(b) , which is a direct mapping of the isomorphic sub-graph to two types of hardware components: the softlogic gates to compute the update messages, and the routing network to represent the edges of the graph. The reusable module is built by 225 basic soft-XOR gates, which is equal to the sum of the basic gates used by 15 degree-5 node C α 's and 15 degree-4 node C β 's, and 180 basic soft-equal gates, which is equal to the sum of the basic gates used by 15 degree-4 node V α 's, 15 degree-3 node V β 's, and 15 degree-3 node V θ 's. The connection ''In&Out-α'' consists of 2 × 15 inputs and 2 × 15 outputs for node V α , and the connection ''In-θ '' consists of 2 × 15 inputs for node V θ . The connection ''Stopping feedback'', which composes of 2 × 15 output signals, represents the corresponding probabilities that satisfy the parity-check equations. Each one of the other bi-directional connections out of the module includes 2 × 2 × 15 wires. It is noticed that the routing network within the module is composed of five types of router predetermined respectively by the parity matrices π A , π B , π C , π D , and I 15 . The hand-craft designing in the reusable module can be reduced as the connections defined by five routers, and this operation is performed only once in the whole analog design flow. 
C. COMPLEXITY ANALYSIS
The (480,240) analog LDPC decoding network is constructed by 16 reusable modules, as shown in Fig. 5 . The design complexity of the analog decoding circuit is measured by the number of the placed wires here. There are 1439 bi-directional edges in the factor graph of (480,240) LDPC code, and therefore the mapped 1439×4 = 5756 wires are implemented in the original architecture without reusable modules. In the proposed decoding architecture, the placed 3180 wires are divided into two parts: the local connection including 300 2 wires within the reusable module, and the global connection including 2880 3 wires between different reusable modules. After adopting the reusable module to construct the decoding network, the number of wires is reduced by 44.8%. Furthermore, the routers within the reusable module are used to encapsulate irregular local wiring, and hence the global wires between different reusable modules are regular and structured. The combination of the scalable decoding architecture and the routing strategy significantly reduces the wiring overhead and minimizes the routing congestion. Hence, the hand-craft design complexity is greatly reduced to an acceptable level by constructing the analog LDPC decoding circuit with reusable modules.
IV. MIXED BEHAVIORAL AND STRUCTURAL MODEL FOR ANALOG DECODERS
Due to the large size of the analog circuit, the classic transistor-level simulation is too time-consuming to predict the system-level specifications for the proposed analog decoder. In order to facilitate the pre-layout simulation, a mixed behavioral and structural model for the analog LDPC decoders, which relates the transistor-level parameters to the system specifications, is presented here.
A. TRANSISTOR-LEVEL DESIGN IN THE SOFT-LOGIC GATE
Before modeling the analog decoding system, a design rule for the transistor size in the soft-logic gate is introduced here. For the gate-level design in the analog LDPC decoder, the Gilbert multiplier circuit, where the MOS transistors should operate in the weak inversion, is key consideration. The MOS transistors operating in the weak inversion should satisfy the following condition [26] 
where I D is the drain current, I U is the reference unit current, W /L is the transistor size, µ is the mobility of charge carriers in the device, C ox is the gate capacitance per unit area, U T ≈ 0.0258 V is the well-known thermal voltage, and κ and V T 0 are the constants of the fabrication process. It is noticed that the design parameter I U not only keeps the transistors in weak inversion region, but also controls the total power consumption of the decoding core. From a qualitative point of view, higher decoding accuracy together with better matching properties requires larger transistor size W /L and lower normalizing current I U at the cost of slowing the circuit response and increasing the node processing delays. Hence, the choice of the device size in the basic gate is a trade-off between the mismatch considerations and the dynamic requirements.
B. MIXED STRUCTURAL AND BEHAVIORAL MODEL
Circuit-level simulation of the proposed decoder is a very resource intensive task. As such, we propose a modeling approach, which is based on the structural description of the decoding network and the behavioral model of the basic gates, to fully characterize the mismatch effects and the dynamic features of the transistors. And the modeling approach of the analog LDPC decoding network is schematized in Fig. 6 , where the ''Soft XOR Gates'' and ''Soft EQU Gates'' modules denote a column of soft-XOR gates and a column of soft-equal gates, respectively. Based on the physical background of the parameters, we model the mismatch effect of the transistors in the soft-logic gate. As shown in Fig. 2 , the fundamental multiplication circuit in the basic gate is divided into two parts: the diodeconnected transistors biased in current mode within dashed box and the kernel multiplication transistor matrix biased in voltage mode. The consequence of parameter mismatch on the transistor behavior is calculated differently for different biasing part.
In the current biasing part, the gate-source voltage V GS is dependent on the imposed current I y,j , and the variance of the gate-source voltage difference V GS is [27] 
where A VT 0 is a process-dependent constant. For simplicity, the absolute difference V GS is converted into the relative current error ε j , whose variance is [27] 
Each transistor in the current biasing part is affected by the mismatch, which changes the input current from I y,j to I y,j (1 + ε j ).
In the voltage biasing part, the current I D is dependent on the imposed voltage V y,j , and the variance of the relative current error ε i,j for the transistor in row j and column i is [27] 
According to (13) and (14), the mismatch-infected output currentĨ i,j from the multiplication circuits is
Similarly, the mismatch-infected output currentĨ z,k from the normalization circuits is
where ε U , ε k , and ε 1,k are the current errors in the normalization circuit. According to (15) and (16), the mismatch effects are introduced by all basic gates into the decoding network. It is shown that the error-correct performance of analog decoding is relatively independent of the distribution of interconnection delays among the nodes [28] . To simplify the analysis, the presented model ignores the propagation delays and only considers the processing delay, which depends strongly on the transistor size.
In Fig. 6 , the ''Delay'' module, which indicates the processing delay, is embedded after the output of each soft-logic gate in the decoding network. The processing delay can be modeled by a first-order RC time-delay approximation, and is characterized by the following recursive equation
where n is the discrete-time index, α = 1−exp (− t/τ ), and t is the sampling period. According to (17) , the delay model is completely characterized by the time constant τ , which is characterizing the response to a step input of the soft-logic gate. The ''Initialization'' module, which performs the reset function, restores the network to a uniform starting condition. The ''DEC'' module performs the hard decisions on the a posteriori probabilities (APPs) at each step.
C. SIMULATION RESULTS
The proposed analog decoder circuit is designed and fabricated in a 0.35-µm CMOS technology. Due to the power constraints, the reference current I U is set to 1 µA. According to (11) 
Based on the above constraint, an MOS transistor with size W /L = 16µm/1µm is adopted in the gate-level design. Using the post-layout simulations, the time constants τ X for the soft-XOR gate and τ E for the soft-equal gate are 42.78ns and 57.04ns, respectively. The simulated BER performance with varying E b /N 0 is shown in Fig. 7 . In the low E b /N 0 region, the BER performance of different throughputs is close. However, with the increase of E b /N 0 , the higher the throughput, the larger the BER gap between the theoretical curve and the simulation result. This is due to the fact that the settling behaviors of the decoding performance converge to their steady-state values over long decoding time.
V. HARDWARE IMPLEMENTATION
In order to verify the proposed decoding architecture and the modeling approach, an implementation of the (480,240) LDPC analog decoder is carried out using 0.35-µm CMOS technology. The details of this implementation are presented as follows. 
A. ARCHITECTURE OF THE ANALOG DECODING CIRCUIT
The system level schematic of the proposed (480,240) LDPC analog decoding circuit is shown in Fig. 8 . The main features of the analog decoding circuit are summarized in Table 1 . Besides the decoding core circuit, the prototype integrates the following modules to facilitate output buffering and testing.
• Analog input buffer is used to store the serially-inputting channel output symbols in the analog memory, and to output the symbols in parallel. The storage elements are realized using a two-stage pseudo-differential sampleand-hold (S/H) circuit.
• Differential pair biased in weak inversion converts the differential voltages into a pair of complementary currents, and converts the channel output symbols into the channel transition probabilities.
• Current comparators generate the digital decision by comparing the output currents, which represent the probabilities of '0' and '1', and then latch the decision.
• Digital output buffer is a shift register chain, which samples the parallel decoded bits from the comparator modules and converts them into a bit-serial format. The timing diagram for the proposed analog LDPC decoder is presented in Fig. 9 . When a string of the received serial symbols V in arrives, it is firstly converted into parallel current pairs [I 0 
B. HARDWARE MAPPING OF THE ANALOG DECODER
The analog decoder including the interface circuits is fabricated in a 0.35-µm CMOS process with single power supply 3.3 V. To improve the matching precision, the mirror symmetry form is adopted in the layout of the Gilbert multiplier circuit, and one-and two-dimensional common centroid layout techniques have been used in the current mirrors. The corresponding chip photo is provided in Fig 10. The silicon area including pads is 108.95mm 2 , of which the analog decoding circuitry occupies 45.5mm 2 . The input buffer circuit takes up a large area on the chip because of the two-stage S/H circuits. The testing circuit on the top left corner is used for the transistor mismatch testing and the circuit behavior verifying of the soft-logic gates. In the analog decoding core, it can be observed that there are 16 reusable modules.
VI. EXPERIMENTAL RESULTS AND DISCUSSIONS
The fabricated decoding chip is evaluated by the following experimental setup, as shown in Fig. 11 . The received channel output symbols are the composite signals from the noise generator and the signal generator. An FPGA that is working at a frequency of 100 MHz is used for the generation of all digital control signals and the data synchronization. When the VOLUME 5, 2017 FIGURE 8. System level schematic of the proposed analog LDPC decoding circuit. 'Early stopping' signal is converging or the pre-determined maximum decoding time is reached, the FPGA retrieves the decoded bits from the decoder chip and records the number of decoding clocks. These data are firstly transferred to the computer, and then the BER and decoding delay are calculated. Fig. 12 shows the BER performance of the fabricated decoding chip. It is observed that there is a good consistency between the model predictions from Section IV and the experimental results, which confirms the reliability of the proposed mixed behavioral and structural model. The experimental results show that the proposed analog decoder is more resilient to mismatch effect and other imperfection errors than expected [11] . This is because that the modularization design of the decoding architecture and the proper device size derived from the reliable model are adopted in design phase. Moreover, the probabilistic computing paradigm essentially belongs to the error-resilient system. With a throughput of 5Mbps, the experimental results show a loss of about 0.5dB with respect to the benchmark at BER = 10 −6 , and the proposed decoder offers a superior 5.6dB coding gain at BER = 10 −4 and 6.3dB coding gain at BER = 10 −6 compared to an uncoded BPSK (binary phase-shift keying) transmission. Since there is a conflict between the long settling delay of the decoding circuit and the decreasing maximum decoding delay, the performance of the decoder dramatically deteriorates with the increasing throughput. For example, with a throughput of 50Mbps, the degradation of the coding gain is about 2 dB at BER = 10 −4 .
The average decoding delay of the proposed decoding chip is presented in Fig. 13 . Here, the maximum decoding delays are set to (480/throughput) s, i.e., 96 µs and 9.6 µs for the throughputs of 5Mbps and 50Mbps, respectively. Comparing with the maximum decoding delays, the average decoding delays are reduced by 93% and 50% at E b /N 0 = 4.5 dB for the two scenarios, respectively. It is shown that the average decoding delay decreases gradually with the increase of E b /N 0 , especially for the scenario with a throughput of 5Mbps. From Fig. 13 , we can conclude that the proposed decoder chip employing early termination can save the decoding time with high E b /N 0 . Hence, the probability stopping criterion can reduce the power consumption and improve the data throughput. The performance comparisons of the proposed decoder with the previously reported analog decoders are summarized in Table 2 . It is shown that the code length of our proposed decoder is the longest among all reported analog decoders. Due to the large scale of decoding network and fabricated CMOS technology, the core area and power of the proposed chip is relatively high. When normalized to the state-of-art 65nm CMOS technology [11] , the core area will be scaled down to a suitable size of about 1.569mm 2 . Furthermore, the current level of the energy efficiency is about 10 pJ/bit in 65 nm CMOS, and the power of the analog implementation with advanced process technologies can be further lowered [29] . Our proposed decoder can achieve a coding gain of 5.6dB at BER = 10 −4 with a throughput of 5Mbps, which is the highest among all existing analog decoders. Hence, the proposed (480,240) CMOS analog LDPC decoder is much more suitable in some energy-limited applications, where moderate throughput and decoding gains are required.
VII. CONCLUSION
In this paper, a (480,240) CMOS analog LDPC decoder based on SP algorithm has been realized in a 0.35-µm CMOS technology. To the best of our knowledge, the proposed analog LDPC decoder is the longest code implemented using analog technology to date. The low-complexity decoder architecture design exploiting the structured nature of the target code minimizes the hardware overhead. More precisely, by using the reusable modules, only 3180 wires need to be routed in hand-craft process, which are reduced by 44.8%. A mixed behavioral and structural model is also presented for the analog LDPC decoding network, which relates transistorlevel parameters to system-level specifications. The contrast result between the simulation value and the measured data validates the reliability of the proposed model. Furthermore, the model can provide circuit-optimization guidelines for the physical design. The chip area is 108.95mm 2 including I/O pads (45.5mm 2 for decoding core). The achievable throughput of the tested chip is higher than 50Mbps with a decoding power of 86.3mW at 3.3V, which corresponds to an energy per decoded bit of 1.726nJ. When the tested throughput is 5Mbps, the decoding chip can offer an outstanding coding gain of 5.6dB at BER = 10 −4 .
