ABSTRACT: This paper investigates the performance of three different symbol level decoding algorithms for Duo-Binary Turbo codes. Explicit details of the computations involved in the three decoding techniques, and a computational complexity analysis are given. Simulation results with different couple lengths, code-rates, and QPSK modulation reveal that the symbol level decoding with bit-level information outperforms the symbol level decoding by 0.1 dB on average in the error floor region. Moreover, a complexity analysis reveals that symbol level decoding with bit-level information reduces the decoding complexity by 19.6 % in terms of the total number of computations required for each half-iteration as compared to symbol level decoding.
INTRODUCTION
Since its inception in 1993, Turbo code [1] , which allowed the bound established by Shannon in 1948 [2] to be approached significantly, have been the center of attention of researchers. The Turbo code community has recently been conducting a lot of research on non-binary Turbo codes. With a comparable implementation complexity, duo-binary Turbo codes can provide better performance in terms of error correction than binary Turbo codes [3] . The excellent performance of duo-binary Circular Recursive Systematic Convolutional (CRSC) codes [4] has led to their adoption in Digital Video Broadcasting with Return Channel via Satellite (DVB-RCS) [5] replacing the conventional scheme that consisted of serial concatenation of a Reed Solomon (RS) code and a convolutional code. The DVB-RCS standard specifies an air-interface where many small terminals send return signals via satellite to a central gateway [6, 7] .
There are several advantages of non-binary turbo codes, for example: better convergence of iterative decoding, low latency, and reduced sensitivity to puncturing patterns [13] . Puncturing and the sub-optimal Max-Log-MAP decoding algorithm have a less significant influence with duo-binary Turbo codes than with binary Turbo codes [6] . Also, duo-binary Turbo codes allow for the latency of the decoder to be halved [7] . Recently, in [14] , the performance of transmission systems with Duo-binary Turbo Codes (DBTC) and 16-QAM square modulation in Additive White Gaussian Noise (AWGN) channel have been investigated for various allocation modes. The author of [15] , has investigated the input quantization of low complexity decoding algorithms and proposed an algorithm for an effective decoder quantization with the introduction of a scale factor into the decoding algorithm so as to achieve significant improvement in the hardware implementation of the decoder architecture. In [16] , a duo-binary turbo code incorporating the Quadratic Permutation Polynomial (QPP) interleaver rather than the one defined in the DVB-RCS standard has been proposed. A complete detailing has been performed on the parameters and performance of the proposed scheme. In [12] , a low-memory intensive decoding architecture has been proposed for a double binary convolutional Turbo code. The scheme is based on an improved decoding algorithm storing part of state metrics in the state metrics cache. In [13] , the algorithm for double-binary Turbo decoding is studied using QPSK modulation over Rayleigh Fading channel. The authors of [17] , have made a performance analysis between Turbo-φ codes and 3D-Turbo codes for the next generation DVB-RCS system in terms of error performance and decoder complexities. The decoding algorithms for Turbo codes are the Maximum A-Posteriori Probability (MAP), Logarithmic MAP (Log-MAP) and the Maximum Log-MAP (Max Log-MAP) algorithms. Due to the extensive computational complexity and numerical instability of the MAP algorithm, researchers have proposed the Log-MAP algorithm [18] . In order to further reduce computational complexity, the Max Log-MAP algorithm was brought forward. This reduction in computational complexity comes with a slight degradation in error performance as trade-off.
In [19] and [20] , the authors have presented a symbol level decoding scheme for duobinary and triple-binary Turbo codes. This comprehensive study has been carried out over an AWGN channel to demonstrate the good performance of the proposed schemes. In [6] , a variant of the symbol level decoding algorithm for duo-binary Turbo codes has been presented. The performance of the decoding scheme is compared to the Turbo code standard for DVB-RCS over a Gaussian channel at Frame Error Rate (FER) of 10 -4 . The results demonstrate the performance of the proposed scheme is almost similar to that used in the DVB-RCS standard. In [21] , an investigation of bit-wise and symbol-wise decoding for the case of multi-binary convolutional Turbo codes employing the MAP algorithms has been performed. The symbol-wise decoding algorithm presented in this work operates on bit-level LLRs as input and is shown to outperform the bit-wise decoding. The advantage of this technique is that the limitation of using only QPSK modulation with the duo-binary Turbo codes [19, 20] can be overcome.
Different equations have been used in Turbo decoding algorithms. As such, this survey paper presents the existing Max Log-MAP Turbo decoding algorithm with the different equations used for duo-binary Turbo codes. Explicit details of the computations involved in the three decoding techniques as well as a complexity analysis have been provided. Simulation results with different couple lengths, code-rates and QPSK modulation reveal that symbol level decoding with bit-level information outperforms symbol level decoding by 0.1 dB on average in the error floor region. Moreover, a complexity analysis reveals that symbol level decoding with bit-level information reduces the complexity by 19.6 % as compared to symbol level decoding.
The paper is organized as follows. Section 2 presents the different methods for Max Log-MAP decoding algorithm for duo-binary Turbo codes. Section 3 presents the simulation results and Section 4 concludes the paper.
DUO-BINARY TURBO CODES
The encoding structure for duo-binary Turbo codes employed in the DVB-RCS standard is shown in Fig. 1 The DVB-RCS standard uses a parallel concatenation of two Circular Recursive Systematic Convolutional (CRSC) codes as the encoder [13] [14] [15] separated by an interleaver which uses a two-level interleaving. Let Nc be the size of each couple at the input of the duo-binary Turbo encoder. At the first level, an intra-symbol permutation takes place and at the second level an inter-symbol permutation takes place. The two levels of interleaving are well described in [5] .
The modulation scheme used in duo-binary Turbo codes for the DVB-RCS standard is gray-coded Quadrature Phase Shift Keying (QPSK) modulation as depicted in Fig. 1 . The constellation mapping for the gray-coded QPSK modulation used is shown in Fig. 2 . The couples { , }, { 1 , 1 } and { 2 , 2 } are mapped onto the modulated symbols 0 , 1 and 2 respectively. Q I 00 01 10 11 Fig. 2: Bit-mapping of Gray-coded QPSK modulation [20, 22] .
After modulation of the symbols, the stream is multiplexed and transmitted over a complex AWGN channel. The received noisy symbol vectors are intercepted at the receiver side and fed to the Turbo decoder. r0, r1 and r2 are the received noisy vectors of the systematic and parity information. 0 ̅̅̅ is the interleaved version of the received noisy vector of the systematic information.
The conventional decoding process for duo-binary Turbo codes is performed with the exchange of symbol-level extrinsic information in an iterative manner between the two turbo decoders after each half-iteration as depicted in Fig. 3 . In this work, the Max-Log MAP algorithm has been used for decoding. Let the branch transition probability associated with input symbol Ct = i, (where i can take values 0, 1, 2 and 3 for duo binary) from state St-1 = l' to St = l and at time t be denoted by 1, ( ′ , ) for decoder 1. The components which are used in the decoding equations are the in-phase and quadrature phase of these complex symbols which are described as follows: Note that the trellis diagram for the second decoder is similar except that the decoder uses 0 ̅̅̅ and 2 as channel inputs. The following subsections describe three decoding algorithms that can be used with duo-binary Turbo codes. The first two are variants of symbol level decoding algorithms while the third one is a symbol-level decoding scheme with bit-level LLRs as inputs.
The parameters shown in Fig. 4 can be defined as follows:
1 is the forward recursive variable for decoder 1 and 1 is the backward recursive variable of decoder 1. 
Decoding for Duo-Binary Turbo Codes using Method 1
The decoding Method 1 from [20] is a symbol level decoding scheme used in the DVB-RCS standard. The decoding equations are presented next. The first decoder's branch metric is given as:
Where, ( ( 2 = )) is the a-priori logarithmic symbol probability of symbol i obtained from the second decoder and the value is set to zero at the beginning of the decoding process.
The number of computations required for the branch metric given in Eqn. (1) is shown in Table 1 . The information provided in Table 1 will be used in the analysis part of section 3. The forward recursive variable for the first decoder is computed as follows [7] :
Where, M S 1 is the total number of states for decoder 1. The number of computations required for the forward recursive variable shown in equation (2) is shown in Table 2 . The information provided in Table 2 will be used in the analysis part of section 3. The backward recursive variable for the first decoder is computed as follows [7] :
The number of computations required for the backward recursive variable shown in equation (3) is shown in Table 3 . The information provided in Table 3 will be used in the analysis part of section 3. The equation for the log likelihood ratio is as follows [7] :
Where, 1, ( )is the Log-Likelihood Ratio (LLR) of symbol i where, i Є {1, 2, and 3} for duo binary turbo codes. The number of computations required for the Log-Likelihood Ratio of symbol at time instant shown in equation (4) is shown in Table 4 . The information provided in Table 4 will be used in the analysis part of section 3. The a-posteriori LLR comprises of 3 LLRs, namely, the a-priori LLR, the intrinsic LLR and the extrinsic LLR related as follows [23]:
Where, 1, ( ) is the intrinsic LLR for decoder 1. The extrinsic LLR for the decoder 1 is thus calculated as follows:
The number of computations required for the extrinsic LLR of symbol at time instant shown in equation (7) is shown in Table 5 . The information provided in Table 5 will be used in the analysis part of section 3. The intrinsic LLR of decoder 1 associated with the systematic bits:(A t , B t ) can be represented as:
Considering for example the intrinsic LLR 1, =01 ( ) which can be expressed as:
Likewise, similar expressions can be obtained for the intrinsic LLRs for the other symbols.
1,
The number of computations required for the intrinsic LLR of symbol at time instant shown in equations (9) and (10) is shown in Table 6 . The information provided in Table 6 will be used in the analysis part of section 3. The probability computation to be fed to next decoder is as follows:
( 1 = 00) + ( 1 = 01) + ( 1 = 10) + ( 1 = 11) = 1 (11)
Therefore, 
The max approximation is defined as:
Applying the max approximation to the computation of the log probabilities:
The number of computations required for the a-posteriori probabilities of symbol at time instant shown in equations (16 -19) is shown in Table 7 . The information provided in Table 7 will be used in the analysis part of section 3. The equations for decoder 2 are now presented. The branch metric of the second decoder is given as follows:
The number of computations involved in calculating the branch metric for each transition in the second decoder is similar to that shown in Table 1 .
The computation of the forward and backward recursive variables is done as follows:
The number of computations involved in calculating each forward transition metric at each time instant in the second decoder is similar to that shown in Table 2 .
The backward recursive variables for the first decoder are computed as follows:
The number of computations involved in calculating each backward transition metric at each time instant in the second decoder is similar to that shown in Table 3 . The equation for the log likelihood ratio is as follows:
Where, M S 2 denotes the number of states for the second decoder.
The equations for the extrinsic information output from both decoders for double binary and triple binary are explained in [23] . The number of computations involved in calculating each LLR at each time instant in the second decoder is similar to that shown in Table 4 . The extrinsic LLR for the decoder 2 is thus calculated as follows:
The number of computations involved in calculating each extrinsic LLR at each time instant in the second decoder is similar to that shown in Table 5 . The number of computations involved in calculating each intrinsic LLR at each time instant in the second decoder is similar to that shown in Table 6 .
The probability computation to be fed to next decoder is [21] :
( 2 = 00) + ( 2 = 01) + ( 2 = 10) + ( 2 = 11) = 1 (25)
Therefore, ( 2 = 00) = 
Applying the max approximation to the computation of the log probabilities: 
The number of computations involved in calculating each a-posteriori probability at each time instant in the second decoder is similar to that shown in Table 7 .
Decoding for Duo-Binary Turbo Codes: Method 2 [6]
A variant of the decoding algorithm for duo-binary Turbo codes is described in [6] . The main difference as compared to Method 1 lies in the computations of the branch transition metrics and the intrinsic LLRs. The equations for the decoding algorithm are presented next. The branch transition metric is computed as:
Where, is the channel reliability estimate. The number of computations required for the branch metric shown in Eqn. (33) is shown in Table 8 . The information provided in Table  8 will be used in the analysis part of section 3. The forward recursive variables for the first decoder are computed in the same way as for Method 1, which is shown in Eqn. (2) . The number of computations involved in each forward recursive variable at each time instant is as shown in Table 2 . The backward recursive variables for the first decoder are computed as shown in Eqn. (3) . The number of computations involved in each backward recursive variable at each time instant is as shown in Table 3 . The equation for the log likelihood ratio is the same as shown in Eqn. (4) . The number of computations involved in each log likelihood ratio at each time instant is as shown in Table 4 .
The a-posteriori LLR comprises of 3 LLRs, namely, the a-priori LLR, the intrinsic LLR and the extrinsic LLR are related as shown in Eqns. (4), (5) and (6) [23]. The intrinsic LLRs are computed as follows: 
The number of computations required for each intrinsic LLR of symbol at time instant shown in Eqn. (34) to (37) is shown in Table 9 . The information provided in Table 9 will be used in the analysis part of section 3. The branch transition metric for the second decoder is computed as:
The number of computations for the branch transition metrics of the second decoder is similar to that for decoder 1, as shown in Table 8 . The forward recursive variables for the second decoder are computed in the same way as for Method 1 which is shown in Eqn.
(2). The number of computations for the forward recursive variable is the same as shown in Table 2 . The backward recursive variables for the second decoder are computed as shown in Eqn. (3) . The number of computations for the backward recursive variable is the same as shown in Table 3 . The equation for the log likelihood ratio is the same as shown in Eqn. (4) . The number of computations for the a-posteriori LLR is the same as shown in Table 4 .
The a-posteriori LLR comprises of 3 LLRs, namely, the a-priori LLR, the intrinsic LLR and the extrinsic LLR are related as shown in Eqn. (32) [23].
The number of computations for the extrinsic LLR is the same as shown in Table 5 . The intrinsic LLR of decoder 2 associated with the interleaved systematic bits ( , ) can be represented as: 
The number of computations for the intrinsic LLR is the same as shown in Table 9 .
Decoding for Duo-Binary Turbo Codes using Method 3 [24]
The Turbo decoding equations to perform symbol-level decoding with bit-level LLRs as input was proposed by the authors of [24] . The encoding structure and gray coded modulation for the double binary Turbo codes are as shown in Fig. 1 and Fig. 2 respectively. The systematic and parity information are modulated based on the Graycoded QPSK bit-mapping constellation of Fig. 2 
Where, real() and imag() denote the real and imaginary parts of the input complex arguments: 0 , 1 and 2 . The Turbo decoder is as shown in Fig. 6 .
The parameters shown in Fig. 6 The first decoder's branch transition metric is given as:
The number of computations required for the branch metric shown in Eqn. (58) is shown in Table 10 . The information provided in Table 10 will be used in the analysis part of section 3. The forward recursive variables for the first decoder are computed in the same way as for Method 1 which is shown in Eqns. (3) and (4) . The number of computations for the forward recursive variable is the same as shown in Table 2 . The backward recursive variables for the first decoder are computed in the same way as for Method 1, which is shown in Eqns. (5) and (6) . The number of computations for the backward recursive variable is the same as shown in Table 3 . The log-likelihood ratios for the first decoder are computed in the same way as for Method 1, which is shown in Eqn. (4) . The number of computations for the a-posteriori LLRs is the same as shown in Table 4 . The a-posteriori LLR comprises of 3 LLRs, namely, the a-priori LLR, the intrinsic LLR and the extrinsic LLR are related as shown in Eqns. (4), (5) and (6) [23]. The number of computations for the extrinsic LLR is the same as shown in Table 5 . The intrinsic LLRs are computed as follows:
The number of computations required for each intrinsic LLR of symbol at time instant shown in Eqn. (45) is shown in Table 11 . The information provided in Table 11 will be used in the analysis part of section 3.
The operation of the second decoder can now be started. The branch transition metric for the second decoder is computed as:
[( ̅̅̅̅̅ ) . ( )] + [( ̅̅̅̅̅ ) . ( )] + [( 2 ) . ( 2 )] + [( 2 ) . ( 2 )])
(46)
The number of computations for the branch transition metrics of decoder 2 is the same as shown in Table 10 .
The forward recursive variables for the first decoder are computed in the same way as for Method 1, which is shown in Eqn. (2) . The number of computations for the forward recursive variable of decoder 2 is the same as shown in Table 2 . The backward recursive variables for the second decoder are computed as shown in Eqn. (3) . The number of computations for the backward recursive variable of decoder 2 is the same as shown in Table 3 . The equation for the log likelihood ratio is the same as shown in Eqn. (4) . The number of computations for the a-posteriori LLR of decoder 2 is the same as shown in Table 4 . The a-posteriori LLR comprises of 3 LLRs, namely, the a-priori LLR, the intrinsic LLR and the extrinsic LLR are related as shown in Eqn. (5) [23]. The number of computations for the extrinsic LLRs of decoder 2 is the same as shown in Table 5 . The intrinsic LLR of decoder 2 associated with the interleaved systematic bits:( , ) can be represented as:
The computational equations for the intrinsic LLRs are as follows:
2, 1 = 2 ([( ̅̅̅̅̅ ). (−1)] + [( ̅̅̅̅̅ ). (+1)])

2, 2 = 2 ([( ̅̅̅̅̅ ). (+1)] + [( ̅̅̅̅̅ ). (−1)])
2, 3 = 2 ([( ̅̅̅̅̅ ). (+1)] + [( ̅̅̅̅̅ ). (+1)])
The number of computations for the intrinsic LLRs of decoder 2 is the same as shown in Table 11 .
Computational Complexity Analysis
In this section, the computational complexities for the three decoding methods have been compared. The break-down of the number of computations at each half-iteration for Methods 1, 2, and 3 are shown in Tables 12, 13, and 14 Table 4 are multiplied by 4 and N c for the 4 symbols over the whole couple length. The computations of the extrinsic LLRs of Eqn. (7) as shown in Table 5 are also multiplied by 4 and N c . The computations of the intrinsic LLRs of Eqns. (9) and (10) as shown in Table 6 cater for the 4 symbols and thus are only multiplied by the couple length, N c . Similarly, the computations of the a-posteriori probabilities of Eqns. (11 -14) , as shown in Table 7 , are only multiplied by the couple length, N c . The values obtained for the metrics computed over one half-iteration for Method 2 are explained next. The number of computations for the branch transition metric of Eqn. (33) shown in Table 8 Table 9 , are multiplied by 4 and N c to cater to the 4 symbols and the couple length. Table 11 are multiplied by 4 and N c to cater for the 4 symbols and the couple length. The total number of computations for each half-iteration for Method 1 is given as: 
Where, 3 is the total number computations per half-iteration for Method 3.
It can be observed that the total number of computations for Method 3 per halfiteration is lower in comparison to that required for both Methods 1 and 2. In total, Method 2 and Method 3 require 118 and 130 fewer computations than Method 1, respectively. From the percentage aspect, Method 2 and Method 3 require 17.8 % and 19.6 % fewer computations than Method 1, respectively, at each half-iteration. However, it is assumed that an addition and a multiplication have the same complexity of one computation.
SIMULATION RESULTS
In this section, the performances of the three different decoding methods for DuoBinary Turbo codes both, with and without circular states are compared.
Q-PSK modulation has been used in all the simulations.
An interleaver size of couple length Nc has been used in all the simulations. The parameters for the duo-binary Turbo code used are as follows [19, 20, 
CONCLUSION
In this paper, an investigation of the state of the art of different iterative decoding techniques for the Max-Log MAP algorithm have been presented for duo-binary Turbo codes. Different couple lengths and code-rates have been employed with duo-binary Turbo codes with and without the incorporation of circular states have been used for this work. Essentially, three different decoding techniques have been identified for this work. The identified schemes were implemented in Matlab and all the simulations were carried out using an AWGN channel, QPSK modulation and the eight-state double binary turbo encoder of the DVB-RCS standard. The computational complexities of the 3 methods were analyzed for one half-iteration. It can be observed that Method 2 and Method 3 require 17.8 % and 19.6 % fewer computations than Method 1 respectively at each halfiteration with the assumption that an addition and a multiplication have the same complexity of one computation. Intensive simulations were then carried out for duo-binary Turbo codes with and without circular states for couple lengths: 48, 212 and 752; and code-rates: 1/3, ½ and 2/3. In most results, Methods 2 and 3 outperform Method 1 for the whole BER range. These results are important when low-complexity decoding algorithms for non-binary Turbo codes need to be considered. Compared to previous work, the investigation in this paper is geared towards analyzing three different sets of decoding equations for the Max Log MAP algorithm used with Duo-Binary Turbo codes. Additionally, based on the equations used to compute the different parameters of the iterative process, a computational complexity analysis was also performed. Methods 1 and 2 are variants of a symbol-level decoding mechanism with the symbol-level a-priori LLRs as input. However, Method 3 performs symbol-level decoding using bit-level LLRs and the results become significant in the sense that the limitation of using only QPSK modulation with duo-binary Turbo codes is overcome with this technique. Using higher order modulations with duo-binary Turbo codes help achieving higher spectral efficiencies. This possibility of using higher order modulations with non-binary Turbo codes opens avenues for the incorporation of prioritization constellation mapping and other schemes in view to enhance error performance with decoders having low computational complexities.
