Nouvelles stratégies de concaténation de codes séries pour la réduction du seuil d’erreur dans le contrôle de parité à faible densité et dans les turbo codes produits by Morero, Damián Alfonso & Hueda, Mario Rafael
Can. J. Elect. Comput. Eng., Vol. 36, No. 2, Spring 2013
Manuscript received January 27, 2013; accepted March 7, 2013
*  D. A. Morero and M. R. Hueda are with Laboratorio de Comunicaciones 
Digitales - Universidad Nacional de Cordoba - CONICET. Av. Velez Sarsfi eld 
1611 - Cordoba (X5016GCA) - Argentina; Email: dmorero@gmail.com, 
mhueda@gmail.com 
This work was supported in part by Fundación Fulgor.   
Associate Editor managing this paper’s review: S. Yousefi 
Novel serial code concatenation strategies for 
error fl oor mitigation of low-density parity-check 
and turbo product codes
Nouvelles stratégies de concaténation de 
codes séries pour la réduction du seuil d’erreur 
dans le contrôle de parité à faible densité et 
dans les turbo codes produits
Damian A. Morero and Mario R. Hueda* 
This paper presents a novel multiple serial code concatenation (SCC) strategy to combat the error-fl oor problem in iterated sparse graph-based error cor-
recting codes such as turbo product-codes (TPC) and low-density parity-check (LDPC) codes. Although SCC has been widely used in the past to reduce the 
error-fl oor in iterative decoders, the main stumbling block for its practical application in high-speed communication systems has been the need for long and 
complex outer codes. Alternative, short outer block codes with interleaving have been shown to provide a good tradeoff between complexity and perform-
ance. Nevertheless, their application to next-generation high-speed communication systems is still a major challenge as a result of the careful design of long 
complex interleavers needed to meet the requirements of these applications. The SCC scheme proposed in this work is based on the use of short outer block 
codes. Departing from techniques used in previous proposals, the long outer code and interleaver are replaced by a simple block code combined with a novel 
encoding/decoding strategy. This allows the proposed SCC to provide a better tradeoff between performance and complexity than previous techniques. 
Several application examples showing the benefi ts of the proposed SCC are described. Particularly, a new coding scheme suitable for high-speed optical 
communication is introduced. 
Cet article présente une nouvelle stratégie de concaténation de codes séries (SCC) multiples permettant de s’affranchir du problème de seuil d’erreur dans 
les codes correcteurs d’erreurs tels que les Turbo codes produits (TPC) et les codes contrôle de parité à faibles densité (LDPC). Bien que dans le passé, la 
SCC ait été largement utilisée pour réduire le seuil d’erreur dans les décodeurs itératifs, le principal obstacle pour son implémentation pratique dans les 
systèmes de communications haut débit a été le recours à des codes extérieurs longs et complexes. De courts codes en blocs extérieurs avec entrelacement 
ont montré qu’ils peuvent fournir un bon compromis entre la complexité et la performance. Cependant, leur implémentation dans la prochaine génération 
de systèmes de communications haut débit reste un défi t majeur vu que le résultat d’une conception minutieuse des entrelaceurs longs et complexes doit 
remplir les exigences de ces systèmes. Le schéma SCC proposé dans ce travail repose sur l’utilisation de courts codes en blocs extérieurs. Partant des tech-
niques existantes, le code extérieur long et l’entrelaceur sont remplacés par un simple code en bloc combiné à une nouvelle stratégie de codage/ décodage. 
Ainsi, par comparaison aux techniques courantes, la stratégie SCC proposée fournit un meilleur compromis entre la performance et la complexité. Plusieurs 
exemples d’applications sont donnés et montrent les avantages de la stratégie SCC proposée. En particulier, nous présentons un nouveau schéma de codage 
adapté aux communications optiques haut débit.
Keywords: concatenated codes; error correction codes; high-speed optical communication; product codes; turbo codes.
I Introduction
In future high speed communication systems (e.g., next generat ion op-
tical transport networks (OTN)), forward error correction (FEC) codes 
with net coding gains (NCGs) 10≥ dB at a bit error rate (BER) of 
1510−  with an overhead (OH) as low as possible (e.g., ∼  20%) are 
mandatory [1]–[3]. Given their superior performance and suitability for 
parallel processing, large block size low density parity check (LDPC) 
codes and turbo product (TP) codes have been considered as FEC 
coding schemes of choice for ultra-high speed transmission systems. 
Unfortunately, these iterative coding schemes usually have error fl oor 
problems which signifi cantly degrade their performance at low BER.
Numerous techniques have been proposed in the literature to lower 
the error fl oor [1]–[5]. These techniques can be divided into two cat-
egories. The fi rst one aims at eliminating all weaknesses in the de-
coder algorithm that create the error fl oor, while the second one aims 
at correcting the residual error pattern by adding an outer code. The 
fi rst category comprises several post-processing [1], [4] and improved 
decoding algorithms [5] which are mainly proposed for LDPC codes. 
The design and performance evaluation of these algorithms may be 
diffi cult since the knowledge of both the weight and structure of the 
dominant error patterns is required. On the other hand, the addition 
of an outer code provides a simple and more general solution to the 
error fl oor problem. In particular, its performance can be estimated 
based on the knowledge of the weights and the probabilities of the 
error patterns. For these reasons, several SCC FEC schemes for 100 
Gigabits per second (Gb/s) OTN applications have been proposed 
(see [2]–[3] and references therein). In [2], it is experimentally shown 
that a 20.5% concatenated code based on an inner LDPC and an out-
er Reed-Solomon (RS) code achieves an NCG of 9 dB at a BER=
1310− . The concatenation of two hard-decision block codes with an 
LDPC is another alternative proposed in [2]. The total overhead of 
0
53 MORERO / HUEDA: NOVEL SERIAL CODE CONCATENATION STRATEGY
this triple-concatenated approach is 20% and the expected NCG is 
10.80 dB at a BER= 1510− . In [3], the authors present a concatenated 
LDPC+RS coding scheme with 20.5% OH and NCG=11.3  dB at a 
BER= 1510− .
The use of short outer block codes with interleaving has also been 
considered in the past to (i) improve the performance and (ii) reduce 
the error fl oor [6]. Based on the structure of the dominant error pat-
terns, the interleaving-based SCC solution is able to achieve a good 
tradeoff between performance and complexity. However, the evalua-
tion of the dominant error patterns is highly complex in numerous 
codes of practical interest as a result of the very low BER and the high 
NCGs required in high-speed applications such as OTN. To avoid the 
evaluation of the structure of the error patterns, long pseudo-random 
interleavers can be used [6]. Unfortunately, long interleavers signifi -
cantly increase not only the implementation complexity but also the 
latency. Therefore, the use of SCC with interleaving in future high-
speed transmission systems is still a major challenge.
The present paper describes a novel SCC strategy designed to re-
duce the error fl oor problem in very high-speed communication sys-
tems. The key ingredient of our technique is the replacement of the 
long outer code and interleaver by simple block codes in combination 
with a novel encoding/decoding strategy [7]. Based on this fi nding, we 
demonstrate that an error fl oor reduction of LDPC codes in multigig-
abit applications can be achieved with a drastic reduction of complex-
ity1 (e.g., one order of magnitude) in comparison with existing SCC 
solutions. As a second contribution of this work, we extend the new 
SCC strategy to combat the error fl oor caused by both (i) the near-
codewords2 with non-zero syndrome [8] and (ii) the minimum distance 
codewords. The latter allows the new SCC technique to be effi ciently 
used with TP codes (TPC). In particular, we show that a TPC com-
posed by two extended Hamming (EH) codes, in combination with the 
proposed SCC algorithm, can achieve an NCG of 11.2∼  dB at BER
15= 10−  with 22∼ % total overhead and error fl oor at 177 10−⋅∼
. It is important to highlight that this NCG is approximately 0.4  dB 
higher than that achieved by TPC schemes reported in past literature 
[2],[9]–[10].
The rest of this paper is organized as follows. Section 2 introduces 
the basic notation used along the paper and describes the classical SCC 
schemes proposed for ultra high speed transmission systems. The new 
SCC technique is described and analyzed in Section 3. An example of 
the use of the proposed SCC for error fl oor reduction of TPC with ap-
plication in high-speed optical communication is presented in Section 
4. Finally, Section 5 reviews the main conclusions of the paper.
II Background
This section introduces basic concepts and notation used in the paper. 
Let  Ω  be the set of all possible error-patterns at the output of the inner 
decoder. An error-pattern is defi ned as the set of all bits in error that 
jointly take place in one received codeword. Let ( )p ω  be the prob-
ability of a certain error pattern ω∈Ω  at a given signal-to-noise ratio 
(SNR). The exact word error rate (WER) due to Ω  is defi ned by 




Ω ∑   (1)
while the transmission BER, ( )bP Ω , and the information BER, 
( )bP Ω , are given by 




Ω ∑   (2)




Ω ∑    (3)
where n  and k  are the length and the dimension of the code, re-
spectively, while ( )w ω  and ( )w ω  are the weight (including the re-
dundant bits) and the information-weight (which does not include the 
redundant bits) of the error pattern ω , respectively. Set Ω  can be div-
ided into two disjoint subsets  and  (i.e.,  and 
), where  is the set of all error-patterns that causes the error fl oor and 
 are the non-problematic error patterns. From (1) and (3), it is simple 
to derive the following upper bound for the information bit error rate 
due to : 
         1( ) = ( ) ( ) ( ),maxb w




Ω ≤ Ω∑    (4)
where { }= ( )maxmaxw wω ω∈Ω   [11]. Parameters maxw  and  
will be used throughout this paper to design various code concatena-
tion schemes.
II.A Serial Code Concatenation (SCC)
Code concatenation is a known FEC technique based on the combina-
tion of an inner code and an outer code [6], [12]–[14]. Let 1C  and 2C  
denote the inner and outer code, respectively. Each code is defi ned by 
the set of parameters [ , , ]j j jn k d , where jn  jk  and jd  are the block 
size, the dimension, and the minimum distance of the code jC  re-
spectively. The overhead of the code is defi ned as = ( ) /j j j jn k kΘ − . 
Let ijC  denote the i th codeword of jC . A codeword ijC  is composed 
by the information data block ijD  of length jk , and the parity block i
jP  of length =j j jr n k− . In high-speed communication systems, the 
inner code 1C  is generally an LDPC or TP code, while the outer code 
2C  is a block code with error correction capability 2 2= ( 1) / 2t d⎣ − ⎦  
designed to eliminate or reduce the error fl oor of 1C .
SCC Scheme I (SCC-I)
Figure 1 shows a classical serial code concatenation scheme denoted 
here by SCC-I. The encoding process is composed of two steps. First, 
the uncoded frame is divided into m  blocks of 2k  bits denoted 2
iD  
for = 1, ,i m…  (e.g., = 3m  in Fig. 1). Each 2
iD  block is encoded by 
2C  generating the codeword 2iC . In the second step, the codewords 
2
iC  are used as the dataword of code 1C  (i.e., 1 2=i iD C ) and they are 
encoded by 1C  generating the codewords 1iC  that will be transmitted. 
In order to eliminate the error fl oor of 1C  generated by the error pat-
terns , 2C  must correct at least maxw  bits. Note that 2C  will also 
correct the error patterns ω∈Ω  with 2( )w tω ≤  but it may introduce 
additional errors in the error patterns 2= { : ( ) > }w tω ωΩ ∈Ω  . Since 
the decoder for 2C  modifi es a maximum of 2t  bits, the maximum 
number of additional errors introduced over the error patterns Ω  is 
no larger than 2t . Therefore, as a worst case scenario, the informa-
tion BER ( )bP Ω  is increased by a factor 2 2(2 1) / ( 1) < 2t t+ + . This 
penalty can be neglected at very low BER (such as 1510− ) where the 
1In this paper we use the equivalent number of gates of different logic cells 
(e.g., AND, XOR, etc.) to compute the complexity of a given SCC approach.
2Let C be a binary linear code of length n. An (a, b) near-codeword is a 
binary vector of length n and Hamming weight a whose syndrome has weight b 
[8]. Particularly, if b = 0, (a, b) is a codeword. The error-fl oor in LDPC codes is 
typically dominated by (a, b) near-codewords where b is small but higher than 
zero, and a is lower than the minimum distance of the code.
Figure 1: Encoding of SCC-I.
54 CAN. J. ELECT. COMPUT. ENG., VOL. 36, NO. 2, SPRING 2013
slope of the BER vs. SNR curve is very high (in particular for codes 
with performance close to the Shannon limit).
Figure 2 shows a variation of SCC-I where an interleaver is intro-
duced between 1C  and 2C . Depending on the structure of the error 
pattern of 1C , it may be possible to design a proper interleaver to div-
ide each error pattern into several codewords of 2C . In this way, the 
correction capability required for 2C  can be relaxed.
SCC Scheme II (SCC-II)
The SCC-I scheme can be generalized as depicted in Fig. 3. This 
scheme, denoted SCC-II, uses a longer outer code 2C  to protect a 
frame of m  inner datawords (e.g., = 3m  in Fig. 3). Let 2 = maxt wτ ⋅  
be the error correction capability of 2C . Then, the error fl oor is elim-
inated if = mτ . On the other hand, if < mτ  the error fl oor is reduced 
but not eliminated. Based on the binomial distribution [11], the resid-
ual error fl oor ( ) ( )IIbP Ω  of SCC-II can be approximated by 
        
( )
1= 1
( ) [ ( )] [1 ( )] .
m
II i m imax
w wb
i









In virtually all applications 4( ) < 10wP
−Ω , < 50maxw  and 
1 > 500k . Therefore, by choosing = 20m  and = 4τ  it is possible to 
reduce the error fl oor below 1710− . Furthermore, because the required 
value of τ  is signifi cantly lower than m , the outer code overhead 
of SCC-II is much lower than that of SCC-I. However, this advan-
tage comes at the expense of implementing an outer code m -times 
longer with a correction capability τ -times higher. Unfortunately, this 
complexity increase makes the use of SCC-II prohibitive in most high 
speed applications.
III New serial code concatenation strategy
This section describes a novel SCC scheme to combat the error fl oor. 
The new technique  is able to achieve an error fl oor reduction similar to 
that accomplished by the SCC-II. However, the new approach builds 
on short outer block codes as in SCC-I; therefore the implementation 
complexity can be signifi cantly reduced.
Next, we assume that the error fl oor of the inner code 1C  is caused 
by a set of detectable error-patterns  as experienced in most LDPC 
codes (i.e. the error-patterns are not codewords). The new SCC ap-
proach uses two short outer block codes (denoted as 2C  and 3C ) 
to combat the error fl oor of the inner code 1C . The encoding process 
comprises three steps (see Fig. 4): 
1) The uncoded frame is divided into m  datawords of 3k  bits de-
noted 3
iD  for = 1, ,i m…  (e.g., = 3m  in the example of Fig. 4). 
Each dataword 3
iD  is encoded by 3C  generating the parity bits 
3
iP . 
2) The m  parity blocks 3
iP  are grouped together into the dataword 
1
2D  which is encoded by 2C , generating the parity bits 12P  
3) The parity bits 12P  are divided into m  sub-blocks of equal (or 
almost equal) size denoted as 1,2
iP  with = 1, ,i m… . Each data-
word 1
iD  is generated by the concatenation of the dataword 3
iD  
and the parity bits 1,2
iP . Finally, each dataword 1
iD  is encoded by 
code 1C  generating the codeword 1iC  to be transmitted over the 
channel. 
Figure 5 presents an example of the decoding process when the er-
ror pattern occurs in the third codeword, 31C . The process comprises 
the following steps: 
Figure 2: Encoding of SCC-I with interleaving.
Figure 3: Encoding of SCC-II.
Figure 4: Encoding of the proposed SCC technique.




55 MORERO / HUEDA: NOVEL SERIAL CODE CONCATENATION STRATEGY
1) Codewords 1
iC  for = 1, ,i m…  are decoded and those containing 
uncorrectable errors are detected. 
2) From the error-free datawords 1
iD  obtained at Step 1, the dat-
words 3
iD  are extracted and encoded in order to recover the parity 
bits 3
iP . 
3) The dataword 12D  is reconstructed from the parity bits 3
iP  gener-
ated at Step 2, while the unavailable parity bits (those that belong 
to the corrupted codewords) are marked as erasures (e.g., 33P  in 
the example of Fig. 5). The parity bits 12P  are extracted from the 
error-free datawords 1
iD  and those bits related to the corrupted 
datawords are marked as erasures. An erasure decoding is carried 
out over the codeword 12C , where the parity bits 3
iP  of the cor-
rupted codewords 3
iC  are regenerated. 
4) Finally, the codewords 3
iC  with residual errors are decoded. 
III.A Performance and overhead
The proposed SCC achieves a performance similar to the one de-
rived from SCC-II if 3C  corrects maxw  bits and 2C  recovers 
3 3 2 2( ) ( / ) ( )n k m n kτ τ⋅ − + ⋅ −  erased bits. Note that 3 3( )n kτ ⋅ −  
are the parity bits of τ  codewords of 3C , while 2 2( / ) ( )m n kτ ⋅ −  
are the par ity bits of 2C  that belong to τ  corrupted codewords of 1C .
Assuming that 2C  is a maximum distance separable (MDS) code 
(e.g., RS codes), it is verifi ed that its erasure correction capability is 
equal to its redundancy [14], therefore 
           2 2 3 3 2 2= ( ) ( ),n k n k n km
ττ− ⋅ − + −   (6)
then, 
                  2 2 3 3= ( ).




−   (7)
Finally, the overhead of the new SCC is 
             3 32 2
3 3
( ) = = .n kn km k m k
ττ τ
−− ⎡ ⎤Θ ⎢ ⎥⋅ − ⎣ ⎦
  (8)
Numerical evaluation of (8) shows that the overhead of the proposed 
SCC scheme is lower than that required by the classical SCC-I and 
SCC-II schemes in practical high-speed applications where 1 > 500k  
4τ ≤ , 1 / 2 200maxw k≤ ≤ , and 20m ≥  [7].
III.B Comp lexity
The proposed SCC scheme is composed of three encoders at the trans-
mitter side (one for each code) and one 3C  encoder and three decoders 
at the receiver side (one decoder for each code), as observed in Fig. 6. 
Let ( , )exΦ C T  and ( , )dxΦ C T  be the complexity of the encoder and 
decoder respectively of code xC  operating at throughput T  bits/s. 
In this work, we use the equivalent number of gates of different logic 
cells (e.g., AND, XOR, etc.) as a complexity measure of a given SCC 
approach. The equivalent gate count is computed based on the TSMC 
cells described in Table 1 [15]. The estimated number of logic gates of 
RS over GF (2 )q  and BCH codes for a classic p -parallel implementa-
tion is summarized in Table 2 [16]–[17].
The total complexity TΦ  of the proposed SCC scheme is 
 
   (9)
where the decoder of 3C  has to correct τ≤  of the m  codewords 
per frame and therefore its throughput is reduced to ( / )mτ≈ T . 
Similarly, the decoder of 2C  has to correct one codeword per frame 
Figure 6: Encoding and Decoding Block Diagram of the Proposed SCC Scheme.
Table 1
TSMC gate count of basic cells
Table 2
Binary BCH and RS code complexities
3 32 ( , ) ( ,( / ) ) (Buffer)
e d
T mτΦ ≈ Φ +Φ +ΦC CT T
2 3 3 2 3 3( ,( / ) ) ( ,( / ) ),
e dr k r k+ Φ +ΦC CT T
56 CAN. J. ELECT. COMPUT. ENG., VOL. 36, NO. 2, SPRING 2013
which reduces its throughput to 3 3( / )r k≈ T . Finally, (Buffer)Φ  is 
the complexity of approximately 13mk  bits of buffering required to 
compensate for the latencies of the different codes. These complex-
ities depend on the adopted outer codes. A good tradeoff between per-
formance and complexity can be achieved by using the new SCC with 
an RS code over GF (2 )q  as code 2C  and a binary BCH code as 
code 3C . As we shall show in the following examples, the complex-
ity of the proposed SCC scheme is signifi cantly lower than that of the 
existing SCC as a result of the throughput reduction of the decoders.
III.C Example 1: Concatenated L DPC+RS+BCH Code
Let 1C  be the [2640,1320]  Margulis LDPC code [8]. This code has 
an error fl oor starting at 5( ) 10wP
−Ω ≈ . This error fl oor is caused by 
trapping-sets (TS), in particular (12,4) TS and (14,4) TS3 which have 6 
and 7 information bits, respectively. However, as noticed in [18], there 
are also trapping sets of weight 15, 16, 17 and 18 bits. We take into 
account these additional trapping sets by designing an SCC scheme to 
correct error patterns with = 9maxw  information bits. The solutions 
for the different SCC schemes are described in the following items: 
• SCC-I: the outer code 2C  must be the BCH
[1320,1221,19] . Therefore, the SNR penalty and band-
width overhead are 1010 (1320 / 1221) = 0.3386log⋅  dB and 
(1320 1221) / 1221 = 8.11%− , respectively. 
• SCC-II: the outer code 2C  must be a binary BCH
[47520,47088,55]  code, which corrects up to = 3τ  error-patterns 
of 9  bits. The error fl oor is not entirely eliminated, but it is reduced 
to ( ) 19( ) 5 10IIbP
−Ω ≈ ⋅  with an SNR penalty and bandwidth over-
head of 0.0397  dB and 0.9174% , respectively. 
• Proposed SCC: setting = 36m  and = 3τ  as in SCC-II (in this 
way, the same residual error fl oor is achieved), 3C  can be a BCH
[1410,1311,19]  over GF 11(2 ) , and 2C  can be an RS [432,396,37]  
over GF( 92 ). The number of additional parity bits is 36 9 = 324⋅  . 
This OH introduces an SNR penalty and a bandwidth overhead of 
0.0297  dB and 0.6865% , respectively. 
The complexity of the three SCC schemes was estimated as de-
scribed in Section 3.2. A base parallelism factor = 160p  bits was 
used as a reference to achieve a throughput of 100≈T Gb/s in 28nm 
CMOS technology operating at a clock frequency of 625  MHz. Table 
3 summarizes the relative complexity of the three schemes and their 
performances. Note that: 
• The performance of SCC-II is better than that of SCC-I at the ex-
pense of a higher implementation complexity (because of the longer 
outer BCH code). 
•  The new SCC achieves an NCG ∼ 0.3 dB higher than that provided 
by the SCC-I with a similar complexity. 
•  The performances of SCC-II and the new SCC algorithm are simi-
lar. However, the implementation complexity of our technique is 
approximately one order of magnitude lower with respect to the 
SCC-II. 
III.D Example 2: Concatenated LDPC+RS+RS Code
As a second e xample, we consider the LDPC+RS concatenation 
scheme proposed in [19], which has been designed for next genera-
tion optical communication systems. This scheme comprises an in-
ner LDPC [9252,7967]  code and an outer RS [992,956,37]  code. 
The total overhead is 20.5%  and the NCG at BER= 1510−  is 10  dB. 
The outer RS code introduces an SNR penalty of 0.1605  dB. Next, 
we use the new SCC approach with the same inner LDPC code 1C  
defi ned before (i.e., LDPC [9252,7967] ). For the outer codes, we 
consider the RS [830,794,37]  as 3C  (a shortened version of the ori-
ginal RS code), and the RS [1014,936,79]  code as 2C  where = 26m  
and = 2τ . Note that the throughput of 2C  is 3 3/ 22.14k r ≈  times 
lower than the original RS (i.e., RS [992,956,37] ), therefore the ex-
tra complexity required by the new SCC is negligible. From [19], the 
error pattern probability is 7( ) 5 10wP
−Ω ≈ ⋅ . Then, the residual error 
fl oor is 19( ) 10bP
−Ω ≈ . The SNR penalty and bandwidth overhead 
are 0.0164  dB and 0.378%  respectively. Therefore, the NCG is in-
creased from 10  dB to 10.1441 dB and the total overhead is reduced 
from 20.5%  to 16.568% . This not only reduces the SNR requirement 
of the system but also increases the spectral effi ciency and reduces the 
power dissipation (since the sampling rate can be reduced because of 
the lower overhead).
III.E Example 3: Concatenated LDPC+SPC+BCH Code
Most LDPC codes proposed for optical applications have a low error 
fl oor (BER 10< 10− ) that can be reduced below 1510−  by correcting 
only one error pattern per frame (i.e., = 1τ ). For these cases, it is pos-
sible to lower the overhead penalty and implementation complexity 
by using 3 3( )n k−  single parity check (SPC) codes as the outer code 
2C . The new encoding process, as depicted in Fig. 7, comprises the 
following steps: 
1)  The uncoded frame is divided into m  blocks. The fi rst 1m −  data-
words 3
iD  for = 1, , 1i m −…  correspond to the fi rst 1m −  blocks. 
The last dataword 3
mD  is the concatenation of 3 3( )n k−  zeros and 
the last block  3
mD  of 3 3 3( )k n k− −  bits. 
2)  Each dataword 3
iD  is encoded by 3C  generating the parity bits 3iP . 
Figure 7: Encoding process of the proposed SCC optimized for = 1τ .
Table 3
Performance and complexity comparison of example 1
3In the notation “(e, d) TS”, e is the number of wrong bits and d is the
number of unsatisfi ed check nodes (see [8] and [18] for more details)
57 MORERO / HUEDA: NOVEL SERIAL CODE CONCATENATION STRATEGY
3)  For 3 3= 1, ,i n k−…  the m-bit dataword 2
iD  is the concatenation of 
the i-th parity bit of 3
jP  for = 1, ,j m… . Each 2
iD  is encoded by an 
SPC code. 
4)  The dataword for 1C  is 1 3=i iD D  for = 1, , 1i m −… . The last data-
word 1
mD  is the concatenation of  3
mD  and the 3 3n k−  parity bits 
generated in Step 3. Finally, each dataword 1
iD  is encoded by 1C . 
The decoding process is similar to that of the original scheme (e.g., see 
Fig. 5). The main benefi ts of this optimized scheme are: 
•  The implementation complexity of the encoder and decoder of 
the outer code 2C  is very low since they can be implemented 
recursively with 3 3n k−  XOR gates and Flip-Flops, i.e.
     
2 3 3 3 3( ) = 2( )( (XOR) (FlipFlop)) = 23 ( )
d n k n kΦ − Φ +Φ ⋅ −C
 
gates. 
•  The overhead penalty is reduced from 3 3 3[1 / ( 1)] [( ) / ]m n k k− ⋅ −  
(see eq. (8) with = 1τ ) to 3 3 3(1 / ) [( ) / ]m n k k⋅ −  because the cor-
rupted parity-bits of 2C  do not have to be erased. 
•  The encoder latency is reduced because the 2
iP  parity bits are trans-
mitted in the last codeword of 1C  (i.e., the parity bits can be com-
puted while the codewords of 1C  are being transmitted). This also 
reduces the buffer length from 13mk≈  to 12mk≈  bits. 
The performance of the optimized SCC approach for = 1τ  is 
evaluated by using computer simulations in Fig. 8. The [2640,1320]  
Margulis LDPC code with = 10m  is used as inner code, 1C . Owing 
to time constraints, an artifi cially high error fl oor with probability 
 is inserted after the decoding process. As the real error 
fl oor, the artifi cial BER is generated from error patterns with 7 infor-
mation bits and 7 parity bits. Fig. 8 shows the BER of the inner code 
1C  (circles), SCC-II (squares) and the proposed SCC (triangles). For 
SCC-II, the outer code 2C  is a BCH [13200,13102,15] . On the other 
hand, 3C  is a BCH [1397,1320,15]  while 2C  is an SPC [11,10,2]  
code for the new SCC. From Fig. 8 note that 1C  has an error fl oor 
at BER . This error fl oor is reduced by 
the outer codes to BER 84.73 10−≈ ⋅ . From this fi gure we see that 
the new SCC is able to achieve a similar performance to SCC-II. It 
is important to realize that our technique achieves this performance 
by using short outer block codes. Compared with SCC-II, this fact 
reduces signifi cantly the implementation complexity in integrated 
circuits. Particularly, using a parallelism factor = 160p  to achieve 
a throughput 100≈T Gb/s in 28nm CMOS technology operating at 
a clock frequency of 625  MHz as in example 1 (see section 3.3), the 
complexity of SCC-II is 575≈  Kgates while the complexity of the 
proposed scheme is 107≈  Kgates, i.e. 5.38  times lower.
IV Error Floor Reduction in TPC
The SCC strategy introduced previously can be extended to mitigate 
the error fl oor caused by low-weight codewords (i.e., undetectable er-
ror patterns). Note that this feature is particularly useful for decoding 
of turbo codes such as turbo product codes (TPC). This approach uses 
a subset of g  parity check bits of 3C  to detect those inner codewords 
with residual errors after the inner code decoder4.
IV.A Generalized SCC Scheme
Figure 9 depicts the encoding process of the generalized SCC. Unlike 
in the previous strategy, in the generalized SCC a subset of g  parity 
bits of 3
iP , denoted as 3̂
iP , is not encoded by 2C . Instead, this subset 
is transmitted as a part of the dataword 1
iD  of the inner code 1C . In 
the decoding process, (should it say Figure 10 here?) 3̂
iP  is used to 
detect the corrupted 1C -codewords. The decoding process starts by 
applying the decoder of code 1C  to the m  received codewords 1iC . 
After that, the dataword 1
iD  is extracted from 1
iC . For each dataword 
1
iD , the dataword 3
iD  is extracted and partially encoded with 3C  in 
order to regenerate only the parity bits 3̂
iP . If these regenerated par-
ity bits are not equal to the corresponding bits in 1
iD , this dataword is 
marked as corrupted. Once all corrupted datawords 1
iD  are identifi ed, 
the rest of the decoding process continues as in the original proposed 
SCC.
The generalized SCC scheme may have a residual error fl oor caused 
by the occurrence of more than t  error patterns  in the same frame. 
This residual error fl oor, denoted as 
 
, can be estimated from 
(5). Additionally, a residual error fl oor caused by an error pattern that 
cannot be detected by the g  parity bits 3̂
iP  is also possible. The prob-
ability  that an undetectable error pattern  takes place in 
the inner codeword 1
iC  can be computed as 
  (10)
where 3̂( )P ω  are the fi rst g  parity bits of 3C  associated with the 
error pattern  and ( )vI X  is the Iverson operator which is equal 
to 1 if the statement X  is true, and 0 otherwise. Because the error 
4It is also possible to use an additional error-detecting code, such as a cyclic 
redundancy check (CRC) code, as part of the proposed SCC scheme to detect 
those inner codewords with residual errors. Furthermore, in a different ap-
proach, it is also possible to replace the erasure decoder of code C2 by an error-
correcting decoder at the expense of increasing the minimum distance of C2.







Figure 9: Encoding process of the generalized SCC.
2( )
eΦ +C
58 CAN. J. ELECT. COMPUT. ENG., VOL. 36, NO. 2, SPRING 2013
patterns of different inner codewords are independent, the probability 
of at least one undetectable error pattern in the frame can be computed 
based on the binomial distribution as 
               (11)
while the BER can be estimated as 
      
 (12)
IV.B Generalized SCC with Turbo Product Codes (TPC)
As mentioned before, powerful FEC codes must be designed to satisfy 
the need of future multigigabit transmission systems. For instance, net 
coding gains > 10 dB at a BER of 1510−  and overhead of ∼  20% 
are mandatory for next generation OTN [2]. In order to meet these 
requirements, numerous LDPC and TP codes have been reported in 
the literature (e.g., see [2] and references therein). In particular, TPC 
based on 2≥ -error-correcting BCH codes (or TPC-BCH) with block 
sizes 32≥  Kbits have been used in high-speed systems to provide 
an acceptable tradeoff between performance and complexity [9]–[10], 
[20]. The feasibility of TPC-BCH for commercial applications at 100 
Gbps with NCG of 11.4∼  dB at BER 15= 10−  and a total overhead of 
20∼ % has been demonstrated in [20].
In the following, we consider the use of the proposed general-
ized SCC technique to improve the behavior of TPC in high-speed 
applications. We demonstrate that a TPC based on simple extended 
Hamming codes (TPC-EH) with a block size of 8192 bits and min-
imum distance of 16, can be combined with the new SCC strategy in 
order to achieve an NCG of 11.2∼  dB at BER 15= 10−  with 22∼
% total overhead and error fl oor at 177 10−⋅∼ . Notice that the this 
performance is: (i) 0.45  dB better than the one achieved by the 
BCH(144,128,5)×BCH(256,239,6) TPC with block size of 36864 bits 
and minimum distance 30  proposed in [9]; (ii) 0.4  dB better than 
the one accomplished by the BCH(128,113,6)×BCH(256,239,6) TPC 
with block size of 32768 bits and minimum distance 36  proposed in 
[10], and (iii) 0.4  dB better than achieved by the triple concatenated-
codes proposed in [2]. Furthermore, the implementation complexity 
of the TPC-based proposed SCC technique is expected to be lower 
than that of non-concatenated TPC-BCH schemes. This is mainly be-
cause the component codes in the TPC-EH with the proposed SCC are 
much simpler than those required in the non-concatenated TPC-BCH 
codes. In particular, the latter requires longer BCH codes with an error 
correction capability higher than that of the EH code in order to reduce 
the error-fl oor. These features make the TPC-EH with the proposed 
SCC, introduced here, a suitable option for next-generation optical 
fi ber communication networks [2].
IV.C Concatenated TPC-EH +  BCH +  RS
Let the inner code 1C  be a TPC based on two extended Hamming 
codes with parameters [128,120,4]  and [64,57,4] . Let wA  be the 
number of codewords with weight w  in 1C . For < 28w , wA  can be 
computed as described in [21] obtaining
             
(13)
Figure 11 shows the BER vs SNR of this code when it is decoded with 
8 turbo iterations between the two component codes. The optimal max-
imum a-posteriori probability (MAP) decoder proposed in [22] is used 
to decode both EH codes. As observed in Fig. 11, 1C  has a error fl oor 
at a BER 710−≤  which is caused by the 16A  codewords of minimum 
weight = 16minw  [23]. Fig. 11 also reports the BER estimations based 
on the union bound [23] for codewords of weight 16 and 24.
In order to avoid an error fl oor at BER 15= 10− , from Fig. 11 we 
infer that codewords with weight 24 must be corrected. Furthermore, 
since a suboptimal iterative decoder algorithm is used, non-codeword 
stopping sets also have to be analyzed. The latter can be computed 
as described in [24], given that the minimum non-codeword stopping 
sets have weight 24  and multiplicity 81782765568  (i.e. 87≈  times 
lower 24A ). A frame composed of = 19m  shorted TPC (7859,6507)  
codewords is required to accommodate the 122368 bits of the Optical 
Channel data Unit (ODU) of the G.709 OTN frame [25] and the parity 
bits of the outer codes. To reduce the error fl oor below 1510−  a correc-
tion capability of = 2τ  error patterns of weight 24≤  and a corrupted 
codeword detection based on = 32g  bits are used. The shortened 
BCH [6753,6441,49]  over GF 13(2 )  with a correction capability of 
24  bits is used as code 3C . Code 2C  is the shorted RS [596,532,65]  
code over GF 10(2 ) . Therefore, the total additional overhead due to 
2C  and 3C  is 1.02%≈  which introduces an SNR penalty of 0.044  
dB. The residual error rate caused by the occurrence of more than τ  
error patterns is , while the residual error rate due to 
undetectable error patterns is . These values have 











Figure 10: Description of the decoding process of the generalized SCC.
)2()2()2(
1
( ) [ ( )] [1 ( )] .i m imax wwb
i wP P P
m k
−⋅Ω ≈ Ω − Ω
⋅
1, if  = 0
888943104, if  = 16
=
7154214100992, if  = 24














59 MORERO / HUEDA: NOVEL SERIAL CODE CONCATENATION STRATEGY
been derived from (5) and (12), respectively, with 6( ) 5 10wP
−Ω ≈ ⋅ , 
which has been obtained from computer simulation with SNR = 6.7  
dB.
The error fl oor problem of the TPC-EH can also be solved with 
schemes SCC-I and SCC-II. However, as it will be shown below, the 
proposed SCC scheme provides a better performance vs. complexity 
tradeoff: 
•  SCC-I requires a frame of = 19m  shorted TPC (8105,6753)  com-
bined with m  BCH [6753,6441,49] . The total overhead is 25.8% , 
the NCG is 11.05  dB and the complexity is 2.02  times higher than 
that of the proposed scheme. Therefore, the proposed SCC scheme 
has better spectral effi ciency, 0.15  dB higher NCG and lower com-
plexity than SCC-I. 
•  SCC-II requires a frame of = 19m  shorted TPC (7836,6484)  com-
bined with one BCH [123196,122380,97] . Similarly to the pro-
posed scheme, the total overhead is 21.7%  and the NCG is 11.2  
dB. However, the complexity is 6.73  times higher, representing a 
signifi cant complexity advantage in favor of the proposed scheme. 
The above illustrates the advantages of the proposed scheme for im-
plementing forward error correction codes for high speed applications. 
Particularly, the proposed TPC-EH + RS + BCH scheme here pro-
posed represents a low complexity alternative to the non-concatenated 
TPC based on 2≥ -error-correcting BCH codes [2],[9]–[10] since the 
later has higher implementation complexity than EH codes.
V Conclusions
We  have introduced a novel SCC scheme to combat the error fl oor 
problem experienced in iterated sparse graph-based error correcting 
codes. This SCC scheme is based on the use of two short outer codes 
combined with a novel encoding/decoding strategy. We have shown 
that the new approach signifi cantly reduces the complexity with neg-
ligible penalty. The proposed SCC can be effi ciently used with both 
LDPC and TP codes. In particular, the new SCC approach can be used 
to improve the performance of high-speed optical communication sys-
tems, where high coding gain and very low BER are required. The 
SCC technique introduced in this work provides a new general frame-
work for solving the error fl oor problem induced by low-weight error 
patterns of any coding scheme.
References 
[1]  D.A. Morero, et al., ``Non-Concatenated FEC Codes for Ultra-High Speed Optical 
Transport Networks,’’ IEEE Global Telecomm. Conf., pp.1–5, Dec. 2011
[2]  K. Onohara, et al., ``Soft-Decision-Based Forward Error Correction for 100 Gb/s 
Transport Systems,’’ IEEE J. Sel. Topics Quantum Electron., vol.16, no.5, pp.1258–
1267, Sept.–Oct. 2010
[3]  N. Kamiya and S. Shioiri, ``Concatenated QC-LDPC and SPC codes for 100 Gbps 
ultra long-haul optical transmission systems,’’ Optical Fiber Comm. (OFC), col-
located National Fiber Optic Eng. Conf. (OFC/NFOEC), pp.1–3, March 2010
[4]  Z. Zhengya, et al., ``Lowering LDPC Error Floors by Postprocessing,’’ IEEE Global 
Telecomm. Conf., pp.1–6, Nov.-Dec. 2008
[5]  N. Varnica, M. Fossorier, A. Kavcic, ``Augmented Belief-Propagation Decoding 
of Low-Density Parity-Check Codes,’’ IEEE Trans. Comm. vol.54, no.10, pp.1896, 
Oct. 2006
[6]  S. Benedetto, et al.; , ``Serial concatenation of interleaved codes: performance 
analysis, design, and iterative decoding,’’ IEEE Trans. Inf. Theory, vol.44, no.3, 
pp.909–926, May 1998
[7]  D.A. Morero and M.R. Hueda, ``Effi cient concatenated coding schemes for error 
fl oor reduction of LDPC and turbo product codes,’’ IEEE Global Telecomm. Conf., 
pp.1–5, Dec. 2012
[8]  D.J. MacKay and M.S. Postol, ``Weaknesses of Margulis and Ramanujan-Margulis 
low-density parity-check codes,’’ Elect. Notes in Theoretical Computer Science, 
2003.
[9]  T. Mizuochi, et al., ``Forward error correction based on block turbo code with 3-bit 
soft decision for 10-Gb/s optical communication systems,’’ IEEE J. Sel. Topics 
Quantum Electron., vol.10, no.2, pp.376–386, March-April 2004
[10]  M. Akita, et al., ``Third generation FEC employing turbo product code for long-haul 
DWDM transmission systems,’’ Optical Fiber Comm. (OFC), pp.289–290, Mar. 2002
[11]  J. G. Proakis, ``Digital Communications,’’ McGraw-Hill Higher Education, Third 
Edition, 1996.
[12]  W. Ryan and S. Lin, ̀ `Channel codes: Classical and modern,’’ Cambridge University 
Press, 2009
[13]  G.D. Forney, ``Concatenated codes,’’ Cambridge, MA: MIT Press, 1966
[14]  W.C. Huffman and V. Pless, ``Fundamentals of error-correcting codes,’’ Cambridge 
University Press, 2003.
[15]  Taiwan Semiconductor Manufacturing Company Ltd, ``N28HP standard cell li-
brary,’’ Datasheet TCBN28HPBWP35, Nov. 2010.
[16]  S. Lin and D. Costello, ``Error control coding, fundamental and applications,’’ 
Pearson Prentice Hall, Second Edition, 2004.
[17]  Hsie-Chia Chang, et al., ``A Universal VLSI Architecture for Reed-Solomon Error-
and-Erasure Decoders,’’ IEEE Trans. Circuits Syst. I, Reg. Papers, vol.56, no.9, 
pp.1960–1967, Sept. 2009
[18]  H. Yang; W.E. Ryan, ``LDPC decoder strategies for achieving low error fl oors,’’ Inf. 
Theory and Applications Workshop, pp.277–286, Jan.-Feb. 2008
[19]  Y. Miyata, et al., ``Effi cient FEC for Optical Communications using Concatenated 
Codes to Combat Error-fl oor,’’ Optical Fiber Comm. (OFC), collocated National 
Fiber Optic Eng. Conf. (OFC/NFOEC), pp.1–3, Feb. 2008
[20]  S. Dave, et al., ``Soft-decision forward error correction in a 40-nm ASIC for 100-
Gbps OTN applications,’’ Optical Fiber Comm. (OFC), collocated National Fiber 
Optic Eng. Conf. (OFC/NFOEC), pp.1–3, Mar, 2011
[21]  L.M.G.M. Tolhuizen, ``More results on the weight enumerator of product codes,’’ 
IEEE Trans. Inf. Theory, vol.48, no.9, pp.2573–2577, Sep. 2002
[22]  A. Ashikhmin and S. Litsyn, ̀ `Simple MAP decoding of fi rst-order Reed-Muller and 
Hamming codes,’’ IEEE Trans. Inf. Theory, vol.50, no.8, pp.1812–1818, Aug. 2004
[23]  F. Chiaraluce and R. Garello, ``Extended Hamming product codes analytical per-
formance evaluation for low error rate applications,’’ IEEE Trans. Wireless Comm., 
vol.3, no.6, pp.2353–2361, Nov. 2004
[24]  E. Rosnes, ``Stopping Set Analysis of Iterative Row-Column Decoding of Product 
Codes,’’ IEEE Trans. Inf. Theory, vol.54, no.4, pp.1551–1560, Apr. 2008
[25]  Int. Telecomm. Union, ``Interfaces for the optical transport network,’’ ITU-T G.709, 
Feb. 2010.
Damian A. Morero received with honors the degree in elec-
tronic engineering from the National University of Cordoba 
(UNC), Cordoba, Argentina where he is currently working 
toward the Ph.D. degree in Engineering Science. In 2003 
and 2005, he received the Academic Excellence Award from 
the Engineers Association of Cordoba Argentina and the 
UNC respectively. From 2006 to 2009, he received a Ph.D. 
Fellowships from the Secretary of Science and Technology 
(SeCyT), Argentina. He is currently with ClariPhy Argentina 
S.A. where he has been engaged in the research and develop-
ment of error correction coding schemes for high speed op-
tical communications. His research interests include coding, 
information theory and signal processing. 
Mario R. Hueda received the degree in electrical and elec-
tronic engineering and the Ph.D. degree from the National 
University of Cordoba, Cordoba, Argentina, in 1994 and 
2002, respectively. From March 1994 to 1996, he received 
a fellowship from the Scientifi c and Technological Research 
Council of Cordoba to carry out research and development in 
the area of voiceband-data transmission. During the summer 
of 1996, he was a Visiting Scholar with Lucent Technologies-
Bell Laboratories, Murray Hill, NJ, where he worked on 
code-division multiple-access receivers. Since 1997, he has 
been with the Digital Communications Research Laboratory, 
Department of Electronic Engineering, National University 
of Córdoba. He is currently with the National Scientifi c and 
Technological Research Council (CONICET), Cordoba. His research interests include 
digital communications and performance analysis of communication systems. 
