We propose a product code with shortened BCH component codes for 100G optical communication systems. Simulation result shows that 10 dB net coding gain is promising at post-FEC BER of 1E-15.
.
Code Construction and Decoding Algorithm
We choose the product code with (391,357) shortened BCH component codes as the candidate code. This code has an overhead of 20% as recommended by OIF. The code structure is shown in Fig. 1 . We first construct a product code with BCH component codes in GF (2 11 ). In a BCH component code, the first 980 bits are fixed to 0s. The middle 357 bits are information bits and the last 34 bits are parity check bits that enable the BCH code to correct up to 3 errors. The minimum distance of the BCH code is 8 so that a codeword with 4 errors will not be decoded. The product code in GF (2 11 ) is then shortened by removing all fixed 0s and this returns a product code with shortened BCH component codes.
We use iterative decoding for the product code. In the component code decoding stage, each shortened component code aims to correct up to 3 errors. The principle of the component code decoding algorithm consists of three steps: to restore the BCH code, to decode BCH code and to use the shortened bits for further check. A row or a column is a BCH codeword in GF (2 11 ) by extending it with 980 leading 0s. If there are no more than 3 errors, the correct codeword is returned by the decoding. However, a decoding error may happen when there are more than 3 errors in a codeword and it degrades the performance of the product code heavily after multiple iterations. We therefore propose to include a third step in the decoding algorithm to reduce the decoding error probability. If any bit in the shortened bits has value 1 in the returned BCH codeword, a decoding error is reported.
Use t to denote the maximum number of errors a code can correct. Decoding error probability of BCH code is 1/(t!) when there are more than t errors. In the proposed code, a BCH component code can correct up to 3 errors. So we have t = 3. The introduced errors from decoding error can be regarded as randomly distributed in the codeword, though there are certain relationships between the error positions and the decoding algorithm. As decoding error with 3 new errors dominates, the theoretical decoding error probability of BCH component code is (1/(t!))(391/2047) 3 = 0.12%. Simulation results match the theoretical prediction.
As the decoding error probability of component codes is very small, the performance of the product code does not degrade much compared to the ideal case of no decoding error. Fig. 2 shows the simulation result of the proposed code in an additive white Gaussian noise Channel. One iteration consists of decoding rows once and decoding columns once. After 8 iterations, the curve drops drastically between input BER 1.2·10 -2 and 1.1·10 -2 . This translates to slightly more than 10 dB NCG. as shown in Fig. 2 and no further action is required. Due to limited number of simulated frames, we take 10 iterations into consideration in the hardware implementation to ensure enough margin to get post-FEC BER of 10 -15 at pre-FEC BER of 1.1·10 -2 . 
Performance Results

Hardware implementation
For easy hardware implementation in future, we employed the decoding algorithm in [9] as component code decoding algorithm. The algorithm takes advantage of the property that the shift of a BCH codeword is also a codeword. For a codeword of length n, it takes n rounds to get the decoding result. In each round, with a pre-setup table, it checks whether there are only 2 or less errors left by flipping the highest bit. If so, the highest bit is flipped. Otherwise the bit keeps its value. The codeword is then shifted by one bit for the next round. After n rounds, the bits in the codeword are back at original positions. Fig. 3 shows the flow of this decoding method. It is also pointed out in [9] that the major circuit area of ROM is (4m + 1) ·2 m = (4·11+1) ·2048 = 90K. We use this algorithm to find the first error bit. The rest error positions will be deduced from checking syndrome values. Now we look at the implementation complexity. In the row decoding of first iteration, one lookup table is shared by 4 BCH decoders. In total we need 90K·391/4 ≈ 8.7M ROM. In this step, we will check the first 190 bits in a codeword. As 4 decoders shares one lookup table, we need 190·4=760 clock cycles. The required flip-flops of shift registers are 190·391≈ 76K. The result is forwarded to an interleaver of 391·391 ≈ 0.15M bits. In total the ROM and flip-flops in this step is ~8.7M and ~0.225M. Column decoding is similar to row decoding. The first iteration will cost ~17.5M ROM and ~0.45M flip flops. The same process is repeated in the second iteration except that we check the last 190 bits when decoding each component codeword. The third iteration will repeat the process of first iteration and so forth. As most errors are corrected in the first 5 iterations, the major circuit required will be ~17.5M·5 = 87.5M ROM and ~0.45M·5 = 2.25M flip flops. The last 5 iterations only correct errors for a few codewords and the added redundany is limited. It is feasible to take this implementation out in several state-of-theart FPGAs.
Each step takes around 760 clock cycles as mentioned above. This equals to 1520 ns in case that FPGA implementation works under 500 MHz system clock frequency. To receive a frame on a 100G interface takes 391·391·10 ps ≈ 1529 ns. So the implementation is fast enough to work for 100G systems.
Conclusion
We proposed a product code with shortened BCH component code for high speed optical communication systems. The code provides potentially more than 10 dB NCG at post-FEC BER of 10 -15
. The hardware implementation is also investigated and the FPGA verification is feasible.
