A Compact Low-Latency Systematic Successive Cancellation Polar Decoder
  for Visible Light Communication Systems by Nguyen, Duc-Phuc et al.
A Compact Low-Latency Systematic Successive
Cancellation Polar Decoder for Visible Light
Communication Systems
Duc-Phuc Nguyen‡†, Dinh-Dung Le†, Thi-Hong Tran†, Takashi Nakada†, Yasuhiko Nakashima†
‡ETIS, UMR-8051, Universit Paris Seine, Universit de Cergy-Pontoise, France
†Graduate School of Information Science, Nara Institute of Science and Technology, Japan
Email: (nguyen.duc-phuc@ensea.fr),(le.dung.ku9, hong, nakada, nakashim@is.naist.jp)
Abstract—Channel polarization and Polar code are widely
considered as major breakthroughs in coding theory because
they have shown promising features for future wireless standards.
The main drawbacks of Polar code are high-latency in decoding
hardware, and unimpressive error-correction performance in
case limited code-length is implemented. These two disadvan-
tages limit implementation of Polar code in low-throughput
wireless communication systems. In this paper, we propose a
low-complexity low-latency hardware architecture for the soft-
decision compact (16,11) Systematic Successive Cancellation Po-
lar Decoder (S-SCD). Experimental results has shown that the
latency of the proposed S-SCD improves 3.75 times and 2.75 times
compared with conventional and 2b-SC architectures. Besides, it
has also shown a better BER/FER performance compared with
RS(15,11) code, which is applied widely in current VLC-based
systems.
Index Terms—Systematic Successive Cancellation Decoder (S-
SCD), Polar Code, Visible Light Communication (VLC)
I. INTRODUCTION
Visible light communication (VLC) refers to short-range op-
tical wireless communication using the visible light spectrum.
VLC transmits data by intensity modulating optical sources,
such as light emitting diodes (LEDs) and laser diodes, faster
than the persistence of human eye [1]–[3]. LEDs are also
increasingly being adopted in the general illumination market
in both the commercial and residential segments, because
of their advantages over competing lighting technologies in
energy efficiency, longevity, color rendering capability, and
environment factor [4]. However, VLC has certain shortcom-
ings compared to traditional RF communication. The main
drawback is that the achievable data rate drops sharply with
increasing link distance, which limits the range of high data
rate VLC use cases [4]. Fortunately, we can increase channel
reliability and link distance by forward error correction (FEC)
techniques [2], [5], [6].
In communication system, FEC is an error correction
method by encoding data with redundant bits at transmit-
ter. The redundant data enable receiver detect and correct
some errors without asking the transmitter to re-transmit the
data [7]. FEC techniques are also known as channel coding
methods. Current optical networks employ FEC based on
classical error-correcting codes such as Reed-Solomon (RS)
or Bose-Chaudhuri-Hocquenghem (BCH) codes [8]. Both RS
and BCH codes currently use hard-decision-based receivers
that have limited coding gain. Fig.1 shows block diagram
of a typical low-data-rate VLC transmitter/receiver. In this
system, a concatenation FEC solution is selected for the
channel encode/decode. RS code is at Outer side, and at the
Inner side is the Convolutional Code (CC). FEC solutions of
different operation modes in VLC systems are well presented
in [1]. Polar code is introduced as a low-complexity channel
coding method that can achieve Shannons channel capacity
for any binary-input symmetric discrete memoryless channel
[9]. Systematic polar code (SPC) were proposed by Arikan
and are known for their improved bit error rate (BER) perfor-
mance compared to the original non-systematic polar codes
[6], [10]–[12]. The basic decoding algorithm for Polar codes
is the Successive Cancellation (SC) algorithm, which is a
non-iterative sequential algorithm with complexity O(NlogN)
for a code of length N. Due to low-complexity and high-
performance, Polar code now is applied in many systems. The
main drawback of Polar code is unimpressive error-correction
performance in case of short code length is used. In this case,
many approaches are introduced to enhance the performance
of Polar code to make it feasible in systems which requires
limited code lengths.
In this paper, we propose applying Polar code as a FEC
solution for VLC systems. From experimental results, we
found that the Polar code outperforms RS code on error
correction performance. We also propose a low-latency low-
resource architecture for the (16,11) soft-decision Systematic
Successive Cancellation Polar decoder (S-SCD).
Fig. 1. A low-data-rate VLC OOK transmitter/receiver
ar
X
iv
:1
90
5.
05
00
2v
1 
 [e
es
s.S
P]
  6
 M
ay
 20
19
Fig. 2. Systematic Polar Encoder (8,5)
TABLE I
PROCESSING ELEMENT (PE) INPUTS AND OUTPUT
SEL uˆ La Lb Output (Out)
0 x x x Sign(La.Lb).min(|La||Lb|)
1 0 x x Lb + La
1 1 x x Lb - La
II. POLAR ENCODING/DECODING
A polar code may be specified completely by (N,K,F )
where N is the length of a code word in bits, K is the
number of information bits encoded per codeword, and F is
a set of indices known as information bit indices [10], [13].
For an (N,K,F ) polar code we describe below the encoding
operation for a vector of information bits u of length K.
The rate of the code is R = K/N . Let n = log2(N) and
G = F⊗n = F ⊗ . . . ⊗ F (n copies) is the n-fold
Kronecker product of Arikans [11], [12] standard polarizing
kerner,
F =
[
1 1
0 1
]
Then a codeword is generated as Equation 1.
x = u.G = d.F⊗n (1)
Polar codes in their standard form are non-systematic codes
[12]. In other words, the information bits do not appear as part
of the codeword transparently. A systematic polar code may be
described as an equivalent to original polar code, except that
the message vectors are mapped to codewords, such that the
message bits are explicitly visible. Systematic Polar encoding
of an information u of K bits, is the solution of Equation 2.
x = y.F⊗n (2)
Where yF (message bit positions) and xFc (frozen bits
positions) are the unknowns. There are exactly N unknowns,
shared between x and y. In this paper, we implement a non-
recursive systematic polar code which is introduced in [10].
Fig.2 shows an example of encoding of a (8,5) Systematic
Polar Encoder (SPE). At the positions of frozen bits, the value
of y is 0, and the data flow runs from left to right. At the
position of information bit, data flow runs from right to left.
Encoded bits are results at x. We can found that inside the
encoded data x, it includes original information bits u. This
encoder requires only
(N/2).log2N
XOR computation, which is the same as non-systematic Polar
encoder.
Fig. 3. Systematic Polar Decoder (N=8)
The most popular decoding algorithm for Polar code is the
SC algorithm which is first introduced by Akiran [12]. In fact,
we employ the same SC decoder for both systematic and non-
systematic codes. In both case, the decoder tooks input y′
and produces an estimate uˆ of u. For non-systematic coding,
the decoder stops after putting out xˆA. For systematic coding,
decoder has an extra step of computing an estimate xˆ = uˆ.G
of x, and produced xˆA as output. Fig.3 presents decoding
diagram of a S-SCD. The S-SCD involves calculations using
likelihood ratio (LR) values. The LRs are usually stored
directly in floating-point variables. It is well-known to cause an
underflow or an overflow. A popular solution to this problem is
to store log-likelihood ratios instead of likelihood ratios. Real-
valued calculations in log-domain is used for F function which
is inside each processing elements (PE) (Fig.3). Specifically,
F function in log-domain is presented in Equation 3:
F function:
F = add log(x+ y, 0)− add log(x, y){
z = y + log(1 + exp(x− y)) ;x < y
z = y + log(1 + exp(y − x)) ;x ≥ y
(3)
Approximation form of F function:
F = sign(x.y).min(|x| , |y|) (4)
In hardware implementation, computing efforts based on
logarithm and exponential are high-complexity. Normally, F
and G functions can be implemented by simple logic gates
and logic circuits by using approximation form as shown in
Tab.II and Eq.4.
III. PROPOSED ARCHITECTURE
Fig. 4. Proposed (16,11) S-SCD architecture
Fig. 5. Processing element (PE) architecture
The proposed architecture of (16,11) S-SCD is specified in
Fig.4. The architecture includes five main parts:
• PEs (Processing Elements): Inside each PE block, we
implement one F and one G function. A SEL signal is
created to select which output of either F-function or G-
function circuit is the output of the PE. Fig.5 shows the
specification of the PE.
• Modified PE: At the last stage of PE tree, we modify the
architecture of a normal PE by extract both outputs of
F-function and G-function circuits as the outputs of PE.
Decoded bit ui is forwarded to the input of G-function
circuit to decode uˆi+1
• Control Finite State Machine (FSM): This block imple-
ments a FSM to manage scheduling for whole S-SCD
core. It assigns SEL and s signals, which control the
operation of PE tree network.
• Decoding (DEC) and FN transform: DEC and FN
transform logic circuit is implemented by simple com-
binational circuit.
• Registers: In proposed architecture, for each clock cycle,
S-SCD finishes decoding 2 bits. For each event of positive
clock edge, new input data will be loaded to PEs of
stage 3, and two decoded bits at the previous round will
be stored in data registers.
Current hardware architectures for SCD focus on high-
throughput [6], [7], [9], [14], [15] and they are expected
to be applied in high-speed systems. On the other hand,
VLC systems work mostly on low-data-rate (PHY-I) and
medium-data-rate modes (PHY-II, PHY-III) [1]. Data-rate
range varies from 11.67 Kb/s to 5 Mb/s with optical clock
rate is set up to 7.5 Mhz. In this paper, we propose a
low-latency, low-resource architecture for S-SCD. In this
architecture, high-throughput is not the highest priority in
design criteria. Specifically, we implement a compact (16,11)
soft-decision S-SCD. The decoder shows a good performance
with low-complexity of implementation. Code rate (N/K
= 16/11) is also an equivalent code-rate with RS(15,11),
which is the FEC solution in many modes of VLC systems.
Fig.6 shows the processing scheduling technique of the
proposed architecture. (16,11) S-SCD has four stages of
PEs. In conventional architecture [14], one clock cycle is
dedicated for each PE stages processing, and two clock
cycles are spent for the last stage. We propose all PEs of
four stages to be processed within one clock cycles. For the
last stage, we make some minor modifications to make the
PE extracts two decoded bits at the output of the last stages PE.
Furthermore, the proposed architecture is based on fixed-
point calculations. Therefore, deciding number of quantization
bit Q is very important. Fig.7 shows BER performance of 4-bit
and 5-bit fixed-point (16,11) SCD compared with its floating
point version. At Q=5, the fixed-point decoder shows a similar
BER performance with the floating-point decoder.
IV. EXPERIMENTAL RESULTS
Tab.III summaries the number of latency clocks of the
proposed S-SCD compared with conventional architecture and
2b-SC concept. The proposed (16,11) S-SCD requires only 8
clocks to finish decoding 16 bits; which reduces the latency
3.75 times and 2.75 times, compared with conventional and
2b-SC architectures respectively. However, maximum clock
frequency of the proposed decoder is slower than if con-
ventional and 2b-SC architectures are applied. This creates a
trade-off between latency and clock frequency of the decoder.
However, as explained in Section III, for a low-data-rate
systems like VLC; latency seems to be put in higher priority
Fig. 6. Proposed scheduling technique for the proposed (16,11) S-SCD
TABLE II
LATENCY OF POPULAR SCD ARCHITECTURES (IN CASE OF THESE ARCHITECTURES ARE APPLIED IN (16,11) S-SCD)
Architecture 1st, 2nd bits decoding Scheduling Number of clocks for decoding
Conventional [14] (F)-(F)-(F)-(F)-(G) 30 clocks
2b-SC [15] (F)-(F)-(F)-(F-G) 22 clocks
Proposed (F-F-F-F-G) 8 clocks
TABLE III
HARDWARE SYNTHESIS RESULTS
Logic Elements Registers Memory Bits Fmax
10-bit S-SCD 1108 181 0 58.11 Mhz
6-bit S-SCD 700 117 0 66 Mhz
5-bit S-SCD 578 101 0 73.53 Mhz
compared with through-put [16]. Fig.8 and Fig.9 show bit-
error-rate (BER) performance and frame-error-rate (FER) per-
formance of the proposed soft-decision S-SCD. These figures
also show performance of the reference RS(15,11) and hard-
decision SCD and S-SCD. Performance of soft-decision Polar
decoder is much better than hard-decision decoder. Specif-
ically, Fig.8 shows a 2dB better in coding gain between
soft-decision and hard-decision (16,11) SCD, at BER=1E-4.
Because RS(15,11) is widely used in VLC systems, we make
a comparison between RS(15,11) which one symbol includes 8
bits; and hard-decision (128,96) S-SCD. In this case, not only
BER, FER performances of hard-decision (128,96) S-SCD are
better; the S-SCD also shows a better information utilization,
in term of higher code-rate is used (128/96 compared with
15/11). In summary, we propose applying soft-decision (16,11)
S-SCD, because of its low-complexity, and better BER/PER
performance compared with current RS solutions in VLC
systems.
Tab.IV summarizes results of hardware synthesis of the
proposed (16,11) soft-decision S-SCD. With Q=5-bit, the pro-
posed hardware can get maximum frequency around 73 Mhz
while keeping low-resource, in which no memory bits are used.
Fig. 7. Performance of the fixed-point SCD at different numbers of quanti-
zation bit Q.
Fig. 8. Bit-error-rate performance of the proposed soft decision S-SCD.
The synthesis results shown Tab.IV are achieved by synthesiz-
ing the proposed design with Quartus II software. The selected
FPGA device is Altera Cyclone IV EP4CE115F29C7N.
V. CONCLUSION
In this paper, we have proposed a low-latency, low-resource
architecture for the compact (16,11) soft-decision S-SCD.
The proposed decoder has shown an improvement in latency
compared with conventional and 2b-SC architectures. We have
also shown that BER/FER performance of the proposed S-
SCD is better than RS(15,11), which is current FEC approach
in VLC systems. Moreover, hardware synthesis results have
demonstrated that the proposed S-SCD is a low-complexity
FEC solution. Therefore, the proposed decoder is quite suitable
to be applied in VLC systems. For near future works, we are
Fig. 9. Frame-error-rate performance of the proposed soft decision S-SCD.
building a full VLC system based on FPGA and customized
VLC front-ends, in which the proposed S-SCD is also applied.
COMPETING INTERESTS
The authors declare that they have no competing interests.
ACKNOWLEDGEMENTS
This work was supported by JSPS KAKENHI Grant Num-
ber JP16K18105.
REFERENCES
[1] S. Rajagopal, R. D. Roberts, and S.-K. Lim, “Ieee 802.15. 7 visible
light communication: modulation schemes and dimming support,” IEEE
Communications Magazine, vol. 50, no. 3, pp. 72–82, 2012.
[2] D. D. Le, D. P. Nguyen, T. H. Tran, and Y. Nakashima, “Joint polar
and run-length limited decoding scheme for visible light communication
systems,” IEICE Communications Express, vol. 7, no. 1, pp. 19–24,
2018.
[3] T. Tuan-Kiet, H.-T. HUYNH, D.-P. NGUYEN, L. Dinh-Dung, T. Thi-
Hong, and Y. NAKASHIMA, “Demonstration of a visible light receiver
using rolling-shutter smartphone camera,” in 2018 International Con-
ference on Advanced Technologies for Communications (ATC). IEEE,
2018, pp. 214–219.
[4] A. Jovicic, J. Li, and T. Richardson, “Visible light communication:
opportunities, challenges and the path to market,” IEEE Communications
Magazine, vol. 51, no. 12, pp. 26–32, 2013.
[5] D.-D. Le, D.-P. Nguyen, T.-H. Tran, and Y. Nakashima, “Log-likelihood
ratio calculation using 3-bit soft-decision for error correction in visible
light communication systems,” IEICE Transactions on Fundamentals of
Electronics, Communications and Computer Sciences, vol. 101, no. 12,
pp. 2210–2212, 2018.
[6] D.-P. Nguyen, D.-D. Le, T.-H. Tran, H.-T. Huynh, and Y. Nakashima,
“Hardware implementation of a non-rll soft-decoding beacon-based
visible light communication receiver,” in 2018 International Conference
on Advanced Technologies for Communications (ATC). IEEE, 2018,
pp. 208–213.
[7] ——, “Vlsi architecture of compact non-rll beacon-based visi-
ble light communication transmitter and receiver,” arXiv preprint
arXiv:1805.03398, 2018.
[8] M. Sakib, V. Mahalingam, W. Gross, and O. Liboiron-Ladouceur,
“Optical front-end for soft-decision ldpc codes in optical communication
systems,” Journal of Optical Communications and Networking, vol. 3,
no. 6, pp. 533–541, 2011.
[9] O. Dizdar and E. Arıkan, “A high-throughput energy-efficient imple-
mentation of successive cancellation decoder for polar codes using
combinational logic,” IEEE Transactions on Circuits and Systems I:
Regular Papers, vol. 63, no. 3, pp. 436–447, 2016.
[10] H. Vangala, Y. Hong, and E. Viterbo, “Efficient algorithms for systematic
polar encoding,” IEEE communications letters, vol. 20, no. 1, pp. 17–20,
2016.
[11] E. Arikan, “Systematic polar coding,” IEEE communications letters,
vol. 15, no. 8, pp. 860–862, 2011.
[12] ——, “Channel polarization: A method for constructing capacity-
achieving codes,” in 2008 IEEE International Symposium on Information
Theory. IEEE, 2008, pp. 1173–1177.
[13] H. Vangala, E. Viterbo, and Y. Hong, “A comparative study of polar code
constructions for the awgn channel,” arXiv preprint arXiv:1501.02473,
2015.
[14] C. Leroux, I. Tal, A. Vardy, and W. J. Gross, “Hardware architectures
for successive cancellation decoding of polar codes,” in 2011 IEEE
International Conference on Acoustics, Speech and Signal Processing
(ICASSP). IEEE, 2011, pp. 1665–1668.
[15] B. Yuan and K. K. Parhi, “Low-latency successive-cancellation polar
decoder architectures using 2-bit decoding,” IEEE Transactions on
Circuits and Systems I: Regular Papers, vol. 61, no. 4, pp. 1241–1254,
2014.
[16] D.-P. Nguyen and D.-D. Le, “An fpga-based centralized visible light
beacon network,” arXiv preprint arXiv:1903.06228, 2019.
