A Flip-Syndrome-List Polar Decoder Architecture for Ultra-Low-Latency
  Communications by Zhang, Huazi et al.
ar
X
iv
:1
80
8.
01
75
6v
1 
 [c
s.I
T]
  6
 A
ug
 20
18
1
A Flip-Syndrome-List Polar Decoder Architecture
for Ultra-Low-Latency Communications
Huazi Zhang, Jiajie Tong, Rong Li, Pengcheng Qiu, Yourui Huangfu, Chen Xu, Xianbin Wang, Jun Wang
Huawei Technologies Co. Ltd.
Email: {zhanghuazi,tongjiajie,rongone.li,justin.wangjun}@huawei.com
Abstract—We consider practical hardware implementation
of Polar decoders. To reduce latency due to the serial nature
of successive cancellation (SC), existing optimizations [7]–[13]
improve parallelism with two approaches, i.e., multi-bit decision
or reduced path splitting. In this paper, we combine the two pro-
cedures into one with an error-pattern-based architecture. It si-
multaneously generates a set of candidate paths for multiple bits
with pre-stored patterns. For rate-1 (R1) or single parity-check
(SPC) nodes, we prove that a small number of deterministic
patterns are required to guarantee performance preservation.
For general nodes, low-weight error patterns are indexed by
syndrome in a look-up table and retrieved in O(1) time. The
proposed flip-syndrome-list (FSL) decoder fully parallelizes all
constituent code blocks without sacrificing performance, thus
is suitable for ultra-low-latency applications. Meanwhile, two
code construction optimizations are presented to further reduce
complexity and improve performance, respectively.
Index Terms—Channel coding, Decoding, Hardware, Low
latency.
I. INTRODUCTION
A. Background and related works
Polar codes [1], [2] have been selected for the fifth
generation (5G) wireless standard. With state-of-the-art code
construction techniques [3]–[5] and SC-List (SCL) decoding
algorithm [6]–[13], Polar codes demonstrate competitive per-
formance over LDPC and Turbo codes in terms of block error
rate (BLER). Beyond 5G, ultra-low decoding latency emerges
as a key requirement for applications such as autonomous
driving and virtual reality. The latency of practical Polar
decoders, e.g., an SC-list decoder with list size L = 8, is
relatively long due to the serial processing nature.
Continuous efforts [7]–[13] have been made to signifi-
cantly reduce decoding latency. Among them, we are par-
ticularly interested in hardware implementations, which are
dominant in real-world products, due to better power- and
area-efficiency. According to our cross-validation, three ap-
proaches are shown to be cost-effective, yet incur no or
negligible performance loss compared to the original SCL
decoder, as summarized below:
1) Pruning on the SC decoding tree [7] (parallelizing
constituent code blocks with mult-bit decision)
• Rate-0 (R0), repetition (Rep) nodes [8], [9].
• General (Gen) nodes comprised of consecutive bits
[10], [11].
2) Reduce the number of path splitting
• Rate-1 (R1), single parity-check (SPC) nodes [9].
• Do not split upon the most reliable (good) bits [12],
[13].
3) Reduce the latency of list pruning
• Adopt bitonic sort [14] for efficient pruning.
• Quick list pruning [15].
B. Motivation and our contributions
It is well known that an SC decoder requires 2N − 2 time
steps for a length-N code [1]. The SC decoding factor graph
reveals that, the main source of latency is the left hand side
(LHS, or information bit side) of the graph. In contrast, the
right hand side (RHS, or codeword side) of the graph consists
of independent code blocks and already supports parallel
decoding.
With the above observations, the key to low-latency de-
coding is to parallelize LHS processing. Existing hardware
decoder designs are pioneered by [7]–[11], which view SC
decoding as binary tree search, i.e., a length-N code (a parent
node) is recursively decomposed into two length-N/2 codes
(child nodes). Upon reaching certain special nodes, their
child nodes are not traversed [7] and the corresponding path
metrics are directly updated at the parent node [8]. Even
though, there is still room for further optimizations:
• The processing of an R1/SPC node is not fully parallel
(e.g., a number of sequential path extension & pruning
are still required [9]). A higher degree of parallelism
can be exploited to further reduce latency.
• Optimizations (e.g., parallel processing) are applied to
some special nodes (e.g., R0/Rep/SPC/R1), and the
length of such blocks, denoted by B, is often short
due to insufficient polarization. According to our mea-
surement under typical code lengths, the main source
of latency is now incurred by the general nodes whose
constituent code rates are between 2
B
and B−2
B
.
Motivated by [7]–[11], and thanks to the recent advances in
efficient list pruning [14], [15], we find it profitable to further
improve parallelism for ultra-low-latency applications. Our
contributions are summarized below:
1) We propose to fully parallelize the processing of
R1/SPC nodes via multi-bit hard estimation and flip-
ping at intermediate stages. Only one-time path exten-
sion/pruning per node is required by applying a small
2number of flipping patterns on the raw hard estimation.
Such simplification is proven to preserve performance.
2) For general nodes, we apply flip-syndrome-list (FSL)
decoding to constituent code blocks. Specifically, a
small set of low-weight error patterns are pre-stored
in a table indexed by syndrome. During decoding,
syndrome is calculated per constituent code block.
Its associated error patterns are retrieved from the
syndrome table, and used for bit-flip-based sub-path
generation. Similar to R1/SPC nodes, the FSL decoder
narrows down the candidates for path extension, and
enjoys the simplicity of a hard-input decoder. The
proposed optimization is shown to incur negligible
performance loss.
3) The complexity of an FSL decoder is mainly incurred
by constituent code blocks with medium rates. We
propose to re-adjust the distribution of information bits
in order to avoid certain constituent code rates, such
that decoder complexity can be significantly reduced.
We show that the performance loss can be negligible.
4) With the FSL decoder’s capability to decode arbitrary
linear outer constituent codes, not necessarily Polar
codes, we propose to adopt hybrid outer codes with
optimized distance spectrum. The hybrid-Polar codes
demonstrate better performance than the original Polar
codes.
Paper is organized as follows, Section II introduces the
fundamentals of Polar SCL decoding, Section III provides
the details of FSL decoder including R1/SPC nodes, general
nodes, latency analysis and BLER performance, Section IV
proposes two improved code construction methods that ben-
efit from the FSL decoder architecture, Section V concludes
the paper.
II. POLAR CODES AND SCL DECODING
A binary Polar code of mother code length N = 2n can
be defined by c = uG and a set of information sub-channel
indices I. The information bits are assigned to sub-channels
with indices in I, i.e., uI , and the frozen bits (zero-valued
by default) are assigned to the rest sub-channels. The Polar
kernel matrix is G = F⊗n, where F =
[
1 0
1 1
]
is the kernel
and ⊗ denotes Kronecker power, and c is the code word. The
transmitted BPSK symbols are xN−10 = 1− 2 · c
N−1
0 and the
received vector is yN−10 .
For completeness, the original SCL decoder [6] is briefly
revisited. The SC decoding factor graph of a length-N Polar
code consists of N × (log2N + 1) nodes. The row indices
i = {0, 1, · · · , N − 1} denote the N bit indices. The column
indices s = 0, 1, · · · , log2N denote decoding stages, with
s = 0 labeling the information bit side and s = log2N
labeling the input LLR side (or codeword side). Each node
in the factor graph can be indexed by a (s, i) pair, and is
associated with a soft LLR value αs,i, which is initialized
by αlog
2
N,i = yi, and a hard estimate βs,i.
For all s and i satisfying i mod 2s+1 < 2s, a hardware-
friendly right-to-left updating rule for α is:
αs,i = sgn(αs+1,i)sgn(αs+1,i+2s )min(|αs+1,i|, |αs+1,i+2s |),
αs,i+2s = (1 − 2βs,i)αs+1,i + αs+1,i+2s .
The hard estimate of the i-th bit is β0,i =
1−sgn(α0,i)
2 . The
corresponding left-to-right updating rule for β is:
βs,i = βs−1,i ⊕ βs−1,i+2s−1 ,
βs,i+2s−1 = βs−1,i+2s−1 .
An SCL decoder with list size L executes path split upon
each information bit, and preserves L paths with smallest
path metrics (PM). Given the l-th path with uˆli as the i-th
hard output bit, a hardware-friendly PM updating rule [16]
is
PMli =
{
PMli−1, if uˆ
l
i = β
l
0,i,
PMli−1 + |α
l
0,i|, otherwise,
where PMli denotes the path metric of the l-th path at bit
index i, and αl0,i and β
l
0,i denote its corresponding soft LLR
and hard estimation, respectively.
After decoding the last bit, the first path1 is selected as
decoding output.
III. FLIP-SYNDROME-LIST (FSL) DECODING
SC-based decoding of length-N Polar codes requires
log2N + 1 stages to propagate received signal (s = log2N )
to information bits (s = 0). The degree of parallelism is 2s,
i.e., reduces by half after each decoding stage.
To increase parallelism, we propose to terminate the
LLR propagation at intermediate stage s = log2 B, and
process all length-B constituent code blocks with a hard-
input decoder. The design is detailed throughout this section,
where differences to existing works mainly include (i) fully
parallelized processing for B bits and L paths, and (ii)
supporting arbitrary-rate blocks rather than special ones (e.g.,
R0/Rep/SPC/R1).
A. Multi-bit hard decision at intermediate stage
The indices of a constituent code block is denoted by B ,
{i, i + 1, · · · , i + B − 1}. Once the soft LLRs at the s-th
stage are obtained, where s = log2B, a raw hard estimation
is immediately obtained by
βs,B =
1− sgn(αs,B)
2
. (1)
In contrast to SCL that uses the soft LLR αs,B, a con-
stituent block decoder takes βs,B as its hard input, and
directly generates hard code word βˆs,B as decoded output.
The hard-input decoders for R1, SPC and general nodes
will be described next in Section III-B and III-C. For now,
we assume such a decoder outputs a hard code word βˆs,B
1For CRC-aided Polar, the first path that passes CRC check is selected.
3,
l
s
a ,
l
s
b
,0
,
ˆ l
s
b
,1
,
ˆ l
s
b
,3
,
ˆ l
s
b
Bits with 
smallest LLRs
LLR vector for 
constituent block
Hard 
decision
Flipped vector 1
Flipped vector 2
Flipped vector 4
…
Error-pattern based parallel path extension:
A small set of possible error patterns are pre-stored in LUT
Sub-path 1
Sub-path 2
Sub-path 4
0 1, ,e e
Fig. 1: Error-pattern based parallel path extension.
for each candidate path, and recover the corresponding in-
formation vector by
uˆB = βˆs,BF
⊗s. (2)
Given the soft LLRs αˆs,B and the recovered codeword
βˆs,B , the multi-bit version of PM updating rule [8] is
PMli = PM
l
i−B +
∑
j∈B
(∣∣∣βˆls,j − βls,j
∣∣∣ ∣∣αls,j∣∣
)
. (3)
The remaining updating of α and β is based on the hard
decision βˆs,B rather than the raw estimation βs,B .
B. Parallelized path extension via bit flipping
1) Rate-1 nodes: For an R1 node, the state-of-the-art
decoding method [9] requires min(L − 1, B) times path
extensions. First, the input soft LLRs αls,B for each list path
is sorted. Then, path extensions are performed only on the
min(L − 1, B) least LLR positions to reduce complexity.
Such simplification incurs no performance loss since addi-
tional path extensions are proven to be redundant [9]. The
searching space becomes L×2min(L−1,B), much smaller than
L×2B for conventional SCL [6] and SSCL [8]. Another work
[17] also proposes to reduce searching space for R1 nodes.
But its candidate paths generation is LLR-dependent, thus is
suitable for software implementation as suggested in [17].
In this paper, we focus on hardware implementation and
propose a parallel path extension based on pre-stored error-
patterns. As shown in Fig. 1, only one-time path extension
and pruning is required for a constituent block. The optimiza-
tion exploits the deterministic partial ordering of incremental
path metrics within a block. Accordingly, the search for
survived paths can be narrowed down to a limited set, and
pre-stored in the form of error patterns in a look-up table
(LUT). The LUT is shown to be very small for a practical
list size L = 8. As such, the advantages are:
• B bits are decoded in parallel.
• Sub-paths are generated in parallel.
• The above two procedures are combined into one.
Notation 1 (soft/hard vectors): The soft LLR input of a
constituent block is indexed by ascending reliability order,
i.e., αls,B such that |α
l
s,0| < |α
l
s,1| < · · · < |α
l
s,B−1| for each
list path. The corresponding raw hard estimation is denoted
by β ls,B ,
[
βls,0, β
l
s,1, · · · , β
l
s,B−1
]
.
Notation 2 (sub-paths extension): For a constituent block
with indices B, a sub-path that extends from the i-bit to the
(i + B − 1)-th bit can be well defined by the blockwise
decoding output. For example, the t-th sub-path of the l-th
path is denoted by the vector βˆ
l,t
s,B .
Notation 3 (bit-flipping): Each vector βˆ
l,t
s,B is generated by
flipping β ls,B based on an error pattern e. A single-bit-error
pattern is denoted by ep if it has one at the p-th bit position
(p = 0, 1, · · · ) and zeros otherwise.
For L = 8, we narrow down the searching space per list
path from 2min(L−1,B) to 13 by the following proposition.
Proposition 1: For each path in an SCL with L = 8,
its L maximum-likelihood sub-paths (i.e., with minimum
incremental path metrics) fall into a deterministic set of
size 13. These sub-paths can be obtained by bit flipping
the original hard estimation of each list path based on the
following error patterns:
βˆ
l,t
s,B =


β ls,B, t = 0;
β ls,B ⊕ et−1, 1 ≤ t ≤ 7,
β ls,B ⊕ e0 ⊕ e1, t = 8,
β ls,B ⊕ e0 ⊕ e2, t = 9,
β ls,B ⊕ e1 ⊕ e2, t = 10,
β ls,B ⊕ e0 ⊕ e3, t = 11,
β ls,B ⊕ e0 ⊕ e1 ⊕ e2, t = 12.
(4)
Proof: To survive from the sub-paths of all L paths,
a sub-path must first survive from the sub-paths of its own
parent path. That means for each parent path, we only need
to consider its L maximum-likelihood sub-paths. Altogether,
there are at most L2 sub-path to be considered.
According to (3), the path metric penalty is received only
on the flipped positions. For each sub-path and its associated
error patterns, the incremental path metric is computed by
∆PMl,ti+B−1 , PM
l,t
i+B−1 − PM
l
i (5)
=


0, t = 0;
|αls,t−1|, 1 ≤ t ≤ 7,
|αls,0|+ |α
l
s,1|, t = 8,
|αls,0|+ |α
l
s,2|, t = 9,
|αls,1|+ |α
l
s,2|, t = 10,
|αls,0|+ |α
l
s,3|, t = 11,
|αls,0|+ |α
l
s,1|+ |α
l
s,2|, t = 12,
Since the indices of soft LLRs |αls,B| are ordered according
to Notation 1, the incremental path metrics also satisfy a set
of partial order, as shown in the following directed graph.
The arrow “→” denotes a “smaller than” relationship that
can be easily verified.
0
↓
|αls,0| → |α
l
s,1| → |α
l
s,2| → |α
l
s,3| → |α
l
s,4| → |α
l
s,5| → |α
l
s,6| → · · ·
ւ ց ց
|αls,0|+ |α
l
s,1| → |α
l
s,0|+ |α
l
s,2| → |α
l
s,0|+ |α
l
s,3| → · · ·
↓ ↓
|αls,1|+ |α
l
s,2| → |α
l
s,1|+ |α
l
s,3| → · · ·
↓
|αls,0|+ |α
l
s,1|+ |α
l
s,2| → · · ·
4We prove Proposition 1 with the above directed graph. Any
node with a minimum distance to the root node “0” larger
than L = 8 cannot survive path pruning.
First, if the 8-th smallest incremental path metric is caused
by a single bit error, then it cannot be |αls,7| or larger, oth-
erwise there will be more than 8 sub-paths with incremental
path metrics smaller than the 8-th one, which contradicts the
assumption. The argument is obvious since there are already
8 nodes upstream of |αls,7| in the directed graph.
Similarly, the 8-th smallest incremental path metric caused
by two bit errors cannot be equal to or larger than |αls,1| +
|αls,3|, because there are already more than 8 sub-paths with
smaller path metrics in its upstream.
Finally, the sub-paths with incremental path metric |αls,0|+
|αls,1| + |α
l
s,2| also has 8 nodes in its upstream (including
itself), and any error patterns with larger incremental path
metric (including the 4-bit patterns) will lead to contradiction
if they are included in the surviving set.
Thus, we can reduce the tested error patterns per path to 13
with only one-time path extension without any performance
loss.
Remark 1: The bit-flipping-based path extension is mainly
constituted of binary/LUT operations. The 13 error patterns
are pre-stored. The resulting path metrics for all error patterns
can be computed in parallel according to (3) or (5).
The path extension and pruning are as summarized by
“(13 → 8 → 64 → 8) × 1”, explained as follows. For
each path, the 13 error patterns lead to 13 sub-paths, among
which the 8 with smallest path metrics are pre-selected
(13 → 8). Altogether, there will be 8 × L = 64 extended
paths (8→ 64) for the case of L = 8. The 64 extended paths
are pruned back to 8 (64 → 8). The above procedures are
executed only one time. In contrast, the fast-SSCL decoder
[9] requires L − 1 = 7 times path extension and pruning,
i.e., (8 → 16 → 8) × 7. According to Section III-D, the
minimum number of “cycles” reduces from 49 to 14 in the
case of a length-16 R1 block. To avoid any misunderstanding,
the “cycles” here captures implementation details in our
fabricated ASIC [19], thus should be distinguished from the
“time steps” concept in [9].
Remark 2: The proposition addresses list size L = 8,
but its idea naturally extends to all list sizes as long as
the corresponding error patterns are identified. Among them,
decoders with list size L = 8 are particularly important
since they are widely accepted by the industry during the 5G
standardization process [18]. The conclusion is drawn after
extensive evaluations on the tradeoff among BLER, latency,
throughput and power consumption, in which decoders with
L = 8 achieve the best overall efficiency. The tradeoff in
real hardware is further verified in our implemented decoder
ASIC in [19].
2) SPC nodes: For an SPC node, the state-of-the-art de-
coding method [9] requiresmin(L,B) times path extensions.
In this work, we propose only one-time path extension and
reduce the searching space from 2min(L,B) to 13 as follows.
TABLE I: Partial order of path metrics for a SPC node
Even checksum case for β ls,B
0 – – –
|αls,0|+ |α
l
s,1| – – –
|αls,0|+ |α
l
s,2| |α
l
s,1|+ |α
l
s,2| – –
|αls,0|+ |α
l
s,3| |α
l
s,1|+ |α
l
s,3| |α
l
s,2|+ |α
l
s,3|
3∑
j=0
|αls,i|
|αls,0|+ |α
l
s,4| |α
l
s,1|+ |α
l
s,4| · · · · · ·
· · · · · · · · · · · ·
|αls,0|+ |α
l
s,7| · · · · · · · · ·
Odd checksum case for β ls,B
|αls,0| – – –
|αls,1| – – –
|αls,2|
∑
j=0,1,2
|αls,i| – –
|αls,3|
∑
j=0,1,3
|αls,i|
∑
j=0,2,3
|αls,i|
∑
j=1,2,3
|αls,i|
|αls,4|
∑
j=0,1,4
|αls,i| · · · · · ·
· · · · · · · · · · · ·
|αls,7| · · · · · · · · ·
Proposition 2: For SCL with L = 8, following Notation 1,
if the checksum of β ls,B is even, i.e.,
∑
j∈B
βls,j = 0, then the
L surviving paths can be obtained from bit flipping each list
path based on the following 13 error patterns:
βˆ
l,t
s,B =


β ls,B, t = 0;
β ls,B ⊕ e0 ⊕ et, 1 ≤ t ≤ 7,
β ls,B ⊕ e1 ⊕ e2, t = 8,
β ls,B ⊕ e1 ⊕ e3, t = 9,
β ls,B ⊕ e1 ⊕ e4, t = 10,
β ls,B ⊕ e2 ⊕ e3, t = 11.
β ls,B ⊕ e0 ⊕ e1 ⊕ e2 ⊕ e3, t = 12.
(6)
Otherwise, if the checksum of β ls,B is odd, i.e.,
∑
j∈B
βls,j = 1,
then the L surviving paths can be obtained from bit flipping
each list path based on the following 13 error patterns:
βˆ
l,t
s,B =


β ls,B ⊕ et, 0 ≤ t ≤ 7,
β ls,B ⊕ e0 ⊕ e1 ⊕ e2, t = 8,
β ls,B ⊕ e0 ⊕ e1 ⊕ e3, t = 9,
β ls,B ⊕ e0 ⊕ e2 ⊕ e3, t = 10,
β ls,B ⊕ e1 ⊕ e2 ⊕ e3, t = 11,
β ls,B ⊕ e0 ⊕ e1 ⊕ e4, t = 12,
(7)
Proof: The proof follows that of Proposition 1. For
simplicity, we change the directed graph to Table I, where
the right and lower cells are always larger than the left and
upper ones. As seen, any error patterns other than the those
given in Proposition 2 will lead to more than 8 surviving
paths with path metrics smaller than the 8-th path, which
contradicts the assumption.
Remark 3: According to Section III-D, the latency (cycles)
reduction from [9] is 56→ 15 under L = 8.
C. Error pattern identification via syndrome decoding
Existing optimizations operate on special rates, e.g.,
R0/R1/SPC/Rep nodes. In this work, we suggest a paral-
5TABLE II: Syndrome table for B = 8,KB = 6
Syndrome Error Patterns (in Hex)
00 00 05 11 41
01 01 04 10 40
10 03 09 21 81
11 02 08 20 80
lelization method for arbitrary nodes with larger sizes (e.g.,
B = 8, 16, · · · ).
For general nodes, it is not easy to identify all possible
error patterns as in R1/SPC nodes. However, it is possible to
quickly narrow down to a subset of highly-likely error pat-
terns for parallelized path extension. Syndrome decoding is
particularly suitable here for two reasons, e.g., (i) blockwise
syndrome calculation is simple and reuses the Kronecker
product module, (ii) multiple error patterns (coset) can be
pre-stored and retrieved in parallel.
1) General nodes: As shown in Fig. 2, we first obtain a set
of input vectors via multi-bit hard decision and bit flipping.
The flipped positions are chosen from the flipping set T , i.e.,
the T indices in αs,B with the smallest LLRs. Based on the
hard estimation β ls,B , we flip within T to generate 2
T input
vectors, denoted by β l,ts,B, and t ∈ {0 · · · 2
T − 1}.
For example, if the t-th flipping pattern is ei ⊕ ej ⊕ ek
(clearly {i, j, k} ∈ T ), then
β l,ts,B = β
l
s,B ⊕ ei ⊕ ej ⊕ ek. (8)
Given the flipping pattern, the syndrome-decoding-based
parallel path extension is illustrated in Fig. 2. The key steps,
e.g., syndrome calculation and error pattern retrieval, are
hardware-friendly binary operations and LUT.
Denote by GB , F
⊗ log
2
B the kernel of a general node and
its frozen set FB, the parity-check matrix HB is obtained by
extracting the columns with indices in FB from GB . Thus,
the syndrome of vector β l,ts,B contains B − KB bits and is
calculated by
dl,ts,B = HB × β
l,t
s,B. (9)
For each syndrome, its associated error patterns are com-
puted offline [20] and pre-stored by ascending weight order
in LUT. Since a low-weight error pattern is more likely than
a high-weight one, we only need to store a small number of
lowest-weight patterns to reduce memory.
There are 2B−KB different syndromes for a (B,KB) con-
stituent code block, where KB is the number of information
bits within the block. As a result, the size of a syndrome
table is (2B−KB)×Lsd, where Lsd is a constant number of
error patterns pre-stored for each syndrome.
For example, the syndrome table for a general node with
B = 8, KB = 6 and Lsd = 4 has size 4× 4 and is given in
Table II.
The error patterns retrieved from LUT are used to simul-
taneously generate a set of candidate sub-paths, denoted by{
βˆ
l,t,lsd
s,B
}
= β l,ts,B +
{
error patterns indexed by dl,ts,B
}
,
(10)
where t and lsd are the flipping pattern index and syndrome-
wise error pattern index, respectively.
For each list path, we have 2T × Lsd extended sub-paths.
The path metrics are updated according to (3) except that, the
T smallest LLRs are modified to a large value, i.e., αl,ts,j →
(−1)βˆ
l,t
s,j ×∞, ∀j ∈ T , where βˆl,ts,j is the j-th hard bit after
flipping. This procedure ensures at most one flip for each bit
position and therefore no duplicate paths will survive, which
is crucial to the overall performance.
Similar to R1/SPC nodes, the path extension and pruning is
performed only one time for each block to keep L surviving
paths, i.e., (L→ L× 2T × Lsd → L).
Remark 4: For small KB, an exhaustive-search-based path
extension is more convenient since it generates 2KB paths
[11]. For KB > T + log2 Lsd, it is more efficient to extend
paths by the proposed flip-syndrome method. Therefore,
we recommend to switch between exhaustive-search-based
and syndrome-based path extension depending on the con-
stituent code rate. As such, the maximum path extension is
min
(
2KB , 2T × Lsd
)
.
Remark 5: For a practical list size L = 8, we can set
B = 8, T ≤ 2, Lsd ≤ 4 for 8-bit parallel decoding, or B =
16, T ≤ 3, Lsd ≤ 8 for 16-bit parallel decoding to achieve
a good tradeoff between complexity and latency, yet with
negligible performance loss.
D. Latency analysis
The minimum number of cycles is analyzed with the
assumption that independent operations can be executed in
parallel. In reality, the latency will be different depending on
the number of processing elements available per implemen-
tation. However, the minimum cycle analysis represents the
number of logical steps and provides a hardware-independent
latency evaluation.
For an R1 node, the 13 error patterns in (4) are retrieved
from a pre-stored table, among which 8 are pre-selected
according to path metric. The 13 → 8 path sorting and
pruning logic is shown in Fig. 3. For simplicity, |αls,t| is
abbreviated by αt. All relevant LLR pairs are compared
in cycle 1. Among them, the first 3 pre-selected paths are
β ls,B,β
l
s,B ⊕ e0 and β
l
s,B ⊕ e1. The remaining paths are
sequentially selected according to the comparison results and
their preceding selection choices. Finally, the 8 candidate
paths are pre-selected and sorted by ascending order. The
process only requires 5 cycles.
Combining all sub-paths in an L = 8 list decoder, there
will be 8× 8 = 64 paths for another round of pruning. Since
the 8 sub-paths for each list are already ordered, the pruning
requires an additional 9 cycles to identify the 8 survival paths
[14]. The number of cycles are 14 and 15 for an R1 and SPC
node, respectively.
For comparison, fast-SSCL [9] requires 7 and 8 rounds of
path extension and pruning for a Rate-1 and an SPC node,
respectively. Each round takes a minimum of 7 cycles with
bitonic sort [14]. Overall, a minimum of 7 × 7 = 49 and
7× 8 = 56 cycles are required.
For general nodes, the proposed FSL decoder also has
lower latency since more bits are processed in parallel. The
6,
l
s
a ,
l
s
b
,0
,
ˆ l
s
b
,1
,
ˆ l
s
b
,3
,
ˆ l
s
b
,0,0
,
ˆ l
s
b
,0,1
,
ˆ l
s
b
,0,3
,
ˆ l
s
b
,3,0
,
ˆ l
s
b
,3,1
,
ˆ l
s
b
,3,3
,
ˆ l
s
b
,
,
l t
s
d syndrome
Syndrome table: offline calculated
Bits with T 
smallest LLRs
LLR vector for 
constituent block
Hard 
decision
Flipped vector 1
Flipped vector 2
Flipped vector 4
Error pattern 1,1
Error pattern 1,2
Error pattern 1,4
Error pattern 4,1
Error pattern 4,2
Error pattern 4,4
…
…
…
Sub-path 1
Sub-path 2
Sub-path M
Sub-path 3
Syndromes
Valid error patterns
(indicated by syndrome)
Low-weight 
error patterns
s
o
r
t
L survival 
paths
syndromes
Lsd error patterns
…
Error-pattern based parallel path extension:
0 1, ,e e 2
B K-
Fig. 2: Error pattern identification via syndrome decoding.
min(α0+α1,α2)
sel{α2,min(α0+α1,α3)}
sel{min(α0+α1,α4),min(α0+α2,α3)}
sel{min(α0+α1,α5),min(α0+α1,α4),min(α0+α1,α3)}
sel{min(α0+α1,α6),
min(α0+α2,α5)}
sel{min(α0+α3,α4),
min(α1+α2,α4),
min(α1+α2,α0+α3)}
min(α0+α1+α2,α3)
0, α0, α1
Survival path index Incremental path metrics and corresponding error patterns
0~2
3
4
5
6
7
Step 1: all comparisons
Step 2: select
Step 3: select
Step 4: select
Step 5: select
Fig. 3: Minimum cycle analysis for a rate-1 node.
TABLE III: Number of leaf nodes in an SC decoding tree
Decoder R0 Rep ML Gen SPC R1 Total
4b ML [11] 30 0 154 0 0 0 184
F-SSCL [9] 21 23 0 0 23 24 91
8b FSL 15 0 26 5 11 13 70
16b FSL 7 0 14 11 5 9 46
overall latency is influenced by two factors (i) the number
of leaf nodes in an SC decoding tree, (ii) the degree of
parallelism within a leaf node.
For a rough estimation, the number of leaf nodes of a
N = 1024,K = 512 Polar code is summarized in Table III.
The code is constructed by Polarization Weight (PW) [4]. For
all schemes, the frozen bits before the first information bit are
skipped. For R0/Rep/SPC/R1 nodes, the maximum length of
a parallel processing block is Bmax = 32. For general nodes,
the parallel processing length is 8-bit or 16-bit, denoted by 8b
and 16b FSL, respectively. As seen, 16b FSL only requires
to visit a half of nodes to traverse the SC decoding tree.
To determine real latency, we synthesized the proposed
decoders in TSMC 16nm CMOS with a frequency of 1GHz.
The maximum supported code length is Nmax = 16384, with
LLRs and path metrics quantized to 6 bits. The number of
processing elements is 128. The decoding latency of 4b multi-
bit [11], Fast-SSCL [9], 8b FSL and 16b FSL decoders is
measured at a code rate of 1/3. For N = 1024, the latency
is 1258ns, 1079ns, 870ns and 697ns, respectively. For N =
4096, the latency is 5134ns, 4239ns, 3640ns and 3003ns,
respectively. The latency reduction from [9], [11] is 35% ∼
TABLE IV: Comparison of decoding latency (ns)
N Rate 4b ML [11] Fast-SSCL [9] 8b FSL 16b FSL
1024
1/3 1258 1079 870 697
1/2 1577 1307 1016 776
4096
1/3 5134 4239 3640 3003
1/2 6585 4984 4399 3501
16384
1/3 21717 17230 15879 13461
1/2 27357 19839 18763 15305
Es/N0 (dB)
-2 -1.5 -1 -0.5 0
B
LE
R
10-4
10-3
10-2
10-1
100
Rate = 1/3
N=1024 SCL
N=1024 FSL
N=4096 SCL
N=4096 FSL
N=16384 SCL
N=16384 FSL
Es/N0 (dB)
0.5 1 1.5 2 2.5
B
LE
R
10-4
10-3
10-2
10-1
100
Rate = 1/2
N=1024 SCL
N=1024 FSL
N=4096 SCL
N=4096 FSL
N=16384 SCL
N=16384 FSL
Fig. 4: BLER comparison between SCL and FSL, both with
L = 8 and 16-bit CRC for final path selection.
45% and 29% ∼ 42%, respectively. As seen, even compared
with the most advanced SCL decoders [9], [11] in literature,
the proposed 8b and 16b FSL decoders can further reduce
latency. A detailed latency comparison is given in Table IV.
E. BLER performance
The BLER performance of an FSL decoder is simulated
and compared with its SCL decoder counterpart. For FSL, we
adopt 16-bit parallel processing with B = 16, T ≤ 3, Lsd ≤
8. We simulated a wide range of code rates and lengths, and
observe negligible performance loss. In the interest of space,
only code rates {1/2, 1/3} and lengths {1024, 4096, 16384}
are plotted in Fig. 4. Throughout the paper, 16 CRC bits are
appended to, but not included in, the K payload bits. The
code rate is calculated by K/N .
IV. IMPROVED CODE CONSTRUCTION
Based on the proposed FSL decoder, we propose two code
construction methods to further (i) reduce complexity and
7KB
0 5 10 15
Fr
eq
ue
nc
y
0
5
10
15
20
25
30
35
40
N=2048, K=1024, CRC16 (Original)
KB
0 5 10 15
0
5
10
15
20
25
30
35
40
N=2048, K=1024, CRC16 (Re-Adjusted)
Blocks with KB = 6,7,8
are eliminated
Fig. 5: Block rate distribution before and after information
bit re-adjustment.
(ii) improve performance. The first method re-adjusts the
information bit positions to avoid certain high-complexity
constituent code blocks. The second one replaces outer
constituent codes with optimized block codes to improve
BLER performance.
A. Adjusted Polar codes
The complexity of an FSL decoder mainly arises from the
size of syndrome tables. According to Section III-C, the size
of a syndrome table is (2B−KB)×Lsd for a constituent code
block with KB information bits and Lsd error patterns per
syndrome. According to Remark 4, a rate-dependent path
extension is adopted, where the maximum path extension
is min
(
2KB , 2T × Lsd
)
. In other words, high-complexity
operations are incurred by medium-rate blocks, while high-
rate or low-rate blocks can be processed with low complexity.
Thanks to the polarization effect, most blocks will diverge
to high or low rates as code length increases, which is helpful.
In the following, we show that, even for finite-length codes
with insufficient polarization, we can deliberately eliminate
some of the medium-rate blocks by re-adjusting their infor-
mation bit positions.
For example, a 16-bit parallel FSL decoder with B = 16,
T = 3 and Lsd = 8 is used to decode a N = 2048,K =
1024, CRC16 Polar code. The original block rate distribution
is shown on the left side of Fig. 5. As seen, many code blocks
have already polarized to either high rate or low rate. Among
the medium rate blocks, those with KB = 6 are responsible
for the majority decoding complexity (syndrome table size
1024). However, there are only 3 such blocks. On the right
side of Fig. 5, we eliminate blocks with KB = 6, 7 and 8 by
re-allocating their information bits to blocks with lower and
higher rates. Although the adjusted Polar codes deviate from
the actual polarization, which implies performance loss, they
demand much lower decoding complexity. In particular, the
largest syndrome table size reduces from 1024 to 128, with
the information re-adjustment in Fig. 5.
Algorithm 1 formalizes the above mentioned re-adjustment
procedures. The input parameters are K lowB and K
high
B
,
which indicate that rates between
Klow
B
B
and
K
high
B
B
are to be
eliminated. The algorithm first constructs an original Polar
codes and determine the number of information bits KB in
each constituent code block. If a block has K lowB < KB <
Khigh
B
, the algorithm either adds or removes information bits
within the block, until its rate RB satisfies RB ≥
K
high
B
B
or
RB ≤
Klow
B
B
. Once the rate of a block is adjusted, another
block has to change its rate accordingly to ensure that overall
code rate remains unchanged.
Algorithm 1 An information bit re-adjustment algorithm
Input: N,K,B, I,K lowB ,K
high
B
; Output: Iadj
1) Re-adjust to eliminate medium-rate block.
for each block with K lowB < KB < K
high
B
do
if KB −K
low
B < K
high
B
−KB then
Reduce KB to K
′
B = K
low
B
else
Increase KB to K
′
B = K
high
B
end if
end for
2) Balance overall rate when necessary.
for each block with K ′B = K
low
B (or K
′
B = K
high
B
) do
while Total # info. bits
∑
K ′B > K (or < K) do
Reduce K ′B to K
′
B = K
low
B − 1
(or Increase K ′B to K
′
B = K
high
B
+ 1)
end while
end for
3) Select information bits.
for each constituent code block do
Select K ′B most reliable bit positions to Iadj
end for
Fig. 6 shows the performance of N = 2048,K = 1024
Polar codes and Adjusted Polar codes under both SCL and
FSL decoders with L = 8. For Adjusted Polar codes, a 16b
FSL decoder (B = 16) is implemented, and blocks with
KB = 6, 7 and 8 are eliminated. The syndrome table size thus
reduces from 1024 to 128. The BLER loss due to information
bit re-adjustment is only 0.02dB at BLER 1%. The same
experiment is conducted for N = 8192,K = 4096, whereas
the performance loss becomes negligible as shown in Fig. 7.
This can be well explained: medium-rate blocks reduce as
polarization increases with code length, thus requiring less
re-adjustment and incurring less performance loss.
The proposed construction allows us to trade some per-
formance for significant complexity reduction, thus bears
practical importance.
B. Optimized outer codes
Observe that the proposed hard-input decoder for outer
block codes is no longer an SC decoder, but similar to
an ML decoder. However, the default Polar outer codes
have poor minimum distance and may not be suitable for
the proposed decoder. To obtain a better performance, a
straightforward idea is to adopt outer codes with optimized
distance spectrum.
8Es/N0 (dB)
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2
BL
ER
10-4
10-3
10-2
10-1
100
N=2048, K=1024 (CRC16), QPSK/AWGN
Adjusted Polar, FSL
Adjusted Polar, SCL
Polar, SCL
Fig. 6: Comparison between Polar codes under SCL decoder
(L = 8) and Adjusted-Polar codes under both SCL and FSL
decoders (L = 8, B = 16) with code length N = 2048.
Es/N0 (dB)
0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7
BL
ER
10-4
10-3
10-2
10-1
100
N=8192, K=4096 (CRC16), QPSK/AWGN
Adjusted Polar, FSL
Adjusted Polar, SCL
Polar, SCL
Fig. 7: Comparison between Polar codes under SCL decoder
(L = 8) and Adjusted-Polar codes under both SCL and FSL
decoders (L = 8, B = 16) with code length N = 8192.
Note that the error-pattern-based decoders do not need to
change at all. As long as the generator/parity-check matrices
are defined, the outer decoders only need to update the error
patterns according to that specific code. That means any
linear block codes fit well into the FSL decoding framework,
offering full freedom to optimize the outer codes.
For B = 16, we present a specific outer code design for
each KB. For example, K = 2 simplex codes repeated to
length-16 have a minimum distance 10, which is larger than
8 of (16, 2) Polar codes. Following this idea, we individually
optimize each (B,KB) outer codes with respect to code
distance.
For KB = 2, 3, 4, repetition over simplex codes always
yields higher code distance than the corresponding Polar
codes. Their respective generator matrices GKB are
G2 =
[
S2 S2 S2 S2 S2
1
1
]
, S2 =
[
1 1 0
1 0 1
]
;
G0
G0
G16
G7
...
...
Hybrid outer 
coding
Inner 
polarization
1 0
1 1
n
G
Ä
é ù
= ê ú
ë û
Payload
Codeword
Fig. 8: A Hybrid-Polar encoding flow.
G3 =

S3 S3
1 1
1 1
1 0

 , S3 =

1 1 1 1 0 0 01 1 0 0 1 1 0
1 0 1 0 1 0 1

 ;
G4 =

S4
1
1
1
1

 , S4 =


1 1 1 1 1 1 1 1 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0 1 1 1 1 0 0 0
1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

 .
For KB = 6, 7, extended BCH (eBCH) codes also yields
better distance spectrum than the corresponding Polar codes.
Their respective generator matrices GKB are
G6 =


0 0 1 0 0 1 0 0 1 1 1 0 1 0 0 0
1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0
1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0
1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


;
G7 =


0 1 1 1 0 0 1 0 0 0 1 0 1 0 0 0
0 0 1 0 0 1 0 0 1 1 1 0 1 0 0 0
1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0
1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0
1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


.
For KB = 8, 9, the dual of eBCH codes are adopted; for
KB = 12, 13, 14, the dual of simplex codes are adopted. For
the remaining rates, the original Polar codes are adopted.
Depending on KB, the outer codes are combination of
different codes, or hybrid outer codes. The resulting concate-
nated codes are thus called hybrid-Polar codes. Note that the
lengths of the outer codes are not necessarily power of 2,
making the concatenated codes length compatible.
The encoding steps are shown in Fig. 8, and explained as
follows:
1) First, an original (N,K) Polar code is constructed, in
order to determine the rate of each (B,KB) outer code.
2) Second, each block is individually encoded, i.e., mul-
tiplying a length-KB information vector by the corre-
sponding generator matrix.
3) Thirdly, the outer code words are concatenated into a
long intermediate vector, upon which inner polarization
is performed to obtain a single code word.
9Es/N0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
BL
ER
10-3
10-2
10-1
100
N=256, K=108 (CRC16), QPSK/AWGN
Polar, SCL
Hybrid-Polar, FSL
Fig. 9: Comparison between Polar codes under SCL decoder
(L = 8) and Hybrid-Polar codes under FSL decoder (L =
8, B = 16) with code length N = 256.
Es/N0 (dB)
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2
BL
ER
10-4
10-3
10-2
10-1
100
N=1024, K=496 (CRC16), QPSK/AWGN
Polar,SCL
Hybrid-Polar, FSL
Fig. 10: Comparison between Polar codes under SCL decoder
(L = 8) and Hybrid-Polar codes under FSL decoder (L =
8, B = 16) with code length N = 1024.
The proposed outer codes have better distance spectrum
than Polar codes. The code weights {w} are enumerated in
Table V. The numbers of code words having a specific weight
are displayed, and those of minimum weight are highlighted
in boldface. As seen, the distance spectrum of the hybrid
codes improves upon Polar codes with the same KB in two
ways:
• The minimum distance increases, e.g., KB = 2, 6, 7;
• The minimum distance remains the same, but the num-
ber of minimum-weight codewords reduces, e.g., KB =
3, 4, 9, 10, 12, 13, 14.
Fig. 9 and Fig. 10 show the performance of N = 256 and
N = 1024 Polar codes, respectively, along with hybrid-Polar
codes of the same length and rate. As seen, a performance
gain between 0.1 ∼ 0.2 dB is demonstrated.
Since such BLER improvement comes with no additional
cost within the FSL decoder architecture, the Hybrid-Polar
codes is considered worthwhile in practical implementations.
V. CONCLUSIONS
In this work, we propose the hardware architecture of a
flip-syndrome-list decoder to reduce decoding latency with
improved parallelism. A limited number of error patterns
are pre-stored, and simultaneously retrieved for bit-flipping-
based path extension. For R1 and SPC nodes, only 13 error
patterns are pre-stored with no performance loss under list
size L = 8; for general nodes, we may further reduce latency
with a syndrome table to quickly identify a set of highly
likely error patterns. Based on the decoder, two code con-
struction optimizations are proposed to either further reduce
complexity or improve performance. The proposed decoder
architecture and code construction are designed particularly
for applications with low-latency requirements.
REFERENCES
[1] E. Arikan, “Channel polarization: A method for constructing capacity-
achieving codes for symmetric binary-input memoryless channels?”
IEEE Trans. Inf. Theory, vol. 55, no. 7, pp. 3051–3073, Jul. 2009.
[2] N. Stolte, “Recursive codes with the Plotkin-Construction and their
Decoding”, Ph.D. dissertation, University of Technology Darmstadt,
Germany.
[3] P. Trifonov, “Efficient design and decoding of polar codes”, IEEE
Transactions on Communications vol. 60, no. 11 pp. 3221–3227, Nov.
2012.
[4] G. He et al., “β-expansion A Theoretical Framework for Fast and
Recursive Construction of Polar Codes”, in Proc IEEE Globecom, Dec.
2017.
[5] K. Niu and K. Chen, ”CRC-aided decoding of polar codes”, IEEE
Communications Letters, vol. 16, no. 10 pp. 1668–1671, Oct. 2012.
[6] I. Tal, and A. Vardy, “List decoding of polar codes”, IEEE Trans. Inf.
Theory, vol. 61, no. 5 pp. 2213–2226, May 2015.
[7] A. Alamdar-Yazdi and F. Kschischang, “A simplified successive-
cancellation decoder for polar codes”, IEEE Communications Letters,
vol. 15, no. 12, pp. 1378–1380, Dec. 2011.
[8] S. Hashemi, C. Condo and W. Gross, “Simplified successive-
cancellation list decoding of Polar codes”, in Proc. IEEE ISIT, July
2016.
[9] S. Hashemi, C. Condo and W. Gross, “Fast and Flexible Successive-
Cancellation List Decoders for Polar Codes”, IEEE Transactions on
Signal Processing, vol. 65, no. 21, pp. 5756–5769, Nov. 2017.
[10] G. Sarkis, P. Giard, A. Vardy, C. Thibeault and W. Gross, “Fast polar
decoders: Algorithm and implementation”, IEEE Journal on Selected
Areas in Communications, vol. 32, no. 5, pp. 946–957, May 2014.
[11] B. Yuan and K. Parhi, “Low-latency successive-cancellation list de-
coders for polar codes with multibit decision”, IEEE Transactions on
Very Large Scale Integration (VLSI) Systems, vol. 23, no. 10, pp. 2268–
2280, Oct. 2015.
[12] B. Li, H. Shen and K. Chen, “A Decision-Aided Parallel SC-List
Decoder for Polar Codes”, arXiv preprint arXiv:1506.02955 (2015).
[13] Z. Zhang, L. Zhang, X. Wang, C. Zhong and H. Poor, “A split-reduced
successive cancellation list decoder for polar codes”, IEEE Journal on
Selected Areas in Communications, vol. 34, no. 2, pp. 292–302, Feb.
2016.
[14] J. Lin and Z. Yan, “Efficient list decoder architecture for Polar codes”,
in Proc IEEE ISCAS, June 2014.
[15] Y. Fan, J. Chen, C. Xia, C. Tsui, J. Jin, H. Shen and B. Li, “Low-
latency list decoding of Polar codes with double thresholding”, in Proc
IEEE ICASSP, April 2015.
[16] A. Balatsoukas-Stimming, M. Parizi and A. Burg, “LLR-based succes-
sive cancellation list decoding of polar codes”, in Proc IEEE ICASSP,
May 2014.
[17] G. Sarkis, P. Giard, A. Vardy, C. Thibeault and W. J. Gross, “Fast
List Decoders for Polar Codes”, in IEEE Journal on Selected Areas in
Communications, vol. 34, no. 2, pp. 318–328, Feb. 2016.
[18] “Chairman’s notes: RAN1”, 3GPP TSG RAN WG1 NR #86bis Meet-
ing, Lisbon, Portugal, 10th–14th Oct. 2016.
[19] X. Liu, Q. Zhang, P. Qiu, J. Tong, H. Zhang, C. Zhao and J. Wang, “A
5.16Gbps decoder ASIC for Polar Code in 16nm FinFET”, International
Symposium on Wireless Communication Systems, April 2018.
10
TABLE V: Comparison of distance spectrum between outer codes: Polar vs Hybrid
Code KB, {w} 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Polar 2 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 1
Simplex 2 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 0
Polar 3 0 0 0 0 0 0 0 6 0 0 0 0 0 0 0 1
Simplex 3 0 0 0 0 0 0 0 1 4 2 0 0 0 0 0 0
Polar 4 0 0 0 0 0 0 0 14 0 0 0 0 0 0 0 1
Simplex 4 0 0 0 0 0 0 0 7 8 0 0 0 0 0 0 0
Polar 5 0 0 0 0 0 0 0 30 0 0 0 0 0 0 0 1
Polar 6 0 0 0 4 0 0 0 54 0 0 0 4 0 0 0 1
eBCH 6 0 0 0 0 0 16 0 30 0 16 0 0 0 0 0 1
Polar 7 0 0 0 12 0 0 0 102 0 0 0 12 0 0 0 1
eBCH 7 0 0 0 0 0 48 0 30 0 48 0 0 0 0 0 1
Polar 8 0 0 0 28 0 0 0 198 0 0 0 28 0 0 0 1
Polar 9 0 0 0 44 0 64 0 294 0 64 0 44 0 0 0 1
Dual of eBCH 9 0 0 0 20 0 160 0 150 0 160 0 20 0 0 0 1
Polar 10 0 0 0 76 0 192 0 486 0 192 0 76 0 0 0 1
Dual of eBCH 10 0 0 0 60 0 256 0 390 0 256 0 60 0 0 0 1
Polar 11 0 0 0 140 0 448 0 870 0 448 0 140 0 0 0 1
Polar 12 0 8 0 252 0 952 0 1670 0 952 0 252 0 8 0 1
Dual of Simplex 12 0 1 42 133 252 469 750 835 680 483 294 119 28 7 2 0
Polar 13 0 24 0 476 0 1960 0 3270 0 1960 0 476 0 24 0 1
Dual of Simplex 13 0 11 82 233 516 1003 1470 1595 1400 1017 558 219 68 17 2 0
Polar 14 0 56 0 924 0 3976 0 6470 0 3976 0 924 0 56 0 1
Dual of Simplex 14 0 35 150 425 1100 2051 2810 3195 2920 1985 1066 475 140 25 6 0
Polar 15 0 120 0 1820 0 8008 0 12870 0 8008 0 1820 0 120 0 1
[20] W. Huffman and V. Pless. “Fundamentals of error-correcting codes”,
Cambridge university press, 2010.
