LLR-based Successive Cancellation List Decoding of Polar Codes by Balatsoukas-Stimming, Alexios et al.
LLR-BASED SUCCESSIVE CANCELLATION LIST DECODING OF POLAR CODES
Alexios Balatsoukas-Stimming, Mani Bastani Parizi, Andreas Burg
E´cole Polytechnique Fe´de´rale de Lausanne, CH-1015 Ecublens, Switzerland
ABSTRACT
We present an LLR-based implementation of the successive cancel-
lation list (SCL) decoder. To this end, we associate each decoding
path with a metric which (i) is a monotone function of the path’s like-
lihood and (ii) can be computed efficiently from the channel LLRs.
The LLR-based formulation leads to a more efficient hardware im-
plementation of the decoder compared to the known log-likelihood
based implementation. Synthesis results for an SCL decoder with
block-length of N = 1024 and list sizes of L = 2 and L = 4 con-
firm that the LLR-based decoder has considerable area and operating
frequency advantages in the orders of 50% and 30%, respectively.
Index Terms— Successive Cancellation List Decoder, Polar
Codes, Hardware Implementation
1. INTRODUCTION
Polar codes are a class of capacity-achieving error correcting codes
with low-complexity encoding and decoding algorithms [1]. Specif-
ically, the successive cancellation (SC) decoder has a structured na-
ture which makes its hardware implementation attractive [2, 3, 4].
Moreover, the SC decoder can be implemented in a numerically ac-
curate and stable way by representing the involved transition proba-
bilities as log-likelihood ratios (LLRs).
Successive cancellation list (SCL) decoding was introduced in
[5] to improve the finite block-length performance of polar codes.
However, the original description of the SCL decoder is given in
terms of the path likelihoods which is not suitable for practical im-
plementation. Hence, the first hardware architecture for SCL decod-
ing [6], uses log-likelihoods (LLs) to partially overcome the numer-
ical stability issues. Unfortunately, the decoder still requires very
large and irregular memory elements and processing elements that
support large bit-widths, which induce a high cost in terms of hard-
ware resources.
Contribution and Paper Outline: In this paper, we propose an
LLR-based formulation of the SCL decoder and show that such a
formulation can significantly improve the hardware architecture of
[6] by solving all the aforementioned implementation problems. In
Section 2, after a brief review of the SCL decoding algorithm, we
introduce a path-metric which is iteratively updated as a function of
the LLR of the bit being decoded given the past trajectory of the de-
coding path and its (tentative) value. We then prove that this metric is
a monotone function of the path’s likelihood which yields the LLR-
based formulation of the SCL decoder. In Section 3, we provide a
short review of the SCL decoder hardware architecture of [6] and
compare the synthesis results for the LL- and LLR-based decoders.
Notation: Throughout this paper the boldface letters denote
vectors. The elements of a vector x are denoted as xi. By xm
we mean the sub-vector [x0, x1, . . . , xm]T if m ≥ 0 and the null
vector otherwise. If I = {i1, i2, . . . } is a set of indices (such that
i1 < i2 < . . . ), then xI denotes the sub-vector [xi1 , xi2 , . . . ]
T .
2. SC LIST DECODING OF POLAR CODES
A polar code of rate R < 1 and block length N = 2n is constructed
by ‘appropriately’ choosing a subsetA ⊂ {0, 1, . . . , N − 1} of car-
dinality |A| = NR called the information indices. The transmitter
then constructs the vector u ∈ {0, 1}N by putting the NR data
bits on uA and fixing uF , where F , {0, 1, . . . , N − 1} \ A, to
known-to-receiver frozen bits. Subsequently, a codeword x = Gu
is computed and sent over the channel.1.
The receiver observes a noisy version of x denoted as y and has
to decode x or, equivalently, u. Since the sub-vector uF is already
known, the decoder’s task reduces to estimating uA. To this end,
Arıkan proposes the SC decoding procedure summarized in Algo-
rithm 1. In line 5 of Algorithm 1, W (i)n (y,ui−1|ui) represents the
likelihood of ui given the channel output y and ui−1 considering
ui+1, ui+2, . . . , uN−1 as unknown bits.
Algorithm 1: Successive Cancellation Decoding [1].
1 for i = 0, 1, . . . , N − 1 do
2 if i 6∈ A then
3 uˆi ← ui ; // Frozen Bits
4 else
5 uˆi ← argmaxui∈{0,1}W
(i)
n (y, uˆ
i−1|ui) ;
6 return uˆA ;
SC decoding is sub-optimal since at each step i ∈ A the de-
coder ignores the information it possesses about the future frozen
bits {uj : j > i, j ∈ F}. In return for this sub-optimality, the
likelihoods W (i)n (y, uˆi−1|ui) can be computed efficiently and the
decoding complexity scales like O(N logN). Furthermore, Arıkan
shows that, as long as the code rate is below the channel capacity,
there exists A ⊂ {0, 1, . . . , N − 1} for which the block-error prob-
ability of this scheme vanishes as N increases [1, 7].
2.1. SC List Decoding
Unfortunately, the sub-optimality of SC decoding is still significant
in small-to-moderate block-lengths used in practice. A successive
cancellation list (SCL) decoder has been proposed in [5] to partially
compensate for this sub-optimally.
In short, SCL decoding is carried out as follows: at each decod-
ing step i ∈ A, instead of fixing a decision on ui (line 5 of Algo-
rithm 1), two decoding paths corresponding to either possible value
of ui are created and decoding is continued in two parallel decod-
ing threads. In order to avoid the exponential growth of the number
of decoding paths, at each step only the L most likely paths are re-
tained. Finally, the decoder will end up with a list ofL candidates for
1The structure of G permits this computation to be done using
O(N logN) binary additions.
uA out of which the most likely one is declared as the final estimate,
uˆA. This procedure is described in Algorithm 2.
Algorithm 2: SC List Decoding [5].
1 for i = 0, 1, . . . , N − 1 do
2 if i 6∈ A then
3 uˆi[`]← ui, ∀` ∈ {0, 1, . . . , L− 1};
4 else
5 if Less than L paths are active then
6 Duplicate all the paths and continue with both
possible values of ui;
7 else
8 Sort the likelihoods
{
W
(i)
n (y, uˆ
i−1[`]|ui) : ∀` ∈
{0, 1, . . . , L− 1}, ∀ui ∈ {0, 1}
}
;
9 Continue along the L most likely paths;
10 `∗ ← The index of the most-likely path;
11 return uˆA[`∗];
Simulation results show that, with relatively small list sizes, the
SCL decoder’s performance is very close to the optimal ML decoder.
More importantly, with a clever choice of the data structures, SCL
decoding can be done in O(LN logN) time complexity [5].
2.2. LLR-based Path Metric Computation
Algorithms 1 and 2 are both valid high-level descriptions of SC and
SCL decoding, respectively. However, implementing the decoders
using the likelihoods directly is risky as likelihoods become very
small numbers and the decoder will be prone to underflow errors.
A practical SC decoder, therefore, receives the channel output
in the form of LLRs (ln W (yi|0)
W (yi|1) , i ∈ {0, 1, . . . , N − 1} where
W (y|x), x ∈ {0, 1} is the channel transition probability) and com-
putes the decision LLRs, L(i)n , ln W
(i)
n (y,uˆ
i−1|0)
W
(i)
n (y,uˆi−1|1)
, which are suffi-
cient statistics for decisions in line 5 of Algorithm 1. Furthermore,
the computations are numerically stable and the decoding involves
O(N logN) arithmetic operations in total [1, Section VIII].
For the SCL decoder, however, the LLRs seem to be insuf-
ficient for choosing the L most likely paths in line 8 of Algo-
rithm 2. In [5] the decoder, therefore, computes a scaled version
of the pair of likelihoods W (i)n (y, uˆi−1[`]|ui), ui ∈ {0, 1} assum-
ing the channel output is provided in the form of pairs of likelihoods
(W (yi|xi), xi ∈ {0, 1}, i ∈ {0, 1, . . . , N − 1}). In order to avoid
the underflows, all likelihoods are scaled by a common factor at each
intermediate step of the computations. This normalization step is cir-
cumvented in the hardware implementation of [6] by performing the
computations in the log-likelihood domain.
Luckily, it turns out that the decoding paths can also be ordered
according to their likelihoods using only the decision LLRs and the
past trajectory of each path as we shall demonstrate in the following.
Theorem 1. For each path ` ∈ {0, 1, . . . , L − 1} and each step
i ∈ {0, 1, . . . , N − 1} let the path-metric be defined as:
PM
(i)
` ,
∑i
j=0 ln
(
1 + e−(1−2uˆj [`])·L
(j)
n [`]
)
, (1)
where L(i)n [`] = ln W
(i)
n (y,uˆ
i−1[`]|0)
W
(i)
n (y,uˆi−1[`]|1)
.
If all the information bits are uniformly distributed in {0, 1}, for
any pair of paths `, `′ ∈ {0, 1, . . . , L− 1},
W (i)n (y, uˆ
i−1[`]|uˆi[`]) < W (i)n (y, uˆi−1[`′]|uˆi[`′])
if and only if
PM
(i)
` > PM
(i)
`′ .
In view of Theorem 1, one can implement the SCL decoder us-
ing L parallel low-complexity and stable LLR-based SC decoders
as the underlying building blocks and, in addition, keep track of L
path-metrics. The metrics can be updated iteratively as the decoder
proceeds according to
PM
(i)
` = PM
(i−1)
` + ln
(
1 + e−(1−2uˆi[`])L
(i)
n [`]
)
. (2)
Any comparison of the likelihoods of the paths can be done equiva-
lently using the values of the path-metrics.
Recall that the SC decoder’s decisions (in line 5 of an LLR-
based implementation of Algorithm 1) would be uˆi = δ(L
(i)
n ) where
δ(x) = 1
2
(1− sign (x)). Moreover, (2) is well-approximated as
PM
(i)
` ≈
{
PM
(i−1)
` if uˆi[`] = δ(L
(i)
n [`]),
PM
(i−1)
` + |L(i)n [`]|, otherwise.
(3)
Hence, our metric has a natural interpretation: If at step i, the `-th
path does not follow the direction δ(L(i)n [`]), it will be penalized by
an amount of ≈ |L(i)n [`]| (which is the reliability of L(i)n [`]).
We devote the rest of this section to prove Theorem 1.
Lemma 2. If Ui is uniformly distributed in {0, 1}, then,
W
(i)
n (y,u
i−1|ui)
P[Ui = ui|Y = y] = 2P[Y = y].
Proof. Since P[Ui = ui] = 12 for ∀ui ∈ {0, 1},
W
(i)
n (y,u
i−1|ui)
P[Ui = ui|Y = y] =
P[Y = y,Ui = ui]
P[Ui = ui]P[Ui = ui|Y = y]
=
P[Y = y]P[Ui = ui|Y = y]
P[Ui = ui]P[Ui = ui|Y = y] = 2P[Y = y].
Proof of Theorem 1. We show that
PM
(i)
` = − ln
(
P[Ui = uˆi[`]|Y = y]
)
(4)
Having shown (4), Theorem 1 will follow as an immediate corol-
lary to Lemma 2 (since the channel output y is fixed for all de-
coding paths). Since the path index ` is fixed on both sides of (1)
we will drop it in the sequel. Let µ(u) , 1 − 2u and Λ(i)n ,
W
(i)
n (y,uˆ
i−1|0)
W
(i)
n (y,ui−1|1)
= P[Y=y,U
i−1=uˆi−1,Ui=0]
P[Y=y,Ui−1=uˆi−1,Ui=1]
(the last equality fol-
lows since P[Ui = 0] = P[Ui = 1]), and observe that showing
(4) is equivalent to proving
P[Ui = uˆi|Y = y] =
i∏
j=0
(
1 + (Λ(j)n )
−µ(uˆj))−1. (5)
Now we have
P[Y = y,Ui−1 = uˆi−1] =
∑
uˆi∈{0,1}
P[Y = y,Ui = uˆi]
= P[Y = y,Ui = uˆi]
(
1 + (Λ(i)n )
−µ(uˆi))
Therefore,
P[Y = y,Ui = uˆi]
=
(
1 + (Λ(i)n )
−µ(uˆi))−1P[Y = y,Ui−1 = uˆi−1]. (6)
Repeated application of (6) (for i− 1, i− 2, . . . , 0) yields
P[Y = y,Ui = uˆi] =
i∏
j=0
(
1 + (Λ(j)n )
−µ(uˆi))−1P[Y = y].
Dividing both sides by P[Y = y] proves (5).
3. SCL DECODER HARDWARE ARCHITECTURE
In the SCL decoder (SCLD) hardware architecture of [6] the SC de-
coder computations are implemented using pairs of log-likelihoods
(LLs), lnW (i)n (y,ui−1|u), u ∈ {0, 1}. LLs provide some numer-
ical stability and reduce the dynamic range of the involved quanti-
ties so that a fixed-point implementation is feasible, although a large
number of quantization bits is still required for good performance. In
this section, we present area and timing results for an LLR-based im-
plementation of [6] in order to highlight the significant area savings
and throughput gains that can be achieved by exploiting Theorem 1.
3.1. LL-based SCLD Hardware Architecture
The LL-based SCLD hardware architecture presented in [6] is
mainly comprised of three components, namely the metric com-
putation unit (MCU), the path selection component, and the state
memories component. The MCU consists of L arrays of process-
ing elements (PEs), which implement L parallel LL-based SC de-
coders. The state memories store the LLs, the paths uˆ[`], as well
as the partial sums required by the SC decoders. The path se-
lection component is responsible for sorting the 2L LL values,{
lnW
(i)
n (y, uˆ
i−1[`]|ui) : ∀` ∈ {0, 1, . . . , L − 1},∀ui ∈ {0, 1}
}
and choosing the L most likely paths to follow in each step.
For a low-complexity hardware implementation, the LLs used
by the SC decoders have to be quantized. Specifically, in [6] the
channel LLs are quantized using an unsigned fixed-point represen-
tation with QLLi integer and Q
LL
f fractional bits. The LL-based SC
update rules involve additions of LLs [6], which all have the same
sign. Thus, in order to prevent catastrophic overflows, at each SC
decoding stage the number of integer bits is increased by one. There-
fore, the LLs at each stage s = 0, . . . , n, must be represented using
QLLi + s integer bits. This necessity leads to large and very irregular
LL memories, which are not well-suited for hardware implementa-
tion. Moreover, the PEs and the metric sorter need to support com-
putations with the maximum bit-width, i.e.,QLL = QLLi +Q
LL
f +n
bits. It has been shown in [6] that in total
BLLtot = (2L+ 2)NQ
LL + 2L(2N − n−QLL − 2) (7)
bits are required for storing the LLs in this scheme.
3.2. LLR-based SCLD Hardware Architecture
For the LLR-based implementation, the MCUs are modified to im-
plement the Min-Sum LLR-based SC update rules [3, Section VI].
Moreover, the path selection unit implements the approximated path-
metric update-rule of (3). Since the rule is iterative, the path se-
lection unit contains a memory with L storage locations to store
PM
(i)
` , ` ∈ {0, 1, . . . , L − 1}. The state memory component is
modified to support the following quantization scheme of the LLRs.
LLRs are quantized using a signed fixed-point representation
with QLLRi integer and Q
LLR
f fractional bits. The total number of
bits per LLR is QLLR = QLLRi + Q
LLR
f + 1. Since the LLRs are
signed quantities and the Min-Sum update rules for SC decoding
0 0.5 1 1.5 2 2.5 3 3.5 4
10−5
10−4
10−3
10−2
10−1
100
Eb/N0 (dB)
FE
R
SC Decoder, Approx., QLLRi = 5, Q
LLR
f = 0
SC Decoder, Exact, Floating Point
L = 2, LLR-based, Approx., QLLRi = 5, Q
LLR
f = 0
L = 2, LL-based, Approx., QLLi = 4, Q
LL
f = 0, [6]
L = 2, LLR-based, Exact, Floating-Point
L = 4, LLR-based, Approx., QLLRi = 5, Q
LLR
f = 0
L = 4, LL-based, Approx., QLLi = 4, Q
LL
f = 0, [6]
L = 4, LLR-based, Exact, Floating-Point
Fig. 1. The performance of floating point LLR-based vs. fixed-point
LLR-based and fixed-point LL-based SCL decoders (and that of SC
decoder for comparison).
involve both additions and subtractions, the dynamic range of the
LLRs used in LLR-based SCLD is generally smaller than that of the
LLs used in the LL-based SCLD. Thus, one intuitively expects over-
flows to happen less frequently and that there is no need to increase
the word size by one bit per decoding stage. This intuition is con-
firmed by our simulation results. This leads to an LLR memory with
a fixed word size, which is more suitable for hardware implemen-
tation than the irregular memory required by the LL-based decoder.
We can guarantee that there will be no overflows in the path met-
rics by using an unsigned representation with QLLRi +n integer and
QLLRf fractional bits yielding totally Q
M = QLLRi +Q
LLR
f +n bits
per path-metric. It turns out that, in practice, much fewer bits are
sufficient. Using an approach identical to that of [6, Section IV.C]
one can verify that totally (N+(N−1)L)QLLR bits will be needed
to store the LLRs. Adding theLQM bits for storing the path-metrics,
we see that this implementation will require
BLLRtot = (N + (N − 1)L)QLLR + LQM (8)
bits for the storage of LLRs and path metrics in total.
3.3. Implementation Results
In Fig. 1 the frame error rate (FER) of an LLR-based floating-point
implementation of an SCL decoder with exact SC decoding and met-
ric update rules is compared to that of a fixed-point implementation
of an SCL decoder with approximated SC and metric update rules
for an N = 1024 polar code of rate 1/2 over a BAWGN channel2.
The fixed-point LLR-based decoder uses QLLRi = 5 and
QLLRf = 0 for quantizing the LLRs. Although 15 bit for quantiz-
ing each path-metric PM(i)` are required to guarantee no overflows,
2The code is optimized for Eb/N0 = 2dB and constructed using the
Monte-Carlo method of [1, Section IX].
L BLLtot (7) B
LLR
tot (8) Reduction
2 32, 736 bits 22, 532 bits 31.2 %
4 57, 280 bits 38, 920 bits 32.1 %
8 106, 368 bits 71, 696 bits 32.6 %
16 204, 544 bits 137, 248 bits 32.9 %
Table 1. Memory requirement for LL- and LLR-based decoder for
N = 1024, QLL = 4, QLLR = 6 and QM = 8.
Cell Area LLR-based LL-based Reduction
List Size L = 1 ([3, Table IV], scaled to 90 nm)
Total 0.592 mm2 n/a n/a
Memory 0.554 mm2 n/a n/a
MCU 0.034 mm2 n/a n/a
Other 0.004 mm2 n/a n/a
List Size L = 2
Total 0.977 mm2 1.668 mm2 41.4%
Memory 0.749 mm2 1.126 mm2 33.4%
MCU 0.205 mm2 0.504 mm2 59.3%
Path Selection 0.001 mm2 0.002 mm2 50.0%
Other 0.022 mm2 0.036 mm2 38.8%
List Size L = 4
Total 1.743 mm2 3.708 mm2 53.0%
Memory 1.303 mm2 2.348 mm2 44.5%
MCU 0.348 mm2 0.984 mm2 64.6%
Path Selection 0.019 mm2 0.030 mm2 36.7%
Other 0.073 mm2 0.346 mm2 78.9%
Table 2. Cell area breakdown of the LL- and LLR-based SCL de-
coders for an N = 1024 polar code and that of the SC decoder
implementation of [3] for comparison.
QM = 8 bits are sufficient; simulated performance for QM = 8 and
QM = 15 are the same (while setting QM = 7 degrades the per-
formance). We observe that, using the aforementioned parameters,
the performance loss of the fixed-point implementation with approx-
imated updated rules is minimal with respect to that of the floating-
point implementation with exact update rules. Moreover, the FER
of this fixed-point implementation can be seen to be exactly equal to
that of an LL-based fixed-point implementation of SCL decoding in
[6] using QLLi = 4 and Q
LL
f = 0. Thus, the following comparison
between the two decoders is fair in terms of the FER.
Using these parameters, the PEs in the LL-based decoder have
to support bit-widths of up to QLLi +Q
LL
f + 10 = 14 bits, whereas
for the LLR-based decoder supporting QLLRi +Q
LLR
f + 1 = 6 bits
suffice. Moreover, the sorting metric in the former implementation
is 14 bits wide, while in the latter it is only QM = 8 bits wide.
We have compared the memory requirement of two implementations
(Equations (7) and (8)) as a function of the list size L in Table 1. We
observe that the LLR-based representation is advantageous in terms
of the memory requirements, in particular as the list-size increases3.
In Table 2 we compare synthesis area results for LL- and LLR-
based SCL decoders with L = 2 and L = 4. All designs were
synthesized using the same UMC 90nm library in the typical corner.
We observe that by using LLRs the total area is reduced by 41%
and 53% for an SCL decoder with L = 2 and L = 4, respectively.
In absolute numbers, the largest gain in both cases comes from the
reduced size of the state memories, which are the largest components
3Our simulation results confirm that the same quantization scheme leads
to a fair comparison for L = 8 and L = 16 as well.
LLR-based LL-based Speedup
List Size L = 1 ([3, Table IV])
Clock Frequency 500 MHz n/a n/a
Coded Throughput 246 Mbps n/a n/a
List Size L = 2
Clock Frequency 558 MHz 427 MHz 30.7%
Coded Throughput 220 Mbps 168 Mbps 30.7%
List Size L = 4
Clock Frequency 412 MHz 386 MHz 6.7%
Coded Throughput 162 Mbps 152 Mbps 6.7%
Table 3. Operating frequency results for the LL- and LLR-based
SCL decoders for an N = 1024 polar code and that of the SC de-
coder implementation of [3] for comparison.
of both the LL- and the LLR-based decoders. In relative numbers the
transition to LLRs is most beneficial for the MCU, which is reduced
in size by approximately 60% in both cases. This number is in line
with the approximately 60% reduction in bit-width of the quantities
involved in the computations (i.e., from 14 bits to 6 bits). The path
selection component benefits from the 40% bit-width reduction of
the path metrics with an average area reduction of the same order.
The corresponding post-synthesis timing results are presented
in Table 3. We observe that for L = 2, the LLR-based decoder can
achieve a 31% higher clock frequency than the corresponding LL-
based decoder. This significant improvement comes not only from
the bit-width reduction of the MCUs, but also from the highly re-
duces size of the LLR storage memory. In the L = 4 LLR-based
decoder the signal path with the highest delay in the hardware goes
through the metric sorter contained in the path selection component
and not through the MCU. The 40% bit-width reduction of the met-
rics helps the comparators used in the radix-2L sorter [6], but it does
not improve the logic gate tree that follows these comparators and
combines their results. Thus, the increase in clock frequency is 7%.
Remark. The transition to LLRs can significantly reduce the area
and increase the operating frequency of an implementation of SCL
decoding. To be fair, we mention one disadvantage of the iterative
metric update, namely that the simplified SC decoding proposed in
[8] can no longer be taken full advantage of, since the LLRs for all
bits are required to keep the metric updated. It can still be applied to
the first group of consecutive frozen bits, which is usually large.
4. CONCLUSION
In this paper, we derived an LLR-based implementation of the suc-
cessive cancellation list decoder using a path metric based on which
the paths can be ranked according to their likelihoods. This met-
ric can also be used for any other tree-search decoding algorithm
that compares the paths according to their likelihoods, such as SC
stack decoding [9]. Moreover, we demonstrated the advantages of
our implementation by comparing synthesis results for an LL- and
an LLR-based hardware successive cancellation list decoder archi-
tecture. Specifically, the decoder area was reduced by up to 53%
and the clock frequency was increased by up to 31%.
In addition to the gains in the hardware cost, most processing
blocks, such as channel equalizers and demodulators, in practical
receivers process data in the form of LLRs. Hence, the presented
implementation of decoder can readily be integrated in existing sys-
tems while the LL-based decoder would require an extra preprocess-
ing stage to convert the channel LLRs to LLs.
5. REFERENCES
[1] E. Arıkan, “Channel polarization: A method for construct-
ing capacity-achieving codes for symmetric binary-input mem-
oryless channels,” IEEE Transactions on Information Theory,
vol. 55, no. 7, pp. 3051–3073, July 2009.
[2] C. Leroux, I. Tal, A. Vardy, and W. Gross, “Hardware archi-
tectures for successive cancellation decoding of polar codes,” in
2011 IEEE International Conference on Acoustics, Speech and
Signal Processing (ICASSP), 2011, pp. 1665–1668.
[3] C. Leroux, A. Raymond, G. Sarkis, and W. Gross, “A semi-
parallel successive-cancellation decoder for polar codes,” IEEE
Transactions on Signal Processing, vol. 61, no. 2, pp. 289–299,
January 2013.
[4] C. Zhang and K. Parhi, “Low-latency sequential and overlapped
architectures for successive cancellation polar decoder,” IEEE
Transactions on Signal Processing, vol. 61, no. 10, pp. 2429–
2441, March 2013.
[5] I. Tal and A. Vardy, “List decoding of polar codes,” in Proceed-
ings of IEEE International Symposium on Information Theory
(ISIT), 2011, 2011, pp. 1–5.
[6] A. Balatsoukas-Stimming, A. Raymond, W. Gross, and A. Burg,
“Tree search architecture for list successive cancellation
decoding of polar codes,” IEEE Transactions on Circuits
and Systems II: Express Briefs (submitted), 2013. [Online].
Available: http://arxiv.org/abs/1303.7127
[7] E. Arıkan and E. Telatar, “On the rate of channel polarization,”
in Proceedings of IEEE International Symposium on Informa-
tion Theory (ISIT), 2009, Jul. 2009, pp. 1493 –1495.
[8] A. Alamdar-Yazdi and F. Kschischang, “A simplified
successive-cancellation decoder for polar codes,” IEEE
Communications Letters, vol. 15, no. 12, pp. 1378–1380,
October 2011.
[9] K. Niu and K. Chen, “Stack decoding of polar codes,” Electron-
ics Letters, vol. 48, no. 12, pp. 695–697, June 2012.
