251 research outputs found
A multi-mode area-efficient SCL polar decoder
Polar codes are of great interest since they are the first provably
capacity-achieving forward error correction codes. To improve throughput and to
reduce decoding latency of polar decoders, maximum likelihood (ML) decoding
units are used by successive cancellation list (SCL) decoders as well as
successive cancellation (SC) decoders. This paper proposes an approximate ML
(AML) decoding unit for SCL decoders first. In particular, we investigate the
distribution of frozen bits of polar codes designed for both the binary erasure
and additive white Gaussian noise channels, and take advantage of the
distribution to reduce the complexity of the AML decoding unit, improving the
area efficiency of SCL decoders. Furthermore, a multi-mode SCL decoder with
variable list sizes and parallelism is proposed. If high throughput or small
latency is required, the decoder decodes multiple received codewords in
parallel with a small list size. However, if error performance is of higher
priority, the multi-mode decoder switches to a serial mode with a bigger list
size. Therefore, the multi-mode SCL decoder provides a flexible tradeoff
between latency, throughput and error performance, and adapts to different
throughput and latency requirements at the expense of small overhead. Hardware
implementation and synthesis results show that our polar decoders not only have
a better area efficiency but also easily adapt to different communication
channels and applications.Comment: 13 pages, 9 figures, submitted to TVLS
High Throughput Polar Decoding Using Two-Staged Adaptive Successive Cancellation List Decoding
Polar codes are the first class of capacity-achieving forward error
correction (FEC) codes. They have been selected as one of the coding schemes
for the 5G communication systems due to their excellent error correction
performance when successive cancellation list (SCL) decoding with cyclic
redundancy check (CRC) is used. A large list size is necessary for SCL decoding
to achieve a low error rate. However, it impedes SCL decoding from achieving a
high throughput as the computational complexity is very high when a large list
size is used. In this paper, we propose a two-staged adaptive SCL (TA-SCL)
decoding scheme and the corresponding hardware architecture to accelerate SCL
decoding with a large list size. Constant system latency and data rate are
supported by TA-SCL decoding. To analyse the decoding performance of TA-SCL, an
accurate mathematical model based on Markov Chain is derived, which can be used
to determine the parameters for practical designs. A VLSI architecture
implementing TA-SCL decoding is then proposed. The proposed architecture is
implemented using UMC 90nm technology. Experimental results show that TA-SCL
can achieve throughputs of 3.00 and 2.35 Gbps when the list sizes are 8 and 32,
respectively, which are nearly 3 times as that of the state-ofthe-art SCL
decoding architectures, with negligible performance degradation on a wide
signal-to-noise ratio (SNR) range and small hardware overhead.Comment: 12 pages, 12 figures, 7 tables, Submitted to IEEE Transactions on
Circuits and Systems I: Regular Paper
A Complexity Reduction Method for Successive Cancellation List Decoding
This brief introduces a hardware complexity reduction method for successive
cancellation list (SCL) decoders. Specifically, we propose to use a sorting
scheme so that L paths with smallest path metrics are also sorted according to
their path indexes for path pruning. We prove that such sorting scheme reduces
the input number of multiplexers in any hardware implementation of SCL decoding
from L to (L/2+1) without any changes in the decoding latency. We also propose
sorter architectures for the proposed sorting method. Field programmable gate
array (FPGA) implementations show that the proposed method achieves significant
gain in hardware consumptions of SCL decoder implementations, especially for
large list sizes and block lengths.Comment: 6 pages, 3 figures, 6 table
Low-Latency Successive-Cancellation List Decoders for Polar Codes with Multi-bit Decision
Polar codes, as the first provable capacity-achieving error-correcting codes,
have received much attention in recent years. However, the decoding performance
of polar codes with traditional successive-cancellation (SC) algorithm cannot
match that of the low-density parity-check (LDPC) or turbo codes. Because SC
list (SCL) decoding algorithm can significantly improve the error-correcting
performance of polar codes, design of SCL decoders is important for polar codes
to be deployed in practical applications. However, because the prior latency
reduction approaches for SC decoders are not applicable for SCL decoders, these
list decoders suffer from the long latency bottleneck. In this paper, we
propose a multi-bit-decision approach that can significantly reduce latency of
SCL decoders. First, we present a reformulated SCL algorithm that can perform
intermediate decoding of 2 bits together. The proposed approach, referred as
2-bit reformulated SCL (2b-rSCL) algorithm, can reduce the latency of SCL
decoder from (3n-2) to (2n-2) clock cycles without any performance loss. Then,
we extend the idea of 2-bit-decision to general case, and propose a general
decoding scheme that can perform intermediate decoding of any 2K bits
simultaneously. This general approach, referred as 2K-bit reformulated SCL
(2Kb-rSCL) algorithm, can reduce the overall decoding latency to as short as
n/2K-2-2 cycles. Furthermore, based on the proposed algorithms, VLSI
architectures for 2b-rSCL and 4b-rSCL decoders are synthesized. Compared with a
prior SCL decoder, the proposed (1024, 512) 2b-rSCL and 4b-rSCL decoders can
achieve 21% and 60% reduction in latency, 1.66 times and 2.77 times increase in
coded throughput with list size 2, and 2.11 times and 3.23 times increase in
coded throughput with list size 4, respectively.Comment: submitted to IEEE TVLSI in Feb 2014, accepted in Sep. 201
Symbol-Decision Successive Cancellation List Decoder for Polar Codes
Polar codes are of great interests because they provably achieve the capacity
of both discrete and continuous memoryless channels while having an explicit
construction. Most existing decoding algorithms of polar codes are based on
bit-wise hard or soft decisions. In this paper, we propose symbol-decision
successive cancellation (SC) and successive cancellation list (SCL) decoders
for polar codes, which use symbol-wise hard or soft decisions for higher
throughput or better error performance. First, we propose to use a recursive
channel combination to calculate symbol-wise channel transition probabilities,
which lead to symbol decisions. Our proposed recursive channel combination also
has a lower complexity than simply combining bit-wise channel transition
probabilities. The similarity between our proposed method and Arikan's channel
transformations also helps to share hardware resources between calculating bit-
and symbol-wise channel transition probabilities. Second, a two-stage list
pruning network is proposed to provide a trade-off between the error
performance and the complexity of the symbol-decision SCL decoder. Third, since
memory is a significant part of SCL decoders, we propose a pre-computation
memory-saving technique to reduce memory requirement of an SCL decoder.
Finally, to evaluate the throughput advantage of our symbol-decision decoders,
we design an architecture based on a semi-parallel successive cancellation list
decoder. In this architecture, different symbol sizes, sorting implementations,
and message scheduling schemes are considered. Our synthesis results show that
in terms of area efficiency, our symbol-decision SCL decoders outperform both
bit- and symbol-decision SCL decoders.Comment: 13 pages, 17 figure
Fast and Flexible Successive-Cancellation List Decoders for Polar Codes
Polar codes have gained significant amount of attention during the past few
years and have been selected as a coding scheme for the next generation of
mobile broadband standard. Among decoding schemes, successive-cancellation list
(SCL) decoding provides a reasonable trade-off between the error-correction
performance and hardware implementation complexity when used to decode polar
codes, at the cost of limited throughput. The simplified SCL (SSCL) and its
extension SSCL-SPC increase the speed of decoding by removing redundant
calculations when encountering particular information and frozen bit patterns
(rate one and single parity check codes), while keeping the error-correction
performance unaltered. In this paper, we improve SSCL and SSCL-SPC by proving
that the list size imposes a specific number of bit estimations required to
decode rate one and single parity check codes. Thus, the number of estimations
can be limited while guaranteeing exactly the same error-correction performance
as if all bits of the code were estimated. We call the new decoding algorithms
Fast-SSCL and Fast-SSCL-SPC. Moreover, we show that the number of bit
estimations in a practical application can be tuned to achieve desirable speed,
while keeping the error-correction performance almost unchanged. Hardware
architectures implementing both algorithms are then described and implemented:
it is shown that our design can achieve 1.86 Gb/s throughput, higher than the
best state-of-the-art decoders.Comment: IEEE Transactions on Signal Processin
On Error-Correction Performance and Implementation of Polar Code List Decoders for 5G
Polar codes are a class of capacity achieving error correcting codes that has
been recently selected for the next generation of wireless communication
standards (5G). Polar code decoding algorithms have evolved in various
directions, striking different balances between error-correction performance,
speed and complexity. Successive-cancellation list (SCL) and its incarnations
constitute a powerful, well-studied set of algorithms, in constant improvement.
At the same time, different implementation approaches provide a wide range of
area occupations and latency results. 5G puts a focus on improved
error-correction performance, high throughput and low power consumption: a
comprehensive study considering all these metrics is currently lacking in
literature. In this work, we evaluate SCL-based decoding algorithms in terms of
error-correction performance and compare them to low-density parity-check
(LDPC) codes. Moreover, we consider various decoder implementations, for both
polar and LDPC codes, and compare their area occupation and power and energy
consumption when targeting short code lengths and rates. Our work shows that
among SCL-based decoders, the partitioned SCL (PSCL) provides the lowest area
occupation and power consumption, whereas fast simplified SCL (Fast-SSCL)
yields the lowest energy consumption. Compared to LDPC decoder architectures,
different SCL implementations occupy up to 17.1x less area, dissipate up to
7.35x less power, and up to 26x less energy.Comment: Accepted in 55th Annual Allerton Conference on Communication,
Control, and Computin
Low-Latency SC Decoder Architectures for Polar Codes
Nowadays polar codes are becoming one of the most favorable capacity
achieving error correction codes for their low encoding and decoding
complexity. However, due to the large code length required by practical
applications, the few existing successive cancellation (SC) decoder
implementations still suffer from not only the high hardware cost but also the
long decoding latency. This paper presents novel several approaches to design
low-latency decoders for polar codes based on look-ahead techniques. Look-ahead
techniques can be employed to reschedule the decoding process of polar decoder
in numerous approaches. However, among those approaches, only well-arranged
ones can achieve good performance in terms of both latency and hardware
complexity. By revealing the recurrence property of SC decoding chart, the
authors succeed in reducing the decoding latency by 50% with look-ahead
techniques. With the help of VLSI-DSP design techniques such as pipelining,
folding, unfolding, and parallel processing, methodologies for four different
polar decoder architectures have been proposed to meet various application
demands. Sub-structure sharing scheme has been adopted to design the merged
processing element (PE) for further hardware reduction. In addition, systematic
methods for construction refined pipelining decoder (2nd design) and the input
generating circuits (ICG) block have been given. Detailed gate-level analysis
has demonstrated that the proposed designs show latency advantages over
conventional ones with similar hardware cost
TC: Throughput Centric Successive Cancellation Decoder Hardware Implementation for Polar Codes
This paper presents a hardware architecture of fast simplified successive
cancellation (fast-SSC) algorithm for polar codes, which significantly reduces
the decoding latency and dramatically increases the throughput.
Algorithmically, fast-SSC algorithm suffers from the fact that its decoder
scheduling and the consequent architecture depends on the code rate; this is a
challenge for rate-compatible system. However, by exploiting the
homogeneousness between the decoding processes of fast constituent polar codes
and regular polar codes, the presented design is compatible with any rate. The
scheduling plan and the intendedly designed process core are also described.
Results show that, compared with the state-of-art decoder, proposed design can
achieve at least 60% latency reduction for the codes with length N = 1024. By
using Nangate FreePDK 45nm process, proposed design can reach throughput up to
5.81 Gbps and 2.01 Gbps for (1024, 870) and (1024, 512) polar code,
respectively.Comment: submitted to ICASSP 201
Fast Low-Complexity Decoders for Low-Rate Polar Codes
Polar codes are capacity-achieving error-correcting codes with an explicit
construction that can be decoded with low-complexity algorithms. In this work,
we show how the state-of-the-art low-complexity decoding algorithm can be
improved to better accommodate low-rate codes. More constituent codes are
recognized in the updated algorithm and dedicated hardware is added to
efficiently decode these new constituent codes. We also alter the polar code
construction to further decrease the latency and increase the throughput with
little to no noticeable effect on error-correction performance. Rate-flexible
decoders for polar codes of length 1024 and 2048 are implemented on FPGA. Over
the previous work, they are shown to have from 22% to 28% lower latency and 26%
to 34% greater throughput when decoding low-rate codes. On 65 nm ASIC CMOS
technology, the proposed decoder for a (1024, 512) polar code is shown to
compare favorably against the state-of-the-art ASIC decoders. With a clock
frequency of 400 MHz and a supply voltage of 0.8 V, it has a latency of 0.41
s and an area efficiency of 1.8 Gbps/mm for an energy efficiency of 77
pJ/info. bit. At 600 MHz with a supply of 1 V, the latency is reduced to 0.27
s and the area efficiency increased to 2.7 Gbps/mm at 115 pJ/info.
bit.Comment: 8 pages, 10 figures, submitted to Springer J. Signal Process. Sys
- …