7 research outputs found
A 5.16Gbps decoder ASIC for Polar Code in 16nm FinFET
Polar codes has been selected as 5G standard. However, only a couple of ASIC
featuring decoders are fabricated,and none of them support list size L > 4 and
code length N > 1024. This paper presents an ASIC implementation of three
decoders for polar code: successive cancellation (SC) decoder, flexible decoder
and ultra-reliable decoder. These decoders are all SC based decoder, supporting
list size up to 1,8,32 and code length up to 2^15,2^14,2^11 respectively. This
chip is fabricated in a 16nm TSMC FinFET technology, and can be clocked at 1
Ghz. Optimization techniques are proposed and employed to increase throughput.
Experiment result shows that the throughput can achieve up to 5.16Gbps.
Compared with fabricated AISC decoder and synthesized decoder in literature,
the flexible decoder achieves higher area efficiency
Tb/s Polar Successive Cancellation Decoder 16nm ASIC Implementation
This work presents an efficient ASIC implementation of successive
cancellation (SC) decoder for polar codes. SC is a low-complexity depth-first
search decoding algorithm, favorable for beyond-5G applications that require
extremely high throughput and low power. The ASIC implementation of SC in this
work exploits many techniques including pipelining and unrolling to achieve
Tb/s data throughput without compromising power and area metrics. To reduce the
complexity of the implementation, an adaptive log-likelihood ratio (LLR)
quantization scheme is used. This scheme optimizes bit precision of the
internal LLRs within the range of 1-5 bits by considering irregular
polarization and entropy of LLR distribution in SC decoder. The performance
cost of this scheme is less than 0.2 dB when the code block length is 1024 bits
and the payload is 854 bits. Furthermore, some computations in SC take large
space with high degree of parallelization while others take longer time steps.
To optimize these computations and reduce both memory and latency, register
reduction/balancing (R-RB) method is used. The final decoder architecture is
called optimized polar SC (OPSC). The post-placement-routing results at 16nm
FinFet ASIC technology show that OPSC decoder achieves 1.2 Tb/s coded
throughput on 0.79 mm area with 0.95 pJ/bit energy efficiency
On the Construction of -coset Codes for Parallel Decoding
In this paper, we propose a type of -coset codes for a highly parallel
stage-permuted turbo-like decoder. The decoder exploits the equivalence between
two stage-permuted factor graphs of -coset codes. Specifically, the inner
codes of a -coset code consist of independent component codes, thus are
decoded in parallel. The extrinsic information of the code bits is obtained and
iteratively exchanged between the two graphs until convergence. Accordingly, we
explore a heuristic and flexible code construction method (information set
selection) for various information lengths and coding rates. Simulations show
that the proposed -coset codes could achieve a coding performance
comparable with polar codes but enjoy higher decoding parallelism.Comment: 6 pages, 6 figure
Toward Terabits-per-second Communications: A High-Throughput Hardware Implementation of -Coset Codes
Recently, a parallel decoding algorithm of -coset codes was proposed.The
algorithm exploits two equivalent decoding graphs.For each graph, the inner
code part, which consists of independent component codes, is decoded in
parallel. The extrinsic information of the code bits is obtained and
iteratively exchanged between the graphs until convergence. This algorithm
enjoys a higher decoding parallelism than the previous successive cancellation
algorithms, due to the avoidance of serial outer code processing. In this work,
we present a hardware implementation of the parallel decoding algorithm, it can
support maximum . We complete the decoder's physical layout in TSMC
process and the size is . The decoder's area efficiency and power consumption are evaluated
for the cases of and . Scaled to
process, the decoder's throughput is higher than and
with five iterations.Comment: 5 pages, 6 figure
An Asymmetric Adaptive SCL Decoder Hardware for Ultra-Low-Error-Rate Polar Codes
In theory, Polar codes do not exhibit an error floor under
successive-cancellation (SC) decoding. In practice, frame error rate (FER) down
to has not been reported with a real SC list (SCL) decoder hardware.
This paper presents an asymmetric adaptive SCL (A2SCL) decoder, implemented in
real hardware, for high-throughput and ultra-reliable communications. We
propose to concatenate multiple SC decoders with an SCL decoder, in which the
numbers of SC/SCL decoders are balanced with respect to their area and latency.
In addition, a novel unequal-quantization technique is adopted. The two
optimizations are crucial for improving SCL throughput within limited chip
area. As an application, we build a link-level FPGA emulation platform to
measure ultra-low FERs of 3GPP NR Polar codes (with parity-check and CRC bits).
It is flexible to support all list sizes up to , code lengths up to
and arbitrary code rates. With the proposed hardware, decoding speed is 7000
times faster than a CPU core. For the first time, FER as low as is
measured and quantization effect is analyzed
Toward Terabits-per-second Communications: Low-Complexity Parallel Decoding of -Coset Codes
Recently, a parallel decoding framework of -coset codes was proposed.
High throughput is achieved by decoding the independent component polar codes
in parallel. Various algorithms can be employed to decode these component
codes, enabling a flexible throughput-performance tradeoff. In this work, we
adopt SC as the component decoders to achieve the highest-throughput end of the
tradeoff. The benefits over soft-output component decoders are reduced
complexity and simpler (binary) interconnections among component decoders. To
reduce performance degradation, we integrate an error detector and a
log-likelihood ratio (LLR) generator into each component decoder. The LLR
generator, specifically the damping factors therein, is designed by a genetic
algorithm. This low-complexity design can achieve an area efficiency of
under 7nm technology.Comment: 5 pages, 6 figure
A Flip-Syndrome-List Polar Decoder Architecture for Ultra-Low-Latency Communications
We consider practical hardware implementation of Polar decoders. To reduce
latency due to the serial nature of successive cancellation (SC), existing
optimizations improve parallelism with two approaches, i.e., multi-bit decision
or reduced path splitting. In this paper, we combine the two procedures into
one with an error-pattern-based architecture. It simultaneously generates a set
of candidate paths for multiple bits with pre-stored patterns. For rate-1 (R1)
or single parity-check (SPC) nodes, we prove that a small number of
deterministic patterns are required to guarantee performance preservation. For
general nodes, low-weight error patterns are indexed by syndrome in a look-up
table and retrieved in O(1) time. The proposed flip-syndrome-list (FSL) decoder
fully parallelizes all constituent code blocks without sacrificing performance,
thus is suitable for ultra-low-latency applications. Meanwhile, two code
construction optimizations are presented to further reduce complexity and
improve performance, respectively.Comment: 10 pages, submitted to IEEE Access (Special Issue on Advances in
Channel Coding for 5G and Beyond