24 research outputs found
A High-Performance and Low-Complexity 5G LDPC Decoder: Algorithm and Implementation
5G New Radio (NR) has stringent demands on both performance and complexity
for the design of low-density parity-check (LDPC) decoding algorithms and
corresponding VLSI implementations. Furthermore, decoders must fully support
the wide range of all 5G NR blocklengths and code rates, which is a significant
challenge. In this paper, we present a high-performance and low-complexity LDPC
decoder, tailor-made to fulfill the 5G requirements. First, to close the gap
between belief propagation (BP) decoding and its approximations in hardware, we
propose an extension of adjusted min-sum decoding, called generalized adjusted
min-sum (GA-MS) decoding. This decoding algorithm flexibly truncates the
incoming messages at the check node level and carefully approximates the
non-linear functions of BP decoding to balance the error-rate and hardware
complexity. Numerical results demonstrate that the proposed fixed-point GAMS
has only a minor gap of 0.1 dB compared to floating-point BP under various
scenarios of 5G standard specifications. Secondly, we present a fully
reconfigurable 5G NR LDPC decoder implementation based on GA-MS decoding. Given
that memory occupies a substantial portion of the decoder area, we adopt
multiple data compression and approximation techniques to reduce 42.2% of the
memory overhead. The corresponding 28nm FD-SOI ASIC decoder has a core area of
1.823 mm2 and operates at 895 MHz. It is compatible with all 5G NR LDPC codes
and achieves a peak throughput of 24.42 Gbps and a maximum area efficiency of
13.40 Gbps/mm2 at 4 decoding iterations.Comment: 14 pages, 14 figure
Design of hardware accelerators for demanding applications.
This paper focuses on mastering the architecture development of hardware accelerators. It presents the results of our analysis of the main issues that have to be addressed when designing accelerators for modern demanding applications, when using as an example the accelerator design for LDPC decoding for the newest demanding communication system standards. Based on the results of our analysis, we formulate the main requirements that have to be satisfied by an adequate accelerator design methodology, and propose a design approach which satisfies these requirements
A survey of FPGA-based LDPC decoders
Low-Density Parity Check (LDPC) error correction decoders have become popular in communications systems, as a benefit of their strong error correction performance and their suitability to parallel hardware implementation. A great deal of research effort has been invested into LDPC decoder designs that exploit the flexibility, the high processing speed and the parallelism of Field-Programmable Gate Array (FPGA) devices. FPGAs are ideal for design prototyping and for the manufacturing of small-production-run devices, where their in-system programmability makes them far more cost-effective than Application-Specific Integrated Circuits (ASICs). However, the FPGA-based LDPC decoder designs published in the open literature vary greatly in terms of design choices and performance criteria, making them a challenge to compare. This paper explores the key factors involved in FPGA-based LDPC decoder design and presents an extensive review of the current literature. In-depth comparisons are drawn amongst 140 published designs (both academic and industrial) and the associated performance trade-offs are characterised, discussed and illustrated. Seven key performance characteristics are described, namely their processing throughput, latency, hardware resource requirements, error correction capability, processing energy efficiency, bandwidth efficiency and flexibility. We offer recommendations that will facilitate fairer comparisons of future designs, as well as opportunities for improving the design of FPGA-based LDPC decoder
Towards Quantum Belief Propagation for LDPC Decoding in Wireless Networks
We present Quantum Belief Propagation (QBP), a Quantum Annealing (QA) based
decoder design for Low Density Parity Check (LDPC) error control codes, which
have found many useful applications in Wi-Fi, satellite communications, mobile
cellular systems, and data storage systems. QBP reduces the LDPC decoding to a
discrete optimization problem, then embeds that reduced design onto quantum
annealing hardware. QBP's embedding design can support LDPC codes of block
length up to 420 bits on real state-of-the-art QA hardware with 2,048 qubits.
We evaluate performance on real quantum annealer hardware, performing
sensitivity analyses on a variety of parameter settings. Our design achieves a
bit error rate of in 20 s and a 1,500 byte frame error rate of
in 50 s at SNR 9 dB over a Gaussian noise wireless channel.
Further experiments measure performance over real-world wireless channels,
requiring 30 s to achieve a 1,500 byte 99.99 frame delivery rate at
SNR 15-20 dB. QBP achieves a performance improvement over an FPGA based soft
belief propagation LDPC decoder, by reaching a bit error rate of and
a frame error rate of at an SNR 2.5--3.5 dB lower. In terms of
limitations, QBP currently cannot realize practical protocol-sized
( Wi-Fi, WiMax) LDPC codes on current QA processors. Our
further studies in this work present future cost, throughput, and QA hardware
trend considerations
Research on energy-efficient VLSI decoder for LDPC code
制度:新 ; 報告番号:甲3742号 ; 学位の種類:博士(工学) ; 授与年月日:2012/9/15 ; 早大学位記番号:新6113Waseda Universit
Ultra-low power LDPC decoder design with high parallelism for wireless communication system
制度:新 ; 報告番号:甲3423号 ; 学位の種類:博士(工学) ; 授与年月日:2011/9/15 ; 早大学位記番号:新574
Design of High Throughput Reconfigurable LDPC CODEC
Channel coding is an essential part of communication systems, which significantly reduces the error rate of receiving messages. Nowadays, iterative decoding methods play an important role in wireless communication such as 5G, Wi-Fi etc. Low-Density Parity-Check (LDPC) codes are one of the most used iterative decoding codes, which attract lots of interest in a wide range of applications. LDPC codes have a channel approaching capacity, which is practical for implementation as well. The thesis focuses on the design of high throughput reconfigurable LDPC channel codec with good performance.
The main focus of this thesis is the design of a novel decoding algorithm for LDPC codes. The new decoding algorithm is configurable to adjust its performance and complexity, which is very flexible for applications. Its error correction capability is close to the sum-product algorithm but with significantly lower complexity. We further implement the LDPC encoder/decoder on FPGA, which is reconfigurable for 5G NR or user-defined LDPC codes. In particular, we apply the new decoding algorithm to the decoder and analyse its performance on hardware.
Moreover, we compared the error detection performance of 5G NR CRC and LDPC Syndrome to investigate the necessity of using CRC decoding or LDPC syndrome check, or both in practical systems. At last, a 5G NR physical layer simulating SoC embedded system is built on FPGA for the verification of the encoder and decoder
Recommended from our members
Low-Density Parity-Check Code Decoder Design and Error Characterization on an FPGA Based Framework
Low-Density Parity-Check (LDPC) codes have gained popularity in communication systems and standards due to their capacity approaching error correction performance. Among all the hard-decision based LDPC decoders, Gallager B (GaB), due to simplicity of its operations, poses as the most hardware friendly algorithm and an attractive solution for meeting the high-throughput demand in communication systems. However, GaB sufferers from poor error correction performance. In this work, we first propose a resource efficient GaB hardware architecture that delivers the best throughput while using fewest Field Programmable Gate Array (FPGA) resources with respect to the state of the art comparable LDPC decoding algorithms. We then introduce a Probabilistic GaB (PGaB) algorithm that disturbs the decisions made during the decoding iterations randomly with a probability value determined based on experimental studies. We achieve up to four orders of magnitude better error correction performance than the GaB with a 3.4% improvement in normalized throughput performance. PGaB requires around 40% less energy than GaB as the probabilistic execution results with reducing the average iteration count by up to 62% compared to the GaB. We also show that our PGaB consistently results with an improvement in maximum operational clock rate compared to the state of the art implementations.
In this dissertation, we also present a high throughput FPGA based framework to accelerate error characterization of the LDPC codes. Our flexible framework allows the end user adjust the simulation parameters and rapidly study various LDPC codes and decoders. We first show that the connection intensive bipartite graph based LDPC decoder hardware architecture creates routing stress for longer codewords that are utilized in today's communications systems and standards. We address this problem by partitioning each processing element (PE) in the bipartite graph in such a way that the inputs of a PE are evenly distributed over its partitions. This allows depopulating the Loo Up Table (LUT) resources utilized for the decoder architecture by spreading the logic across the FPGA. We show that even though LUT usage increases, critical path delay reduces with the depopulation. More importantly, with the depopulation technique an unroutable design becomes routable, which allows longer codewords to be mapped on the FPGA. We then conduct two experiments on error correction performance analysis for the GaB and PGaB algorithms, demonstrate our framework's ability to reach a resolution level that is not attainable with general purpose processor (GPP) based simulations, which reduces the time scale of simulations to 24 hours from an estimated 199 years. We finally conduct the first study on identifying all possible codewords that are not correctable by the GaB for the case where a codeword has four errors. We reduce the time scale of this simulation that requires processing 117 billion codewords to 4 hours and 38 minutes with our framework from an estimated 7800 days on a single GPP
Area and energy efficient VLSI architectures for low-density parity-check decoders using an on-the-fly computation
The VLSI implementation complexity of a low density parity check (LDPC)
decoder is largely influenced by the interconnect and the storage requirements. This
dissertation presents the decoder architectures for regular and irregular LDPC codes that
provide substantial gains over existing academic and commercial implementations. Several
structured properties of LDPC codes and decoding algorithms are observed and are used to
construct hardware implementation with reduced processing complexity. The proposed
architectures utilize an on-the-fly computation paradigm which permits scheduling of the
computations in a way that the memory requirements and re-computations are reduced.
Using this paradigm, the run-time configurable and multi-rate VLSI architectures for the
rate compatible array LDPC codes and irregular block LDPC codes are designed. Rate
compatible array codes are considered for DSL applications. Irregular block LDPC codes
are proposed for IEEE 802.16e, IEEE 802.11n, and IEEE 802.20. When compared with a
recent implementation of an 802.11n LDPC decoder, the proposed decoder reduces the
logic complexity by 6.45x and memory complexity by 2x for a given data throughput.
When compared to the latest reported multi-rate decoders, this decoder design has an area efficiency of around 5.5x and energy efficiency of 2.6x for a given data throughput. The
numbers are normalized for a 180nm CMOS process.
Properly designed array codes have low error floors and meet the requirements of
magnetic channel and other applications which need several Gbps of data throughput. A
high throughput and fixed code architecture for array LDPC codes has been designed. No
modification to the code is performed as this can result in high error floors. This parallel
decoder architecture has no routing congestion and is scalable for longer block lengths.
When compared to the latest fixed code parallel decoders in the literature, this design has
an area efficiency of around 36x and an energy efficiency of 3x for a given data throughput.
Again, the numbers are normalized for a 180nm CMOS process. In summary, the design
and analysis details of the proposed architectures are described in this dissertation. The
results from the extensive simulation and VHDL verification on FPGA and ASIC design
platforms are also presented