Search CORE

4 research outputs found

A Massively Parallel Implementation of QC-LDPC Decoder on GPU

Author: Cavallaro Joseph R.
Sun Yang
Wang Guohui
Wu Michael
Publication venue: IEEE
Publication date: 01/06/2011
Field of study

The graphics processor unit (GPU) is able to provide a low-cost and flexible software-based multi-core architecture for high performance computing. However, it is still very challenging to efficiently map the real-world applications to GPU and fully utilize the computational power of GPU. As a case study, we present a GPU-based implementation of a real-world digital signal processing (DSP) application: low-density parity-check (LDPC) decoder. The paper shows the efforts we made to map the algorithm onto the massively parallel architecture of GPU and fully utilize GPU’s computational resources to significantly boost the performance. Moreover, several efficient data structures have been proposed to reduce the memory access latency and the memory bandwidth requirement. Experimental results show that the proposed GPU-based LDPC decoding accelerator can take advantage of the multi-core computational power provided by GPU and achieve high throughput up to 100.3Mbps.Renesas MobileTexas InstrumentsXilinxNational Science Foundatio

Crossref

DSpace at Rice University

High-throughput multi-rate LDPC decoder based on architecture-oriented parity check matrices

Author: Cavallaro Joseph
de Baynast Alexandre
Karkooti Marjan
Radosavljevic Predrag
Publication venue
Publication date
Field of study

Publication in the conference proceedings of EUSIPCO, Florence, Italy, 200

ZENODO

Loosely coupled memory-based decoding architecture for low density parity check codes

Author: In-Cheol Park
Se-Hyeon Kang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Recommended from our members

Low-complexity high-speed VLSI design of low-density parity-check decoders

Author: Cui Zhiqiang
Publication venue: 'Oregon State University'
Publication date
Field of study

Low-Density Parity-check (LDPC) codes have attracted considerable attention due to their capacity approaching performance over AWGN channel and highly parallelizable decoding schemes. They have been considered in a variety of industry standards for the next generation communication systems. In general, LDPC codes achieve outstanding performance with large codeword lengths (e.g., N>1000 bits), which lead to a linear increase of the size of memory for storing all the soft messages in LDPC decoding. In the next generation communication systems, the target data rates range from a few hundred Mbit/sec to several Gbit/sec. To achieve those very high decoding throughput, a large amount of computation units are required, which will significantly increase the hardware cost and power consumption of LDPC decoders. LDPC codes are decoded using iterative decoding algorithms. The decoding latency and power consumption are linearly proportional to the number of decoding iterations. A decoding approach with fast convergence speed is highly desired in practice. This thesis considers various VLSI design issues of LDPC decoder and develops efficient approaches for reducing memory requirement, low complexity implementation, and high speed decoding of LDPC codes. We propose a memory efficient partially parallel decoder architecture suited for quasi-cyclic LDPC (QC-LDPC) codes using Min-Sum decoding algorithm. We develop an efficient architecture for general permutation matrix based LDPC codes. We have explored various approaches to linearly increase the decoding throughput with a small amount of hardware overhead. We develop a multi-Gbit/sec LDPC decoder architecture for QC-LDPC codes and prototype an enhanced partially parallel decoder architecture for a Euclidian geometry based LDPC code on FPGA. We propose an early stopping scheme and an extended layered decoding method to reduce the number of decoding iterations for undecodable and decodable sequence received from channel. We also propose a low-complexity optimized 2-bit decoding approach which requires comparable implementation complexity to weighted bit flipping based algorithms but has much better decoding performance and faster convergence speed

ScholarsArchive@OSU