202 research outputs found

    A 2.0 Gb/s Throughput Decoder for QC-LDPC Convolutional Codes

    Full text link
    This paper propose a decoder architecture for low-density parity-check convolutional code (LDPCCC). Specifically, the LDPCCC is derived from a quasi-cyclic (QC) LDPC block code. By making use of the quasi-cyclic structure, the proposed LDPCCC decoder adopts a dynamic message storage in the memory and uses a simple address controller. The decoder efficiently combines the memories in the pipelining processors into a large memory block so as to take advantage of the data-width of the embedded memory in a modern field-programmable gate array (FPGA). A rate-5/6 QC-LDPCCC has been implemented on an Altera Stratix FPGA. It achieves up to 2.0 Gb/s throughput with a clock frequency of 100 MHz. Moreover, the decoder displays an excellent error performance of lower than 101310^{-13} at a bit-energy-to-noise-power-spectral-density ratio (Eb/N0E_b/N_0) of 3.55 dB.Comment: accepted to IEEE Transactions on Circuits and Systems

    Hardware Implementations of CCSDS Deep Space LDPC Codes for a Satellite Transponder

    Get PDF
    Error-correction coding is a technique that adds mathematical structure to a message, allowing corruptions to be detected and corrected when the message is received. This is especially important for deep space satellite communications, since the long distances and low signal power levels often cause message corruption. A very strong type of error-correction coding known as LDPC codes was recently standardized for use with space communications. This project implements the encoding and decoding algorithms required for a small satellite radio to be able to use these LDPC codes. Several decoder architectures are implemented and compared by their performance, speed, and complexity. Using these LDPC decoders requires knowledge of the received signal and noise levels, so an appropriate algorithm for estimating these parameters is developed and implemented. The LDPC encoder is implemented using a flexible architecture that allows the entire standardized family of ten LDPC codes to be encoded using the same hardware

    Power Characterization of a Gbit/s FPGA Convolutional LDPC Decoder

    Get PDF
    In this thesis, we present an FPGA implementation of parallel-node low-density-parity-check convolutional-code (PN-LDPC-CC) encoder and decoder. A 2.4 Gbit/s rate-1/2 (3, 6) PN-LDPC-CC encoder and decoder were implemented on an Altera development and education board (DE4). Detailed power measurements of the FPGA board for various configurations of the design have been conducted to characterize the power consumption of the decoder module. For an Eb/N0 of 5 dB, the decoder with 9 processor cores (pipelined decoder iteration stages) has a bit-error-rate performance of 10E-10 and achieves an energy-per-coded-bit of 1.683 nJ based on raw power measurement results. The increase in Eb/N0 can effectively reduce the decoder power and energy-per-coded-bit for configurations with 5 or more processor cores for Eb/N0 < 5 dB. The incremental decoder power cost and incremental energy-per-coded-bit also hold a linearly decreasing trend for each additional processor core. Additional experiments are performed to account for the effect of the efficiency of the DC/DC converter circuitry on the raw power measurement data. Further experiments have also been conducted to quantify the effect of clipping thresholds, bit width for each processor core on bit-error-rate (BER) performance, power consumption, and logic utilization of the decoder. A “6Core" decoder with growing bit-width log-likelihood ratios (LLRs) has been found to have a BER performance near that of a “6Core" 6-bit decoder while consuming similar power, and logic utilization to that of a 5-bit “6Core" decoder

    Mixed Precision Multi-frame Parallel Low-Density Parity-Check Code Decoder

    Get PDF
    As the demand for high speed and high quality connectivity is increasing exponentially, channels are getting more and more crowded. The need for a high performance and low error floor channel decoder is apparent. Low-density parity-check code (LDPC) is a linear error correction code that can reach near Shannon limit. In this work, LDPC code construction and decoding algorithms are discussed, the LDPC decoder, in fully parallel and partial parallel, was implemented, and the features and issues related to corresponding architecture are analyzed. Furthermore, a multi-frame processing approach, based on pipelining and out-of-order processing, is proposed. The implemented decoder achieves 12.6 Gbps at 3.0 dB SNR. The mixed precision scheme is explored by adding precision control and alignment units before and after check node units (CNU) to improve performance, as well as error floor. By mixing the 6-bit and 5-bit precision CNUs at 1:1 ratio, the decoder reaches ~0.5 dB lower FER and BER while retaining a low error floor

    Mixed Precision Multi-frame Parallel Low-Density Parity-Check Code Decoder

    Get PDF
    As the demand for high speed and high quality connectivity is increasing exponentially, channels are getting more and more crowded. The need for a high performance and low error floor channel decoder is apparent. Low-density parity-check code (LDPC) is a linear error correction code that can reach near Shannon limit. In this work, LDPC code construction and decoding algorithms are discussed, the LDPC decoder, in fully parallel and partial parallel, was implemented, and the features and issues related to corresponding architecture are analyzed. Furthermore, a multi-frame processing approach, based on pipelining and out-of-order processing, is proposed. The implemented decoder achieves 12.6 Gbps at 3.0 dB SNR. The mixed precision scheme is explored by adding precision control and alignment units before and after check node units (CNU) to improve performance, as well as error floor. By mixing the 6-bit and 5-bit precision CNUs at 1:1 ratio, the decoder reaches ~0.5 dB lower FER and BER while retaining a low error floor

    Domain specific high performance reconfigurable architecture for a communication platform

    Get PDF

    1.5 Gbit/s FPGA implementation of a fully-parallel turbo decoder designed for mission-critical machine-type communication applications

    No full text
    In wireless communication schemes, turbo codes facilitate near-capacity transmission throughputs by achieving reliable forward error correction. However, owing to the serial data dependencies imposed by the underlying Logarithmic Bahl-Cocke-Jelinek-Raviv (Log- BCJR) algorithm, the limited processing throughputs of conventional turbo decoder implementations impose a severe bottleneck upon the overall throughputs of realtime wireless communication schemes. Motivated by this, we recently proposed a Fully Parallel Turbo Decoder (FPTD) algorithm, which eliminates these serial data dependencies, allowing parallel processing and hence offering a significantly higher processing throughput. In this paper, we propose a novel resource-efficient version of the FPTD algorithm, which reduces its computational resource requirement by 50%, which enhancing its suitability for Field-Programmable Gate Array (FPGA) implementations. We propose a model FPGA implementation. When using a Stratix IV FPGA, the proposed FPTD FPGA implementation achieves an average throughput of 1.53 Gbit/s and an average latency of 0.56 s, when decoding frames comprising N=720 bits. These are respectively 13.2 times and 11.1 times superior to those of the state-of-the- art FPGA implementation of the Log-BCJR Long- Term Evolution (LTE) turbo decoder, when decoding frames of the same frame length at the same error correction capability. Furthermore, our proposed FPTD FPGA implementation achieves a normalized resource usage of 0.42 kALUTs Mbit/s , which is 5.2 times superior to that of the benchmarker decoder. Furthermore, when decoding the shortest N=40-bit LTE frames, the proposed FPTD FPGA implementation achieves an average throughput of 442 Mbit/s and an average latency of 0.18 s, which are respectively 21.1 times and 10.6 times superior to those of the benchmarker decoder. In this case, the normalized resource usage of 0.08 kALUTs Mbit/s is 146.4 times superior to that of the benchmarker decoder
    corecore