152 research outputs found
Comparison of Polar Decoders with Existing Low-Density Parity-Check and Turbo Decoders
Polar codes are a recently proposed family of provably capacity-achieving
error-correction codes that received a lot of attention. While their
theoretical properties render them interesting, their practicality compared to
other types of codes has not been thoroughly studied. Towards this end, in this
paper, we perform a comparison of polar decoders against LDPC and Turbo
decoders that are used in existing communications standards. More specifically,
we compare both the error-correction performance and the hardware efficiency of
the corresponding hardware implementations. This comparison enables us to
identify applications where polar codes are superior to existing
error-correction coding solutions as well as to determine the most promising
research direction in terms of the hardware implementation of polar decoders.Comment: Fixes small mistakes from the paper to appear in the proceedings of
IEEE WCNC 2017. Results were presented in the "Polar Coding in Wireless
Communications: Theory and Implementation" Worksho
Configurable and Scalable Turbo Decoder for 4G Wireless Receivers
The increasing requirements of high data rates and quality of service (QoS) in fourth-generation (4G) wireless communication require the implementation of practical capacity approaching codes. In this chapter, the application of Turbo coding schemes that have recently been adopted in the IEEE 802.16e WiMax standard and 3GPP Long Term Evolution (LTE) standard are reviewed. In order to process several 4G wireless standards with a common hardware module, a reconfigurable and scalable Turbo decoder architecture is presented. A parallel Turbo decoding scheme with scalable parallelism tailored to the target throughput is applied to support high data rates in 4G applications. High-level decoding parallelism is achieved by employing contention-free interleavers. A multi-banked memory structure and routing network among memories and MAP decoders are designed to operate at full speed with parallel interleavers. A new on-line address generation technique is introduced to support multiple Turbo
interleaving patterns, which avoids the interleaver address memory that is typically necessary in the traditional designs. Design trade-offs in terms of area and power efficiency are analyzed for different parallelism and clock frequency goals
A Flexible LDPC/Turbo Decoder Architecture
Low-density parity-check (LDPC) codes and convolutional Turbo codes are two of the most powerful error correcting codes that are widely used in modern
communication systems. In a multi-mode baseband receiver, both LDPC and Turbo decoders may be required. However, the different decoding approaches
for LDPC and Turbo codes usually lead to different hardware architectures. In this paper we propose a unified message passing algorithm for LDPC and Turbo
codes and introduce a flexible soft-input soft-output (SISO) module to handle LDPC/Turbo decoding. We employ the trellis-based maximum a posteriori (MAP)
algorithm as a bridge between LDPC and Turbo codes decoding. We view the LDPC code as a concatenation of n super-codes where each super-code has a simpler
trellis structure so that the MAP algorithm can be easily applied to it. We propose a flexible functional unit (FFU) for MAP processing of LDPC and Turbo
codes with a low hardware overhead (about 15% area and timing overhead). Based on the FFU, we propose an area-efficient flexible SISO decoder architecture to
support LDPC/Turbo codes decoding. Multiple such SISO modules can be embedded into a parallel decoder for higher decoding throughput. As a case study, a
flexible LDPC/Turbo decoder has been synthesized on a TSMC 90 nm CMOS technology with a core area of 3.2 mm2. The decoder can support IEEE 802.16e LDPC codes, IEEE 802.11n LDPC codes, and 3GPP LTE Turbo codes. Running at 500 MHz clock frequency, the decoder can sustain up to 600 Mbps LDPC decoding or
450 Mbps Turbo decoding.NokiaNokia Siemens Networks (NSN)XilinxTexas InstrumentsNational Science Foundatio
Configurable and Scalable High Throughput Turbo Decoder Architecture for Multiple 4GWireless Standards
In this paper, we propose a novel multi-code turbo decoder architecture for 4G wireless systems. To support various 4G standards, a configurable multi-mode MAP (maximum a posteriori) decoder is designed for both binary and duo-binary turbo codes with small resource overhead (less than 10%) compared to the single-mode architecture. To achieve high data rates in 4G, we present a parallel turbo decoder architecture with scalable parallelism tailored to the given throughput requirements. High-level parallelism is achieved by employing contention-free interleavers. Multi-banked memory structure and routing network among memories and MAP decoders are designed to operate at full speed with parallel interleavers. We designed a very low-complexity recursive on-line address generator supporting multiple interleaving patterns, which avoids the interleaver address memory. Design trade-offs in terms of area and power efficiency are explored to find the optimal architectures. A 711 Mbps data rate is feasible with 32 Radix-4 MAP decoders running at 200 MHz clock rate.Texas Instruments Incorporate
On Complexity, Energy- and Implementation-Efficiency of Channel Decoders
Future wireless communication systems require efficient and flexible baseband
receivers. Meaningful efficiency metrics are key for design space exploration
to quantify the algorithmic and the implementation complexity of a receiver.
Most of the current established efficiency metrics are based on counting
operations, thus neglecting important issues like data and storage complexity.
In this paper we introduce suitable energy and area efficiency metrics which
resolve the afore-mentioned disadvantages. These are decoded information bit
per energy and throughput per area unit. Efficiency metrics are assessed by
various implementations of turbo decoders, LDPC decoders and convolutional
decoders. New exploration methodologies are presented, which permit an
appropriate benchmarking of implementation efficiency, communications
performance, and flexibility trade-offs. These exploration methodologies are
based on efficiency trajectories rather than a single snapshot metric as done
in state-of-the-art approaches.Comment: Submitted to IEEE Transactions on Communication
Implementation of a High Throughput 3GPP Turbo Decoder on GPU
Turbo code is a computationally intensive channel code that is widely used in current and upcoming wireless standards. General-purpose graphics processor unit (GPGPU) is a programmable commodity processor that achieves high performance computation power by using many simple cores. In this paper,
we present a 3GPP LTE compliant Turbo decoder accelerator that takes advantage of the processing power of GPU to offer fast Turbo decoding throughput. Several
techniques are used to improve the performance of the decoder. To fully utilize the computational resources on GPU, our decoder can decode multiple codewords simultaneously, divide the workload for a single codeword across multiple cores, and pack multiple codewords to fit the single instruction multiple
data (SIMD) instruction width. In addition, we use shared memory judiciously to enable hundreds of concurrent multiple threads while keeping frequently used
data local to keep memory access fast. To improve efficiency of the decoder in the high SNR regime, we also present a low complexity early termination scheme
based on average extrinsic LLR statistics. Finally, we examine how different workload partitioning choices affect the error correction performance and the decoder throughput.Renesas MobileTexas InstrumentsXilinxNational Science Foundatio
- …