172 research outputs found

    System-on-chip Computing and Interconnection Architectures for Telecommunications and Signal Processing

    Get PDF
    This dissertation proposes novel architectures and design techniques targeting SoC building blocks for telecommunications and signal processing applications. Hardware implementation of Low-Density Parity-Check decoders is approached at both the algorithmic and the architecture level. Low-Density Parity-Check codes are a promising coding scheme for future communication standards due to their outstanding error correction performance. This work proposes a methodology for analyzing effects of finite precision arithmetic on error correction performance and hardware complexity. The methodology is throughout employed for co-designing the decoder. First, a low-complexity check node based on the P-output decoding principle is designed and characterized on a CMOS standard-cells library. Results demonstrate implementation loss below 0.2 dB down to BER of 10^{-8} and a saving in complexity up to 59% with respect to other works in recent literature. High-throughput and low-latency issues are addressed with modified single-phase decoding schedules. A new "memory-aware" schedule is proposed requiring down to 20% of memory with respect to the traditional two-phase flooding decoding. Additionally, throughput is doubled and logic complexity reduced of 12%. These advantages are traded-off with error correction performance, thus making the solution attractive only for long codes, as those adopted in the DVB-S2 standard. The "layered decoding" principle is extended to those codes not specifically conceived for this technique. Proposed architectures exhibit complexity savings in the order of 40% for both area and power consumption figures, while implementation loss is smaller than 0.05 dB. Most modern communication standards employ Orthogonal Frequency Division Multiplexing as part of their physical layer. The core of OFDM is the Fast Fourier Transform and its inverse in charge of symbols (de)modulation. Requirements on throughput and energy efficiency call for FFT hardware implementation, while ubiquity of FFT suggests the design of parametric, re-configurable and re-usable IP hardware macrocells. In this context, this thesis describes an FFT/IFFT core compiler particularly suited for implementation of OFDM communication systems. The tool employs an accuracy-driven configuration engine which automatically profiles the internal arithmetic and generates a core with minimum operands bit-width and thus minimum circuit complexity. The engine performs a closed-loop optimization over three different internal arithmetic models (fixed-point, block floating-point and convergent block floating-point) using the numerical accuracy budget given by the user as a reference point. The flexibility and re-usability of the proposed macrocell are illustrated through several case studies which encompass all current state-of-the-art OFDM communications standards (WLAN, WMAN, xDSL, DVB-T/H, DAB and UWB). Implementations results are presented for two deep sub-micron standard-cells libraries (65 and 90 nm) and commercially available FPGA devices. Compared with other FFT core compilers, the proposed environment produces macrocells with lower circuit complexity and same system level performance (throughput, transform size and numerical accuracy). The final part of this dissertation focuses on the Network-on-Chip design paradigm whose goal is building scalable communication infrastructures connecting hundreds of core. A low-complexity link architecture for mesochronous on-chip communication is discussed. The link enables skew constraint looseness in the clock tree synthesis, frequency speed-up, power consumption reduction and faster back-end turnarounds. The proposed architecture reaches a maximum clock frequency of 1 GHz on 65 nm low-leakage CMOS standard-cells library. In a complex test case with a full-blown NoC infrastructure, the link overhead is only 3% of chip area and 0.5% of leakage power consumption. Finally, a new methodology, named metacoding, is proposed. Metacoding generates correct-by-construction technology independent RTL codebases for NoC building blocks. The RTL coding phase is abstracted and modeled with an Object Oriented framework, integrated within a commercial tool for IP packaging (Synopsys CoreTools suite). Compared with traditional coding styles based on pre-processor directives, metacoding produces 65% smaller codebases and reduces the configurations to verify up to three orders of magnitude

    Improve the Usability of Polar Codes: Code Construction, Performance Enhancement and Configurable Hardware

    Full text link
    Error-correcting codes (ECC) have been widely used for forward error correction (FEC) in modern communication systems to dramatically reduce the signal-to-noise ratio (SNR) needed to achieve a given bit error rate (BER). Newly invented polar codes have attracted much interest because of their capacity-achieving potential, efficient encoder and decoder implementation, and flexible architecture design space.This dissertation is aimed at improving the usability of polar codes by providing a practical code design method, new approaches to improve the performance of polar code, and a configurable hardware design that adapts to various specifications. State-of-the-art polar codes are used to achieve extremely low error rates. In this work, high-performance FPGA is used in prototyping polar decoders to catch rare-case errors for error-correcting performance verification and error analysis. To discover the polarization characteristics and error patterns of polar codes, an FPGA emulation platform for belief-propagation (BP) decoding is built by a semi-automated construction flow. The FPGA-based emulation achieves significant speedup in large-scale experiments involving trillions of data frames. The platform is a key enabler of this work. The frozen set selection of polar codes, known as bit selection, is critical to the error-correcting performance of polar codes. A simulation-based in-order bit selection method is developed to evaluate the error rate of each bit using Monte Carlo simulations. The frozen set is selected based on the bit reliability ranking. The resulting code construction exhibits up to 1 dB coding gain with respect to the conventional bit selection. To further improve the coding gain of BP decoder for low-error-rate applications, the decoding error mechanisms are studied and analyzed, and the errors are classified based on their distinct signatures. Error detection is enabled by low-cost CRC concatenation, and post-processing algorithms targeting at each type of the error is designed to mitigate the vast majority of the decoding errors. The post-processor incurs only a small implementation overhead, but it provides more than an order of magnitude improvement of the error-correcting performance. The regularity of the BP decoder structure offers many hardware architecture choices. Silicon area, power consumption, throughput and latency can be traded to reach the optimal design points for practical use cases. A comprehensive design space exploration reveals several practical architectures at different design points. The scalability of each architecture is also evaluated based on the implementation candidates. For dynamic communication channels, such as wireless channels in the upcoming 5G applications, multiple codes of different lengths and code rates are needed to t varying channel conditions. To minimize implementation cost, a universal decoder architecture is proposed to support multiple codes through hardware reuse. A 40nm length- and rate-configurable polar decoder ASIC is demonstrated to fit various communication environments and service requirements.PHDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/140817/1/shuangsh_1.pd

    Design Trade‐Offs for FPGA Implementation of LDPC Decoders

    Get PDF
    Low density parity check (LDPC) decoders represent important throughput bottlenecks, as well as major cost and power-consuming components in today\u27s digital circuits for wireless communication and storage. They present a wide range of architectural choices, with different throughput, cost, and error correction capability trade-offs. In this book chapter, we will present an overview of the main design options in the architecture and implementation of these circuits on field programmable gate array (FPGA) devices. We will present the mapping of the main units within the LDPC decoders on the specific embedded components of FPGA device. We will review architectural trade-offs for both flooded and layered scheduling strategies in their FPGA implementation

    Configurable LDPC Decoder Architecture for Regular and Irregular Codes

    Get PDF
    Low Density Parity Check (LDPC) codes are one of the best error correcting codes that enable the future generations of wireless devices to achieve higher data rates with excellent quality of service. This paper presents two novel flexible decoder architectures. The first one supports (3, 6) regular codes of rate 1/2 that can be used for different block lengths. The second decoder is more general and supports both regular and irregular LDPC codes with twelve combinations of code lengths −648, 1296, 1944-bits and code rates-1/2, 2/3, 3/4, 5/6- based on the IEEE 802.11n standard. All codes correspond to a block-structured parity check matrix, in which the sub-blocks are either a shifted identity matrix or a zero matrix. Prototype architectures for both LDPC decoders have been implemented and tested on a Xilinx field programmable gate array.NokiaNational Science Foundatio

    A survey of FPGA-based LDPC decoders

    No full text
    Low-Density Parity Check (LDPC) error correction decoders have become popular in communications systems, as a benefit of their strong error correction performance and their suitability to parallel hardware implementation. A great deal of research effort has been invested into LDPC decoder designs that exploit the flexibility, the high processing speed and the parallelism of Field-Programmable Gate Array (FPGA) devices. FPGAs are ideal for design prototyping and for the manufacturing of small-production-run devices, where their in-system programmability makes them far more cost-effective than Application-Specific Integrated Circuits (ASICs). However, the FPGA-based LDPC decoder designs published in the open literature vary greatly in terms of design choices and performance criteria, making them a challenge to compare. This paper explores the key factors involved in FPGA-based LDPC decoder design and presents an extensive review of the current literature. In-depth comparisons are drawn amongst 140 published designs (both academic and industrial) and the associated performance trade-offs are characterised, discussed and illustrated. Seven key performance characteristics are described, namely their processing throughput, latency, hardware resource requirements, error correction capability, processing energy efficiency, bandwidth efficiency and flexibility. We offer recommendations that will facilitate fairer comparisons of future designs, as well as opportunities for improving the design of FPGA-based LDPC decoder

    The 5G channel code contenders

    No full text

    Research on energy-efficient VLSI decoder for LDPC code

    Get PDF
    制度:新 ; 報告番号:甲3742号 ; 学位の種類:博士(工学) ; 授与年月日:2012/9/15 ; 早大学位記番号:新6113Waseda Universit

    Design of High Throughput Reconfigurable LDPC CODEC

    Full text link
    Channel coding is an essential part of communication systems, which significantly reduces the error rate of receiving messages. Nowadays, iterative decoding methods play an important role in wireless communication such as 5G, Wi-Fi etc. Low-Density Parity-Check (LDPC) codes are one of the most used iterative decoding codes, which attract lots of interest in a wide range of applications. LDPC codes have a channel approaching capacity, which is practical for implementation as well. The thesis focuses on the design of high throughput reconfigurable LDPC channel codec with good performance. The main focus of this thesis is the design of a novel decoding algorithm for LDPC codes. The new decoding algorithm is configurable to adjust its performance and complexity, which is very flexible for applications. Its error correction capability is close to the sum-product algorithm but with significantly lower complexity. We further implement the LDPC encoder/decoder on FPGA, which is reconfigurable for 5G NR or user-defined LDPC codes. In particular, we apply the new decoding algorithm to the decoder and analyse its performance on hardware. Moreover, we compared the error detection performance of 5G NR CRC and LDPC Syndrome to investigate the necessity of using CRC decoding or LDPC syndrome check, or both in practical systems. At last, a 5G NR physical layer simulating SoC embedded system is built on FPGA for the verification of the encoder and decoder

    Efficient Decoder for Optical Transport Networks Achieving Near Capacity Performance

    Get PDF
    Today’s optical transport networks (OTNs) support a plethora of services such as video streaming, cloud computing, social networking and many more. To make such a wide assortment of services possible, a tremendous amount of data needs to be carried over the internet backbone supported by these optical transport networks. In order to cope with this increase in traffic, data rate on OTNs has increased significantly. Product codes (PC) are a class of codes that provide good coding gain at reasonable decoding complexity and, hence, have been a popular choice for OTNs in recent times. The key goal of this thesis is to implement a decoder for a Product Code (PC) on a Virtex7 Field Programmable Gate Array(FPGA). The product code of choice for this project is based on a (1023,993) BCH code as a component code. The conventional decoder for BCH codes has a computationally expensive step for finding the roots of error locator polynomial. The BCH decoder implemented as a part of this project is optimized to speed up the decoding process while at the same time also simplifying the hardware complexity of the design. The implementation is parallelized and pipelined to achieve high throughputs. This provides a hardware platform to evaluate the performance of product codes at low bit error rates that is infeasible using software simulations

    Power Characterization of a Digit-Online FPGA Implementation of a Low-Density Parity-Check Decoder for WiMAX Applications

    Get PDF
    Low-density parity-check (LDPC) codes are a class of easily decodable error-correcting codes. Published parallel LDPC decoders demonstrate high throughput and low energy-per-bit but require a lot of silicon area. Decoders based on digit-online arithmetic (processing several bits per fundamental operation) process messages in a digit-serial fashion, reducing the area requirements, and can process multiple frames in frame-interlaced fashion. Implementations on Field-Programmable Gate Array (FPGA) are usually power- and area-hungry, but provide flexibility compared with application-specific integrated circuit implementations. With the penetration of mobile devices in the electronics industry the power considerations have become increasingly important. The power consumption of a digit-online decoder depends on various factors, like input log-likelihood ratio (LLR) bit precision, signal-to-noise ratio (SNR) and maximum number of iterations. The design is implemented on an Altera Stratix IV GX EP4SGX230 FPGA, which comes on an Altera DE4 Development and Education Board. In this work, both parallel and digit-online block LDPC decoder implementations on FPGAs for WiMAX 576-bit, rate-3/4 codes are studied, and power measurements from the DE4 board are reported. Various components of the system include a random-data generator, WiMAX Encoder, shift-out register, additive white Gaussian noise (AWGN) generator, channel LLR buffer, WiMAX Decoder and bit-error rate (BER) Calculator. The random-data generator outputs pseudo-random bit patterns through an implemented linear-feedback shift register (LFSR). Digit-online decoders with input LLR precisions ranging from 6 to 13 bits and parallel decoders with input LLR precisions ranging from 3 to 6 bits are synthesized in a Stratix IV FPGA. The digit-online decoders can be clocked at higher frequency for higher LLR precisions. A digit-online decoder can be used to decode two frames simultaneously in frame-interlaced mode. For the 6-bit implementation of digit-online decoder in single-frame mode, the minimum throughput achieved is 740 Mb/s at low SNRs. For the case of 11-bit LLR digit-online decoder in frame-interlaced mode, the minimum throughput achieved is 1363 Mb/s. Detailed analysis such as effect of SNR and LLR precision on decoder power is presented. Also, the effect of changing LLR precision on max clock frequency and logic utilization on the parallel and the digit-online decoders is studied. Alongside, power per iteration for a 6-bit LLR input digit-online decoder is also reported
    corecore