9 research outputs found
Ultra-low power LDPC decoder design with high parallelism for wireless communication system
制度:新 ; 報告番号:甲3423号 ; 学位の種類:博士(工学) ; 授与年月日:2011/9/15 ; 早大学位記番号:新574
Area and energy efficient VLSI architectures for low-density parity-check decoders using an on-the-fly computation
The VLSI implementation complexity of a low density parity check (LDPC)
decoder is largely influenced by the interconnect and the storage requirements. This
dissertation presents the decoder architectures for regular and irregular LDPC codes that
provide substantial gains over existing academic and commercial implementations. Several
structured properties of LDPC codes and decoding algorithms are observed and are used to
construct hardware implementation with reduced processing complexity. The proposed
architectures utilize an on-the-fly computation paradigm which permits scheduling of the
computations in a way that the memory requirements and re-computations are reduced.
Using this paradigm, the run-time configurable and multi-rate VLSI architectures for the
rate compatible array LDPC codes and irregular block LDPC codes are designed. Rate
compatible array codes are considered for DSL applications. Irregular block LDPC codes
are proposed for IEEE 802.16e, IEEE 802.11n, and IEEE 802.20. When compared with a
recent implementation of an 802.11n LDPC decoder, the proposed decoder reduces the
logic complexity by 6.45x and memory complexity by 2x for a given data throughput.
When compared to the latest reported multi-rate decoders, this decoder design has an area efficiency of around 5.5x and energy efficiency of 2.6x for a given data throughput. The
numbers are normalized for a 180nm CMOS process.
Properly designed array codes have low error floors and meet the requirements of
magnetic channel and other applications which need several Gbps of data throughput. A
high throughput and fixed code architecture for array LDPC codes has been designed. No
modification to the code is performed as this can result in high error floors. This parallel
decoder architecture has no routing congestion and is scalable for longer block lengths.
When compared to the latest fixed code parallel decoders in the literature, this design has
an area efficiency of around 36x and an energy efficiency of 3x for a given data throughput.
Again, the numbers are normalized for a 180nm CMOS process. In summary, the design
and analysis details of the proposed architectures are described in this dissertation. The
results from the extensive simulation and VHDL verification on FPGA and ASIC design
platforms are also presented
New Algorithms for High-Throughput Decoding with Low-Density Parity-Check Codes using Fixed-Point SIMD Processors
Most digital signal processors contain one or more functional units with a single-instruction, multiple-data architecture that supports saturating fixed-point arithmetic with two or more options for the arithmetic precision. The processors designed for the highest performance contain many such functional units connected through an on-chip network. The selection of the arithmetic precision provides a trade-off between the task-level throughput and the quality of the output of many signal-processing algorithms, and utilization of the interconnection network during execution of the algorithm introduces a latency that can also limit the algorithm\u27s throughput. In this dissertation, we consider the turbo-decoding message-passing algorithm for iterative decoding of low-density parity-check codes and investigate its performance in parallel execution on a processor of interconnected functional units employing fast, low-precision fixed-point arithmetic. It is shown that the frequent occurrence of saturation when 8-bit signed arithmetic is used severely degrades the performance of the algorithm compared with decoding using higher-precision arithmetic. A technique of limiting the magnitude of certain intermediate variables of the algorithm, the extrinsic values, is proposed and shown to eliminate most occurrences of saturation, resulting in performance with 8-bit decoding nearly equal to that achieved with higher-precision decoding. We show that the interconnection latency can have a significant detrimental effect of the throughput of the turbo-decoding message-passing algorithm, which is illustrated for a type of high-performance digital signal processor known as a stream processor. Two alternatives to the standard schedule of message-passing and parity-check operations are proposed for the algorithm. Both alternatives markedly reduce the interconnection latency, and both result in substantially greater throughput than the standard schedule with no increase in the probability of error
Flexible LDPC Decoder Architectures
Flexible channel decoding is getting significance with the increase in number of wireless standards and modes within a standard. A flexible channel decoder is a solution providing interstandard and intrastandard support without change in hardware. However, the design of efficient implementation of flexible low-density parity-check (LDPC) code decoders satisfying area, speed, and power constraints is a challenging task and still requires considerable research effort. This paper provides an overview of state-of-the-art in the design of flexible LDPC decoders. The published solutions are evaluated at two levels of architectural design: the processing element (PE) and the interconnection structure. A qualitative and quantitative analysis
of different design choices is carried out, and comparison is provided in terms of achieved flexibility, throughput, decoding efficiency, and area (power) consumption
Acceleration of High-Fidelity Wireless Network Simulations
Network simulation with bit-accurate modeling of modulation, coding and channel properties is typically computationally intensive. Simple link-layer models that are frequently used in network simulations sacrifice accuracy to decrease simulation time. We investigate the performance and simulation time of link models that use analytical bounds on link performance and bit-accurate link models executed in Graphical Processing Units (GPUs). We show that properly chosen analytical bounds on link performance can result in simulation results close to those using bit-level simulation while providing a significant reduction in simulation time. We also show that bit-accurate decoding in link models can be expedited using parallel processing in GPUs without compromising accuracy and decreasing the overall simulation time
Research on energy-efficient VLSI decoder for LDPC code
制度:新 ; 報告番号:甲3742号 ; 学位の種類:博士(工学) ; 授与年月日:2012/9/15 ; 早大学位記番号:新6113Waseda Universit
Recommended from our members
Low-complexity high-speed VLSI design of low-density parity-check decoders
Low-Density Parity-check (LDPC) codes have attracted considerable attention due to their capacity approaching performance over AWGN channel and highly parallelizable decoding schemes. They have been considered in a variety of industry standards for the next generation communication systems. In general, LDPC codes achieve outstanding performance with large codeword lengths (e.g., N>1000 bits), which lead to a linear increase of the size of memory for storing all the soft messages in LDPC decoding. In the next generation communication systems, the target data rates range from a few hundred Mbit/sec to several Gbit/sec. To achieve those very high decoding throughput, a large amount of computation units are required, which will significantly increase the hardware cost and power consumption of LDPC decoders. LDPC codes are decoded using iterative decoding algorithms. The decoding latency and power consumption are linearly proportional to the number of decoding iterations. A decoding approach with fast convergence speed is highly desired in practice.
This thesis considers various VLSI design issues of LDPC decoder and develops efficient approaches for reducing memory requirement, low complexity implementation, and high speed decoding of LDPC codes. We propose a memory efficient partially parallel decoder architecture suited for quasi-cyclic LDPC (QC-LDPC) codes using Min-Sum decoding algorithm. We develop an efficient architecture for general permutation matrix based LDPC codes. We have explored various approaches to linearly increase the decoding throughput with a small amount of hardware overhead. We develop a multi-Gbit/sec LDPC decoder architecture for QC-LDPC codes and prototype an enhanced partially parallel decoder architecture for a Euclidian geometry based LDPC code on FPGA. We propose an early stopping scheme and an extended layered decoding method to reduce the number of decoding iterations for undecodable and decodable sequence received from channel. We also propose a low-complexity optimized 2-bit decoding approach which requires comparable implementation complexity to weighted bit flipping based algorithms but has much better decoding performance and faster convergence speed
Research on high performance LDPC decoder
制度:新 ; 報告番号:甲3272号 ; 学位の種類:博士(工学) ; 授与年月日:2011/3/15 ; 早大学位記番号:新557
Hardware Design of Decoder for Low-Density Parity Check Codes
A hardware decoder architecture is presented in this thesis for quasi-cyclic (QC) low-density parity check (LDPC) codes.
The decoder is real-time configurable and supports 15 codes which are combination of 3 rates and 5 lengths. The partly parallel architecture implements layered decoding. A check node decoder is serial and implements min-sum correction algorithm. The proposed design techniques include out-of-order memory-write, two-stage multi-size shifter, serial decoding termination.
The decoder consumes about half amount of logic resource on the Xilinx FPGA chip XC2VP50-5F1152. The worst case throughput at 20 iterations ranges from 5 Mbits to 60 Mbits (information bits) per second. Higher throughput can be obtained by the proposed optimisation. Reuse for similar codes is possible