34 research outputs found

    Improve the Usability of Polar Codes: Code Construction, Performance Enhancement and Configurable Hardware

    Full text link
    Error-correcting codes (ECC) have been widely used for forward error correction (FEC) in modern communication systems to dramatically reduce the signal-to-noise ratio (SNR) needed to achieve a given bit error rate (BER). Newly invented polar codes have attracted much interest because of their capacity-achieving potential, efficient encoder and decoder implementation, and flexible architecture design space.This dissertation is aimed at improving the usability of polar codes by providing a practical code design method, new approaches to improve the performance of polar code, and a configurable hardware design that adapts to various specifications. State-of-the-art polar codes are used to achieve extremely low error rates. In this work, high-performance FPGA is used in prototyping polar decoders to catch rare-case errors for error-correcting performance verification and error analysis. To discover the polarization characteristics and error patterns of polar codes, an FPGA emulation platform for belief-propagation (BP) decoding is built by a semi-automated construction flow. The FPGA-based emulation achieves significant speedup in large-scale experiments involving trillions of data frames. The platform is a key enabler of this work. The frozen set selection of polar codes, known as bit selection, is critical to the error-correcting performance of polar codes. A simulation-based in-order bit selection method is developed to evaluate the error rate of each bit using Monte Carlo simulations. The frozen set is selected based on the bit reliability ranking. The resulting code construction exhibits up to 1 dB coding gain with respect to the conventional bit selection. To further improve the coding gain of BP decoder for low-error-rate applications, the decoding error mechanisms are studied and analyzed, and the errors are classified based on their distinct signatures. Error detection is enabled by low-cost CRC concatenation, and post-processing algorithms targeting at each type of the error is designed to mitigate the vast majority of the decoding errors. The post-processor incurs only a small implementation overhead, but it provides more than an order of magnitude improvement of the error-correcting performance. The regularity of the BP decoder structure offers many hardware architecture choices. Silicon area, power consumption, throughput and latency can be traded to reach the optimal design points for practical use cases. A comprehensive design space exploration reveals several practical architectures at different design points. The scalability of each architecture is also evaluated based on the implementation candidates. For dynamic communication channels, such as wireless channels in the upcoming 5G applications, multiple codes of different lengths and code rates are needed to t varying channel conditions. To minimize implementation cost, a universal decoder architecture is proposed to support multiple codes through hardware reuse. A 40nm length- and rate-configurable polar decoder ASIC is demonstrated to fit various communication environments and service requirements.PHDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/140817/1/shuangsh_1.pd

    System Development and VLSI Implementation of High Throughput and Hardware Efficient Polar Code Decoder

    Get PDF
    Polar code is the first channel code which is provable to achieve the Shannon capacity. Additionally, it has a very good performance in terms of low error floor. All these merits make it a potential candidate for the future standard of wireless communication or storage system. Polar code is received increasing research interest these years. However, the hardware implementation of hardware decoder still has not meet the expectation of practical applications, no matter from neither throughput aspect nor hardware efficient aspect. This dissertation presents several system development approaches and hardware structures for three widely known decoding algorithms. These algorithms are successive cancellation (SC), list successive cancellation (LSC) and belief propagation (BP). All the efforts are in order to maximize the throughput meanwhile minimize the hardware cost. Throughput centric successive cancellation (TCSC) decoder is proposed for SC decoding. By introducing the concept of constituent code, the decoding latency is significantly reduced with a negligible decoding performance loss. However, the specifically designed computation unites dramatically increase the hardware cost, and how to handle the conventional polar code sets and constituent codes sets makes the hardware implementation more complicated. By exploiting the natural property of conventional SC decoder, datapaths for decoding constituent codes are compatibly built via computation units sharing technique. This approach does not incur additional hardware cost expect some multiplexer logic, but can significantly increase the decoding throughput. Other techniques such as pre-computing and gate-level optimization are used as well in order to further increase the decoding throughput. A specific designed partial sum generator (PSG) is also investigated in this dissertation. This PSG is hardware efficient and timing compatible with proposed TCSC decoder. Additionally, a polar code construction scheme with constituent codes optimization is also presents. This construction scheme aims to reduce the constituent codes based SC decoding latency. Results show that, compared with the state-of-art decoder, TCSC can achieve at least 60% latency reduction for the codes with length n = 1024. By using Nangate FreePDK 45nm process, TCSC decoder can reach throughput up to 5.81 Gbps and 2.01 Gbps for (1024, 870) and (1024, 512) polar code, respectively. Besides, with the proposed construction scheme, the TCSC decoder generally is able to further achieve at least around 20% latency deduction with an negligible gain loss. Overlapped List Successive Cancellation (OLSC) is proposed for LSC decoding as a design approach. LSC decoding has a better performance than LS decoding at the cost of hardware consumption. With such approach, the l (l > 1) instances of successive cancellation (SC) decoder for LSC with list size l can be cut down to only one. This results in a dramatic reduction of the hardware complexity without any decoding performance loss. Meanwhile, approaches to reduce the latency associated with the pipeline scheme are also investigated. Simulation results show that with proposed design approach the hardware efficiency is increased significantly over the recently proposed LSC decoders. Express Journey Belief Propagation (XJBP) is proposed for BP decoding. This idea origins from extending the constituent codes concept from SC to BP decoding. Express journey refers to the datapath of specific constituent codes in the factor graph, which accelerates the belief information propagation speed. The XJBP decoder is able to achieve 40.6% computational complexity reduction with the conventional BP decoding. This enables an energy efficient hardware implementation. In summary, all the efforts to optimize the polar code decoder are presented in this dissertation, supported by the careful analysis, precise description, extensively numerical simulations, thoughtful discussion and RTL implementation on VLSI design platforms

    System Development and VLSI Implementation of High Throughput and Hardware Efficient Polar Code Decoder

    Get PDF
    Polar code is the first channel code which is provable to achieve the Shannon capacity. Additionally, it has a very good performance in terms of low error floor. All these merits make it a potential candidate for the future standard of wireless communication or storage system. Polar code is received increasing research interest these years. However, the hardware implementation of hardware decoder still has not meet the expectation of practical applications, no matter from neither throughput aspect nor hardware efficient aspect. This dissertation presents several system development approaches and hardware structures for three widely known decoding algorithms. These algorithms are successive cancellation (SC), list successive cancellation (LSC) and belief propagation (BP). All the efforts are in order to maximize the throughput meanwhile minimize the hardware cost. Throughput centric successive cancellation (TCSC) decoder is proposed for SC decoding. By introducing the concept of constituent code, the decoding latency is significantly reduced with a negligible decoding performance loss. However, the specifically designed computation unites dramatically increase the hardware cost, and how to handle the conventional polar code sets and constituent codes sets makes the hardware implementation more complicated. By exploiting the natural property of conventional SC decoder, datapaths for decoding constituent codes are compatibly built via computation units sharing technique. This approach does not incur additional hardware cost expect some multiplexer logic, but can significantly increase the decoding throughput. Other techniques such as pre-computing and gate-level optimization are used as well in order to further increase the decoding throughput. A specific designed partial sum generator (PSG) is also investigated in this dissertation. This PSG is hardware efficient and timing compatible with proposed TCSC decoder. Additionally, a polar code construction scheme with constituent codes optimization is also presents. This construction scheme aims to reduce the constituent codes based SC decoding latency. Results show that, compared with the state-of-art decoder, TCSC can achieve at least 60% latency reduction for the codes with length n = 1024. By using Nangate FreePDK 45nm process, TCSC decoder can reach throughput up to 5.81 Gbps and 2.01 Gbps for (1024, 870) and (1024, 512) polar code, respectively. Besides, with the proposed construction scheme, the TCSC decoder generally is able to further achieve at least around 20% latency deduction with an negligible gain loss. Overlapped List Successive Cancellation (OLSC) is proposed for LSC decoding as a design approach. LSC decoding has a better performance than LS decoding at the cost of hardware consumption. With such approach, the l (l > 1) instances of successive cancellation (SC) decoder for LSC with list size l can be cut down to only one. This results in a dramatic reduction of the hardware complexity without any decoding performance loss. Meanwhile, approaches to reduce the latency associated with the pipeline scheme are also investigated. Simulation results show that with proposed design approach the hardware efficiency is increased significantly over the recently proposed LSC decoders. Express Journey Belief Propagation (XJBP) is proposed for BP decoding. This idea origins from extending the constituent codes concept from SC to BP decoding. Express journey refers to the datapath of specific constituent codes in the factor graph, which accelerates the belief information propagation speed. The XJBP decoder is able to achieve 40.6% computational complexity reduction with the conventional BP decoding. This enables an energy efficient hardware implementation. In summary, all the efforts to optimize the polar code decoder are presented in this dissertation, supported by the careful analysis, precise description, extensively numerical simulations, thoughtful discussion and RTL implementation on VLSI design platforms

    Analog information decoding of bosonic quantum LDPC codes

    Full text link
    Quantum error correction is crucial for scalable quantum information processing applications. Traditional discrete-variable quantum codes that use multiple two-level systems to encode logical information can be hardware-intensive. An alternative approach is provided by bosonic codes, which use the infinite-dimensional Hilbert space of harmonic oscillators to encode quantum information. Two promising features of bosonic codes are that syndrome measurements are natively analog and that they can be concatenated with discrete-variable codes. In this work, we propose novel decoding methods that explicitly exploit the analog syndrome information obtained from the bosonic qubit readout in a concatenated architecture. Our methods are versatile and can be generally applied to any bosonic code concatenated with a quantum low-density parity-check (QLDPC) code. Furthermore, we introduce the concept of quasi-single-shot protocols as a novel approach that significantly reduces the number of repeated syndrome measurements required when decoding under phenomenological noise. To realize the protocol, we present a first implementation of time-domain decoding with the overlapping window method for general QLDPC codes, and a novel analog single-shot decoding method. Our results lay the foundation for general decoding algorithms using analog information and demonstrate promising results in the direction of fault-tolerant quantum computation with concatenated bosonic-QLDPC codes.Comment: 30 pages, 15 figure

    Polar coding for optical wireless communication

    Get PDF

    Energy-Efficient Decoders of Near-Capacity Channel Codes.

    Full text link
    Channel coding has become essential in state-of-the-art communication and storage systems for ensuring reliable transmission and storage of information. Their goal is to achieve high transmission reliability while keeping the transmit energy consumption low by taking advantage of the coding gain provided by these codes. The lowest total system energy is achieved with a decoder that provides both good coding gain and high energy-efficiency. This thesis demonstrates the VLSI implementation of near-capacity channel decoders using the LDPC, nonbinary LDPC (NB-LDPC) and polar codes with an emphasis of reducing the decode energy. LDPC code is a widely used channel code due to its excellent error-correcting performance. However, memory dominates the power of high-throughput LDPC decoders. Therefore, these memories are replaced with a novel non-refresh embedded DRAM (eDRAM) taking advantage of the deterministic memory access pattern and short access window of the decoding algorithm to trade off retention time for faster access speed. The resulting LDPC decoder with integrated eDRAMs achieves state-of-the-art area- and energy-efficiency. NB-LDPC code achieves better error-correcting performance than LDPC code at the cost of higher decoding complexity. However, the factor graph is simplified, permitting a fully parallel architecture with low wiring overhead. To reduce the dynamic power of the decoder, a fine-grained dynamic clock gating technique is applied based on node-level convergence. This technique greatly reduces dynamic power allowing the decoder to achieve high energy-efficiency while achieving high throughput. The recently invented polar code has a similar error-correcting performance to LDPC code of comparable block length. However, the easy reconfigurability of code rate as well as block length makes it desirable in numerous applications where LDPC is not competitive. In addition, the regular structure and simple processing enables a highly efficient decoder in terms of area and power. Using the belief propagation algorithm with architectural and memory improvements, a polar decoder is demonstrated achieving high throughput and high energy- and area-efficiency. The demonstrated energy-efficient decoders have advanced the state-of-the-art. The decoders will allow the continued reduction of decode energy for the latest communication and storage applications. The developed techniques are widely applicable to designing low-power DSP processors.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/108731/1/parkyoun_1.pd

    Domain specific high performance reconfigurable architecture for a communication platform

    Get PDF

    Cross-Layer Optimization for Power-Efficient and Robust Digital Circuits and Systems

    Full text link
    With the increasing digital services demand, performance and power-efficiency become vital requirements for digital circuits and systems. However, the enabling CMOS technology scaling has been facing significant challenges of device uncertainties, such as process, voltage, and temperature variations. To ensure system reliability, worst-case corner assumptions are usually made in each design level. However, the over-pessimistic worst-case margin leads to unnecessary power waste and performance loss as high as 2.2x. Since optimizations are traditionally confined to each specific level, those safe margins can hardly be properly exploited. To tackle the challenge, it is therefore advised in this Ph.D. thesis to perform a cross-layer optimization for digital signal processing circuits and systems, to achieve a global balance of power consumption and output quality. To conclude, the traditional over-pessimistic worst-case approach leads to huge power waste. In contrast, the adaptive voltage scaling approach saves power (25% for the CORDIC application) by providing a just-needed supply voltage. The power saving is maximized (46% for CORDIC) when a more aggressive voltage over-scaling scheme is applied. These sparsely occurred circuit errors produced by aggressive voltage over-scaling are mitigated by higher level error resilient designs. For functions like FFT and CORDIC, smart error mitigation schemes were proposed to enhance reliability (soft-errors and timing-errors, respectively). Applications like Massive MIMO systems are robust against lower level errors, thanks to the intrinsically redundant antennas. This property makes it applicable to embrace digital hardware that trades quality for power savings.Comment: 190 page

    Advances in Bosonic Quantum Error Correction with Gottesman-Kitaev-Preskill Codes: Theory, Engineering and Applications

    Full text link
    Encoding quantum information into a set of harmonic oscillators is considered a hardware efficient approach to mitigate noise for reliable quantum information processing. Various codes have been proposed to encode a qubit into an oscillator -- including cat codes, binomial codes and Gottesman-Kitaev-Preskill (GKP) codes. These bosonic codes are among the first to reach a break-even point for quantum error correction. Furthermore, GKP states not only enable close-to-optimal quantum communication rates in bosonic channels, but also allow for error correction of an oscillator into many oscillators. This review focuses on the basic working mechanism, performance characterization, and the many applications of GKP codes, with emphasis on recent experimental progress in superconducting circuit architectures and theoretical progress in multimode GKP qubit codes and oscillators-to-oscillators (O2O) codes. We begin with a preliminary continuous-variable formalism needed for bosonic codes. We then proceed to the quantum engineering involved to physically realize GKP states. We take a deep dive into GKP stabilization and preparation in superconducting architectures and examine proposals for realizing GKP states in the optical domain (along with a concise review of GKP realization in trapped-ion platforms). Finally, we present multimode GKP qubits and GKP-O2O codes, examine code performance and discuss applications of GKP codes in quantum information processing tasks such as computing, communication, and sensing.Comment: 77+5 pages, 31 figures. Minor bugs fixed in v2. comments are welcome

    Reply to “Comments on Techniques and Architectures for Hazard-Free Semi-Parallel Decoding of LDPC Codes”

    No full text
    This is a reply to the comments by Gunnam et al.“Comments on ‘Techniques and architectures for hazard-free semi-parallel decoding of LDPC codes” EURASIP Journal on Embedded Systems,vol.2009,ArticleID704174 on our recent work “Techniques and architectures for hazard-free semi-parallel decoding of LDPC codes”,EURASIP Journal on Embedded Systems,vol.2009,ArticleID723465
    corecore