509 research outputs found

    Partial Sums Generation Architecture for Successive Cancellation Decoding of Polar Codes

    Full text link
    Polar codes are a new family of error correction codes for which efficient hardware architectures have to be defined for the encoder and the decoder. Polar codes are decoded using the successive cancellation decoding algorithm that includes partial sums computations. We take advantage of the recursive structure of polar codes to introduce an efficient partial sums computation unit that can also implements the encoder. The proposed architecture is synthesized for several codelengths in 65nm ASIC technology. The area of the resulting design is reduced up to 26% and the maximum working frequency is improved by ~25%.Comment: Submitted to IEEE Workshop on Signal Processing Systems (SiPS)(26 April 2012). Accepted (28 June 2013

    Partial Sums Computation In Polar Codes Decoding

    Full text link
    Polar codes are the first error-correcting codes to provably achieve the channel capacity but with infinite codelengths. For finite codelengths the existing decoder architectures are limited in working frequency by the partial sums computation unit. We explain in this paper how the partial sums computation can be seen as a matrix multiplication. Then, an efficient hardware implementation of this product is investigated. It has reduced logic resources and interconnections. Formalized architectures, to compute partial sums and to generate the bits of the generator matrix k^n, are presented. The proposed architecture allows removing the multiplexing resources used to assigned to each processing elements the required partial sums.Comment: Accepted to ISCAS 201

    Algorithm Development and VLSI Implementation of Energy Efficient Decoders of Polar Codes

    Get PDF
    With its low error-floor performance, polar codes attract significant attention as the potential standard error correction code (ECC) for future communication and data storage. However, the VLSI implementation complexity of polar codes decoders is largely influenced by its nature of in-series decoding. This dissertation is dedicated to presenting optimal decoder architectures for polar codes. This dissertation addresses several structural properties of polar codes and key properties of decoding algorithms that are not dealt with in the prior researches. The underlying concept of the proposed architectures is a paradigm that simplifies and schedules the computations such that hardware is simplified, latency is minimized and bandwidth is maximized. In pursuit of the above, throughput centric successive cancellation (TCSC) and overlapping path list successive cancellation (OPLSC) VLSI architectures and express journey BP (XJBP) decoders for the polar codes are presented. An arbitrary polar code can be decomposed by a set of shorter polar codes with special characteristics, those shorter polar codes are referred to as constituent polar codes. By exploiting the homogeneousness between decoding processes of different constituent polar codes, TCSC reduces the decoding latency of the SC decoder by 60% for codes with length n = 1024. The error correction performance of SC decoding is inferior to that of list successive cancellation decoding. The LSC decoding algorithm delivers the most reliable decoding results; however, it consumes most hardware resources and decoding cycles. Instead of using multiple instances of decoding cores in the LSC decoders, a single SC decoder is used in the OPLSC architecture. The computations of each path in the LSC are arranged to occupy the decoder hardware stages serially in a streamlined fashion. This yields a significant reduction of hardware complexity. The OPLSC decoder has achieved about 1.4 times hardware efficiency improvement compared with traditional LSC decoders. The hardware efficient VLSI architectures for TCSC and OPLSC polar codes decoders are also introduced. Decoders based on SC or LSC algorithms suffer from high latency and limited throughput due to their serial decoding natures. An alternative approach to decode the polar codes is belief propagation (BP) based algorithm. In BP algorithm, a graph is set up to guide the beliefs propagated and refined, which is usually referred to as factor graph. BP decoding algorithm allows decoding in parallel to achieve much higher throughput. XJBP decoder facilitates belief propagation by utilizing the specific constituent codes that exist in the conventional factor graph, which results in an express journey (XJ) decoder. Compared with the conventional BP decoding algorithm for polar codes, the proposed decoder reduces the computational complexity by about 40.6%. This enables an energy-efficient hardware implementation. To further explore the hardware consumption of the proposed XJBP decoder, the computations scheduling is modeled and analyzed in this dissertation. With discussions on different hardware scenarios, the optimal scheduling plans are developed. A novel memory-distributed micro-architecture of the XJBP decoder is proposed and analyzed to solve the potential memory access problems of the proposed scheduling strategy. The register-transfer level (RTL) models of the XJBP decoder are set up for comparisons with other state-of-the-art BP decoders. The results show that the power efficiency of BP decoders is improved by about 3 times

    A Multi-Kernel Multi-Code Polar Decoder Architecture

    Get PDF
    Polar codes have received increasing attention in the past decade, and have been selected for the next generation of wireless communication standard. Most research on polar codes has focused on codes constructed from a 2×22\times2 polarization matrix, called binary kernel: codes constructed from binary kernels have code lengths that are bound to powers of 22. A few recent works have proposed construction methods based on multiple kernels of different dimensions, not only binary ones, allowing code lengths different from powers of 22. In this work, we design and implement the first multi-kernel successive cancellation polar code decoder in literature. It can decode any code constructed with binary and ternary kernels: the architecture, sized for a maximum code length NmaxN_{max}, is fully flexible in terms of code length, code rate and kernel sequence. The decoder can achieve frequency of more than 11 GHz in 6565 nm CMOS technology, and a throughput of 615615 Mb/s. The area occupation ranges between 0.110.11 mm2^2 for Nmax=256N_{max}=256 and 2.012.01 mm2^2 for Nmax=4096N_{max}=4096. Implementation results show an unprecedented degree of flexibility: with Nmax=4096N_{max}=4096, up to 5555 code lengths can be decoded with the same hardware, along with any kernel sequence and code rate

    Algorithm Development and VLSI Implementation of Energy Efficient Decoders of Polar Codes

    Get PDF
    With its low error-floor performance, polar codes attract significant attention as the potential standard error correction code (ECC) for future communication and data storage. However, the VLSI implementation complexity of polar codes decoders is largely influenced by its nature of in-series decoding. This dissertation is dedicated to presenting optimal decoder architectures for polar codes. This dissertation addresses several structural properties of polar codes and key properties of decoding algorithms that are not dealt with in the prior researches. The underlying concept of the proposed architectures is a paradigm that simplifies and schedules the computations such that hardware is simplified, latency is minimized and bandwidth is maximized. In pursuit of the above, throughput centric successive cancellation (TCSC) and overlapping path list successive cancellation (OPLSC) VLSI architectures and express journey BP (XJBP) decoders for the polar codes are presented. An arbitrary polar code can be decomposed by a set of shorter polar codes with special characteristics, those shorter polar codes are referred to as constituent polar codes. By exploiting the homogeneousness between decoding processes of different constituent polar codes, TCSC reduces the decoding latency of the SC decoder by 60% for codes with length n = 1024. The error correction performance of SC decoding is inferior to that of list successive cancellation decoding. The LSC decoding algorithm delivers the most reliable decoding results; however, it consumes most hardware resources and decoding cycles. Instead of using multiple instances of decoding cores in the LSC decoders, a single SC decoder is used in the OPLSC architecture. The computations of each path in the LSC are arranged to occupy the decoder hardware stages serially in a streamlined fashion. This yields a significant reduction of hardware complexity. The OPLSC decoder has achieved about 1.4 times hardware efficiency improvement compared with traditional LSC decoders. The hardware efficient VLSI architectures for TCSC and OPLSC polar codes decoders are also introduced. Decoders based on SC or LSC algorithms suffer from high latency and limited throughput due to their serial decoding natures. An alternative approach to decode the polar codes is belief propagation (BP) based algorithm. In BP algorithm, a graph is set up to guide the beliefs propagated and refined, which is usually referred to as factor graph. BP decoding algorithm allows decoding in parallel to achieve much higher throughput. XJBP decoder facilitates belief propagation by utilizing the specific constituent codes that exist in the conventional factor graph, which results in an express journey (XJ) decoder. Compared with the conventional BP decoding algorithm for polar codes, the proposed decoder reduces the computational complexity by about 40.6%. This enables an energy-efficient hardware implementation. To further explore the hardware consumption of the proposed XJBP decoder, the computations scheduling is modeled and analyzed in this dissertation. With discussions on different hardware scenarios, the optimal scheduling plans are developed. A novel memory-distributed micro-architecture of the XJBP decoder is proposed and analyzed to solve the potential memory access problems of the proposed scheduling strategy. The register-transfer level (RTL) models of the XJBP decoder are set up for comparisons with other state-of-the-art BP decoders. The results show that the power efficiency of BP decoders is improved by about 3 times

    System Development and VLSI Implementation of High Throughput and Hardware Efficient Polar Code Decoder

    Get PDF
    Polar code is the first channel code which is provable to achieve the Shannon capacity. Additionally, it has a very good performance in terms of low error floor. All these merits make it a potential candidate for the future standard of wireless communication or storage system. Polar code is received increasing research interest these years. However, the hardware implementation of hardware decoder still has not meet the expectation of practical applications, no matter from neither throughput aspect nor hardware efficient aspect. This dissertation presents several system development approaches and hardware structures for three widely known decoding algorithms. These algorithms are successive cancellation (SC), list successive cancellation (LSC) and belief propagation (BP). All the efforts are in order to maximize the throughput meanwhile minimize the hardware cost. Throughput centric successive cancellation (TCSC) decoder is proposed for SC decoding. By introducing the concept of constituent code, the decoding latency is significantly reduced with a negligible decoding performance loss. However, the specifically designed computation unites dramatically increase the hardware cost, and how to handle the conventional polar code sets and constituent codes sets makes the hardware implementation more complicated. By exploiting the natural property of conventional SC decoder, datapaths for decoding constituent codes are compatibly built via computation units sharing technique. This approach does not incur additional hardware cost expect some multiplexer logic, but can significantly increase the decoding throughput. Other techniques such as pre-computing and gate-level optimization are used as well in order to further increase the decoding throughput. A specific designed partial sum generator (PSG) is also investigated in this dissertation. This PSG is hardware efficient and timing compatible with proposed TCSC decoder. Additionally, a polar code construction scheme with constituent codes optimization is also presents. This construction scheme aims to reduce the constituent codes based SC decoding latency. Results show that, compared with the state-of-art decoder, TCSC can achieve at least 60% latency reduction for the codes with length n = 1024. By using Nangate FreePDK 45nm process, TCSC decoder can reach throughput up to 5.81 Gbps and 2.01 Gbps for (1024, 870) and (1024, 512) polar code, respectively. Besides, with the proposed construction scheme, the TCSC decoder generally is able to further achieve at least around 20% latency deduction with an negligible gain loss. Overlapped List Successive Cancellation (OLSC) is proposed for LSC decoding as a design approach. LSC decoding has a better performance than LS decoding at the cost of hardware consumption. With such approach, the l (l > 1) instances of successive cancellation (SC) decoder for LSC with list size l can be cut down to only one. This results in a dramatic reduction of the hardware complexity without any decoding performance loss. Meanwhile, approaches to reduce the latency associated with the pipeline scheme are also investigated. Simulation results show that with proposed design approach the hardware efficiency is increased significantly over the recently proposed LSC decoders. Express Journey Belief Propagation (XJBP) is proposed for BP decoding. This idea origins from extending the constituent codes concept from SC to BP decoding. Express journey refers to the datapath of specific constituent codes in the factor graph, which accelerates the belief information propagation speed. The XJBP decoder is able to achieve 40.6% computational complexity reduction with the conventional BP decoding. This enables an energy efficient hardware implementation. In summary, all the efforts to optimize the polar code decoder are presented in this dissertation, supported by the careful analysis, precise description, extensively numerical simulations, thoughtful discussion and RTL implementation on VLSI design platforms
    • …
    corecore