13 research outputs found

    Polar Code decoder exploration framework

    Get PDF
    The increasing demand for fast wireless communications requires sophisticated baseband signal processing. One of the computational intense tasks here is advanced Forward Error Correction (FEC), especially the decoding. Finding efficient hardware implementations for sophisticated FEC decoding algorithms that fulfill throughput demands under strict implementation constraints is an active research topic due to increasing throughput, low latency, and high energy efficiency requirements.This paper focuses on the interesting class of Polar Codes that are currently a hot topic. We present a modular framework to automatically generate and evaluate a wide range of Polar Code decoders, with emphasis on design space exploration for efficient hardware architectures. To demonstrate the efficiency of our framework a very high throughput Soft Cancellation (SCAN) Polar Code decoder is shown that was automatically generated. This decoder is, to the best of our knowledge, the fastest SCAN Polar Code decoder published so far.</p

    Hardware implementation aspects of polar decoders and ultra high-speed LDPC decoders

    Get PDF
    The goal of channel coding is to detect and correct errors that appear during the transmission of information. In the past few decades, channel coding has become an integral part of most communications standards as it improves the energy-efficiency of transceivers manyfold while only requiring a modest investment in terms of the required digital signal processing capabilities. The most commonly used channel codes in modern standards are low-density parity-check (LDPC) codes and Turbo codes, which were the first two types of codes to approach the capacity of several channels while still being practically implementable in hardware. The decoding algorithms for LDPC codes, in particular, are highly parallelizable and suitable for high-throughput applications. A new class of channel codes, called polar codes, was introduced recently. Polar codes have an explicit construction and low-complexity encoding and successive cancellation (SC) decoding algorithms. Moreover, polar codes are provably capacity achieving over a wide range of channels, making them very attractive from a theoretical perspective. Unfortunately, polar codes under standard SC decoding cannot compete with the LDPC and Turbo codes that are used in current standards in terms of their error-correcting performance. For this reason, several improved SC-based decoding algorithms have been introduced. The most prominent SC-based decoding algorithm is the successive cancellation list (SCL) decoding algorithm, which is powerful enough to approach the error-correcting performance of LDPC codes. The original SCL decoding algorithm was described in an arithmetic domain that is not well-suited for hardware implementations and is not clear how an efficient SCL decoder architecture can be implemented. To this end, in this thesis, we re-formulate the SCL decoding algorithm in two distinct arithmetic domains, we describe efficient hardware architectures to implement the resulting SCL decoders, and we compare the decoders with existing LDPC and Turbo decoders in terms of their error-correcting performance and their implementation efficiency. Due to the ongoing technology scaling, the feature sizes of integrated circuits keep shrinking at a remarkable pace. As transistors and memory cells keep shrinking, it becomes increasingly difficult and costly (in terms of both area and power) to ensure that the implemented digital circuits always operate correctly. Thus, manufactured digital signal processing circuits, including channel decoder circuits, may not always operate correctly. Instead of discarding these faulty dies or using costly circuit-level fault mitigation mechanisms, an alternative approach is to try to live with certain malfunctions, provided that the algorithm implemented by the circuit is sufficiently fault-tolerant. In this spirit, in this thesis we examine decoding of polar codes and LDPC codes under the assumption that the memories that are used within the decoders are not fully reliable. We show that, in both cases, there is inherent fault-tolerance and we also propose some methods to reduce the effect of memory faults on the error-correcting performance of the considered decoders

    Energy Consumption Analysis of Software Polar Decoders on Low Power Processors

    Get PDF
    International audienceThis paper presents a new dynamic and fully generic implementation of a Successive Cancellation (SC) decoder (multi-precision support and intra-/inter-frame strategy support). This fully generic SC decoder is used to perform comparisons of the different configurations in terms of throughput, latency and energy consumption. A special emphasis is given on the energy consumption on low power embedded processors for software defined radio (SDR) systems. A N=4096 code length, rate 1/2 software SC decoder consumes only 14 nJ per bit on an ARM Cortex-A57 core, while achieving 65 Mbps. Some design guidelines are given in order to adapt the configuration to the application context

    Polar coding for optical wireless communication

    Get PDF

    Algorithm Development and VLSI Implementation of Energy Efficient Decoders of Polar Codes

    Get PDF
    With its low error-floor performance, polar codes attract significant attention as the potential standard error correction code (ECC) for future communication and data storage. However, the VLSI implementation complexity of polar codes decoders is largely influenced by its nature of in-series decoding. This dissertation is dedicated to presenting optimal decoder architectures for polar codes. This dissertation addresses several structural properties of polar codes and key properties of decoding algorithms that are not dealt with in the prior researches. The underlying concept of the proposed architectures is a paradigm that simplifies and schedules the computations such that hardware is simplified, latency is minimized and bandwidth is maximized. In pursuit of the above, throughput centric successive cancellation (TCSC) and overlapping path list successive cancellation (OPLSC) VLSI architectures and express journey BP (XJBP) decoders for the polar codes are presented. An arbitrary polar code can be decomposed by a set of shorter polar codes with special characteristics, those shorter polar codes are referred to as constituent polar codes. By exploiting the homogeneousness between decoding processes of different constituent polar codes, TCSC reduces the decoding latency of the SC decoder by 60% for codes with length n = 1024. The error correction performance of SC decoding is inferior to that of list successive cancellation decoding. The LSC decoding algorithm delivers the most reliable decoding results; however, it consumes most hardware resources and decoding cycles. Instead of using multiple instances of decoding cores in the LSC decoders, a single SC decoder is used in the OPLSC architecture. The computations of each path in the LSC are arranged to occupy the decoder hardware stages serially in a streamlined fashion. This yields a significant reduction of hardware complexity. The OPLSC decoder has achieved about 1.4 times hardware efficiency improvement compared with traditional LSC decoders. The hardware efficient VLSI architectures for TCSC and OPLSC polar codes decoders are also introduced. Decoders based on SC or LSC algorithms suffer from high latency and limited throughput due to their serial decoding natures. An alternative approach to decode the polar codes is belief propagation (BP) based algorithm. In BP algorithm, a graph is set up to guide the beliefs propagated and refined, which is usually referred to as factor graph. BP decoding algorithm allows decoding in parallel to achieve much higher throughput. XJBP decoder facilitates belief propagation by utilizing the specific constituent codes that exist in the conventional factor graph, which results in an express journey (XJ) decoder. Compared with the conventional BP decoding algorithm for polar codes, the proposed decoder reduces the computational complexity by about 40.6%. This enables an energy-efficient hardware implementation. To further explore the hardware consumption of the proposed XJBP decoder, the computations scheduling is modeled and analyzed in this dissertation. With discussions on different hardware scenarios, the optimal scheduling plans are developed. A novel memory-distributed micro-architecture of the XJBP decoder is proposed and analyzed to solve the potential memory access problems of the proposed scheduling strategy. The register-transfer level (RTL) models of the XJBP decoder are set up for comparisons with other state-of-the-art BP decoders. The results show that the power efficiency of BP decoders is improved by about 3 times

    Algorithm Development and VLSI Implementation of Energy Efficient Decoders of Polar Codes

    Get PDF
    With its low error-floor performance, polar codes attract significant attention as the potential standard error correction code (ECC) for future communication and data storage. However, the VLSI implementation complexity of polar codes decoders is largely influenced by its nature of in-series decoding. This dissertation is dedicated to presenting optimal decoder architectures for polar codes. This dissertation addresses several structural properties of polar codes and key properties of decoding algorithms that are not dealt with in the prior researches. The underlying concept of the proposed architectures is a paradigm that simplifies and schedules the computations such that hardware is simplified, latency is minimized and bandwidth is maximized. In pursuit of the above, throughput centric successive cancellation (TCSC) and overlapping path list successive cancellation (OPLSC) VLSI architectures and express journey BP (XJBP) decoders for the polar codes are presented. An arbitrary polar code can be decomposed by a set of shorter polar codes with special characteristics, those shorter polar codes are referred to as constituent polar codes. By exploiting the homogeneousness between decoding processes of different constituent polar codes, TCSC reduces the decoding latency of the SC decoder by 60% for codes with length n = 1024. The error correction performance of SC decoding is inferior to that of list successive cancellation decoding. The LSC decoding algorithm delivers the most reliable decoding results; however, it consumes most hardware resources and decoding cycles. Instead of using multiple instances of decoding cores in the LSC decoders, a single SC decoder is used in the OPLSC architecture. The computations of each path in the LSC are arranged to occupy the decoder hardware stages serially in a streamlined fashion. This yields a significant reduction of hardware complexity. The OPLSC decoder has achieved about 1.4 times hardware efficiency improvement compared with traditional LSC decoders. The hardware efficient VLSI architectures for TCSC and OPLSC polar codes decoders are also introduced. Decoders based on SC or LSC algorithms suffer from high latency and limited throughput due to their serial decoding natures. An alternative approach to decode the polar codes is belief propagation (BP) based algorithm. In BP algorithm, a graph is set up to guide the beliefs propagated and refined, which is usually referred to as factor graph. BP decoding algorithm allows decoding in parallel to achieve much higher throughput. XJBP decoder facilitates belief propagation by utilizing the specific constituent codes that exist in the conventional factor graph, which results in an express journey (XJ) decoder. Compared with the conventional BP decoding algorithm for polar codes, the proposed decoder reduces the computational complexity by about 40.6%. This enables an energy-efficient hardware implementation. To further explore the hardware consumption of the proposed XJBP decoder, the computations scheduling is modeled and analyzed in this dissertation. With discussions on different hardware scenarios, the optimal scheduling plans are developed. A novel memory-distributed micro-architecture of the XJBP decoder is proposed and analyzed to solve the potential memory access problems of the proposed scheduling strategy. The register-transfer level (RTL) models of the XJBP decoder are set up for comparisons with other state-of-the-art BP decoders. The results show that the power efficiency of BP decoders is improved by about 3 times

    Embedded electronic systems driven by run-time reconfigurable hardware

    Get PDF
    Abstract This doctoral thesis addresses the design of embedded electronic systems based on run-time reconfigurable hardware technology –available through SRAM-based FPGA/SoC devices– aimed at contributing to enhance the life quality of the human beings. This work does research on the conception of the system architecture and the reconfiguration engine that provides to the FPGA the capability of dynamic partial reconfiguration in order to synthesize, by means of hardware/software co-design, a given application partitioned in processing tasks which are multiplexed in time and space, optimizing thus its physical implementation –silicon area, processing time, complexity, flexibility, functional density, cost and power consumption– in comparison with other alternatives based on static hardware (MCU, DSP, GPU, ASSP, ASIC, etc.). The design flow of such technology is evaluated through the prototyping of several engineering applications (control systems, mathematical coprocessors, complex image processors, etc.), showing a high enough level of maturity for its exploitation in the industry.Resumen Esta tesis doctoral abarca el diseño de sistemas electrónicos embebidos basados en tecnología hardware dinámicamente reconfigurable –disponible a través de dispositivos lógicos programables SRAM FPGA/SoC– que contribuyan a la mejora de la calidad de vida de la sociedad. Se investiga la arquitectura del sistema y del motor de reconfiguración que proporcione a la FPGA la capacidad de reconfiguración dinámica parcial de sus recursos programables, con objeto de sintetizar, mediante codiseño hardware/software, una determinada aplicación particionada en tareas multiplexadas en tiempo y en espacio, optimizando así su implementación física –área de silicio, tiempo de procesado, complejidad, flexibilidad, densidad funcional, coste y potencia disipada– comparada con otras alternativas basadas en hardware estático (MCU, DSP, GPU, ASSP, ASIC, etc.). Se evalúa el flujo de diseño de dicha tecnología a través del prototipado de varias aplicaciones de ingeniería (sistemas de control, coprocesadores aritméticos, procesadores de imagen, etc.), evidenciando un nivel de madurez viable ya para su explotación en la industria.Resum Aquesta tesi doctoral està orientada al disseny de sistemes electrònics empotrats basats en tecnologia hardware dinàmicament reconfigurable –disponible mitjançant dispositius lògics programables SRAM FPGA/SoC– que contribueixin a la millora de la qualitat de vida de la societat. S’investiga l’arquitectura del sistema i del motor de reconfiguració que proporcioni a la FPGA la capacitat de reconfiguració dinàmica parcial dels seus recursos programables, amb l’objectiu de sintetitzar, mitjançant codisseny hardware/software, una determinada aplicació particionada en tasques multiplexades en temps i en espai, optimizant així la seva implementació física –àrea de silici, temps de processat, complexitat, flexibilitat, densitat funcional, cost i potència dissipada– comparada amb altres alternatives basades en hardware estàtic (MCU, DSP, GPU, ASSP, ASIC, etc.). S’evalúa el fluxe de disseny d’aquesta tecnologia a través del prototipat de varies aplicacions d’enginyeria (sistemes de control, coprocessadors aritmètics, processadors d’imatge, etc.), demostrant un nivell de maduresa viable ja per a la seva explotació a la indústria

    Exploration architecturale pour le décodage de codes polaires

    Get PDF
    Applications in the field of digital communications are becoming increasingly complex and diversified. Hence, the need to correct the transmitted message mistakes becomes an issue to be dealt with. To address this problem, error correcting codes are used. In particular, Polar Codes that are the subject of this thesis. They have recently been discovered (2008) by Arikan. They are considered an important discovery in the field of error correcting codes. Their practicality goes hand in hand with the ability to propose a hardware implementation of a decoder. The subject of this thesis focuses on the architectural exploration of Polar Code decoders implementing particular decoding algorithms. Thus, the subject revolves around two decoding algorithms: a first decoding algorithm, returning hard decisions, and another decoding algorithm, returning soft decisions.The first decoding algorithm, treated in this thesis, is based on the hard decision algorithm called "successive cancellation" (SC) as originally proposed. Analysis of implementations of SC decoders shows that the partial sum computation unit is complex. Moreover, the memory amount from this analysis limits the implementation of large decoders. Research conducted in order to solve these problems presents an original architecture, based on shift registers, to compute the partial sums. This architecture allows to reduce the complexity and increase the maximum working frequency of this unit. We also proposed a new methodology to redesign an existing decoder architecture, relatively simply, to reduce memory requirements. ASIC and FPGA syntheses were performed to characterize these contributions.The second decoding algorithm treated in this thesis is the soft decision algorithm called SCAN. The study of the state of the art shows that the only other implemented soft decision algorithm is the BP algorithm. However, it requires about fifty iterations to obtain the decoding performances of the SC algorithm. In addition, its memory requirements make it not implementable for huge code sizes. The interest of the SCAN algorithm lies in its performances which are better than those of the BP algorithm with only two iterations. In addition, its lower memory footprint makes it more convenient and allows the implementation of larger decoders. We propose in this thesis a first implementation of this algorithm on FPGA targets. FPGA syntheses were carried out in order to compare the SCAN decoder with BP decoders in the state of the art.The contributions proposed in this thesis allowed to bring a complexity reduction of the partial sum computation unit. Moreover, the amount of memory required by an SC decoder has been decreased. At last, a SCAN decoder has been proposed and can be used in the communication field with other blocks requiring soft inputs. This then broadens the application field of Polar Codes.Les applications dans le domaine des communications numériques deviennent de plus en plus complexes et diversifiées. En témoigne la nécessité de corriger les erreurs des messages transmis. Pour répondre à cette problématique, des codes correcteurs d’erreurs sont utilisés. En particulier, les Codes Polaires qui font l’objet de cette thèse. Ils ont été découverts récemment (2008) par Arıkan. Ils sont considérés comme une découverte importante dans le domaine des codes correcteurs d’erreurs. Leur aspect pratique va de paire avec la capacité à proposer une implémentation matérielle de décodeur. Le sujet de cette thèse porte sur l’exploration architecturale de décodeurs de Codes Polaires implémentant des algorithmes de décodage particuliers. Ainsi, le sujet gravite autour de deux algorithmes de décodage : un premier algorithme de décodage à décisions dures et un autre algorithme de décodage à décisions souples.Le premier algorithme de décodage, à décisions dures, traité dans cette thèse repose sur l’algorithme par annulation successive (SC) comme proposé originellement. L’analyse des implémentations de décodeurs montre que l’unité de calcul des sommes partielles est complexe. De plus,la quantité mémoire ressort de cette analyse comme étant un point limitant de l’implémentation de décodeurs de taille importante. Les recherches menées afin de palier ces problèmes montrent qu’une architecture de mise à jour des sommes partielles à base de registres à décalages permet de réduire la complexité de cette unité. Nous avons également proposé une nouvelle méthodologie permettant de revoir la conception d’une architecture de décodeur déjà existante de manière relativement simple afin de réduire le besoin en mémoire. Des synthèses en technologie ASIC et sur cibles FPGA ont été effectués pour caractériser ces contributions. Le second algorithme de décodage, à décisions souples, traité dans ce mémoire, est l’algorithme SCAN. L’étude de l’état de l’art montre que le seul autre algorithme à décisions souples implémenté est l’algorithme BP. Cependant, il nécessite une cinquantaine d’itérations pour obtenir des performances de décodages au niveau de l’algorithme SC. De plus, son besoin mémoire le rend non implémentable pour des tailles de codes élevées. L’intérêt de l’algorithme SCAN réside dans ses performances qui sont meilleures que celles de l’algorithme BP avec seulement 2 itérations.De plus, sa plus faible empreinte mémoire le rend plus pratique et permet l’implémentation de décodeurs plus grands. Nous proposons dans cette thèse une première implémentation de cetalgorithme sur cibles FPGA. Des synthèses sur cibles FPGA ont été effectuées pour pouvoir comparer le décodeur SCAN avec les décodeurs BP de l’état de l’art.Les contributions proposées dans cette thèse ont permis d’apporter une réduction de la complexité matérielle du calcul des sommes partielles ainsi que du besoin général du décodeur en éléments de mémorisation. Le décodeur SCAN peut être utilisé dans la chaîne de communication avec d’autres blocs nécessitant des entrées souples. Cela permet alors d’ouvrir le champ d’applications des Codes Polaires à ces blocs

    Optical Communication

    Get PDF
    Optical communication is very much useful in telecommunication systems, data processing and networking. It consists of a transmitter that encodes a message into an optical signal, a channel that carries the signal to its desired destination, and a receiver that reproduces the message from the received optical signal. It presents up to date results on communication systems, along with the explanations of their relevance, from leading researchers in this field. The chapters cover general concepts of optical communication, components, systems, networks, signal processing and MIMO systems. In recent years, optical components and other enhanced signal processing functions are also considered in depth for optical communications systems. The researcher has also concentrated on optical devices, networking, signal processing, and MIMO systems and other enhanced functions for optical communication. This book is targeted at research, development and design engineers from the teams in manufacturing industry, academia and telecommunication industries
    corecore