11 research outputs found

    A Parallel Radix-Sort-Based VLSI Architecture for Finding the First W Maximum/Minimum Values

    Get PDF
    Very-large-scale integration (VLSI) architectures for finding the first W (W>2) maximum (or minimum) values are required in the implementation of several applications such as nonbinary low-density-parity-check decoders, K-best multiple-input–multiple-output (MIMO) detectors, and turbo product codes. In this brief, a parallel radix-sort-based VLSI architecture for finding the first W maximum (or minimum) values is proposed. The described architecture, called Bit-Wise-And (BWA) architecture, relies on analyzing input data from the most significant bit to the least significant one, with very simple logic circuits. One key feature in the BWA architecture is its high level of scalability, which enables the adoption of this solution in a large spectrum of applications, corresponding to large ranges for both W and the size of the input data set. Experimental results, achieved by implementing the proposed architecture on a high-speed 90-nm CMOS standard-cell technology, show that BWA architecture requires significantly less area than other solutions available in the literature, i.e., less than or about 50% in all the considered cases and about 50% in the worst case. Moreover, the BWA architecture exhibits the lowest area–delay product among almost all considered cases

    A Simplified Min-Sum Decoding Algorithm for Non-Binary LDPC Codes

    Full text link
    Non-binary low-density parity-check codes are robust to various channel impairments. However, based on the existing decoding algorithms, the decoder implementations are expensive because of their excessive computational complexity and memory usage. Based on the combinatorial optimization, we present an approximation method for the check node processing. The simulation results demonstrate that our scheme has small performance loss over the additive white Gaussian noise channel and independent Rayleigh fading channel. Furthermore, the proposed reduced-complexity realization provides significant savings on hardware, so it yields a good performance-complexity tradeoff and can be efficiently implemented.Comment: Partially presented in ICNC 2012, International Conference on Computing, Networking and Communications. Accepted by IEEE Transactions on Communication

    One minimum only trellis decoder for non-binary low-density parity-check codes

    Full text link
    © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.A one minimum only decoder for Trellis-EMS (OMO T-EMS) and for Trellis-Min-max (OMO T-MM) is proposed in this paper. In this novel approach, we avoid computing the second minimum in messages of the check node processor, and propose efficient estimators to infer the second minimum value. By doing so, we greatly reduce the complexity and at the same time improve latency and throughput of the derived architectures compared to the existing implementations of EMS and Min-max decoders. This solution has been applied to various NB-LDPC codes constructed over different Galois fields and with different degree distributions showing in all cases negligible performance loss compared to the ideal EMS and Min-max algorithms. In addition, two complete decoders for OMO T-EMS and OMO T-MM were implemented for the (837,726) NB-LDPC code over GF(32) for comparison proposals. A 90 nm CMOS process was applied, achieving a throughput of 711 Mbps and 818 Mbps respectively at a clock frequency of 250 MHz, with an area of 19.02 rmmm2{rm mm}^{2} and 16.10 rmmm2{rm mm}^{2} after place and route. To the best knowledge of the authors, the proposed decoders have higher throughput and area-time efficiency than any other solution for high-rate NB-LDPC codes with high Galois field order.This work was supported in part by the Spanish Ministerio de Ciencia e Innovacion under Grant TEC2011-27916 and in part by the Universitat Politecnica de Valencia under Grant PAID-06-2012-SP20120625. The work of F. Garcia-Herrero was supported by the Spanish Ministerio de Educacion under Grant AP2010-5178. David Declercq has been funded by the Institut Universitaire de France for this project. This paper was recommended by Associate Editor Z. Zhang.Lacruz, JO.; García Herrero, FM.; Valls Coquillat, J.; Declercq, D. (2015). One minimum only trellis decoder for non-binary low-density parity-check codes. IEEE Transactions on Circuits and Systems I: Regular Papers. 62(1):177-184. https://doi.org/10.1109/TCSI.2014.2354753S17718462

    Low Complexity Reliability Based Message Passing Decoder Architecture For Non Binary LDPC Codes

    Get PDF
    Non-binary low-density parity-check (NB-LDPC) codes can achieve better error-correcting performance than their binary counterparts at the cost of higher decoding complexity when the codeword length is moderate. The recently developed iterative reliability-based majority-logic NB-LDPC decoding has better performance complexity tradeoffs than previous algorithms. This paper first proposes enhancement schemes to the iterative hard reliability-based majority-logic decoding (IHRB-MLGD). Compared to the IHRB algorithm, our enhanced (E)-IHRB algorithm can achieve significant coding gain with small hardware overhead. Then low-complexity partial-parallel NB-LDPC decoder architectures are developed based on these two algorithms.  Moreover, novel schemes are developed to keep a small proportion of messages in order to reduce the memory requirement without causing noticeable performance loss. In addition, a shift-message structure is proposed by using memories concatenated with variable node units to enable efficient partial-parallel decoding for cyclic NB-LDPC codes.  our proposed decoders have at least tens of times lower complexity with moderate coding gain loss

    High-Performance NB-LDPC Decoder With Reduction of Message Exchange

    Full text link
    © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.This paper presents a novel algorithm based on trellis min-max for decoding non-binary low-density parity-check (NB-LDPC) codes. This decoder reduces the number of messages exchanged between check node and variable node processors, which decreases the storage resources and the wiring congestion and, thus, increases the throughput of the decoder. Our frame error rate performance simulations show that the proposed algorithm has a negligible performance loss for high-rate codes with GF(16) and GF(32), and a performance loss smaller than 0.07 dB for high-rate codes over GF(64). In addition, a layered decoder architecture is presented and implemented on a 90-nm CMOS process for the following high-rate NB-LDPC codes: (2304, 2048) over GF(16), (837, 726) over GF(32), and (1536, 1344) over GF(64). In all cases, the achieved throughput is higher than 1 Gb/s.This work was supported in part by the Spanish Ministerio de Ciencia e Innovacion under Grant TEC2011-27916 and Grant TEC2012-38558-C02-02, and in part by Generalitat Valenciana under Grant GV/2014/011.Lacruz, JO.; García Herrero, FM.; Canet Subiela, MJ.; Valls Coquillat, J. (2016). High-Performance NB-LDPC Decoder With Reduction of Message Exchange. IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 24(5):1950-1961. https://doi.org/10.1109/TVLSI.2015.2493041S1950196124

    VLSI architectures design for encoders of High Efficiency Video Coding (HEVC) standard

    Get PDF
    The growing popularity of high resolution video and the continuously increasing demands for high quality video on mobile devices are producing stronger needs for more efficient video encoder. Concerning these desires, HEVC, a newest video coding standard, has been developed by a joint team formed by ISO/IEO MPEG and ITU/T VCEG. Its design goal is to achieve a 50% compression gain over its predecessor H.264 with an equal or even higher perceptual video quality. Motion Estimation (ME) being as one of the most critical module in video coding contributes almost 50%-70% of computational complexity in the video encoder. This high consumption of the computational resources puts a limit on the performance of encoders, especially for full HD or ultra HD videos, in terms of coding speed, bit-rate and video quality. Thus the major part of this work concentrates on the computational complexity reduction and improvement of timing performance of motion estimation algorithms for HEVC standard. First, a new strategy to calculate the SAD (Sum of Absolute Difference) for motion estimation is designed based on the statistics on property of pixel data of video sequences. This statistics demonstrates the size relationship between the sum of two sets of pixels has a determined connection with the distribution of the size relationship between individual pixels from the two sets. Taking the advantage of this observation, only a small proportion of pixels is necessary to be involved in the SAD calculation. Simulations show that the amount of computations required in the full search algorithm is reduced by about 58% on average and up to 70% in the best case. Secondly, from the scope of parallelization an enhanced TZ search for HEVC is proposed using novel schemes of multiple MVPs (motion vector predictor) and shared MVP. Specifically, resorting to multiple MVPs the initial search process is performed in parallel at multiple search centers, and the ME processing engine for PUs within one CU are parallelized based on the MVP sharing scheme on CU (coding unit) level. Moreover, the SAD module for ME engine is also parallelly implemented for PU size of 32×32. Experiments indicate it achieves an appreciable improvement on the throughput and coding efficiency of the HEVC video encoder. In addition, the other part of this thesis is contributed to the VLSI architecture design for finding the first W maximum/minimum values targeting towards high speed and low hardware cost. The architecture based on the novel bit-wise AND scheme has only half of the area of the best reference solution and its critical path delay is comparable with other implementations. While the FPCG (full parallel comparison grid) architecture, which utilizes the optimized comparator-based structure, achieves 3.6 times faster on average on the speed and even 5.2 times faster at best comparing with the reference architectures. Finally the architecture using the partial sorting strategy reaches a good balance on the timing performance and area, which has a slightly lower or comparable speed with FPCG architecture and a acceptable hardware cost

    Etude des décodeurs LDPC non-binaires

    Get PDF
    Binary Low-Density Parity-Check (LDPC) codes and turbo-codes are known to have near-capacity performance for long code lengths. However, these codes are less efficient for short and moderate code lengths. In addition, the combination of binary codes with high-order modulations requires a marginalization step to extract bits reliabilities from symbols reliablities. Thus, binary demodulation suffers from a loss of information that can be recovered using iterative demodulators at the expense of higher complexity. LDPC codes defined over finite fields of order q > 2 can be considered as a solution to these problems. Nevertheless, optimal decoding of non-binary LDPC codes suffers from extremely highcomplexity which almost prevents practical implementation. In this thesis we aim at proving the feasibility of using non-binary LDPC codes in modern communication systems by proposing on the one hand a low-complexity decoder architecture based on a sub-optimal decoding algorithm, and showing on the other hand the advantages of combining such codes with high-order modulations. In the first part of our thesis, we propose to simplify the Extended Min-Sum (EMS) algorithm by considering a limited number n_α 2 permettent de résoudre ces problèmes. Toutefois, lesdécodeurs optimaux associés ont une complexité très importante qui rend leur utilisation problématique.L’objectif de cette thèse est de valoriser les codes LDPC non binaires en proposant d’une part une architecture d’un décodeur à complexité réduite et en montrant d’autre part l’intérêt de les associer à des modulations d’ordre élevé. Dans la première partie de notre thèse, nous proposons de simplifier l’algorithme de décodage Extended Min-Sum (EMS) en considérant un nombre limité n_α << n_m des fiabilités intrinsèques lors de la mise à jour des messages par les nœuds de variable. Cette approche permet de réduire la taille de la mémoire dédiée au stockage des messages intrinsèques. De plus, pour améliorer l’effcacité des nœuds de parité nous proposons une variante simplifiée de l’algorithme L-Bubble Check et l’architecture associée. Enfin, nous montrons par l’intermédiaire d’un prototype sur une carte FPGA (Field Programmable Gate Array) que notre décodeur possède une faible complexité en le comparant avec un ancien décodeur EMS conçu par notre laboratoire de recherche dans le cadre du projet européen DAVINCI. Dans la deuxième partie, nous étudions l’association des codes LDPC non binaires avec une modulation par décalage cyclique de code (Cyclic Code-shift Keying, CCSK) de même ordre. Nous avons choisi cette modulation pour ses propriétés qui permettent de réduire la complexité du démodulateur. En effet, nous montrons qu’il est possible dans le cas d’un système de transmission mono-porteuse avec préfixe cyclique de fusionner le démodulateur et l’égaliseur dans un même bloc comportant une seule transformée de Fourier rapide et une seule transformée de Fourier rapide inverse. Les simulations montrent que ce système possède des performances comparables à un système de transmission multiporteuses de type OFDM (Orthogonal Frequency-Division Multiplexing). Elles montrent aussi que la modulation CCSK donne des performances meilleures que la modulation de Hadamard dans un canal en environnement intérieur sélectif en fréquence. Enfin, les simulations montrent que les codes LDPC non binaires sont nettement plus effcaces avec la modulation CCSK que les codes LDPC binaires même en considérant une démodulation itérative

    Architectures for soft-decision decoding of non-binary codes

    Full text link
    En esta tesis se estudia el dise¿no de decodificadores no-binarios para la correcci'on de errores en sistemas de comunicaci'on modernos de alta velocidad. El objetivo es proponer soluciones de baja complejidad para los algoritmos de decodificaci'on basados en los c'odigos de comprobaci'on de paridad de baja densidad no-binarios (NB-LDPC) y en los c'odigos Reed-Solomon, con la finalidad de implementar arquitecturas hardware eficientes. En la primera parte de la tesis se analizan los cuellos de botella existentes en los algoritmos y en las arquitecturas de decodificadores NB-LDPC y se proponen soluciones de baja complejidad y de alta velocidad basadas en el volteo de s'¿mbolos. En primer lugar, se estudian las soluciones basadas en actualizaci'on por inundaci 'on con el objetivo de obtener la mayor velocidad posible sin tener en cuenta la ganancia de codificaci'on. Se proponen dos decodificadores diferentes basados en clipping y t'ecnicas de bloqueo, sin embargo, la frecuencia m'axima est'a limitada debido a un exceso de cableado. Por este motivo, se exploran algunos m'etodos para reducir los problemas de rutado en c'odigos NB-LDPC. Como soluci'on se propone una arquitectura basada en difusi'on parcial para algoritmos de volteo de s'¿mbolos que mitiga la congesti'on por rutado. Como las soluciones de actualizaci 'on por inundaci'on de mayor velocidad son sub-'optimas desde el punto de vista de capacidad de correci'on, decidimos dise¿nar soluciones para la actualizaci'on serie, con el objetivo de alcanzar una mayor velocidad manteniendo la ganancia de codificaci'on de los algoritmos originales de volteo de s'¿mbolo. Se presentan dos algoritmos y arquitecturas de actualizaci'on serie, reduciendo el 'area y aumentando de la velocidad m'axima alcanzable. Por 'ultimo, se generalizan los algoritmos de volteo de s'¿mbolo y se muestra como algunos casos particulares puede lograr una ganancia de codificaci'on cercana a los algoritmos Min-sum y Min-max con una menor complejidad. Tambi'en se propone una arquitectura eficiente, que muestra que el 'area se reduce a la mitad en comparaci'on con una soluci'on de mapeo directo. En la segunda parte de la tesis, se comparan algoritmos de decodificaci'on Reed- Solomon basados en decisi'on blanda, concluyendo que el algoritmo de baja complejidad Chase (LCC) es la soluci'on m'as eficiente si la alta velocidad es el objetivo principal. Sin embargo, los esquemas LCC se basan en la interpolaci'on, que introduce algunas limitaciones hardware debido a su complejidad. Con el fin de reducir la complejidad sin modificar la capacidad de correcci'on, se propone un esquema de decisi'on blanda para LCC basado en algoritmos de decisi'on dura. Por 'ultimo se dise¿na una arquitectura eficiente para este nuevo esquemaGarcía Herrero, FM. (2013). Architectures for soft-decision decoding of non-binary codes [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/33753TESISPremiad

    High Performance Decoder Architectures for Error Correction Codes

    Get PDF
    Due to the rapid development of the information industry, modern communication and storage systems require much higher data rates and reliability to server various demanding applications. However, these systems suffer from noises from the practical channels. Various error correction codes (ECCs), such as Reed-Solomon (RS) codes, convolutional codes, turbo codes, Low-Density Parity-Check (LDPC) codes and so on, have been adopted in lots of current standards. With the increasing data rate, the research of more advanced ECCs and the corresponding efficient decoders will never stop.Binary LDPC codes have been adopted in lots of modern communication and storage applications due their superior error performance and efficient hardware decoder implementations. Non-binary LDPC (NB-LDPC) codes are an important extension of traditional binary LDPC codes. Compared with its binary counterpart, NB-LDPC codes show better error performance under short to moderate block lengths and higher order modulations. Moreover, NB-LDPC codes have lower error floor than binary LDPC codes. In spite of the excellent error performance, it is hard for current communication and storage systems to adopt NB-LDPC codes due to complex decoding algorithms and decoder architectures. In terms of hardware implementation, current NB-LDPC decoders need much larger area and achieve much lower data throughput.Besides the recently proposed NB-LDPC codes, polar codes, discovered by Ar{\i}kan, appear as a very promising candidate for future communication and storage systems. Polar codes are considered as a major breakthrough in recent coding theory society. Polar codes are proved to be capacity achieving codes over binary input symmetric memoryless channels. Besides, polar codes can be decoded by the successive cancelation (SC) algorithm with of complexity of O(Nlog2N)\mathcal{O}(N\log_2 N), where NN is the block length. The main sticking point of polar codes to date is that their error performance under short to moderate block lengths is inferior compared with LDPC codes or turbo codes. The list decoding technique can be used to improve the error performance of SC algorithms at the cost higher computational and memory complexities. Besides, the hardware implementation of current SC based decoders suffer from long decoding latency which is unsuitable for modern high speed communications.ECCs also find their applications in improving the reliability of network coding. Random linear network coding is an efficient technique for disseminating information in networks, but it is highly susceptible to errors. K\ {o}tter-Kschischang (KK) codes and Mahdavifar-Vardy (MV) codes are two important families of subspace codes that provide error control in noncoherent random linear network coding. List decoding has been used to decode MV codes beyond half distance. Existing hardware implementations of the rank metric decoder for KK codes suffer from limited throughput, long latency and high area complexity. The interpolation-based list decoding algorithm for MV codes still has high computational complexity, and its feasibility for hardware implementations has not been investigated.In this exam, we present efficient decoding algorithms and hardware decoder architectures for NB-LDPC codes, polar codes, KK and MV codes. For NB-LDPC codes, an efficient shuffled decoder architecture is presented to reduce the number of average iterations and improve the throughput. Besides, a fully parallel decoder architecture for NB-LDPC codes with short or moderate block lengths is also presented. Our fully parallel decoder architecture achieves much higher throughput and area efficiency compared with the state-of-art NB-LDPC decoders. For polar codes, a memory efficient list decoder architecture is first presented. Based on our reduced latency list decoding algorithm for polar codes, a high throughput list decoder architecture is also presented. At last, we present efficient decoder architectures for both KK and MV codes
    corecore