7 research outputs found

    An efficient GPU implementation of fixed-complexity sphere decoders for MIMO wireless systems

    Full text link
    The use of many-core processors such as general purpose Graphic Processing Units (GPUs) has recently become attractive for the efficient implementation of signal processing algorithms for communication systems. This is due to the cost-effectiveness of GPUs together with their potential capability of parallel processing. This paper presents an implementation of the widely employed fixed-complexity sphere decoder on GPUs, which allows to considerably decrease the computational time required for the data detection stage in multiple-input multiple-output systems. Both, the hard-and soft-output versions of the method have been implemented. Speedup results show the proposed GPU implementation boosts the runtime of the parallel execution of the methods in a high performance multi-core CPU. In addition, the throughput of the algorithm is evaluated and is shown to outperform other recent implementations and to fulfill the real-time requirements of several LTE configurations. ©2012-IOS Press and the authors. All rights reserved.This work was partially funded by the TEC2009-13741 project of the Spanish Ministry of Science and by the PROMETEO/2009/013 project of the Generalitat Valenciana.Roger Varea, S.; Ramiro Sánchez, C.; González Salvador, A.; Almenar Terré, V.; Vidal Maciá, AM. (2012). An efficient GPU implementation of fixed-complexity sphere decoders for MIMO wireless systems. Integrated Computer-Aided Engineering. 19(4):341-350. https://doi.org/10.3233/ICA-2012-0410S34135019

    Linear-time encoding and decoding of low-density parity-check codes

    Get PDF
    Low-density parity-check (LDPC) codes had a renaissance when they were rediscovered in the 1990’s. Since then LDPC codes have been an important part of the field of error-correcting codes, and have been shown to be able to approach the Shannon capacity, the limit at which we can reliably transmit information over noisy channels. Following this, many modern communications standards have adopted LDPC codes. Error-correction is equally important in protecting data from corruption on a hard-drive as it is in deep-space communications. It is most commonly used for example for reliable wireless transmission of data to mobile devices. For practical purposes, both encoding and decoding need to be of low complexity to achieve high throughput and low power consumption. This thesis provides a literature review of the current state-of-the-art in encoding and decoding of LDPC codes. Message- passing decoders are still capable of achieving the best error-correcting performance, while more recently considered bit-flipping decoders are providing a low-complexity alternative, albeit with some loss in error-correcting performance. An implementation of a low-complexity stochastic bit-flipping decoder is also presented. It is implemented for Graphics Processing Units (GPUs) in a parallel fashion, providing a peak throughput of 1.2 Gb/s, which is significantly higher than previous decoder implementations on GPUs. The error-correcting performance of a range of decoders has also been tested, showing that the stochastic bit-flipping decoder provides relatively good error-correcting performance with low complexity. Finally, a brief comparison of encoding complexities for two code ensembles is also presented

    Design and Implementation of Efficient Algorithms for Wireless MIMO Communication Systems

    Full text link
    En la última década, uno de los avances tecnológicos más importantes que han hecho culminar la nueva generación de banda ancha inalámbrica es la comunicación mediante sistemas de múltiples entradas y múltiples salidas (MIMO). Las tecnologías MIMO han sido adoptadas por muchos estándares inalámbricos tales como LTE, WiMAS y WLAN. Esto se debe principalmente a su capacidad de aumentar la máxima velocidad de transmisión , junto con la fiabilidad alcanzada y la cobertura de las comunicaciones inalámbricas actuales sin la necesidad de ancho de banda extra ni de potencia de transmisión adicional. Sin embargo, las ventajas proporcionadas por los sistemas MIMO se producen a expensas de un aumento sustancial del coste de implementación de múltiples antenas y de la complejidad del receptor, la cual tiene un gran impacto sobre el consumo de energía. Por esta razón, el diseño de receptores de baja complejidad es un tema importante que se abordará a lo largo de esta tesis. En primer lugar, se investiga el uso de técnicas de preprocesado de la matriz de canal MIMO bien para disminuir el coste computacional de decodificadores óptimos o bien para mejorar las prestaciones de detectores subóptimos lineales, SIC o de búsqueda en árbol. Se presenta una descripción detallada de dos técnicas de preprocesado ampliamente utilizadas: el método de Lenstra, Lenstra, Lovasz (LLL) para lattice reduction (LR) y el algorimo VBLAST ZF-DFE. Tanto la complejidad como las prestaciones de ambos métodos se han evaluado y comparado entre sí. Además, se propone una implementación de bajo coste del algoritmo VBLAST ZF-DFE, la cual se incluye en la evaluación. En segundo lugar, se ha desarrollado un detector MIMO basado en búsqueda en árbol de baja complejidad, denominado detector K-Best de amplitud variable (VB K-Best). La idea principal de este método es aprovechar el impacto del número de condición de la matriz de canal sobre la detección de datos con el fin de disminuir la complejidad de los sistemasRoger Varea, S. (2012). Design and Implementation of Efficient Algorithms for Wireless MIMO Communication Systems [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/16562Palanci

    Techniques for Low-latency in Software-defined Radio-based Networks

    Get PDF
    Decreased budgets have pushed the United States Air Force towards using existing systems in new ways. The use of unmanned aerial vehicle swarms is one example of reuse of existing systems. One problem with the increased utilization of these swarms is the congestion of the electromagnetic spectrum. Software-defined or cognitive radios have been proposed as a basis for a potential robust communications solution. The present research aims to develop and test a genetic algorithm-based cognitive engine to begin looking at real-time engines that could be used in future swarms. Here, latency is the optimization objective of primary importance. In testing the engine, particular items of interest include the number of solutions evaluated in a given bound and the engine\u27s reliability in yielding acceptable network performance. Initial experiments indicate the engine can consider significant portions of the search space within a relatively small bound and that the engine is efficient at finding highly fit solutions. Future work for this research includes evaluating how well high fitness correlates to acceptable performance and testing the engine with additional noise floors

    Tridimensional block multiword LDPC decoding on GPUs

    Full text link
    [EN] In this paper, we describe a parallel algorithm for LDPC (Low Density Parity Check codes) decoding on a GPU (Graphics Processing Unit) using CUDA (Compute Unified Device Architecture). The strategy of the kernel grid and block design is shown and the multiword decoding operation is described using tridimensional blocks. The performance (speedup) of the proposed parallel algorithm is slightly better than the performance found in the literature when this is relatively good, and shows a great improvement in those cases with previously reported moderate or bad performance. © 2011 Springer Science+Business Media, LLC.This work was financially supported by the Spanish Ministerio de Ciencia e Innovación (Projects TIN2008-06570-C04-02, TEC2009-13741 and TEC2008-06787), Universidad Politécnica de Valencia through “Programa de Apoyo a la Investigación y Desarrollo (PAID-05-09)” and Generalitat Valenciana through project PROMETEO/2009/013.Martínez Zaldívar, FJ.; Vidal Maciá, AM.; González Salvador, A.; Almenar Terré, V. (2011). Tridimensional block multiword LDPC decoding on GPUs. Journal of Supercomputing. 58(3):314-322. https://doi.org/10.1007/s11227-011-0587-3S314322583Berrou C, Glavieux A, Thitimajshima P (1993) Near Shannon limit error-correcting coding and decoding: turbo-codes. In: International conference on communications, GenevaFalcão G, Sousa L, Silva V (2008) Massive parallel LDPC decoding on GPU. In: Proceedings of the 13th ACM SIGPLAN symposium on principles and practice of parallel programming, Salt Lake City, UT, USA, February 20–23, pp 83–90Falcão G, Silva V, Sousa L (2009) How GPUs can outperform ASICs for fast LDPC decoding. In: Proceedings of the 23rd international conference on supercomputing, Yorktown Heights, NY, USA, pp 390–399Falcão G, Sousa L, Silva V, Maurinho J (2009) Parallel LDPC decoding on the Cell/B.E. processor. In: Lecture notes in computer science, vol 5409. Springer, Berlin, pp 389–403Falcão G, Yamagiwa S, Silva V, Sousa L (2009) Parallel LDPC decoding on GPUs using a stream-based computing approach. J Comput Sci Technol 24(5):913–924Gallager RG (1963) Low density parity check codes. Ph.D. diss, MITKirk DB, Hwu WW (2010) Programming massively parallel processors. A hands on approach, NVidia. Morgan Kaufmann, San MateoMackay DJC, Neal RM (1996) Near Shannon limit performance of low density parity check codes. Electron Lett 32(18):1645–1646Richardson T, Urbanke R (2008) Modern coding theory. Cambridge University Press, CambridgeShannon C (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423 and 623–656Tanner R (1981) A recursive approach to low complexity codes. IEEE Trans Inf Theory 27(5):533–547Wang S, Cheng S, Wu Q (2008) A parallel decoding algorithm of LDPC codes using CUDA. In: Proc asilomar conference on signals, systems and computers, Pacific Grove, CA, Octobe

    A GPU implementation of an iterative receiver for energy saving MIMO ID-BICM systems

    Full text link
    Iterative detection and decoding in communication systems with multiple transmitter and receiver antennas suffer from a significant increase in the computational cost and energy consumption. Nowadays, application of specific high-performance computing techniques for signal processing in communication systems is receiving considerable attention. In this paper, we present an accelerated and efficient iterative receiver, which has been implemented following two strategies. First, we reduce the computational cost using parallelized algorithms executed on graphics processing unit. In addition, our receiver allows the selection between two types of detectors with different complexity and performance. The selection can be done to fulfill a given compromise between bit error rate and power consumptionThis work has been supported by European Union ERDF and Spanish Government through TEC2012-38142-C04 project and Generalitat Valenciana through PROMETEO/2009/013 project.Ramiro Sánchez, C.; Simarro Haro, MDLA.; Martínez Zaldívar, FJ.; Vidal Maciá, AM.; Gonzalez, A. (2014). A GPU implementation of an iterative receiver for energy saving MIMO ID-BICM systems. The Journal of Supercomputing. 70(2):541-551. https://doi.org/10.1007/s11227-013-1081-xS541551702Barbero L, Thompson J (2008) Extending a fixed-complexity sphere decoder to obtain likelihood information for turbo-MIMO systems. IEEE Trans Veh Technol 57(5):2804–2814Barbero LG, Thompson JS (2008) Fixing the complexity of the sphere decoder for MIMO detection. IEEE Trans Wirel Commun 7(6):2131–2134Boutros J, Gresset N, Brunel L, Fossorier M (2003) Soft-input soft-output lattice sphere decoder for linear channels. Proc IEEE GLOBECOM 3(2):1583–1587Choi J (2010) Optimal combining and detection. Cambridge University Press, CambridgeGuo Z, Nilsson P (2006) Algorithm and implementation of the k-best sphere decoding for mimo detection. IEEE J Sel Areas Commun 24(3):491–503Hassibi B, Vikalo H (2005) On sphere decoding algorithm. Part I, the expected complexity. Trans Signal Process 54(5):2806–2818Hochwald BM, Brink ST (2003) Achieving near-capacity on a multiple-antenna channel. IEEE Trans Commun 51(3):389–399Larsson EG (2009) MIMO detection methods: how they work. IEEE Signal Process Mag 26(3):91–95Li X, Ritcey J (1987) Bit interleaved coded modulation with iterative decoding. IEEE Commun Lett 1:169–171Lu B, Wang X, Narayanan K (2002) LDPC-based space-time coded OFDM systems over correlated fading channels: performance analysis and receiver design. IEEE Trans Commun 50(1):74–88Martínez-Zaldívar F, Vidal A, Gonzalez A, Almenar V (2011) Tridimensional block multiword LDPC decoding on GPUs. J Supercomput 58(3):314–322. doi: 10.1007/s11227-011-0587-3NVIDIA (2013) NVIDIA CUDA C programming guide, version 5.5Roger S, Ramiro C, Gonzalez A, Almenar V, Vidal A (2012) An efficient GPU implementation of fixed-complexity sphere decoders for MIMO wireless systems. Integr Comput Aided Eng 19(4):341–350Roger S, Ramiro C, Gonzalez A, Almenar V, Vidal A (2012) Fully parallel GPU implementation of a fixed-complexity soft-output MIMO detector. IEEE Trans Veh Technol 61(8):3796–3800Simarro M, Ramiro C, Martínez-Zaldívar F, Vidal A, Gonzalez A (2013) A parallel iterative MIMO receiver with variable complexity detectors. Proc Int Conf Comput Math Methods Sci Eng 4:1242–1279Studer C, Burg A, Bölcskei H (2008) Soft-output sphere decoding algorithms and VLSI implementation. IEEE J Sel Areas Commun 26(2):290–300Tanner R (1981) A recursive approach to low complexity codes. IEEE Trans Inf Theory 27(5):533–547Zehavi E (1988) 8-PSK trellis codes for a Ralyleigh fading channel. IEEE Trans Commun 36:1004–101
    corecore