558 research outputs found

    A Scalable VLSI Architecture for Soft-Input Soft-Output Depth-First Sphere Decoding

    Full text link
    Multiple-input multiple-output (MIMO) wireless transmission imposes huge challenges on the design of efficient hardware architectures for iterative receivers. A major challenge is soft-input soft-output (SISO) MIMO demapping, often approached by sphere decoding (SD). In this paper, we introduce the - to our best knowledge - first VLSI architecture for SISO SD applying a single tree-search approach. Compared with a soft-output-only base architecture similar to the one proposed by Studer et al. in IEEE J-SAC 2008, the architectural modifications for soft input still allow a one-node-per-cycle execution. For a 4x4 16-QAM system, the area increases by 57% and the operating frequency degrades by 34% only.Comment: Accepted for IEEE Transactions on Circuits and Systems II Express Briefs, May 2010. This draft from April 2010 will not be updated any more. Please refer to IEEE Xplore for the final version. *) The final publication will appear with the modified title "A Scalable VLSI Architecture for Soft-Input Soft-Output Single Tree-Search Sphere Decoding

    Turbo NOC: a framework for the design of Network On Chip based turbo decoder architectures

    Get PDF
    This work proposes a general framework for the design and simulation of network on chip based turbo decoder architectures. Several parameters in the design space are investigated, namely the network topology, the parallelism degree, the rate at which messages are sent by processing nodes over the network and the routing strategy. The main results of this analysis are: i) the most suited topologies to achieve high throughput with a limited complexity overhead are generalized de-Bruijn and generalized Kautz topologies; ii) depending on the throughput requirements different parallelism degrees, message injection rates and routing algorithms can be used to minimize the network area overhead.Comment: submitted to IEEE Trans. on Circuits and Systems I (submission date 27 may 2009

    A Flexible LDPC/Turbo Decoder Architecture

    Get PDF
    Low-density parity-check (LDPC) codes and convolutional Turbo codes are two of the most powerful error correcting codes that are widely used in modern communication systems. In a multi-mode baseband receiver, both LDPC and Turbo decoders may be required. However, the different decoding approaches for LDPC and Turbo codes usually lead to different hardware architectures. In this paper we propose a unified message passing algorithm for LDPC and Turbo codes and introduce a flexible soft-input soft-output (SISO) module to handle LDPC/Turbo decoding. We employ the trellis-based maximum a posteriori (MAP) algorithm as a bridge between LDPC and Turbo codes decoding. We view the LDPC code as a concatenation of n super-codes where each super-code has a simpler trellis structure so that the MAP algorithm can be easily applied to it. We propose a flexible functional unit (FFU) for MAP processing of LDPC and Turbo codes with a low hardware overhead (about 15% area and timing overhead). Based on the FFU, we propose an area-efficient flexible SISO decoder architecture to support LDPC/Turbo codes decoding. Multiple such SISO modules can be embedded into a parallel decoder for higher decoding throughput. As a case study, a flexible LDPC/Turbo decoder has been synthesized on a TSMC 90 nm CMOS technology with a core area of 3.2 mm2. The decoder can support IEEE 802.16e LDPC codes, IEEE 802.11n LDPC codes, and 3GPP LTE Turbo codes. Running at 500 MHz clock frequency, the decoder can sustain up to 600 Mbps LDPC decoding or 450 Mbps Turbo decoding.NokiaNokia Siemens Networks (NSN)XilinxTexas InstrumentsNational Science Foundatio

    UNIFIED DECODER ARCHITECTURE FOR LDPC/TURBO CODES

    Get PDF
    Low-density parity-check (LDPC) codes on par with convolutional turbo codes (CTC) are two of the most powerful error correction codes known to perform very close to the Shannon limit. However, their different code structures usually lead to different hardware implementations. In this paper, we propose a unified decoder architecture that is capable of decoding both LDPC and turbo codes with a limited hardware overhead. We employ maximum a posteriori (MAP) algorithm as a bridge between LDPC and turbo codes. We represent LDPC codes as parallel concatenated single parity check (PCSPC) codes and propose a group sub-trellis (GST) decoding algorithm for the efficient decoding of PCSPC codes. This algorithm achieves about 2X improvement in the convergence speed and is more numerically robust than the classical ”tanh” algorithm. What is more interesting is that we can generalize a unified trellis decoding algorithm for LDPC and turbo codes based on their trellis structures. We propose a reconfigurable computation kernel for log-MAP decoding of LDPC and turbo codes at a cost of ∼15% hardware overhead. Small lookup tables (LUTs) with 9 entries of 2-bit data are designed to implement the log-MAP algorithm. Fixed point (6:2) simulation results show that there is negligible or nearly no performance loss by using this LUT approximation compared to the ideal case. The proposed architecture results in scalable and flexible datapath units enabling parallel decoding of LDPC/turbo codes.NokiaNational Science Foundatio

    Mapping the SISO module of the Turbo decoder to a FPFA

    Get PDF
    In the CHAMELEON project a reconfigurable systems-architecture, the Field Programmable Function Array (FPFA) is introduced. FPFAs are reminiscent to FPGAs, but have a matrix of ALUs and lookup tables instead of Configurable Logic Blocks (CLBs). The FPFA can be regarded as a low power reconfigurable accelerator for an application specific domain. In this paper we show how the SISO (Soft Input Soft Output) module of the Turbo decoding algorithm can be mapped on the reconfigurable FPFA

    Modified Distributive Arithmetic based 2D-DWT for Hybrid (Neural Network-DWT) Image Compression

    Get PDF
    Artificial Neural Networks ANN is significantly used in signal and image processing techniques for pattern recognition and template matching Discrete Wavelet Transform DWT is combined with neural network to achieve higher compression if 2D data such as image Image compression using neural network and DWT have shown superior results over classical techniques with 70 higher compression and 20 improvement in Mean Square Error MSE Hardware complexity and power issipation are the major challenges that have been addressed in this work for VLSI implementation In this work modified distributive arithmetic DWT and multiplexer based DWT architecture are designed to reduce the computation complexity of hybrid architecture for image compression A 2D DWT architecture is designed with 1D DWT architecture and is implemented on FPGA that operates at 268 MHz consuming power less than 1
    corecore