567 research outputs found
A Scalable VLSI Architecture for Soft-Input Soft-Output Depth-First Sphere Decoding
Multiple-input multiple-output (MIMO) wireless transmission imposes huge
challenges on the design of efficient hardware architectures for iterative
receivers. A major challenge is soft-input soft-output (SISO) MIMO demapping,
often approached by sphere decoding (SD). In this paper, we introduce the - to
our best knowledge - first VLSI architecture for SISO SD applying a single
tree-search approach. Compared with a soft-output-only base architecture
similar to the one proposed by Studer et al. in IEEE J-SAC 2008, the
architectural modifications for soft input still allow a one-node-per-cycle
execution. For a 4x4 16-QAM system, the area increases by 57% and the operating
frequency degrades by 34% only.Comment: Accepted for IEEE Transactions on Circuits and Systems II Express
Briefs, May 2010. This draft from April 2010 will not be updated any more.
Please refer to IEEE Xplore for the final version. *) The final publication
will appear with the modified title "A Scalable VLSI Architecture for
Soft-Input Soft-Output Single Tree-Search Sphere Decoding
On chip interconnects for multiprocessor turbo decoding architectures
International audienc
Turbo NOC: a framework for the design of Network On Chip based turbo decoder architectures
This work proposes a general framework for the design and simulation of
network on chip based turbo decoder architectures. Several parameters in the
design space are investigated, namely the network topology, the parallelism
degree, the rate at which messages are sent by processing nodes over the
network and the routing strategy. The main results of this analysis are: i) the
most suited topologies to achieve high throughput with a limited complexity
overhead are generalized de-Bruijn and generalized Kautz topologies; ii)
depending on the throughput requirements different parallelism degrees, message
injection rates and routing algorithms can be used to minimize the network area
overhead.Comment: submitted to IEEE Trans. on Circuits and Systems I (submission date
27 may 2009
A Flexible LDPC/Turbo Decoder Architecture
Low-density parity-check (LDPC) codes and convolutional Turbo codes are two of the most powerful error correcting codes that are widely used in modern
communication systems. In a multi-mode baseband receiver, both LDPC and Turbo decoders may be required. However, the different decoding approaches
for LDPC and Turbo codes usually lead to different hardware architectures. In this paper we propose a unified message passing algorithm for LDPC and Turbo
codes and introduce a flexible soft-input soft-output (SISO) module to handle LDPC/Turbo decoding. We employ the trellis-based maximum a posteriori (MAP)
algorithm as a bridge between LDPC and Turbo codes decoding. We view the LDPC code as a concatenation of n super-codes where each super-code has a simpler
trellis structure so that the MAP algorithm can be easily applied to it. We propose a flexible functional unit (FFU) for MAP processing of LDPC and Turbo
codes with a low hardware overhead (about 15% area and timing overhead). Based on the FFU, we propose an area-efficient flexible SISO decoder architecture to
support LDPC/Turbo codes decoding. Multiple such SISO modules can be embedded into a parallel decoder for higher decoding throughput. As a case study, a
flexible LDPC/Turbo decoder has been synthesized on a TSMC 90 nm CMOS technology with a core area of 3.2 mm2. The decoder can support IEEE 802.16e LDPC codes, IEEE 802.11n LDPC codes, and 3GPP LTE Turbo codes. Running at 500 MHz clock frequency, the decoder can sustain up to 600 Mbps LDPC decoding or
450 Mbps Turbo decoding.NokiaNokia Siemens Networks (NSN)XilinxTexas InstrumentsNational Science Foundatio
UNIFIED DECODER ARCHITECTURE FOR LDPC/TURBO CODES
Low-density parity-check (LDPC) codes on par with convolutional turbo codes (CTC) are two of the most powerful error correction codes known to perform very close to the Shannon limit. However, their different code structures usually
lead to different hardware implementations. In this paper, we propose a unified decoder architecture that is capable of decoding both LDPC and turbo codes with a limited hardware overhead. We employ maximum a posteriori (MAP) algorithm
as a bridge between LDPC and turbo codes. We represent LDPC codes as parallel concatenated single parity check (PCSPC) codes and propose a group sub-trellis (GST) decoding algorithm for the efficient decoding of PCSPC codes. This algorithm achieves about 2X improvement in the convergence speed and is more numerically robust than the classical ”tanh” algorithm. What is more interesting is that we can generalize a unified trellis decoding algorithm for LDPC and turbo codes based on their trellis structures. We propose a
reconfigurable computation kernel for log-MAP decoding of LDPC and turbo codes at a cost of ∼15% hardware overhead.
Small lookup tables (LUTs) with 9 entries of 2-bit data are
designed to implement the log-MAP algorithm. Fixed point
(6:2) simulation results show that there is negligible or nearly
no performance loss by using this LUT approximation compared
to the ideal case. The proposed architecture results in
scalable and flexible datapath units enabling parallel decoding
of LDPC/turbo codes.NokiaNational Science Foundatio
Modified Distributive Arithmetic based 2D-DWT for Hybrid (Neural Network-DWT) Image Compression
Artificial Neural Networks ANN is significantly used in signal and image processing techniques for pattern recognition and template matching Discrete Wavelet Transform DWT is combined with neural network to achieve higher compression if 2D data such as image Image compression using neural network and DWT have shown superior results over classical techniques with 70 higher compression and 20 improvement in Mean Square Error MSE Hardware complexity and power issipation are the major challenges that have been addressed in this work for VLSI implementation In this work modified distributive arithmetic DWT and multiplexer based DWT architecture are designed to reduce the computation complexity of hybrid architecture for image compression A 2D DWT architecture is designed with 1D DWT architecture and is implemented on FPGA that operates at 268 MHz consuming power less than 1
Mapping the SISO module of the Turbo decoder to a FPFA
In the CHAMELEON project a reconfigurable systems-architecture, the Field Programmable Function Array (FPFA) is introduced. FPFAs are reminiscent to FPGAs, but have a matrix of ALUs and lookup tables instead of Configurable Logic Blocks (CLBs). The FPFA can be regarded as a low power reconfigurable accelerator for an application specific domain. In this paper we show how the SISO (Soft Input Soft Output) module of the Turbo decoding algorithm can be mapped on the reconfigurable FPFA
- …