Search CORE

210 research outputs found

A Programmable, Scalable-Throughput Interleaver

Author: CH van Berkel
EJC Rijshouwer
Publication venue: Springer Nature
Publication date: 01/01/2010
Field of study

The interleaver stages of digital communication standards show a surprisingly large variation in throughput, state sizes, and permutation functions. Furthermore, data rates for 4G standards such as LTE-Advanced will exceed typical baseband clock frequencies of handheld devices. Multistream operation for Software Defined Radio and iterative decoding algorithms will call for ever higher interleave data rates. Our interleave machine is built around 8 single-port SRAM banks and can be programmed to generate up to 8 addresses every clock cycle. The scalable architecture combines SIMD and VLIW concepts with an efficient resolution of bank conflicts. A wide range of cellular, connectivity, and broadcast interleavers have been mapped on this machine, with throughputs up to more than 0.5 Gsymbol/second. Although it was designed for channel interleaving, the application domain of the interleaver extends also to Turbo interleaving. The presented configuration of the architecture is designed as a part of a programmable outer receiver on a prototype board. It offers (near) universal programmability to enable the implementation of new interleavers. The interleaver measures 2.09 mm2 in 65 nm CMOS (including memories) and proves functional on silicon

Springer - Publisher Connector

Pure OAI Repository

Directory of Open Access Journals

A programmable, scalable-throughput interleaver

Author: Berkel
C H Van
E J C Rijshouwer
Publication venue
Publication date: 02/04/2020
Field of study

Abstract The interleaver stages of digital communication standards show a surprisingly large variation in throughput, state sizes, and permutation functions. Furthermore, data rates for 4G standards such as LTE-Advanced will exceed typical baseband clock frequencies of handheld devices. Multistream operation for Software Defined Radio and iterative decoding algorithms will call for ever higher interleave data rates. Our interleave machine is built around 8 single-port SRAM banks and can be programmed to generate up to 8 addresses every clock cycle. The scalable architecture combines SIMD and VLIW concepts with an efficient resolution of bank conflicts. A wide range of cellular, connectivity, and broadcast interleavers have been mapped on this machine, with throughputs up to more than 0.5 Gsymbol/second. Although it was designed for channel interleaving, the application domain of the interleaver extends also to Turbo interleaving. The presented configuration of the architecture is designed as a part of a programmable outer receiver on a prototype board. It offers (near) universal programmability to enable the implementation of new interleavers. The interleaver measures 2.09 mm2 in 65 nm CMOS (including memories) and proves functional on silicon

CiteSeerX

VLSI Implementation of WiMax Convolutional Turbo Code Encoder and Decoder

Author: Martina Maurizio
Masera Guido
Nicola M.
Publication venue: World Scientific Publishing
Publication date: 01/01/2009
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Configurable and Scalable Turbo Decoder for 4G Wireless Receivers

Author: Cavallaro Joseph R.
Manish Goel
Sun Yang
Zhu Yuming
Publication venue: IGI-Global Press
Publication date: 01/01/2010
Field of study

The increasing requirements of high data rates and quality of service (QoS) in fourth-generation (4G) wireless communication require the implementation of practical capacity approaching codes. In this chapter, the application of Turbo coding schemes that have recently been adopted in the IEEE 802.16e WiMax standard and 3GPP Long Term Evolution (LTE) standard are reviewed. In order to process several 4G wireless standards with a common hardware module, a reconfigurable and scalable Turbo decoder architecture is presented. A parallel Turbo decoding scheme with scalable parallelism tailored to the target throughput is applied to support high data rates in 4G applications. High-level decoding parallelism is achieved by employing contention-free interleavers. A multi-banked memory structure and routing network among memories and MAP decoders are designed to operate at full speed with parallel interleavers. A new on-line address generation technique is introduced to support multiple Turbo interleaving patterns, which avoids the interleaver address memory that is typically necessary in the traditional designs. Design trade-offs in terms of area and power efficiency are analyzed for different parallelism and clock frequency goals

DSpace at Rice University

Configurable and Scalable High Throughput Turbo Decoder Architecture for Multiple 4GWireless Standards

Author: Cavallaro Joseph R.
Goel Manish
Sun Yang
Zhu Yuming
Publication venue: IEEE
Publication date: 01/01/2008
Field of study

In this paper, we propose a novel multi-code turbo decoder architecture for 4G wireless systems. To support various 4G standards, a configurable multi-mode MAP (maximum a posteriori) decoder is designed for both binary and duo-binary turbo codes with small resource overhead (less than 10%) compared to the single-mode architecture. To achieve high data rates in 4G, we present a parallel turbo decoder architecture with scalable parallelism tailored to the given throughput requirements. High-level parallelism is achieved by employing contention-free interleavers. Multi-banked memory structure and routing network among memories and MAP decoders are designed to operate at full speed with parallel interleavers. We designed a very low-complexity recursive on-line address generator supporting multiple interleaving patterns, which avoids the interleaver address memory. Design trade-offs in terms of area and power efficiency are explored to find the optimal architectures. A 711 Mbps data rate is feasible with 32 Radix-4 MAP decoders running at 200 MHz clock rate.Texas Instruments Incorporate

CiteSeerX

Crossref

DSpace at Rice University

Scalable Architecture of MIMO Multi-carrier CDMA System on Programmable Logic

Author: Cavallaro Joseph R.
Guo Yuanbin
Publication venue: IEEE
Publication date: 01/11/2007
Field of study

In this paper, a scalable architecture of the multicarrier CDMA system using Multiple-Input-Multiple-Output (MIMO) technology is designed in the programmable logic array. The system-level partitioning with different architecture design entries is described. The overall computing architecture for complex signal processing blocks, e.g., channel estimation, frequency domain equalization, demodulation etc is described. The MIMO architecture is easily extended from a SISO system with single antenna. This scalable architecture demonstrates resource utilization efficiency and easy extension to MIMO configurations

Crossref

DSpace at Rice University

VLSI Architectures for WIMAX Channel Decoders

Author: Martina Maurizio
Masera Guido
Publication venue
Publication date: 01/01/2009
Field of study

This chapter describes the main architectures proposed in the literature to implement the channel decoders required by the WiMax standard, namely convolutional codes, turbo codes (both block and convolutional) and LDPC. Then it shows a complete design of a convolutional turbo code encoder/decoder system for WiMax.Comment: To appear in the book "WIMAX, New Developments", M. Upena, D. Dalal, Y. Kosta (Ed.), ISBN978-953-7619-53-

arXiv.org e-Print Archive

IntechOpen

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Implementation of a 3GPP LTE Turbo Decoder Accelerator on GPU

Author: Cavallaro Joseph R.
Wu Michael
Publication venue: IEEE
Publication date: 01/10/2010
Field of study

This paper presents a 3GPP LTE compliant turbo decoder accelerator on GPU. The challenge of implementing a turbo decoder is finding an efficient mapping of the decoder algorithm on GPU, e.g. finding a good way to parallelize workload across cores and allocate and use fast on-die memory to improve throughput. In our implementation, we increase throughput through 1) distributing the decoding workload for a codeword across multiple cores, 2) decoding multiple codewords simultaneously to increase concurrency and 3) employing memory optimization techniques to reduce memory bandwidth requirements. In addition, we analyze how different MAP algorithm approximations affect both throughput and bit error rate (BER) performance of this decoder

DSpace at Rice University