Search CORE

70 research outputs found

Algorithm-Architecture Co-Design for Digital Front-Ends in Mobile Receivers

Author: Diaz Isael
Publication venue
Publication date: 01/01/2014
Field of study

The methodology behind this work has been to use the concept of algorithm-hardware co-design to achieve efficient solutions related to the digital front-end in mobile receivers. It has been shown that, by looking at algorithms and hardware architectures together, more efficient solutions can be found; i.e., efficient with respect to some design measure. In this thesis the main focus have been placed on two such parameters; first reduced complexity algorithms to lower energy consumptions at limited performance degradation, secondly to handle the increasing number of wireless standards that preferably should run on the same hardware platform. To be able to perform this task it is crucial to understand both sides of the table, i.e., both algorithms and concepts for wireless communication as well as the implications arising on the hardware architecture. It is easier to handle the high complexity by separating those disciplines in a way of layered abstraction. However, this representation is imperfect, since many interconnected "details" belonging to different layers are lost in the attempt of handling the complexity. This results in poor implementations and the design of mobile terminals is no exception. Wireless communication standards are often designed based on mathematical algorithms with theoretical boundaries, with few considerations to actual implementation constraints such as, energy consumption, silicon area, etc. This thesis does not try to remove the layer abstraction model, given its undeniable advantages, but rather uses those cross-layer "details" that went missing during the abstraction. This is done in three manners: In the first part, the cross-layer optimization is carried out from the algorithm perspective. Important circuit design parameters, such as quantization are taken into consideration when designing the algorithm for OFDM symbol timing, CFO, and SNR estimation with a single bit, namely, the Sign-Bit. Proof-of-concept circuits were fabricated and showed high potential for low-end receivers. In the second part, the cross-layer optimization is accomplished from the opposite side, i.e., the hardware-architectural side. A SDR architecture is known for its flexibility and scalability over many applications. In this work a filtering application is mapped into software instructions in the SDR architecture in order to make filtering-specific modules redundant, and thus, save silicon area. In the third and last part, the optimization is done from an intermediate point within the algorithm-architecture spectrum. Here, a heterogeneous architecture with a combination of highly efficient and highly flexible modules is used to accomplish initial synchronization in at least two concurrent OFDM standards. A demonstrator was build capable of performing synchronization in any two standards, including LTE, WiFi, and DVB-H

Lund University Publications

Adaptive Baseband Pro cessing and Configurable Hardware for Wireless Communication

Author: Gangarajaiah Rakesh
Publication venue: Department of Electrical and Information Technology, Lund University
Publication date: 01/01/2017
Field of study

The world of information is literally at one’s fingertips, allowing access to previously unimaginable amounts of data, thanks to advances in wireless communication. The growing demand for high speed data has necessitated theuse of wider bandwidths, and wireless technologies such as Multiple-InputMultiple-Output (MIMO) have been adopted to increase spectral efficiency.These advanced communication technologies require sophisticated signal processing, often leading to higher power consumption and reduced battery life.Therefore, increasing energy efficiency of baseband hardware for MIMO signal processing has become extremely vital. High Quality of Service (QoS)requirements invariably lead to a larger number of computations and a higherpower dissipation. However, recognizing the dynamic nature of the wirelesscommunication medium in which only some channel scenarios require complexsignal processing, and that not all situations call for high data rates, allowsthe use of an adaptive channel aware signal processing strategy to provide adesired QoS. Information such as interference conditions, coherence bandwidthand Signal to Noise Ratio (SNR) can be used to reduce algorithmic computations in favorable channels. Hardware circuits which run these algorithmsneed flexibility and easy reconfigurability to switch between multiple designsfor different parameters. These parameters can be used to tune the operations of different components in a receiver based on feedback from the digitalbaseband. This dissertation focuses on the optimization of digital basebandcircuitry of receivers which use feedback to trade power and performance. Aco-optimization approach, where designs are optimized starting from the algorithmic stage through the hardware architectural stage to the final circuitimplementation is adopted to realize energy efficient digital baseband hardwarefor mobile 4G devices. These concepts are also extended to the next generation5G systems where the energy efficiency of the base station is improved.This work includes six papers that examine digital circuits in MIMO wireless receivers. Several key blocks in these receiver include analog circuits thathave residual non-linearities, leading to signal intermodulation and distortion.Paper-I introduces a digital technique to detect such non-linearities and calibrate analog circuits to improve signal quality. The concept of a digital nonlinearity tuning system developed in Paper-I is implemented and demonstratedin hardware. The performance of this implementation is tested with an analogchannel select filter, and results are presented in Paper-II. MIMO systems suchas the ones used in 4G, may employ QR Decomposition (QRD) processors tosimplify the implementation of tree search based signal detectors. However,the small form factor of the mobile device increases spatial correlation, whichis detrimental to signal multiplexing. Consequently, a QRD processor capableof handling high spatial correlation is presented in Paper-III. The algorithm and hardware implementation are optimized for carrier aggregation, which increases requirements on signal processing throughput, leading to higher powerdissipation. Paper-IV presents a method to perform channel-aware processingwith a simple interpolation strategy to adaptively reduce QRD computationcount. Channel properties such as coherence bandwidth and SNR are used toreduce multiplications by 40% to 80%. These concepts are extended to usetime domain correlation properties, and a full QRD processor for 4G systemsfabricated in 28 nm FD-SOI technology is presented in Paper-V. The designis implemented with a configurable architecture and measurements show thatcircuit tuning results in a highly energy efficient processor, requiring 0.2 nJ to1.3 nJ for each QRD. Finally, these adaptive channel-aware signal processingconcepts are examined in the scope of the next generation of communicationsystems. Massive MIMO systems increase spectral efficiency by using a largenumber of antennas at the base station. Consequently, the signal processingat the base station has a high computational count. Paper-VI presents a configurable detection scheme which reduces this complexity by using techniquessuch as selective user detection and interpolation based signal processing. Hardware is optimized for resource sharing, resulting in a highly reconfigurable andenergy efficient uplink signal detector

Lund University Publications

Exploring the design space of a VLIW processor for LTE and LTE-A

Author: Bosveld M.
Publication venue
Publication date: 01/01/2011
Field of study

Repository TU/e

5G – Wireless Communications for 2020

Author: Amorim Rafhael Medeiros de
Barreto André
Faria Bruno
Lauridsen Mads
Portela Lopes de Almeida Erika
Rodriguez Ignacio
Vieira Robson
Publication venue: 'Sociedad Brasileira de Telecomunicacoes'
Publication date: 28/06/2016
Field of study

VBN

Hardware co-processor to enable MIMO in next generation wireless networks

Author: Horner Nathaniel
Publication venue: RIT Scholar Works
Publication date: 01/02/2010
Field of study

One prevailing technology in wireless communication is Multiple Input, Multiple Output (MIMO) communication. MIMO communication simultaneously transmits several data streams, each from their own antenna within the same frequency channel. This technique can increase data bandwidth by up to a factor of the number of transmitting antennas, but comes with the cost of a much higher computational complexity for the wireless receiver. MIMO communication exploits differing channel effects caused by physical distances between antennas to differentiate between transmitting antennas, an intrinsically two dimensional operation. Current Digital Signal Processors (DSPs), on the other hand, are designed to perform computations on one dimensional vectors of incoming data. To compensate for the lack of native support of these higher dimensional operations, current base stations are forced to add multiple new processing elements while many mobile devices cannot support MIMO communication. In order to allow wireless clients and stations to have native support of the two dimensional operations required by MIMO communication, a hardware co-processor was designed to allow the DSP to offload these operations onto another processor to reduce computation time

RIT Scholar Works

Efficient DSP and Circuit Architectures for Massive MIMO: State-of-the-Art and Future Directions

Author: Larsson Erik G.
Liu Liang
Van der Perre Liesbet
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Massive MIMO is a compelling wireless access concept that relies on the use of an excess number of base-station antennas, relative to the number of active terminals. This technology is a main component of 5G New Radio (NR) and addresses all important requirements of future wireless standards: a great capacity increase, the support of many simultaneous users, and improvement in energy efficiency. Massive MIMO requires the simultaneous processing of signals from many antenna chains, and computational operations on large matrices. The complexity of the digital processing has been viewed as a fundamental obstacle to the feasibility of Massive MIMO in the past. Recent advances on system-algorithm-hardware co-design have led to extremely energy-efficient implementations. These exploit opportunities in deeply-scaled silicon technologies and perform partly distributed processing to cope with the bottlenecks encountered in the interconnection of many signals. For example, prototype ASIC implementations have demonstrated zero-forcing precoding in real time at a 55 mW power consumption (20 MHz bandwidth, 128 antennas, multiplexing of 8 terminals). Coarse and even error-prone digital processing in the antenna paths permits a reduction of consumption with a factor of 2 to 5. This article summarizes the fundamental technical contributions to efficient digital signal processing for Massive MIMO. The opportunities and constraints on operating on low-complexity RF and analog hardware chains are clarified. It illustrates how terminals can benefit from improved energy efficiency. The status of technology and real-life prototypes discussed. Open challenges and directions for future research are suggested.Comment: submitted to IEEE transactions on signal processin

arXiv.org e-Print Archive

Publikationer från Linköpings universitet

Lund University Publications

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Reconfigurable Real-time MIMO Detector on GPU

Author: Cavallaro Joseph R.
Sun Yang
Wu Michael
Publication venue: IEEE
Publication date: 01/01/2009
Field of study

In a high performance multiple-input multiple-output (MIMO) system, a soft output MIMO detector combined with a channel decoder is often used at the receiver to maximize performance gain. Graphic processor unit (GPU) is a low-cost parallel programmable co-processor that can deliver extremely high computation throughput and is well suited for signal processing applications. We propose and implement a novel soft MIMO detection algorithm and show we meet real-time performance while maintaining flexibility using GPU.NokiaNokia Siemens Networks (NSN)Texas InstrumentsXilinxNational Science Foundatio

CiteSeerX

On the application of graphics processor to wireless receiver design

Author: Wu Michael
Publication venue
Publication date: 01/01/2010
Field of study

In many wireless systems, a Turbo decoder is often combined with a soft-output multiple-input and multiple-output (MIMO) detector at the receiver to maximize performance in many 4G and beyond wireless standards. Although custom application specific designs are usually used to meet this challenge, programmable graphics processing units (GPU) has become an alternative to the traditional ASIC and FPGA solution for wireless applications. However, careful architecture-aware algorithm design and mapping are required to maximize performance of a communication block on GPU. For MIMO soft detection, we implemented a new MIMO soft detection algorithm, multi-pass trellis traversal (MTT). For Turbo decoding, we used a parallel window algorithm. We showed that our implementations can achieve high throughput while maintaining good performance. This work will allow us to implement a complete iterative MIMO receiver in software on GPU in the future

Singular value decomposition based pipeline architecture for MIMO communication systems

Author: Wang Yue
Publication venue: Drexel University
Publication date: 01/06/2010
Field of study

This thesis presents a design, implementation and performance benchmark of custom hardware for computing Singular Value Decomposition (SVD) of the radio communication channel characteristic matrix. Software Defined Radio (SDR) is a concept in which the radio transceiver is implemented by software programs running on a processor. SVD of the channel characteristic matrix is used in pre-coding, equalization and beamforming for Multiple Input Multiple Output (MIMO) and Orthogonal Frequency Division Modulation (OFDM) communication systems (e.g., IEEE 802.11n). Since SVD is computationally intensive, it may require custom hardware to reduce the computing time. The pipeline processor developed in this thesis is suitable for computing the SVD of a sequence of 2 × 2 matrices. A stream of 2×2 matrices is sent to the custom hardware, which returns the corresponding streams of singular values and unitary matrices. The architecture is based on the two sided Jacobi method utilizing Coordinate Rotation Digital Computer (CORDIC) algorithms. A 2×2 SVD prototype was implemented on Field-Programmable Gate Array (FPGA) for SDR applications. The 2×2 SVD prototype design can output the singular values and the corresponding unitary matrices in pipeline while operating at a data rate of 324 MHz on a Virtex 6 (xc6vlx240t-lff1156) FPGA. The prototype design consists of fifty-five CORDIC cores which takes 32 percent of available logic on the FPGA. It achieves the optimal pipeline rate equaled to the maximum hardware clock rate. The depth of the pipeline (latency) is 173 clock-cycles for 16-bit data hardware. The proposed architecture provides performance gains over standard software libraries, such as the ZGESVD function of Linear Algebra PACKage (LAPACK) library, which is based on Golub-Kahan-Reinsch SVD algorithm, when running on standard processors. The ZGESVD function of LAPACK implemented in Intel’s Math Kernel Library (MKL) will achieve a projected data rate of 40 MHz on a 2.50 GHz Intel Quad (Q9300) CPU. The pipeline SVD hardware ban width equals the clock frequency and the data rate can reach 324 MHz on the ML605 board (Virtex 6 xc6vlx240t). The proposed architecture also has the potential to be easily extended to solve 4×4 SVD problems used in pre-coding and equalization schemes. The proposed algorithm and design have better performance for small matrices, even though the general timing complexity is n2 when compared to nlog(n) complexity of Brent-Luk-Van Loan (BLV) systolic array using non-pipeline 2×2 processors. The performance gain of the proposed design is at the cost of increased circuit area.M.S., Computer Engineering -- Drexel University, 201

Drexel Libraries E-Repository and Archives

5G – Wireless Communications for 2020

Author
Publication venue: 'Sociedad Brasileira de Telecomunicacoes'
Publication date: 01/01/2016
Field of study