8 research outputs found

    BASEBAND RADIO MODEM DESIGN USING GRAPHICS PROCESSING UNITS

    Get PDF
    A modern radio or wireless communications transceiver is programmed via software and firmware to change its functionalities at the baseband. However, the actual implementation of the radio circuits relies on dedicated hardware, and the design and implementation of such devices are time consuming and challenging. Due to the need for real-time operation, dedicated hardware is preferred in order to meet stringent requirements on throughput and latency. With increasing need for higher throughput and shorter latency, while supporting increasing bandwidth across a fragmented spectrum, dedicated subsystems are developed in order to service individual frequency bands and specifications. Such a dedicated-hardware-intensive approach leads to high resource costs, including costs due to multiple instantiations of mixers, filters, and samplers. Such increases in hardware requirements in turn increases device size, power consumption, weight, and financial cost. If it can meet the required real-time constraints, a more flexible and reconfigurable design approach, such as a software-based solution, is often more desirable over a dedicated hardware solution. However, significant challenges must be overcome in order to meet constraints on throughput and latency while servicing different frequency bands and bandwidths. Graphics processing unit (GPU) technology provides a promising class of platforms for addressing these challenges. GPUs, which were originally designed for rendering images and video sequences, have been adapted as general purpose high-throughput computation engines for a wide variety of application areas beyond their original target domains. Linear algebra and signal processing acceleration are examples of such application areas. In this thesis, we apply GPUs as software-based, baseband radios and demonstrate novel, software-based implementations of key subsystems in modern wireless transceivers. In our work, we develop novel implementation techniques that allow communication system designers to use GPUs as accelerators for baseband processing functions, including real-time filtering and signal transformations. More specifically, we apply GPUs to accelerate several computationally-intensive, frontend radio subsystems, including filtering, signal mixing, sample rate conversion, and synchronization. These are critical subsystems that must operate in real-time to reliably receive waveforms. The contributions of this thesis can be broadly organized into 3 major areas: (1) channelization, (2) arbitrary resampling, and (3) synchronization. 1. Channelization: a wideband signal is shared between different users and channels, and a channelizer is used to separate the components of the shared signal in the different channels. A channelizer is often used as a pre-processing step in selecting a specific channel-of-interest. A typical channelization process involves signal conversion, resampling, and filtering to reject adjacent channels. We investigate GPU acceleration for a particularly efficient form of channelizer called a polyphase filterbank channelizer, and demonstrate a real-time implementation of our novel channelizer design. 2. Arbitrary resampling: following a channelization process, a signal is often resampled to at least twice the data rate in order to further condition the signal. Since different communication standards require different resampling ratios, it is desirable for a resampling subsystem to support a variety of different ratios. We investigate optimized, GPU-based methods for resampling using polyphase filter structures that are mapped efficiently into GPU hardware. We investigate these GPU implementation techniques in the context of interpolation (integer-factor increases in sampling rate), decimation (integer-factor decreases in sampling rate), and rational resampling. Finally, we demonstrate an efficient implementation of arbitrary resampling using GPUs. This implementation exploits specialized hardware units within the GPU to enable efficient and accurate resampling processes involving arbitrary changes in sample rate. 3. Synchronization: incoming signals in a wireless communications transceiver must be synchronized in order to recover the transmitted data properly from complex channel effects such as thermal noise, fading, and multipath propagation. We investigate timing recovery in GPUs to accelerate the most computationally intensive part of the synchronization process, and correctly align the incoming data symbols in the receiver. Furthermore, we implement fully-parallel timing error detection to accelerate maximum likelihood estimation

    Channelization for Multi-Standard Software-Defined Radio Base Stations

    Get PDF
    As the number of radio standards increase and spectrum resources come under more pressure, it becomes ever less efficient to reserve bands of spectrum for exclusive use by a single radio standard. Therefore, this work focuses on channelization structures compatible with spectrum sharing among multiple wireless standards and dynamic spectrum allocation in particular. A channelizer extracts independent communication channels from a wideband signal, and is one of the most computationally expensive components in a communications receiver. This work specifically focuses on non-uniform channelizers suitable for multi-standard Software-Defined Radio (SDR) base stations in general and public mobile radio base stations in particular. A comprehensive evaluation of non-uniform channelizers (existing and developed during the course of this work) shows that parallel and recombined variants of the Generalised Discrete Fourier Transform Modulated Filter Bank (GDFT-FB) represent the best trade-off between computational load and flexibility for dynamic spectrum allocation. Nevertheless, for base station applications (with many channels) very high filter orders may be required, making the channelizers difficult to physically implement. To mitigate this problem, multi-stage filtering techniques are applied to the GDFT-FB. It is shown that these multi-stage designs can significantly reduce the filter orders and number of operations required by the GDFT-FB. An alternative approach, applying frequency response masking techniques to the GDFT-FB prototype filter design, leads to even bigger reductions in the number of coefficients, but computational load is only reduced for oversampled configurations and then not as much as for the multi-stage designs. Both techniques render the implementation of GDFT-FB based non-uniform channelizers more practical. Finally, channelization solutions for some real-world spectrum sharing use cases are developed before some final physical implementation issues are considered

    Energy-efficient hardware architecture and vlsi implementation of a polyphase channelizer with applications to subband adaptive filtering

    Get PDF
    Abstract Polyphase channelizer is an important component of subband adaptive filtering systems. This paper presents an energy-efficient hardware architecture and VLSI implementation of polyphase channelizer, integrating algorithmic, architectural and circuit level design techniques. At algorithm level, low complexity polyphase channelizer architecture is derived using multirate signal processing approach. To reduce the computational complexity in polyphase filters, computation sharing differential coefficient (CSDC) method is effectively used as an architectural level technique. The main idea of CSDC is to combine the strength of augmented differential coefficient method and subexpression sharing. Efficient circuitlevel techniques: low power commutator implementation, dual-VDD scheme and novel level-converting flip-flop (LCFF), are also used to further reduce the power dissipation. The proposed polyphase channelizer consumes 352 mW power with throughput of 480 million samples per second (MSPS). A test chip has been fabricated in 0.18 μm CMOS technology and its functionality is verified. Chip measurement results show that the dual-VDD implementation achieves a total power saving of 2.7 X

    Design of large polyphase filters in the Quadratic Residue Number System

    Full text link

    Temperature aware power optimization for multicore floating-point units

    Full text link
    corecore