40 research outputs found

    FPGA Prototyping of A High Data Rate LTE Uplink Baseband Receiver

    Get PDF
    The Third Generation Partnership Project (3GPP) Long Term Evolution (LTE) standard is becoming the appropriate choice to pave the way for the next generation wireless and cellular standards. While the popular OFDM technique has been adopted and implemented in previous standards and also in the LTE downlink, it suffers from high peak-to-average-power ratio (PAPR). High PAPR requires more sophisticated power amplifiers (PAs) in the handsets and would result in lower efficiency PAs. In order to combat such effects, the LTE uplink choice of transmission is the novel Single Carrier Frequency Division Multiple Access (SC-FDMA) scheme which has lower PAPR due to its inherent signal structure. While reducing the PAPR, the SC-FDMA requires a more complicated detector structure in the base station for multi-antenna and multi-user scenarios. Since the multi-antenna and multi-user scenarios are critical parts of the LTE standard to deliver high performance and data rate, it is important to design novel architectures to ensure high reliability and data rate in the receiver. In this paper, we propose a flexible architecture of a high data rate LTE uplink receiver with multiple receive antennas and implemented a single FPGA prototype of this architecture. The architecture is verified on the WARPLab (a software defined radio platform based on Rice Wireless Open-access Research Platform) and tested in the real over-the-air indoor channel.NokiaNokia Siemens Networks (NSN)XilinxAzimuth SystemsNational Science Foundatio

    Spectrum Optimisation in Wireless Communication Systems: Technology Evaluation, System Design and Practical Implementation

    Get PDF
    Two key technology enablers for next generation networks are examined in this thesis, namely Cognitive Radio (CR) and Spectrally Efficient Frequency Division Multiplexing (SEFDM). The first part proposes the use of traffic prediction in CR systems to improve the Quality of Service (QoS) for CR users. A framework is presented which allows CR users to capture a frequency slot in an idle licensed channel occupied by primary users. This is achieved by using CR to sense and select target spectrum bands combined with traffic prediction to determine the optimum channel-sensing order. The latter part of this thesis considers the design, practical implementation and performance evaluation of SEFDM. The key challenge that arises in SEFDM is the self-created interference which complicates the design of receiver architectures. Previous work has focused on the development of sophisticated detection algorithms, however, these suffer from an impractical computational complexity. Consequently, the aim of this work is two-fold; first, to reduce the complexity of existing algorithms to make them better-suited for application in the real world; second, to develop hardware prototypes to assess the feasibility of employing SEFDM in practical systems. The impact of oversampling and fixed-point effects on the performance of SEFDM is initially determined, followed by the design and implementation of linear detection techniques using Field Programmable Gate Arrays (FPGAs). The performance of these FPGA based linear receivers is evaluated in terms of throughput, resource utilisation and Bit Error Rate (BER). Finally, variants of the Sphere Decoding (SD) algorithm are investigated to ameliorate the error performance of SEFDM systems with targeted reduction in complexity. The Fixed SD (FSD) algorithm is implemented on a Digital Signal Processor (DSP) to measure its computational complexity. Modified sorting and decomposition strategies are then applied to this FSD algorithm offering trade-offs between execution speed and BER

    Baseband Processing for 5G and Beyond: Algorithms, VLSI Architectures, and Co-design

    Get PDF
    In recent years the number of connected devices and the demand for high data-rates have been significantly increased. This enormous growth is more pronounced by the introduction of the Internet of things (IoT) in which several devices are interconnected to exchange data for various applications like smart homes and smart cities. Moreover, new applications such as eHealth, autonomous vehicles, and connected ambulances set new demands on the reliability, latency, and data-rate of wireless communication systems, pushing forward technology developments. Massive multiple-input multiple-output (MIMO) is a technology, which is employed in the 5G standard, offering the benefits to fulfill these requirements. In massive MIMO systems, base station (BS) is equipped with a very large number of antennas, serving several users equipments (UEs) simultaneously in the same time and frequency resource. The high spatial multiplexing in massive MIMO systems, improves the data rate, energy and spectral efficiencies as well as the link reliability of wireless communication systems. The link reliability can be further improved by employing channel coding technique. Spatially coupled serially concatenated codes (SC-SCCs) are promising channel coding schemes, which can meet the high-reliability demands of wireless communication systems beyond 5G (B5G). Given the close-to-capacity error correction performance and the potential to implement a high-throughput decoder, this class of code can be a good candidate for wireless systems B5G. In order to achieve the above-mentioned advantages, sophisticated algorithms are required, which impose challenges on the baseband signal processing. In case of massive MIMO systems, the processing is much more computationally intensive and the size of required memory to store channel data is increased significantly compared to conventional MIMO systems, which are due to the large size of the channel state information (CSI) matrix. In addition to the high computational complexity, meeting latency requirements is also crucial. Similarly, the decoding-performance gain of SC-SCCs also do come at the expense of increased implementation complexity. Moreover, selecting the proper choice of design parameters, decoding algorithm, and architecture will be challenging, since spatial coupling provides new degrees of freedom in code design, and therefore the design space becomes huge. The focus of this thesis is to perform co-optimization in different design levels to address the aforementioned challenges/requirements. To this end, we employ system-level characteristics to develop efficient algorithms and architectures for the following functional blocks of digital baseband processing. First, we present a fast Fourier transform (FFT), an inverse FFT (IFFT), and corresponding reordering scheme, which can significantly reduce the latency of orthogonal frequency-division multiplexing (OFDM) demodulation and modulation as well as the size of reordering memory. The corresponding VLSI architectures along with the application specific integrated circuit (ASIC) implementation results in a 28 nm CMOS technology are introduced. In case of a 2048-point FFT/IFFT, the proposed design leads to 42% reduction in the latency and size of reordering memory. Second, we propose a low-complexity massive MIMO detection scheme. The key idea is to exploit channel sparsity to reduce the size of CSI matrix and eventually perform linear detection followed by a non-linear post-processing in angular domain using the compressed CSI matrix. The VLSI architecture for a massive MIMO with 128 BS antennas and 16 UEs along with the synthesis results in a 28 nm technology are presented. As a result, the proposed scheme reduces the complexity and required memory by 35%–73% compared to traditional detectors while it has better detection performance. Finally, we perform a comprehensive design space exploration for the SC-SCCs to investigate the effect of different design parameters on decoding performance, latency, complexity, and hardware cost. Then, we develop different decoding algorithms for the SC-SCCs and discuss the associated decoding performance and complexity. Also, several high-level VLSI architectures along with the corresponding synthesis results in a 12 nm process are presented, and various design tradeoffs are provided for these decoding schemes

    Exploration of the scalability of SIMD processing for software defined radio

    Get PDF
    The idea of software defined radio (SDR) describes a signal processing system for wireless communications that allows performing major parts of the physical layer processing in software. SDR systems are more flexible and have lower development costs than traditional systems based on application-specific integrated circuits (ASICs). Yet, SDR requires programmable processor architectures that can meet the throughput and energy efficiency requirements of current third generation (3G) and future fourth generation (4G) wireless standards for mobile devices. Single instruction, multiple data (SIMD) processors operate on long data vectors in parallel data lanes and can achieve a good ratio of computing power to energy consumption. Hence, SIMD processors could be the basis of future SDR systems. Yet, SIMD processors only achieve a high efficiency if all parallel data lanes can be utilized. This thesis investigates the scalability of SIMD processing for algorithms required in 4G wireless systems; i. e. the scaling of performance and energy consumption with increasing SIMD vector lengths is explored. The basis of the exploration is a scalable SIMD processor architecture, which also supports long instruction word (LIW) execution and can be configured with four different permutation networks for vector element permutations. Radix-2 and mixed-radix fast Fourier transform (FFT) algorithms, sphere decoding for multiple input, multiple output (MIMO) systems, and the decoding of quasi-cyclic lowdensity parity check (LDPC) codes have been examined, as these are key algorithms for 4G wireless systems. The results show that the performance of all algorithms scales with the SIMD vector length, yet there are different constraints on the ratios between algorithm and architecture parameters. The radix-2 FFT algorithm allows close to linear speedups if the FFT size is at least twice the SIMD vector length, the mixed-radix FFT algorithm requires the FFT size to be a multiple of the squared SIMD width. The performance of the implemented sphere decoding algorithm scales linearly with the SIMD vector length. The scalability of LDPC decoding is determined by the expansion factor of the quasicyclic code. Wider SIMD processors offer better performance and also require less energy than processors with a shorter vector length for all considered algorithms. The results for different permutations networks show that a simple permutation network is sufficient for most applications

    Datacenter Design for Future Cloud Radio Access Network.

    Full text link
    Cloud radio access network (C-RAN), an emerging cloud service that combines the traditional radio access network (RAN) with cloud computing technology, has been proposed as a solution to handle the growing energy consumption and cost of the traditional RAN. Through aggregating baseband units (BBUs) in a centralized cloud datacenter, C-RAN reduces energy and cost, and improves wireless throughput and quality of service. However, designing a datacenter for C-RAN has not yet been studied. In this dissertation, I investigate how a datacenter for C-RAN BBUs should be built on commodity servers. I first design WiBench, an open-source benchmark suite containing the key signal processing kernels of many mainstream wireless protocols, and study its characteristics. The characterization study shows that there is abundant data level parallelism (DLP) and thread level parallelism (TLP). Based on this result, I then develop high performance software implementations of C-RAN BBU kernels in C++ and CUDA for both CPUs and GPUs. In addition, I generalize the GPU parallelization techniques of the Turbo decoder to the trellis algorithms, an important family of algorithms that are widely used in data compression and channel coding. Then I evaluate the performance of commodity CPU servers and GPU servers. The study shows that the datacenter with GPU servers can meet the LTE standard throughput with 4× to 16× fewer machines than with CPU servers. A further energy and cost analysis show that GPU servers can save on average 13× more energy and 6× more cost. Thus, I propose the C-RAN datacenter be built using GPUs as a server platform. Next I study resource management techniques to handle the temporal and spatial traffic imbalance in a C-RAN datacenter. I propose a “hill-climbing” power management that combines powering-off GPUs and DVFS to match the temporal C-RAN traffic pattern. Under a practical traffic model, this technique saves 40% of the BBU energy in a GPU-based C-RAN datacenter. For spatial traffic imbalance, I propose three workload distribution techniques to improve load balance and throughput. Among all three techniques, pipelining packets has the most throughput improvement at 10% and 16% for balanced and unbalanced loads, respectively.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120825/1/qizheng_1.pd

    Bandwidth Compressed Waveform and System Design for Wireless and Optical Communications: Theory and Practice

    Get PDF
    This thesis addresses theoretical and practical challenges of spectrally efficient frequency division multiplexing (SEFDM) systems in both wireless and optical domains. SEFDM improves spectral efficiency relative to the well-known orthogonal frequency division multiplexing (OFDM) by non-orthogonally multiplexing overlapped sub-carriers. However, the deliberate violation of orthogonality results in inter carrier interference (ICI) and associated detection complexity, thus posing many challenges to practical implementations. This thesis will present solutions for these issues. The thesis commences with the fundamentals by presenting the existing challenges of SEFDM, which are subsequently solved by proposed transceivers. An iterative detection (ID) detector iteratively removes self-created ICI. Following that, a hybrid ID together with fixed sphere decoding (FSD) shows an optimised performance/complexity trade-off. A complexity reduced Block-SEFDM can subdivide the signal detection into several blocks. Finally, a coded Turbo-SEFDM is proved to be an efficient technique that is compatible with the existing mobile standards. The thesis also reports the design and development of wireless and optical practical systems. In the optical domain, given the same spectral efficiency, a low-order modulation scheme is proved to have a better bit error rate (BER) performance when replacing a higher order one. In the wireless domain, an experimental testbed utilizing the LTE-Advanced carrier aggregation (CA) with SEFDM is operated in a realistic radio frequency (RF) environment. Experimental results show that 40% higher data rate can be achieved without extra spectrum occupation. Additionally, a new waveform, termed Nyquist-SEFDM, which compresses bandwidth and suppresses out-of-band power leakage is investigated. A 4th generation (4G) and 5th generation (5G) coexistence experiment is followed to verify its feasibility. Furthermore, a 60 GHz SEFDM testbed is designed and built in a point-to-point indoor fiber wireless experiment showing 67% data rate improvement compared to OFDM. Finally, to meet the requirements of future networks, two simplified SEFDM transceivers are designed together with application scenarios and experimental verifications

    Energy efficient design of an adaptive switching algorithm for the iterative-MIMO receiver

    Get PDF
    An efficient design dedicated for iterative-multiple-input multiple-output (MIMO) receiver systems is now imperative in our world since data demands are increasing tremendously in wireless networks. This puts a massive burden on the signal processing power especially in small receiver systems where power sources are often shared or limited. This thesis proposes an attractive solution to both the wireless signal processing and the architectural implementation design sides of the problem. A novel algorithm, dubbed the Adaptive Switching Algorithm, is proven to not only save more than a third of the energy consumption in the algorithmic design, but is also able to achieve an energy reduction of more than 50% in terms of processing power when the design is mapped onto state-of-the-art programmable hardware. Simulations are based in MatlabTM using the Monte Carlo approach, where multiple additive white Gaussian noise (AWGN) and Rayleigh fading channels for both fast and slow fading environments were investigated. The software selects the appropriate detection algorithm depending on the current channel conditions. The design for the hardware is based on the latest field programmable gate arrays (FPGA) hardware from Xilinx R , specifically the Virtex-5 and Virtex-7 chipsets. They were chosen during the experimental phase to verify the results in order to examine trends for energy consumption in the proposed algorithm design. Savings come from dynamic allocation of the hardware resources by implementing power minimization techniques depending on the processing requirements of the system. Having demonstrated the feasibility of the algorithm in controlled environments, realistic channel conditions were simulated using spatially correlated MIMO channels to test the algorithm’s readiness for real-world deployment. The proposed algorithm is placed in both the MIMO detector and the iterative-decoder blocks of the receiver. When the final full receiver design setup is implemented, it shows that the key to energy saving lies in the fact that both software and hardware components of the Adaptive Switching Algorithm adopt adaptivity in the respective designs. The detector saves energy by selecting suitable detection schemes while the decoder provides adaptivity by limiting the number of decoding iterations, both of which are updated in real-time. The overall receiver can achieve more than 70% energy savings in comparison to state-of-the-art iterative-MIMO receivers and thus it can be concluded that this level of ‘intelligence’ is an important direction towards a more efficient iterative-MIMO receiver designs in the future

    Timing and Carrier Synchronization in Wireless Communication Systems: A Survey and Classification of Research in the Last 5 Years

    Get PDF
    Timing and carrier synchronization is a fundamental requirement for any wireless communication system to work properly. Timing synchronization is the process by which a receiver node determines the correct instants of time at which to sample the incoming signal. Carrier synchronization is the process by which a receiver adapts the frequency and phase of its local carrier oscillator with those of the received signal. In this paper, we survey the literature over the last 5 years (2010–2014) and present a comprehensive literature review and classification of the recent research progress in achieving timing and carrier synchronization in single-input single-output (SISO), multiple-input multiple-output (MIMO), cooperative relaying, and multiuser/multicell interference networks. Considering both single-carrier and multi-carrier communication systems, we survey and categorize the timing and carrier synchronization techniques proposed for the different communication systems focusing on the system model assumptions for synchronization, the synchronization challenges, and the state-of-the-art synchronization solutions and their limitations. Finally, we envision some future research directions
    corecore