45 research outputs found

    A High-Speed QR Decomposition Processor for Carrier-Aggregated LTE-A Downlink Systems

    Get PDF
    This paper presents a high-speed QR decomposition (QRD) processor targeting the carrier-aggregated 4 × 4 Long Term Evolution-Advanced (LTE-A) receiver. The processor provides robustness in spatially correlated channels with reduced complexity by using modifications to the Householder transform, such as decomposing-target redefinition and matrix real-valued decomposition. In terms of hardware design, we extensively explore flexibilities in systolic architectures using a high-level synthesis tool to achieve area-power efficiency. In a 65 nm CMOS technology, the processor occupies a core area of 0.77mm2 and produces 72MQRD per second, the highest reported throughput. The power consumed in the proposed processor is 219mW

    Adaptive Baseband Pro cessing and Configurable Hardware for Wireless Communication

    Get PDF
    The world of information is literally at one’s fingertips, allowing access to previously unimaginable amounts of data, thanks to advances in wireless communication. The growing demand for high speed data has necessitated theuse of wider bandwidths, and wireless technologies such as Multiple-InputMultiple-Output (MIMO) have been adopted to increase spectral efficiency.These advanced communication technologies require sophisticated signal processing, often leading to higher power consumption and reduced battery life.Therefore, increasing energy efficiency of baseband hardware for MIMO signal processing has become extremely vital. High Quality of Service (QoS)requirements invariably lead to a larger number of computations and a higherpower dissipation. However, recognizing the dynamic nature of the wirelesscommunication medium in which only some channel scenarios require complexsignal processing, and that not all situations call for high data rates, allowsthe use of an adaptive channel aware signal processing strategy to provide adesired QoS. Information such as interference conditions, coherence bandwidthand Signal to Noise Ratio (SNR) can be used to reduce algorithmic computations in favorable channels. Hardware circuits which run these algorithmsneed flexibility and easy reconfigurability to switch between multiple designsfor different parameters. These parameters can be used to tune the operations of different components in a receiver based on feedback from the digitalbaseband. This dissertation focuses on the optimization of digital basebandcircuitry of receivers which use feedback to trade power and performance. Aco-optimization approach, where designs are optimized starting from the algorithmic stage through the hardware architectural stage to the final circuitimplementation is adopted to realize energy efficient digital baseband hardwarefor mobile 4G devices. These concepts are also extended to the next generation5G systems where the energy efficiency of the base station is improved.This work includes six papers that examine digital circuits in MIMO wireless receivers. Several key blocks in these receiver include analog circuits thathave residual non-linearities, leading to signal intermodulation and distortion.Paper-I introduces a digital technique to detect such non-linearities and calibrate analog circuits to improve signal quality. The concept of a digital nonlinearity tuning system developed in Paper-I is implemented and demonstratedin hardware. The performance of this implementation is tested with an analogchannel select filter, and results are presented in Paper-II. MIMO systems suchas the ones used in 4G, may employ QR Decomposition (QRD) processors tosimplify the implementation of tree search based signal detectors. However,the small form factor of the mobile device increases spatial correlation, whichis detrimental to signal multiplexing. Consequently, a QRD processor capableof handling high spatial correlation is presented in Paper-III. The algorithm and hardware implementation are optimized for carrier aggregation, which increases requirements on signal processing throughput, leading to higher powerdissipation. Paper-IV presents a method to perform channel-aware processingwith a simple interpolation strategy to adaptively reduce QRD computationcount. Channel properties such as coherence bandwidth and SNR are used toreduce multiplications by 40% to 80%. These concepts are extended to usetime domain correlation properties, and a full QRD processor for 4G systemsfabricated in 28 nm FD-SOI technology is presented in Paper-V. The designis implemented with a configurable architecture and measurements show thatcircuit tuning results in a highly energy efficient processor, requiring 0.2 nJ to1.3 nJ for each QRD. Finally, these adaptive channel-aware signal processingconcepts are examined in the scope of the next generation of communicationsystems. Massive MIMO systems increase spectral efficiency by using a largenumber of antennas at the base station. Consequently, the signal processingat the base station has a high computational count. Paper-VI presents a configurable detection scheme which reduces this complexity by using techniquessuch as selective user detection and interpolation based signal processing. Hardware is optimized for resource sharing, resulting in a highly reconfigurable andenergy efficient uplink signal detector

    Efficient FPGA implementation and power modelling of image and signal processing IP cores

    Get PDF
    Field Programmable Gate Arrays (FPGAs) are the technology of choice in a number ofimage and signal processing application areas such as consumer electronics, instrumentation, medical data processing and avionics due to their reasonable energy consumption, high performance, security, low design-turnaround time and reconfigurability. Low power FPGA devices are also emerging as competitive solutions for mobile and thermally constrained platforms. Most computationally intensive image and signal processing algorithms also consume a lot of power leading to a number of issues including reduced mobility, reliability concerns and increased design cost among others. Power dissipation has become one of the most important challenges, particularly for FPGAs. Addressing this problem requires optimisation and awareness at all levels in the design flow. The key achievements of the work presented in this thesis are summarised here. Behavioural level optimisation strategies have been used for implementing matrix product and inner product through the use of mathematical techniques such as Distributed Arithmetic (DA) and its variations including offset binary coding, sparse factorisation and novel vector level transformations. Applications to test the impact of these algorithmic and arithmetic transformations include the fast Hadamard/Walsh transforms and Gaussian mixture models. Complete design space exploration has been performed on these cores, and where appropriate, they have been shown to clearly outperform comparable existing implementations. At the architectural level, strategies such as parallelism, pipelining and systolisation have been successfully applied for the design and optimisation of a number of cores including colour space conversion, finite Radon transform, finite ridgelet transform and circular convolution. A pioneering study into the influence of supply voltage scaling for FPGA based designs, used in conjunction with performance enhancing strategies such as parallelism and pipelining has been performed. Initial results are very promising and indicated significant potential for future research in this area. A key contribution of this work includes the development of a novel high level power macromodelling technique for design space exploration and characterisation of custom IP cores for FPGAs, called Functional Level Power Analysis and Modelling (FLPAM). FLPAM is scalable, platform independent and compares favourably with existing approaches. A hybrid, top-down design flow paradigm integrating FLPAM with commercially available design tools for systematic optimisation of IP cores has also been developed.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Spectrum Optimisation in Wireless Communication Systems: Technology Evaluation, System Design and Practical Implementation

    Get PDF
    Two key technology enablers for next generation networks are examined in this thesis, namely Cognitive Radio (CR) and Spectrally Efficient Frequency Division Multiplexing (SEFDM). The first part proposes the use of traffic prediction in CR systems to improve the Quality of Service (QoS) for CR users. A framework is presented which allows CR users to capture a frequency slot in an idle licensed channel occupied by primary users. This is achieved by using CR to sense and select target spectrum bands combined with traffic prediction to determine the optimum channel-sensing order. The latter part of this thesis considers the design, practical implementation and performance evaluation of SEFDM. The key challenge that arises in SEFDM is the self-created interference which complicates the design of receiver architectures. Previous work has focused on the development of sophisticated detection algorithms, however, these suffer from an impractical computational complexity. Consequently, the aim of this work is two-fold; first, to reduce the complexity of existing algorithms to make them better-suited for application in the real world; second, to develop hardware prototypes to assess the feasibility of employing SEFDM in practical systems. The impact of oversampling and fixed-point effects on the performance of SEFDM is initially determined, followed by the design and implementation of linear detection techniques using Field Programmable Gate Arrays (FPGAs). The performance of these FPGA based linear receivers is evaluated in terms of throughput, resource utilisation and Bit Error Rate (BER). Finally, variants of the Sphere Decoding (SD) algorithm are investigated to ameliorate the error performance of SEFDM systems with targeted reduction in complexity. The Fixed SD (FSD) algorithm is implemented on a Digital Signal Processor (DSP) to measure its computational complexity. Modified sorting and decomposition strategies are then applied to this FSD algorithm offering trade-offs between execution speed and BER

    Forward diffraction modelling : analysis and application to grating reconstruction

    Get PDF
    The semiconductor industry uses lithography machines for manufacturing complex integrated circuits (also called ICs) onto wafers. Because an IC is built up layer by layer and feature sizes get smaller and smaller, tight control of the lithography process is required to guarantee a fast production of working ICs. Typically a lot of information on the lithography process can be obtained by measuring test structures or gratings which are scattered over the wafer. These gratings are tiny periodic structures much smaller than ICs. First these gratings are illuminated and its response (a scattered intensity) is measured. For certain applications like overlay metrology the asymmetry in this measured signal (due to an offset between two gratings) can be used to align the lithographic process. For other applications like critical dimension (CD) metrology one is interested in the shape of the grating lines that produced the measured signal. Since this information is not directly available but encrypted in the measurement, a reconstruction algorithm is used to extract it. The reconstructed values like height, width and sidewall angle can then be related to machine settings like dose and focus which control the lithographic process. In particular the CD metrology application requires rigorous mathematical models that solve optical diffraction problems for periodic gratings in combination with advanced reconstruction algorithms. This thesis focuses on the optical diffraction problem for 1D periodic gratings. Starting from Maxwell's equations a reduced model is derived by simplifying both the grating and the incident electromagnetic field. The former is approximated with an infinitely periodic layered structure with isotropic non-magnetic materials. The latter is approximated with a time-harmonic incident plane wave. The reduced model is discretised using two different mode expansion methods, Bloch and the Rigorous Coupled-Wave Analysis (RCWA). Bloch expands the electromagnetic field in each layer in terms of the exact eigenfunctions whereas RCWA only uses approximate eigenfunctions. After truncation of the involved series a transmission problem is derived by matching the fields at the layer interfaces. Having solved the resulting linear system, the scattered field can be computed easily. Both mode expansion methods solve a similar linear system containing a large but sparse block-structured coefficient matrix. However, special care needs to be taken when solving this system stably and efficiently. Therefore a stable condensation algorithm is derived based on Riccati transformations that decouples the exponentially growing and decaying terms that are present in the solution. This separation or decoupling is the key feature explaining the stability which is not always clear in alternative condensation algorithms. Furthermore the algorithm is optimised for speed by using a two-stage approach. Finally it is shown that the resulting stable recursions are identical to those used in the 'enhanced transmittance matrix approach" (a frequently used condensation algorithm), thereby confirming its stability as well. This thesis also examines and extends both mode expansions methods. The Bloch method is generalised to deal with multiple material transitions inside a grating layer covering a wider range of applications. However, lossy or fully asymmetric gratings are still hard to solve. On the other hand the Fourier discretisation used in RCWA is much more exible but only approximates the more exact discretisation of Bloch. Therefore two RCWA modifications have been investigated to improve the accuracy while keeping its exibility and relatively straightforward implementation. Adaptive Spatial Resolution applies an additional layer specific coordinate transformation before Fourier discretising the problem again. A good transformation not only refines near a material interface but also does this in a smooth way. A significant improvement in accuracy is observed that approaches and sometimes outperforms the results obtained with the Bloch method. The second modification removes the Fourier discretisation completely and uses a finite difference approximation in the periodic direction. Although this approach allows for a better discretisation near a material interface, the sparsity of the resulting matrices could not be exploited to make a competitive implementation within the standard RCWA framework. Finally the integration of the forward diffraction model in the CD reconstruction application is discussed. Either a library based or real-time regressions approach can be used for this reconstruction. Both approaches rely heavily on having an accurate and fast forward model. By exploiting additional symmetries and smart reuse of information, acceptable library fill times and real-time reconstructions are now feasible

    Modelling and and measurement analysis of the satellite MIMO radio channel

    Get PDF
    The increasing demand for terrestrial and satellite delivered digital multimedia services has precipitated the problem of spectrum scarcity in recent years. This has resulted in deployment of spectral efficient technologies such as MIMO for terrestrial systems. However, MIMO cannot be easily deployed for the satellite channel using conventional spatial multiplexing as the channel conditions here are very different from the terrestrial case, and it is often dominated by line of sight fading. Orthogonal circular polarization, which has long been used for increasing both frequency reuse and the power spectral density available to earth-bound satellite terminals, has recently been recommended for directly increasing the throughput available to such devices. Following that theme, this thesis proposes a novel dual circular polarisation multiplexing (DCPM) technique, which is aimed at the burgeoning area of throughput-hungry digital video broadcasting via satellite to handheld devices (DVB-SH) and digital video broadcast to the next generation of hand held (DVB-NGH) systems. In determining the working limits of DCPM, a series of measurement campaigns have been performed, from which extensive dual circular polarised land mobile satellite (LMS) channel data has been derived. Using the newly available channel data and with the aid of statistical channel modelling tools found in literature, a new dual circular polarised LMS MIMO channel model has been developed. This model, in contrast with previously available LMS MIMO channel models, is simpler to implement since it uses a distinct state-based empirical-stochastic approach. The model has been found to be robust and it easily lends itself to rapid implementation for system level MIMO and DCPM analysis. Finally, by way of bit error rate (BER) analysis in different channel fading conditions, it has been determined when best to implement polarisation multiplexing or conventional . MIMO techniques for DVB-type land mobile receivers. It is recommended that DCPM be used when the channel in predominantly Ricean, with eo-polar channel Rice factors and sub-channel cross correlation values greater than 1dB and 0.40 respectively. The recommendations provided by this research are valuable contributions, which may help shape the evolving DVB-NGH standardisation process.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Low-Complexity Algorithms for Channel Estimation in Optimised Pilot-Assisted Wireless OFDM Systems

    Get PDF
    Orthogonal frequency division multiplexing (OFDM) has recently become a dominant transmission technology considered for the next generation fixed and mobile broadband wireless communication systems. OFDM has an advantage of lessening the severe effects of the frequency-selective (multipath) fading due to the band splitting into relatively flat fading subchannels, and allows for low-complexity transceiver implementation based on the fast Fourier transform algorithms. Combining OFDM modulation with multilevel frequency-domain symbol mapping (e.g., QAM) and spatial multiplexing (SM) over the multiple-input multiple-output (MIMO) channels, can theoretically achieve near Shannon capacity of the communication link. However, the high-rate and spectrumefficient system implementation requires coherent detection at the receiving end that is possible only when accurate channel state information (CSI) is available. Since in practice, the response of the wireless channel is unknown and is subject to random variation with time, the receiver typically employs a channel estimator for CSI acquisition. The channel response information retrieved by the estimator is then used by the data detector and can also be fed back to the transmitter by means of in-band or out-of-band signalling, so the latter could adapt power loading, modulation and coding parameters according to the channel conditions. Thus, design of an accurate and robust channel estimator is a crucial requirement for reliable communication through the channel, which is selective in time and frequency. In a MIMO configuration, a separate channel estimator has to be associated with each transmit/receive antenna pair, making the estimation algorithm complexity a primary concern. Pilot-assisted methods, relying on the insertion of reference symbols in certain frequencies and time slots, have been found attractive for identification of the doubly-selective radio channels from both the complexity and performance standpoint. In this dissertation, a family of the reduced-complexity estimators for the single and multiple-antenna OFDM systems is developed. The estimators are based on the transform-domain processing and have the same order of computational complexity, irrespective of the number of pilot subcarriers and their positioning. The common estimator structure represents a cascade of successive small-dimension filtering modules. The number of modules, as well as their order inside the cascade, is determined by the class of the estimator (one or two-dimensional) and availability of the channel statistics (correlation and signal-to-noise power ratio). For fine precision estimation in the multipath channels with statistics not known a priori, we propose recursive design of the filtering modules. Simulation results show that in the steady state, performance of the recursive estimators approaches that of their theoretical counterparts, which are optimal in the minimum mean square error (MMSE) sense. In contrast to the majority of the channel estimators developed so far, our modular-type architectures are suitable for the reconfigurable OFDM transceivers where the actual channel conditions influence the decision of what class of filtering algorithm to use, and how to allot pilot subcarrier positions in the band. In the pilot-assisted transmissions, channel estimation and detection are performed separately from each other over the distinct subcarrier sets. The estimator output is used only to construct the detector transform, but not as the detector input. Since performance of both channel estimation and detection depends on the signal-to-noise power vi ratio (SNR) at the corresponding subcarriers, there is a dilemma of the optimal power allocation between the data and the pilot symbols as these are conflicting requirements under the total transmit power constraint. The problem is exacerbated by the variety of channel estimators. Each kind of estimation algorithm is characterised by its own SNR gain, which in general can vary depending on the channel correlation. In this dissertation, we optimise pilot-data power allocation for the case of developed low-complexity one and two-dimensional MMSE channel estimators. The resultant contribution is manifested by the closed-form analytical expressions of the upper bound (suboptimal approximate value) on the optimal pilot-to-data power ratio (PDR) as a function of a number of design parameters (number of subcarriers, number of pilots, number of transmit antennas, effective order of the channel model, maximum Doppler shift, SNR, etc.). The resultant PDR equations can be applied to the MIMO-OFDM systems with arbitrary arrangement of the pilot subcarriers, operating in an arbitrary multipath fading channel. These properties and relatively simple functional representation of the derived analytical PDR expressions are designated to alleviate the challenging task of on-the-fly optimisation of the adaptive SM-MIMO-OFDM system, which is capable of adjusting transmit signal configuration (e.g., block length, number of pilot subcarriers or antennas) according to the established channel conditions

    Software for Exascale Computing - SPPEXA 2016-2019

    Get PDF
    This open access book summarizes the research done and results obtained in the second funding phase of the Priority Program 1648 "Software for Exascale Computing" (SPPEXA) of the German Research Foundation (DFG) presented at the SPPEXA Symposium in Dresden during October 21-23, 2019. In that respect, it both represents a continuation of Vol. 113 in Springer’s series Lecture Notes in Computational Science and Engineering, the corresponding report of SPPEXA’s first funding phase, and provides an overview of SPPEXA’s contributions towards exascale computing in today's sumpercomputer technology. The individual chapters address one or more of the research directions (1) computational algorithms, (2) system software, (3) application software, (4) data management and exploration, (5) programming, and (6) software tools. The book has an interdisciplinary appeal: scholars from computational sub-fields in computer science, mathematics, physics, or engineering will find it of particular interest
    corecore