14 research outputs found

    Power-Efficient Hardware Architecture for Computing Split-Radix FFT on Highly Sparse Spectrum

    Get PDF
    RÉSUMÉ Le problème du transfert des signaaux du domaine temporel au domaine fréquentiel d'une manière efficace, lorsque le contenu du spectre de fréquences a une faible densité, est le sujet de cette thèse. La technique bien connue de la transformée de Fourier rapide (FFT) est l'algorithme de traitement de signal privilégié pour observer le contenu fréquentiel des signaux entrants à des émetteurs-récepteurs de télécommunication, tels que la radio cognitive, ou la radio définie par logiciel qu‟on utilise habituellement pour l‟analyse du spectre dans une bande de fréquences. Cela peut représenter un lourd fardeau de calcul sur des processeurs lorsque la FFT ordinaire est mise en oeuvre, ce qui peut impliquer une consommation d'énergie considérable. L'alimentation en énergie est une ressource limitée dans les appareils mobiles et, par conséquent, cette ressource peut être critique pour des dispositifs de télécommunications mobiles. Dans le but de développer un processeur économe en énergie pour les applications de transformation temps-fréquence, un algorithme de transformée de Fourier plus efficace, en termes du nombre de multiplications et d'additions complexes, est sélectionné. En effet, la Split-Radix Fast Fourier Transform (SRFFT) offre une performance meilleure que la FFT classique en termes de réduction du nombre de multiplications complexes nécessaires et elle peut donc conduire à une consommation d'énergie réduite. En appliquent le concept d'élagage des calculs inutiles, c'est-à-dire des multiplications complexes avec entrées ou sorties à zéro, tout au long de l'algorithme, on peut réduire la consommation d'énergie.Ainsi, une architecture matérielle énergétiquement efficace est développée pour le calcul de la SRFFT. Cette architecture est basée sur l'élagage des calculs inutiles. En fait, pour tirer parti du potentiel de la SRFFT, une nouvelle architecture d'un processeur de SRFFT configurable est d'abord conçue, puis l'architecture est développée afin d'éliminer les calculs inutiles. Cela se fait par l'utilisation appropriée d'une matrice d'élagage.----------ABSTRACT The problem of transferring a time domain signal into the frequency domain in an efficient manner, when the frequency contents are sparsely distributed, is the research topic covered in this thesis. The well-known Fast Fourier Transform (FFT) is the most common signal processing algorithm for observing the frequency contents of incoming signals in telecommunication transceivers. It is notably used in cognitive or software defined radio which usually demands for monitoring the spectrum in a wide frequency band. This may imply a heavy computation burden on processors when the ordinary FFT algorithm is implemented, and hence yield considerable power consumption. Power and energy supply is a limited resource in mobile devices and therefore, efficient execution of the Fourier transform has turned out to be critical for mobile telecommunication devices.With the purpose of developing a power-efficient processor for time-frequency transformation, the most computationally efficient Fourier transform algorithm is selected among the existing Fourier transform algorithms upon studying them in terms of required arithmetic operations, i.e. complex multiplications and additions. Indeed, the Split-Radix Fast Fourier Transform (SRFFT) offers a performance that is better than conventional FFT in terms of reduced number of complex multiplications and hence, can reduce power consumption.Appling the concept of pruning of the unnecessary computations, i.e. complex multiplications with either zero inputs or outputs, throughout the whole algorithm may reduce the power consumption even further

    Design and Implementation of Software Defined Radios on a Homogeneous Multi-Processor Architecture

    Get PDF
    In the wireless communications domain, multi-mode and multi-standard platforms are becoming increasingly the central focus of system architects. In fact, mobile terminal users require more and more mobility and throughput, pushing towards a fully integrated radio system able to support different communication protocols running concurrently on the platform. A new concept of radio system was introduced to meet the users' expectations. Flexible radio platforms have became an indispensable requirement to meet the expectations of the users today and in the future. This thesis deals with issues related to the design of flexible radio platforms. In particular, the flexibility of the radio system is achieved through the concept of software defined radios (SDRs). The research work focuses on the utilization of homogeneous multi-processor (MP) architectures as a feasible way to efficiently implement SDR platforms. In fact, platforms based on MP architectures are able to deliver high performance together with a high degree of flexibility. Moreover, homogeneous MP platforms are able to reduce design and verification costs as well as provide a high scalability in terms of software and hardware. However, homogeneous MP architectures provide less computational efficiency when compared to heterogeneous solutions. This thesis can be divided into two parts: the first part is related to the implementation of a reference platform while the second part of the thesis introduces the design and implementation of flexible, high performance, power and energy efficient algorithms for wireless communications. The proposed reference platform, Ninesilica, is a homogeneous MP architecture composed of a 3x3 mesh of processing nodes (PNs), interconnected by a hierarchical Network-on-Chip (NoC). Each PN hosts as Processing Element (PE) a processor core. To improve the computational efficiency of the platform, different power and energy saving techniques have been investigated. In the design, implementation and mapping of the algorithms, the following constraints were considered: energy and power efficiency, high scalability of the platform, portability of the solutions across similar platforms, and parallelization efficiency. Ninesilica architecture together with the proposed algorithm implementations showed that homogeneous MP architectures are highly scalable platforms, both in terms of hardware and software. Furthermore, Ninesilica architecture demonstrated that homogeneous MPs are able to achieve high parallelization efficiency as well as high energy and power savings, meeting the requirements of SDRs as well as enabling cognitive radios. Ninesilica can be utilized as a stand-alone block or as an elementary building block to realize clustered many-core architectures. Moreover, the obtained results, in terms of parallelization efficiency as well as power and energy efficiency are independent of the type of PE utilized, ensuring the portability of the results to similar architectures based on a different type of processing element

    Design and Implementation of Software Defined Radios on a Homogeneous Multi-Processor Architecture

    Get PDF
    In the wireless communications domain, multi-mode and multi-standard platforms are becoming increasingly the central focus of system architects. In fact, mobile terminal users require more and more mobility and throughput, pushing towards a fully integrated radio system able to support different communication protocols running concurrently on the platform. A new concept of radio system was introduced to meet the users' expectations. Flexible radio platforms have became an indispensable requirement to meet the expectations of the users today and in the future. This thesis deals with issues related to the design of flexible radio platforms. In particular, the flexibility of the radio system is achieved through the concept of software defined radios (SDRs). The research work focuses on the utilization of homogeneous multi-processor (MP) architectures as a feasible way to efficiently implement SDR platforms. In fact, platforms based on MP architectures are able to deliver high performance together with a high degree of flexibility. Moreover, homogeneous MP platforms are able to reduce design and verification costs as well as provide a high scalability in terms of software and hardware. However, homogeneous MP architectures provide less computational efficiency when compared to heterogeneous solutions. This thesis can be divided into two parts: the first part is related to the implementation of a reference platform while the second part of the thesis introduces the design and implementation of flexible, high performance, power and energy efficient algorithms for wireless communications. The proposed reference platform, Ninesilica, is a homogeneous MP architecture composed of a 3x3 mesh of processing nodes (PNs), interconnected by a hierarchical Network-on-Chip (NoC). Each PN hosts as Processing Element (PE) a processor core. To improve the computational efficiency of the platform, different power and energy saving techniques have been investigated. In the design, implementation and mapping of the algorithms, the following constraints were considered: energy and power efficiency, high scalability of the platform, portability of the solutions across similar platforms, and parallelization efficiency. Ninesilica architecture together with the proposed algorithm implementations showed that homogeneous MP architectures are highly scalable platforms, both in terms of hardware and software. Furthermore, Ninesilica architecture demonstrated that homogeneous MPs are able to achieve high parallelization efficiency as well as high energy and power savings, meeting the requirements of SDRs as well as enabling cognitive radios. Ninesilica can be utilized as a stand-alone block or as an elementary building block to realize clustered many-core architectures. Moreover, the obtained results, in terms of parallelization efficiency as well as power and energy efficiency are independent of the type of PE utilized, ensuring the portability of the results to similar architectures based on a different type of processing element

    Low power FFT processor design considerations for OFDM communications

    Full text link
    Today\u27s emerging communication technologies require fast processing as well as efficient use of resources. This project specifically addresses the power-efficient design of an FFT processor as it relates to OFDM communications such as cognitive radio. The Fast Fourier Transform (FFT) processor is what enables the efficient modulation in OFDM. As the FFT processor is the most computationally intensive component in OFDM communication, the power efficiency improvement of this component can have great impacts on the overall system. These impacts are significant considering the number of mobile and remote communication devices that rely on limited battery-powered operation. This project explores current FFT processor algorithms and architectures as well as optimization techniques that aim to reduce the power consumption of these devices. A floating point as well as a fixed point dynamically size-configurable FFT processor was designed in VHDL for FPGA applications, and power-saving modifications were implemented while analyzing the results

    Low-Complexity Multicarrier Waveform Processing Schemes fo Future Wireless Communications

    Get PDF
    Wireless communication systems deliver enormous variety of services and applications. Nowa- days, wireless communications play a key-role in many fields, such as industry, social life, education, and home automation. The growing demand for wireless services and applications has motivated the development of the next generation cellular radio access technology called fifth-generation new radio (5G-NR). The future networks are required to magnify the delivered user data rates to gigabits per second, reduce the communication latency below 1 ms, and en- able communications for massive number of simple devices. Those main features of the future networks come with new demands for the wireless communication systems, such as enhancing the efficiency of the radio spectrum use at below 6 GHz frequency bands, while supporting various services with quite different requirements for the waveform related key parameters. The current wireless systems lack the capabilities to handle those requirements. For exam- ple, the long-term evolution (LTE) employs the cyclic-prefix orthogonal frequency-division multiplexing (CP-OFDM) waveform, which has critical drawbacks in the 5G-NR context. The basic drawback of CP-OFDM waveform is the lack of spectral localization. Therefore, spectrally enhanced variants of CP-OFDM or other multicarrier waveforms with well localized spectrum should be considered. This thesis investigates spectrally enhanced CP-OFDM (E-OFDM) schemes to suppress the out-of-band (OOB) emissions, which are normally produced by CP-OFDM. Commonly, the weighted overlap-and-add (WOLA) scheme applies smooth time-domain window on the CP- OFDM waveform, providing spectrally enhanced subcarriers and reducing the OOB emissions with very low additional computational complexity. Nevertheless, the suppression perfor- mance of WOLA-OFDM is not sufficient near the active subband. Another technique is based on filtering the CP-OFDM waveform, which is referred to as F-OFDM. F-OFDM is able to provide well-localized spectrum, however, with significant increase in the computational com- plexity in the basic scheme with time-domain filters. Also filter-bank multicarrier (FBMC) waveforms are included in this study. FBMC has been widely studied as a potential post- OFDM scheme with nearly ideal subcarrier spectrum localization. However, this scheme has quite high computational complexity while being limited to uniformly distributed sub- bands. Anyway, filter-bank based waveform processing is one of the main topics of this work. Instead of traditional polyphase network (PPN) based uniform filter banks, the focus is on fast-convolution filter banks (FC-FBs), which utilize fast Fourier transform (FFT) domain processing to realize effectively filter-banks with high flexibility in terms of subcarrier bandwidths and center frequencies. FC-FBs are applied for both FBMC and F-OFDM waveform genera- tion and processing with greatly increased flexibility and significantly reduced computational complexity. This study proposes novel structures for FC-FB processing based on decomposition of the FC-FB structure consisting of forward and inverse discrete Fourier transforms (DFT and IDFT). The decomposition of multirate FC provides means of reducing the computational complexity in some important specific scenarios. A generic FC decomposition model is proposed and analyzed. This scheme is mathematically equivalent to the corresponding direct FC imple- mentation, with exactly the same performance. The benefits of the optimized decomposition structure appear mainly in communication scenarios with relatively narrow active transmis- sion band, resulting in significantly reduced computational complexity compared to the direct FC structure. The narrowband scenarios find their places in the recent 3GPP specification of cellular low- power wide-area (LPWA) access technology called narrowband internet-of-things (NB-IoT). NB-IoT aims at introducing the IoT to LTE and GSM frequency bands in coexistence with those technologies. NB-IoT uses CP-OFDM based waveforms with parameters compatible with the LTE. However, additional means are needed also for NB-IoT transmitters to improve the spec- trum localization. For NB-IoT user devices, it is important to consider ultra-low complexity solutions, and a look-up table (LUT) based approach is proposed to implement NB-IoT uplink transmitters with filtered waveforms. This approach provides completely multiplication-free digital baseband implementations and the addition rates are similar or smaller than in the basic NB-IoT waveform generation without the needed elements for spectrum enhancement. The basic idea includes storing full or partial waveforms for all possible data symbol combinations. Then the transmitted waveform is composed through summation of needed stored partial waveforms and trivial phase rotations. The LUT based scheme is developed with different vari- ants tackling practical implementations issues of NB-IoT device transmitters, considering also the effects of nonlinear power amplifier. Moreover, a completely multiplication and addition- free LUT variant is proposed and found to be feasible for very narrowband transmission, with up to 3 subcarriers. The finite-wordlength performance of LUT variants is evaluated through simulations

    Digital and Mixed Domain Hardware Reduction Algorithms and Implementations for Massive MIMO

    Get PDF
    Emerging 5G and 6G based wireless communications systems largely rely on multiple-input-multiple-output (MIMO) systems to reduce inherently extensive path losses, facilitate high data rates, and high spatial diversity. Massive MIMO systems used in mmWave and sub-THz applications consists of hundreds perhaps thousands of antenna elements at base stations. Digital beamforming techniques provide the highest flexibility and better degrees of freedom for phased antenna arrays as compared to its analog and hybrid alternatives but has the highest hardware complexity. Conventional digital beamformers at the receiver require a dedicated analog to digital converter (ADC) for every antenna element, leading to ADCs for elements. The number of ADCs is the key deterministic factor for the power consumption of an antenna array system. The digital hardware consists of fast Fourier transform (FFT) cores with a multiplier complexity of (N log2N) for an element system to generate multiple beams. It is required to reduce the mixed and digital hardware complexities in MIMO systems to reduce the cost and the power consumption, while maintaining high performance. The well-known concept has been in use for ADCs to achieve reduced complexities. An extension of the architecture to multi-dimensional domain is explored in this dissertation to implement a single port ADC to replace ADCs in an element system, using the correlation of received signals in the spatial domain. This concept has applications in conventional uniform linear arrays (ULAs) as well as in focal plane array (FPA) receivers. Our analysis has shown that sparsity in the spatio-temporal frequency domain can be exploited to reduce the number of ADCs from N to where . By using the limited field of view of practical antennas, multiple sub-arrays are combined without interferences to achieve a factor of K increment in the information carrying capacity of the ADC systems. Applications of this concept include ULAs and rectangular array systems. Experimental verifications were done for a element, 1.8 - 2.1 GHz wideband array system to sample using ADCs. This dissertation proposes that frequency division multiplexing (FDM) receiver outputs at an intermediate frequency (IF) can pack multiple (M) narrowband channels with a guard band to avoid interferences. The combined output is then sampled using a single wideband ADC and baseband channels are retrieved in the digital domain. Measurement results were obtained by employing a element, 28 GHz antenna array system to combine channels together to achieve a 75% reduction of ADC requirement. Implementation of FFT cores in the digital domain is not always exact because of the finite precision. Therefore, this dissertation explores the possibility of approximating the discrete Fourier transform (DFT) matrix to achieve reduced hardware complexities at an allowable cost of accuracy. A point approximate DFT (ADFT) core was implemented on digital hardware using radix-32 to achieve savings in cost, size, weight and power (C-SWaP) and synthesized for ASIC at 45-nm technology

    Spectrum Optimisation in Wireless Communication Systems: Technology Evaluation, System Design and Practical Implementation

    Get PDF
    Two key technology enablers for next generation networks are examined in this thesis, namely Cognitive Radio (CR) and Spectrally Efficient Frequency Division Multiplexing (SEFDM). The first part proposes the use of traffic prediction in CR systems to improve the Quality of Service (QoS) for CR users. A framework is presented which allows CR users to capture a frequency slot in an idle licensed channel occupied by primary users. This is achieved by using CR to sense and select target spectrum bands combined with traffic prediction to determine the optimum channel-sensing order. The latter part of this thesis considers the design, practical implementation and performance evaluation of SEFDM. The key challenge that arises in SEFDM is the self-created interference which complicates the design of receiver architectures. Previous work has focused on the development of sophisticated detection algorithms, however, these suffer from an impractical computational complexity. Consequently, the aim of this work is two-fold; first, to reduce the complexity of existing algorithms to make them better-suited for application in the real world; second, to develop hardware prototypes to assess the feasibility of employing SEFDM in practical systems. The impact of oversampling and fixed-point effects on the performance of SEFDM is initially determined, followed by the design and implementation of linear detection techniques using Field Programmable Gate Arrays (FPGAs). The performance of these FPGA based linear receivers is evaluated in terms of throughput, resource utilisation and Bit Error Rate (BER). Finally, variants of the Sphere Decoding (SD) algorithm are investigated to ameliorate the error performance of SEFDM systems with targeted reduction in complexity. The Fixed SD (FSD) algorithm is implemented on a Digital Signal Processor (DSP) to measure its computational complexity. Modified sorting and decomposition strategies are then applied to this FSD algorithm offering trade-offs between execution speed and BER
    corecore