96 research outputs found

    Channel-Aware Energy Optimization of OFDM Receivers Using Dynamic Precision Scaling in FPGAs

    Get PDF
    International audienceTo reduce the energy consumption of Orthogonal Frequency-Division Multiplexing (OFDM) systems, a new variable word-length method is presented in this paper. A simulation based approach is used: the optimized fixed-point implementation of an OFDM receiver is found for different simulated channel conditions, depending on the Signal-to-Noise Ratio (SNR) and the channel type. During the execution, the receiver estimates the channel conditions and chooses the optimum word-length to decode the received information. A realistic energy consumption of the receiver is estimated with a library that contains the energy consumption of Field-Programmable Gate Array (FPGA) basic operators depending on the bit-width, obtained from experimental data. Up to 57% of the dynamic energy can be saved using this method

    Energy-Aware Computing via Adaptive Precision under Performance Constraints in OFDM Wireless Receivers

    Get PDF
    International audienceTo cope with rapid variations of channel parameters , wireless receivers are designed with a significant performance margin to reach a given Bit Error Rate (BER), even for the worst-case channel conditions. Indeed, one of the steps during the design phase is the choice of the architecture bit-width, and the smallest wordlength that ensures the correct behaviour of the receiver is usually chosen. In this paper, an adaptive precision OFDM receiver is proposed. Significant energy savings come from varying at run time processing bit-width, based on estimation of channel conditions, without compromising the BER constraints. To validate the energy savings, the energy consumption of basic operators has been obtained from real measurements for different bit-widths on a FPGA and a processor using soft SIMD. Results show that up to 63% of the dynamic energy consumption can be saved using this adaptive technique

    FPGA based technical solutions for high throughput data processing and encryption for 5G communication: A review

    Get PDF
    The field programmable gate array (FPGA) devices are ideal solutions for high-speed processing applications, given their flexibility, parallel processing capability, and power efficiency. In this review paper, at first, an overview of the key applications of FPGA-based platforms in 5G networks/systems is presented, exploiting the improved performances offered by such devices. FPGA-based implementations of cloud radio access network (C-RAN) accelerators, network function virtualization (NFV)-based network slicers, cognitive radio systems, and multiple input multiple output (MIMO) channel characterizers are the main considered applications that can benefit from the high processing rate, power efficiency and flexibility of FPGAs. Furthermore, the implementations of encryption/decryption algorithms by employing the Xilinx Zynq Ultrascale+MPSoC ZCU102 FPGA platform are discussed, and then we introduce our high-speed and lightweight implementation of the well-known AES-128 algorithm, developed on the same FPGA platform, and comparing it with similar solutions already published in the literature. The comparison results indicate that our AES-128 implementation enables efficient hardware usage for a given data-rate (up to 28.16 Gbit/s), resulting in higher efficiency (8.64 Mbps/slice) than other considered solutions. Finally, the applications of the ZCU102 platform for high-speed processing are explored, such as image and signal processing, visual recognition, and hardware resource management

    Adaptive Baseband Pro cessing and Configurable Hardware for Wireless Communication

    Get PDF
    The world of information is literally at one’s fingertips, allowing access to previously unimaginable amounts of data, thanks to advances in wireless communication. The growing demand for high speed data has necessitated theuse of wider bandwidths, and wireless technologies such as Multiple-InputMultiple-Output (MIMO) have been adopted to increase spectral efficiency.These advanced communication technologies require sophisticated signal processing, often leading to higher power consumption and reduced battery life.Therefore, increasing energy efficiency of baseband hardware for MIMO signal processing has become extremely vital. High Quality of Service (QoS)requirements invariably lead to a larger number of computations and a higherpower dissipation. However, recognizing the dynamic nature of the wirelesscommunication medium in which only some channel scenarios require complexsignal processing, and that not all situations call for high data rates, allowsthe use of an adaptive channel aware signal processing strategy to provide adesired QoS. Information such as interference conditions, coherence bandwidthand Signal to Noise Ratio (SNR) can be used to reduce algorithmic computations in favorable channels. Hardware circuits which run these algorithmsneed flexibility and easy reconfigurability to switch between multiple designsfor different parameters. These parameters can be used to tune the operations of different components in a receiver based on feedback from the digitalbaseband. This dissertation focuses on the optimization of digital basebandcircuitry of receivers which use feedback to trade power and performance. Aco-optimization approach, where designs are optimized starting from the algorithmic stage through the hardware architectural stage to the final circuitimplementation is adopted to realize energy efficient digital baseband hardwarefor mobile 4G devices. These concepts are also extended to the next generation5G systems where the energy efficiency of the base station is improved.This work includes six papers that examine digital circuits in MIMO wireless receivers. Several key blocks in these receiver include analog circuits thathave residual non-linearities, leading to signal intermodulation and distortion.Paper-I introduces a digital technique to detect such non-linearities and calibrate analog circuits to improve signal quality. The concept of a digital nonlinearity tuning system developed in Paper-I is implemented and demonstratedin hardware. The performance of this implementation is tested with an analogchannel select filter, and results are presented in Paper-II. MIMO systems suchas the ones used in 4G, may employ QR Decomposition (QRD) processors tosimplify the implementation of tree search based signal detectors. However,the small form factor of the mobile device increases spatial correlation, whichis detrimental to signal multiplexing. Consequently, a QRD processor capableof handling high spatial correlation is presented in Paper-III. The algorithm and hardware implementation are optimized for carrier aggregation, which increases requirements on signal processing throughput, leading to higher powerdissipation. Paper-IV presents a method to perform channel-aware processingwith a simple interpolation strategy to adaptively reduce QRD computationcount. Channel properties such as coherence bandwidth and SNR are used toreduce multiplications by 40% to 80%. These concepts are extended to usetime domain correlation properties, and a full QRD processor for 4G systemsfabricated in 28 nm FD-SOI technology is presented in Paper-V. The designis implemented with a configurable architecture and measurements show thatcircuit tuning results in a highly energy efficient processor, requiring 0.2 nJ to1.3 nJ for each QRD. Finally, these adaptive channel-aware signal processingconcepts are examined in the scope of the next generation of communicationsystems. Massive MIMO systems increase spectral efficiency by using a largenumber of antennas at the base station. Consequently, the signal processingat the base station has a high computational count. Paper-VI presents a configurable detection scheme which reduces this complexity by using techniquessuch as selective user detection and interpolation based signal processing. Hardware is optimized for resource sharing, resulting in a highly reconfigurable andenergy efficient uplink signal detector

    Multipass communication systems for tiled processor architectures

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 191-202).Multipass communication systems utilize multiple sets of parallel baseband receiver functions to balance communication data rates and available computation capabilities. This is achieved by spatially pipelining baseband functions across parallel resources to perform multiple processing passes on the same set of received values, thus allowing the system to simultaneously convey multiple sequences of data using a single wireless link. The use of multiple passes mitigates the effects of data rate on receiver processing bottlenecks, making the use of general-purpose processing elements for high data rate communication functions viable. The flexibility of general-purpose processing, in turn, allows the receiver composition to trade-off resource usage and required processing rate. For instance, a communication system could be distributed across 2 passes using 2x the overall area, but reducing the data rate for each pass and the resultant overall required processing rate, and hence clock speed, by 1/2. Lowering the clock speed can also be leveraged to reduce power through voltage scaling and/or the use of higher Vt devices. The characteristics of general-purpose parallel processors for communications processing are explored, as well as the applicability of specific parallel designs to communications processing.(Cont.) In particular, an in depth look is taken of the Raw processor's tiled architecture as a general-purpose parallel processor particularly well suited to portable communications processing. An example of a multipass system, based on the 802.11a baseband, implemented on the Raw processor along with the accompanying hardware implementation is presented as both a proof-of-concept, as well as a means to explore some of the advantages and trade-offs of such a system. A bit-error rate study is presented which shows this multipass system to be within a small fraction of dB of the performance of an equivalent data rate single pass system, thus demonstrating the viability of the multipass algorithm. In addition, the capability of tiled processors to maximize processing capabilities at the system block level, as well as the system architectural level, is shown. Parallel implementations of two processing intensive functions: the FFT and the Viterbi decoder are shown. A parallelized assembly language FFT utilizing 16 tiles is shown to have a 1,000x improvement , and a parallelized 48-tile assembly language Viterbi decoder is shown to have a 10, 000x improvement over corresponding serial C implementations.by Nathan Robert Shnidman.Ph.D

    Design of large polyphase filters in the Quadratic Residue Number System

    Full text link

    Algorithm-Architecture Co-Design for Digital Front-Ends in Mobile Receivers

    Get PDF
    The methodology behind this work has been to use the concept of algorithm-hardware co-design to achieve efficient solutions related to the digital front-end in mobile receivers. It has been shown that, by looking at algorithms and hardware architectures together, more efficient solutions can be found; i.e., efficient with respect to some design measure. In this thesis the main focus have been placed on two such parameters; first reduced complexity algorithms to lower energy consumptions at limited performance degradation, secondly to handle the increasing number of wireless standards that preferably should run on the same hardware platform. To be able to perform this task it is crucial to understand both sides of the table, i.e., both algorithms and concepts for wireless communication as well as the implications arising on the hardware architecture. It is easier to handle the high complexity by separating those disciplines in a way of layered abstraction. However, this representation is imperfect, since many interconnected "details" belonging to different layers are lost in the attempt of handling the complexity. This results in poor implementations and the design of mobile terminals is no exception. Wireless communication standards are often designed based on mathematical algorithms with theoretical boundaries, with few considerations to actual implementation constraints such as, energy consumption, silicon area, etc. This thesis does not try to remove the layer abstraction model, given its undeniable advantages, but rather uses those cross-layer "details" that went missing during the abstraction. This is done in three manners: In the first part, the cross-layer optimization is carried out from the algorithm perspective. Important circuit design parameters, such as quantization are taken into consideration when designing the algorithm for OFDM symbol timing, CFO, and SNR estimation with a single bit, namely, the Sign-Bit. Proof-of-concept circuits were fabricated and showed high potential for low-end receivers. In the second part, the cross-layer optimization is accomplished from the opposite side, i.e., the hardware-architectural side. A SDR architecture is known for its flexibility and scalability over many applications. In this work a filtering application is mapped into software instructions in the SDR architecture in order to make filtering-specific modules redundant, and thus, save silicon area. In the third and last part, the optimization is done from an intermediate point within the algorithm-architecture spectrum. Here, a heterogeneous architecture with a combination of highly efficient and highly flexible modules is used to accomplish initial synchronization in at least two concurrent OFDM standards. A demonstrator was build capable of performing synchronization in any two standards, including LTE, WiFi, and DVB-H

    Reconfigurable Antenna Systems: Platform implementation and low-power matters

    Get PDF
    Antennas are a necessary and often critical component of all wireless systems, of which they share the ever-increasing complexity and the challenges of present and emerging trends. 5G, massive low-orbit satellite architectures (e.g. OneWeb), industry 4.0, Internet of Things (IoT), satcom on-the-move, Advanced Driver Assistance Systems (ADAS) and Autonomous Vehicles, all call for highly flexible systems, and antenna reconfigurability is an enabling part of these advances. The terminal segment is particularly crucial in this sense, encompassing both very compact antennas or low-profile antennas, all with various adaptability/reconfigurability requirements. This thesis work has dealt with hardware implementation issues of Radio Frequency (RF) antenna reconfigurability, and in particular with low-power General Purpose Platforms (GPP); the work has encompassed Software Defined Radio (SDR) implementation, as well as embedded low-power platforms (in particular on STM32 Nucleo family of micro-controller). The hardware-software platform work has been complemented with design and fabrication of reconfigurable antennas in standard technology, and the resulting systems tested. The selected antenna technology was antenna array with continuously steerable beam, controlled by voltage-driven phase shifting circuits. Applications included notably Wireless Sensor Network (WSN) deployed in the Italian scientific mission in Antarctica, in a traffic-monitoring case study (EU H2020 project), and into an innovative Global Navigation Satellite Systems (GNSS) antenna concept (patent application submitted). The SDR implementation focused on a low-cost and low-power Software-defined radio open-source platform with IEEE 802.11 a/g/p wireless communication capability. In a second embodiment, the flexibility of the SDR paradigm has been traded off to avoid the power consumption associated to the relevant operating system. Application field of reconfigurable antenna is, however, not limited to a better management of the energy consumption. The analysis has also been extended to satellites positioning application. A novel beamforming method has presented demonstrating improvements in the quality of signals received from satellites. Regarding those who deal with positioning algorithms, this advancement help improving precision on the estimated position

    Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)

    Get PDF
    ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability

    Power and Energy Aware Heterogeneous Computing Platform

    Get PDF
    During the last decade, wireless technologies have experienced significant development, most notably in the form of mobile cellular radio evolution from GSM to UMTS/HSPA and thereon to Long-Term Evolution (LTE) for increasing the capacity and speed of wireless data networks. Considering the real-time constraints of the new wireless standards and their demands for parallel processing, reconfigurable architectures and in particular, multicore platforms are part of the most successful platforms due to providing high computational parallelism and throughput. In addition to that, by moving toward Internet-of-Things (IoT), the number of wireless sensors and IP-based high throughput network routers is growing at a rapid pace. Despite all the progression in IoT, due to power and energy consumption, a single chip platform for providing multiple communication standards and a large processing bandwidth is still missing.The strong demand for performing different sets of operations by the embedded systems and increasing the computational performance has led to the use of heterogeneous multicore architectures with the help of accelerators for computationally-intensive data-parallel tasks acting as coprocessors. Currently, highly heterogeneous systems are the most power-area efficient solution for performing complex signal processing systems. Additionally, the importance of IoT has increased significantly the need for heterogeneous and reconfigurable platforms.On the other hand, subsequent to the breakdown of the Dennardian scaling and due to the enormous heat dissipation, the performance of a single chip was obstructed by the utilization wall since all cores cannot be clocked at their maximum operating frequency. Therefore, a thermal melt-down might be happened as a result of high instantaneous power dissipation. In this context, a large fraction of the chip, which is switched-off (Dark) or operated at a very low frequency (Dim) is called Dark Silicon. The Dark Silicon issue is a constraint for the performance of computers, especially when the up-coming IoT scenario will demand a very high performance level with high energy efficiency. Among the suggested solution to combat the problem of Dark-Silicon, the use of application-specific accelerators and in particular Coarse-Grained Reconfigurable Arrays (CGRAs) are the main motivation of this thesis work.This thesis deals with design and implementation of Software Defined Radio (SDR) as well as High Efficiency Video Coding (HEVC) application-specific accelerators for computationally intensive kernels and data-parallel tasks. One of the most important data transmission schemes in SDR due to its ability of providing high data rates is Orthogonal Frequency Division Multiplexing (OFDM). This research work focuses on the evaluation of Heterogeneous Accelerator-Rich Platform (HARP) by implementing OFDM receiver blocks as designs for proof-of-concept. The HARP template allows the designer to instantiate a heterogeneous reconfigurable platform with a very large amount of custom-tailored computational resources while delivering a high performance in terms of many high-level metrics. The availability of this platform lays an excellent foundation to investigate techniques and methods to replace the Dark or Dim part of chip with high-performance silicon dissipating very low power and energy. Furthermore, this research work is also addressing the power and energy issues of the embedded computing systems by tailoring the HARP for self-aware and energy-aware computing models. In this context, the instantaneous power dissipation and therefore the heat dissipation of HARP are mitigated on FPGA/ASIC by using Dynamic Voltage and Frequency Scaling (DVFS) to minimize the dark/dim part of the chip. Upgraded HARP for self-aware and energy-aware computing can be utilized as an energy-efficient general-purpose transceiver platform that is cognitive to many radio standards and can provide high throughput while consuming as little energy as possible. The evaluation of HARP has shown promising results, which makes it a suitable platform for avoiding Dark Silicon in embedded computing platforms and also for diverse needs of IoT communications.In this thesis, the author designed the blocks of OFDM receiver by crafting templatebased CGRA devices and then attached them to HARP’s Network-on-Chip (NoC) nodes. The performance of application-specific accelerators generated from templatebased CGRAs, the performance of the entire platform subsequent to integrating the CGRA nodes on HARP and the NoC traffic are recorded in terms of several highlevel performance metrics. In evaluating HARP on FPGA prototype, it delivers a performance of 0.012 GOPS/mW. Because of the scalability and regularity in HARP, the author considered its value as architectural constant. In addition to showing the gain and the benefits of maximizing the number of reconfigurable processing resources on a platform in comparison to the scaled performance of several state-of-the-art platforms, HARP’s architectural constant ensures application-independent figure of merit. HARP is further evaluated by implementing various sizes of Discrete Cosine transform (DCT) and Discrete Sine Transform (DST) dedicated for HEVC standard, which showed its ability to sustain Full HD 1080p format at 30 fps on FPGA. The author also integrated self-aware computing model in HARP to mitigate the power dissipation of an OFDM receiver. In the case of FPGA implementation, the total power dissipation of the platform showed 16.8% reduction due to employing the Feedback Control System (FCS) technique with Dynamic Frequency Scaling (DFS). Furthermore, by moving to ASIC technology and scaling both frequency and voltage simultaneously, significant dynamic power reduction (up to 82.98%) was achieved, which proved the DFS/DVFS techniques as one step forward to mitigate the Dark Silicon issue
    corecore