1,059 research outputs found
Recommended from our members
Continuous-Time and Companding Digital Signal Processors Using Adaptivity and Asynchronous Techniques
The fully synchronous approach has been the norm for digital signal processors (DSPs) for many decades. Due to its simplicity, the classical DSP structure has been used in many applications. However, due to its rigid discrete-time operation, a classical DSP has limited efficiency or inadequate resolution for some emerging applications, such as processing of multimedia and biological signals. This thesis proposes fundamentally new approaches to designing DSPs, which are different from the classical scheme. The defining characteristic of all new DSPs examined in this thesis is the notion of "adaptivity" or "adaptability." Adaptive DSPs dynamically change their behavior to adjust to some property of their input stream, for example the rate of change of the input. This thesis presents both enhancements to existing adaptive DSPs, as well as new adaptive DSPs. The main class of DSPs that are examined throughout the thesis are continuous-time (CT) DSPs. CT DSPs are clock-less and event-driven; they naturally adapt their activity and power consumption to the rate of their inputs. The absence of a clock also provides a complete avoidance of aliasing in the frequency domain, hence improved signal fidelity. The core of this thesis deals with the complete and systematic design of a truly general-purpose CT DSP. A scalable design methodology for CT DSPs is presented. This leads to the main contribution of this thesis, namely a new CT DSP chip. This chip is the first general-purpose CT DSP chip, able to process many different classes of CT and synchronous signals. The chip has the property of handling various types of signals, i.e. various different digital modulations, both synchronous and asynchronous, without requiring any reconfiguration; such property is presented for the first time CT DSPs and is impossible for classical DSPs. As opposed to previous CT DSPs, which were limited to using only one type of digital format, and whose design was hard to scale for different bandwidths and bit-widths, this chip has a formal, robust and scalable design, due to the systematic usage of asynchronous design techniques. The second contribution of this thesis is a complete methodology to design adaptive delay lines. In particular, it is shown how to make the granularity, i.e. the number of stages, adaptive in a real-time delay line. Adaptive granularity brings about a significant improvement in the line's power consumption, up to 70% as reported by simulations on two design examples. This enhancement can have a direct large power impact on any CT DSP, since a delay line consumes the majority of a CT DSP's power. The robust methodology presented in this thesis allows safe dynamic reconfiguration of the line's granularity, on-the-fly and according to the input traffic. As a final contribution, the thesis also examines two additional DSPs: one operating the CT domain and one using the companding technique. The former operates only on level-crossing samples; the proposed methodology shows a potential for high-quality outputs by using a complex interpolation function. Finally, a companding DSP is presented for MPEG audio. Companding DSPs adapt their dynamic range to the amplitude of their input; the resulting can offer high-quality outputs even for small inputs. By applying companding to MPEG DSPs, it is shown how the DSP distortion can be made almost inaudible, without requiring complex arithmetic hardware
Baseband analog front-end and digital back-end for reconfigurable multi-standard terminals
Multimedia applications are driving wireless network operators to add high-speed data services such as Edge (E-GPRS), WCDMA (UMTS) and WLAN (IEEE 802.11a,b,g) to the existing GSM network. This creates the need for multi-mode cellular handsets that support a wide range of communication standards, each with a different RF frequency, signal bandwidth, modulation scheme etc. This in turn generates several design challenges for the analog and digital building blocks of the physical layer. In addition to the above-mentioned protocols, mobile devices often include Bluetooth, GPS, FM-radio and TV services that can work concurrently with data and voice communication. Multi-mode, multi-band, and multi-standard mobile terminals must satisfy all these different requirements. Sharing and/or switching transceiver building blocks in these handsets is mandatory in order to extend battery life and/or reduce cost. Only adaptive circuits that are able to reconfigure themselves within the handover time can meet the design requirements of a single receiver or transmitter covering all the different standards while ensuring seamless inter-interoperability. This paper presents analog and digital base-band circuits that are able to support GSM (with Edge), WCDMA (UMTS), WLAN and Bluetooth using reconfigurable building blocks. The blocks can trade off power consumption for performance on the fly, depending on the standard to be supported and the required QoS (Quality of Service) leve
Channelization for Multi-Standard Software-Defined Radio Base Stations
As the number of radio standards increase and spectrum resources come under more pressure, it becomes ever less efficient to reserve bands of spectrum for exclusive use by a single radio standard. Therefore, this work focuses on channelization structures compatible with spectrum sharing among multiple wireless standards and dynamic spectrum allocation in particular. A channelizer extracts independent communication channels from a wideband signal, and is one of the most computationally expensive components in a communications receiver. This work specifically focuses on non-uniform channelizers suitable for multi-standard Software-Defined Radio (SDR) base stations in general and public mobile radio base stations in particular.
A comprehensive evaluation of non-uniform channelizers (existing and developed during the course of this work) shows that parallel and recombined variants of the Generalised Discrete Fourier Transform Modulated Filter Bank (GDFT-FB) represent the best trade-off between computational load and flexibility for dynamic spectrum allocation. Nevertheless, for base station applications (with many channels) very high filter orders may be required, making the channelizers difficult to physically implement.
To mitigate this problem, multi-stage filtering techniques are applied to the GDFT-FB. It is shown that these multi-stage designs can significantly reduce the filter orders and number of operations required by the GDFT-FB. An alternative approach, applying frequency response masking techniques to the GDFT-FB prototype filter design, leads to even bigger reductions in the number of coefficients, but computational load is only reduced for oversampled configurations and then not as much as for the multi-stage designs. Both techniques render the implementation of GDFT-FB based non-uniform channelizers more practical.
Finally, channelization solutions for some real-world spectrum sharing use cases are developed before some final physical implementation issues are considered
In Car Audio
This chapter presents implementations of advanced in Car Audio Applications. The system is composed by three main different applications regarding the In Car listening and communication experience. Starting from a high level description of the algorithms, several implementations on different levels of hardware abstraction are presented, along with empirical results on both the design process undergone and the performance results achieved
X-Rel: Energy-Efficient and Low-Overhead Approximate Reliability Framework for Error-Tolerant Applications Deployed in Critical Systems
Triple Modular Redundancy (TMR) is one of the most common techniques in
fault-tolerant systems, in which the output is determined by a majority voter.
However, the design diversity of replicated modules and/or soft errors that are
more likely to happen in the nanoscale era may affect the majority voting
scheme. Besides, the significant overheads of the TMR scheme may limit its
usage in energy consumption and area-constrained critical systems. However, for
most inherently error-resilient applications such as image processing and
vision deployed in critical systems (like autonomous vehicles and robotics),
achieving a given level of reliability has more priority than precise results.
Therefore, these applications can benefit from the approximate computing
paradigm to achieve higher energy efficiency and a lower area. This paper
proposes an energy-efficient approximate reliability (X-Rel) framework to
overcome the aforementioned challenges of the TMR systems and get the full
potential of approximate computing without sacrificing the desired reliability
constraint and output quality. The X-Rel framework relies on relaxing the
precision of the voter based on a systematical error bounding method that
leverages user-defined quality and reliability constraints. Afterward, the size
of the achieved voter is used to approximate the TMR modules such that the
overall area and energy consumption are minimized. The effectiveness of
employing the proposed X-Rel technique in a TMR structure, for different
quality constraints as well as with various reliability bounds are evaluated in
a 15-nm FinFET technology. The results of the X-Rel voter show delay, area, and
energy consumption reductions of up to 86%, 87%, and 98%, respectively, when
compared to those of the state-of-the-art approximate TMR voters.Comment: This paper has been published in IEEE Transactions on Very Large
Scale Integration (VLSI) System
iURBAN
iURBAN: Intelligent Urban Energy Tool introduces an urban energy tool integrating different ICT energy management systems (both hardware and software) in two European cities, providing useful data to a novel decision support system that makes available the necessary parameters for the generation and further operation of associated business models. The business models contribute at a global level to efficiently manage and distribute the energy produced and consumed at a local level (city or neighbourhood), incorporating behavioural aspects of the users into the software platform and in general prosumers. iURBAN integrates a smart Decision Support System (smartDSS) that collects real-time or near real-time data, aggregates, analyses and suggest actions of energy consumption and production from different buildings, renewable energy production resources, combined heat and power plants, electric vehicles (EV) charge stations, storage systems, sensors and actuators. The consumption and production data is collected via a heterogeneous data communication protocols and networks. The iURBAN smartDSS through a Local Decision Support System allows the citizens to analyse the consumptions and productions that they are generating, receive information about CO2 savings, advises in demand response and the possibility to participate actively in the energy market. Whilst, through a Centralised Decision Support System allow to utilities, ESCOs, municipalities or other authorised third parties to: Get a continuous snapshot of city energy consumption and productionManage energy consumption and productionForecasting of energy consumptionPlanning of new energy "producers" for the future needs of the cityVisualise, analyse and take decisions of all the end points that are consuming or producing energy in a city level, permitting them to forecast and planning renewable power generation available in the city
A methodology for the design of dynamic accuracy operators by runtime back bias
Mobile and IoT applications must balance increasing processing demands with limited power and cost budgets. Approximate computing achieves this goal leveraging the error tolerance features common in many emerging applications to reduce power consumption. In particular, adequate (i.e., energy/quality-configurable) hardware operators are key components in an error tolerant system. Existing implementations of these operators require significant architectural modifications, hence they are often design-specific and tend to have large overheads compared to accurate units. In this paper, we propose a methodology to design adequate data-path operators in an automatic way, which uses threshold voltage scaling as a knob to dynamically control the power/accuracy tradeoff. The method overcomes the limitations of previous solutions based on supply voltage scaling, in that it introduces lower overheads and it allows fine-grain regulation of this tradeoff. We demonstrate our approach on a state-of-the-art 28nm FDSOI technology, exploiting the strong effect of back biasing on threshold voltage. Results show a power consumption reduction of as much as 39% compared to solutions based only on supply voltage scaling, at iso-accuracy
A Dynamically Reconfigurable Parallel Processing Framework with Application to High-Performance Video Processing
Digital video processing demands have and will continue to grow at unprecedented rates. Growth comes from ever increasing volume of data, demand for higher resolution, higher frame rates, and the need for high capacity communications. Moreover, economic realities force continued reductions in size, weight and power requirements. The ever-changing needs and complexities associated with effective video processing systems leads to the consideration of dynamically reconfigurable systems. The goal of this dissertation research was to develop and demonstrate the viability of integrated parallel processing system that effectively and efficiently apply pre-optimized hardware cores for processing video streamed data. Digital video is decomposed into packets which are then distributed over a group of parallel video processing cores. Real time processing requires an effective task scheduler that distributes video packets efficiently to any of the reconfigurable distributed processing nodes across the framework, with the nodes running on FPGA reconfigurable logic in an inherently Virtual\u27 mode. The developed framework, coupled with the use of hardware techniques for dynamic processing optimization achieves an optimal cost/power/performance realization for video processing applications. The system is evaluated by testing processor utilization relative to I/O bandwidth and algorithm latency using a separable 2-D FIR filtering system, and a dynamic pixel processor. For these applications, the system can achieve performance of hundreds of 640x480 video frames per second across an eight lane Gen I PCIe bus. Overall, optimal performance is achieved in the sense that video data is processed at the maximum possible rate that can be streamed through the processing cores. This performance, coupled with inherent ability to dynamically add new algorithms to the described dynamically reconfigurable distributed processing framework, creates new opportunities for realizable and economic hardware virtualization.\u2
The use of a reconfigurable functional cache in a digital signal processor: power and performance
Due to the computationally intensive nature of the tasks that digital signal processors (DSP) are required to perform it is desirable to decrease the time required to execute these tasks. Minimizing the execution time required for the various algorithms that are commonly and frequently executed (ex: FIR filters) will improve the overall performance. It is known that hardware is able to execute algorithms faster than software, however, due to the size limitations of embedded DSP, not all of the necessary algorithms can be implemented in hardware. A reconfigurable cache architecture in combination with a DSP is proposed as an alternative to increase algorithm performance by using reconfigurable hardware rather than dedicated hardware. Another important issue to consider for embedded processors is the power consumption of the DSP. Due to the fact that most embedded processors operate by battery power, energy efficiency is a necessity. This study looks at the power requirements of a DSP with reconfigurable cache to determine the viability of such an architecture in an embedded system. Others have shown that reconfigurable cache in conjunction with a general purpose processor improves performance for some DSP benchmarks. This study shows that a DSP/reconfigurable cache combination can achieve kernel performance gains ranging from 10-350 times that of a DSP architecture operating alone and can achieve overall benchmark speedups ranging from 1.02 to 1.91 times that of the existing DSP architecture. Further, relative power consumption results show that the power consumption of the reconfigurable architecture is approximately 85 to 95% of the current architecture (5-15% power savings) and attains energy savings ranging from approximately 14 to 50%
- …