Search CORE

422 research outputs found

Low power JPEG2000 5/3 discrete wavelet transform algorithm and architecture

Author: Tan Kay-Chuan Benny
Publication venue: The University of Edinburgh
Publication date: 01/01/2004
Field of study

Energy autonomous systems : future trends in devices, technology, and systems

Author: Belleville M.
Cantatore E.
Fanet H.
Fiorini P.
Hahn R.
Nicole P.
Pelgrom M.
Piguet C.
Tartagni M.
Van Hoof C.
Vullers R.J.M.
Publication venue: CATRENE
Publication date: 01/01/2009
Field of study

The rapid evolution of electronic devices since the beginning of the nanoelectronics era has brought about exceptional computational power in an ever shrinking system footprint. This has enabled among others the wealth of nomadic battery powered wireless systems (smart phones, mp3 players, GPS, …) that society currently enjoys. Emerging integration technologies enabling even smaller volumes and the associated increased functional density may bring about a new revolution in systems targeting wearable healthcare, wellness, lifestyle and industrial monitoring applications

Repository TU/e

Pure OAI Repository

A Survey on Design Methodologies for Accelerating Deep Learning on Heterogeneous Architectures

In recent years, the field of Deep Learning has seen many disruptive and impactful advancements. Given the increasing complexity of deep neural networks, the need for efficient hardware accelerators has become more and more pressing to design heterogeneous HPC platforms. The design of Deep Learning accelerators requires a multidisciplinary approach, combining expertise from several areas, spanning from computer architecture to approximate computing, computational models, and machine learning algorithms. Several methodologies and tools have been proposed to design accelerators for Deep Learning, including hardware-software co-design approaches, high-level synthesis methods, specific customized compilers, and methodologies for design space exploration, modeling, and simulation. These methodologies aim to maximize the exploitable parallelism and minimize data movement to achieve high performance and energy efficiency. This survey provides a holistic review of the most influential design methodologies and EDA tools proposed in recent years to implement Deep Learning accelerators, offering the reader a wide perspective in this rapidly evolving field. In particular, this work complements the previous survey proposed by the same authors in [203], which focuses on Deep Learning hardware accelerators for heterogeneous HPC platforms

arXiv.org e-Print Archive

PowerBit - Power aware arithmetic bit-width optimization

Author: Clarke JA
Constantinides GA
Gaffar AA
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Published versio

Crossref

Spiral - Imperial College Digital Repository

NONLINEAR OPERATORS FOR IMAGE PROCESSING: DESIGN, IMPLEMENTATION AND MODELING TECHNIQUES FOR POWER ESTIMATION

Author: BERNACCHIA GIUSEPPE
Publication venue: Università degli studi di Trieste
Publication date: 01/01/2000
Field of study

1998/1999Negli ultimi anni passati le applicazioni multimediali hanno visto uno sviluppo notevole, trovando applicazione in un gran numero di campi. Applicazioni come video conferenze, diagnostica medica, telefonia mobile e applicazioni militari necessitano il trattamento di una gran mole di dati ad alta velocità. Pertanto, l'elaborazione di immagini e di dati vocali è molto importante ed è stata oggetto di numerosi sforzi, nel tentativo di trovare algoritmi sempre più veloci ed efficaci. Tra gli algoritmi proposti, noi crediamo che gli operatori razionali svolgano un ruolo molto importante, grazie alla loro versatilità ed efficacia nell'elaborazione di dati. Negli ultimi anni sono stati proposti diversi algoritmi, dimostrando che questi operatori possono essere molto vantaggiosi in diverse applicazioni, producendo buoni risultati. Lo scopo di questo lavoro è di realizzare alcuni di questi algoritmi e, quindi, dimostrare che i filtri razionali, in particolare, possono essere realizzati senza ricorrere a sistemi di grandi dimensioni e possono raggiungere frequenze operative molto alte. Una volta che il blocco fondamentale di un sistema basato su operatori razionali sia stato realizzato, esso pu6 essere riusato con successo in molte altre applicazioni. Dal punto di vista del progettista, è importante avere uno schema generale di studio, che lo renda capace di studiare le varie configurazioni del sistema da realizzare e di analizzare i compromessi tra le variabili di progetto. In particolare, per soddisfare l'esigenza di metodi versatili per la stima della potenza, abbiamo sviluppato una tecnica di macro modellizazione che permette al progettista di stimare velocemente ed accuratamente la potenza dissipata da un circuito. La tesi è organizzata come segue: Nel Capitolo 1 alcuni sono presentati alcuni algoritmi studiati per la realizzazione. Ne viene data solo una veloce descrizione, lasciando comunque al lettore interessato dei riferimenti bibliografici. Nel Capitolo 2 vengono discusse le architetture fondamentali usate per la realizzazione. Principalmente sono state usate architetture a pipeline, ma viene data anche una descrizione degli approcci oggigiorno disponibili per l'ottimizzazione delle temporizzazioni. Nel Capitolo 3 sono presentate le realizzazioni di due sistemi studiati per questa tesi. Gli approcci seguiti si basano su ASIC e FPGA. Richiedono tecniche e soluzioni diverse per il progetto del sistema, per cui é interessante vedere cosa pu6 essere fatto nei due casi. Infine, nel Capitolo 4, descriviamo la nostra tecnica di macro modellizazione per la stima di potenza, dando una breve visione delle tecniche finora proposte e facendo vedere quali sono i vantaggi che il nostro metodo comporta per il progetto.In the past few years, multimedia application have been growing very fast, being applied to a large variety of fields. Applications like video conference, medical diagnostic, mobile phones, military applications require to handle large amount of data at high rate. Images as well as voice data processing are therefore very important and they have been subjected to a lot of efforts in order to find always faster and effective algorithms. Among image processing algorithms, we believe that rational operators assume an important role, due to their versatility and effectiveness in data processing. In the last years, several algorithms have been proposed, demonstrating that these operators can be very suitable in different applications with very good results. The aim of this work is to implement some of these algorithm and, therefore, demonstrate that rational filters, in particular, can be implemented without requiring large sized systems and they can operate at very high frequencies. Once the basic building block of a rational based system has been implemented, it can be successfully reused in many other applications. From the designer point of view, it is important to have a general framework, which makes it able to study various configurations of the system to be implemented and analyse the trade-off among the design variables. In particular, to meet the need far versatile tools far power estimation, we developed a new macro modelling technique, which allows the designer to estimate the power dissipated by a circuit quickly and accurately. The thesis is organized as follows: In chapter 1 we present some of the algorithms which have been studied for implementation. Only a brief overview is given, leaving to the interested reader some references in literature. In chapter 2 we discuss the basic architectures used for the implementations. Pipelined structures have been mainly used for this thesis, but an overview of the nowaday available approaches for timing optimization is presented. In chapter 3 we present two of the implementation designed for this thesis. The approaches followed are ASIC driven and FPGA drive. They require different techniques and different solution for the design of the system, therefore it is interesting to see what can be done in both the cases. Finally, in chapter 4, we describe our macro modelling techniques for power estimation, giving a brief overview of the up to now proposed techniques and showing the advantages our method brings to the design.XII Ciclo1969Versione digitalizzata della tesi di dottorato cartacea

OpenGrey Repository

OpenstarTs

High-level power optimisation for Digital Signal Processing in Recon gurable Logic

Author: Clarke Jonathan A
Clarke Jonathan A
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/01/2009
Field of study

This thesis is concerned with the optimisation of Digital Signal Processing (DSP) algorithm implementations on recon gurable hardware via the selection of appropriate word-lengths for the signals in these algorithms, in order to minimise system power consumption. Whilst existing word-length optimisation work has concentrated on the minimisation of the area of algorithm implementations, this work introduces the rst set of power consumption models that can be evaluated quickly enough to be used within the search of the enormous design space of multiple word-length optimisation problems. These models achieve their speed by estimating both the power consumed within the arithmetic components of an algorithm and the power in the routing wires that connect these components, using only a high-level description of the algorithm itself. Trading o a small reduction in power model accuracy for a large increase in speed is one of the major contributions of this thesis. In addition to the work on power consumption modelling, this thesis also develops a new technique for selecting the appropriate word-lengths for an algorithm implementation in order to minimise its cost in terms of power (or some other metric for which models are available). The method developed is able to provide tight lower and upper bounds on the optimal cost that can be obtained for a particular word-length optimisation problem and can, as a result, nd provably near-optimal solutions to word-length optimisation problems without resorting to an NP-hard search of the design space. Finally the costs of systems optimised via the proposed technique are compared to those obtainable by word-length optimisation for minimisation of other metrics (such as logic area) and the results compared, providing greater insight into the nature of wordlength optimisation problems and the extent of the improvements obtainable by them

Spiral - Imperial College Digital Repository

Hybrid DDS-PLL based reconfigurable oscillators with high spectral purity for cognitive radio

Author: Mazumdar Dipayan
Publication venue
Publication date: 01/01/2018
Field of study

Analytical, design and simulation studies on the performance optimization of reconfigurable architecture of a Hybrid DDS – PLL are presented in this thesis. The original contributions of this thesis are aimed towards the DDS, the dithering (spur suppression) scheme and the PLL. A new design of Taylor series-based DDS that reduces the dynamic power and number of multipliers is a significant contribution of this thesis. This thesis compares dynamic power and SFDR achieved in the design of varieties of DDS such as Quartic, Cubic, Linear and LHSC. This thesis proposes two novel schemes namely “Hartley Image Suppression” and “Adaptive Sinusoidal Interference Cancellation” overcoming the low noise floor of traditional dithering schemes. The simulation studies on a Taylor series-based DDS reveal an improvement in SFDR from 74 dB to 114 dB by using Least Mean Squares -Sinusoidal Interference Canceller (LM-SIC) with the noise floor maintained at -200 dB. Analytical formulations have been developed for a second order PLL to relate the phase noise to settling time and Phase Margin (PM) as well as to relate jitter variance and PM. New expressions relating phase noise to PM and lock time to PM are derived. This thesis derives the analytical relationship between the roots of the characteristic equation of a third order PLL and its performance metrics like PM, Gardner’s stability factor, jitter variance, spur gain and ratio of noise power to carrier power. This thesis presents an analysis to relate spur gain and capacitance ratio of a third order PLL. This thesis presents an analytical relationship between the lock time and the roots of its characteristic equation of a third order PLL. Through Vieta’s circle and Vieta’s angle, the performance metrics of a third order PLL are related to the real roots of its characteristic equation

Coventry University Pure Portal

Recommended from our members

Array Architectures and Physical Layer Design for Millimeter-Wave Communications Beyond 5G

Author: Yan Han
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Ever increasing demands in mobile data rates have resulted in exploration of millimeter-wave (mmW) frequencies for the next generation (5G) wireless networks. Communications at mmW frequencies is presented with two keys challenges. Firstly, high propagation loss requires base stations (BSs) and user equipment (UEs) to use a large number of antennas and narrow beams to close the link with sufficient received signal power. Consequently, communications using narrow beams create a new challenge in channel estimation and link establishment based on fine angular probing. Current mmW system use analog phased arrays that can probe only one angle at the time which results in high latency during link establishment and channel tracking. It is desirable to design low latency beam training by exploring both physical layer designs and array architectures that could replace current 5G approaches and pave the way to the communications for frequency bands in higher mmW band and sub-THz region where larger antenna arrays and communications bandwidth can be exploited. To this end, we propose a novel signal processing techniques exploiting unique properties of mmW channel, and show both theoretically, in simulation and experiments its advantages over conventional approaches. Secondly, we explore different array architecture design and analyze their trade-offs between spectral efficiency and power consumption and area. For comprehensive comparison, we have developed a methodology for optimal design of system parameters for different array architecture candidates based on the spectral efficiency target, and use these parameters to estimate the array area and power consumption based on the circuits reported in the literature. We show that the hybrid analog and digital architectures have severe scalability concerns in radio frequency signal distribution with increased array size and spatial multiplexing levels, while the fully-digital array architectures have the best performance and power/area trade-offs.The developed approaches are based on a cross-disciplinary research that combines innovation in model based signal processing, machine learning, and radio hardware. This work is the first to apply compressive sensing (CS), a signal processing tool that exploits sparsity of mmW channel model, to accelerate beam training of mmW cellular system. The algorithm is designed to address practical issues including the requirement of cell discovery and synchronization that involves estimation of angular channel together with carrier frequency offset and timing offsets. We have analyzed the algorithm performance in the 5G compliant simulation and showed that an order of magnitude saving is achieved in initial access latency for the desired channel estimation accuracy. Moreover, we are the first to develop and implement a neural network assisted compressive beam alignment to deal with hardware impairments in mmW radios. We have used 60GHz mmW testbed to perform experiments and show that neural networks approach enhances alignment rate compared to CS. To further accelerate beam training, we proposed a novel frequency selective probing beams using the true-time-delay (TTD) analog array architecture. Our approach utilizes different subcarriers to scan different directions, and achieves a single-shot beam alignment, the fastest approach reported to date. Our comprehensive analysis of different array architectures and exploration of emerging architectures enabled us to develop an order of magnitude faster and energy efficient approaches for initial access and channel estimation in mmW systems

eScholarship - University of California

System-level power optimization:techniques and tools

Author: Benini Luca
De Micheli Giovanni
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 05/05/2011
Field of study

This tutorial surveys design methods for energy-efficient system-level design. We consider electronic sytems consisting of a hardware platform and software layers. We consider the three major constituents of hardware that consume energy, namely computation, communication, and storage units, and we review methods of reducing their energy consumption. We also study models for analyzing the energy cost of software, and methods for energy-efficient software design and compilation. This survery is organized around three main phases of a system design: conceptualization and modeling design and implementation, and runtime management. For each phase, we review recent techniques for energy-efficient design of both hardware and software

Infoscience - École polytechnique fédérale de Lausanne