Search CORE

98 research outputs found

Energy efficient enabling technologies for semantic video processing on mobile devices

Author: Larkin Daniel
Publication venue: Dublin City University. Centre for Digital Video Processing (CDVP)
Publication date: 01/11/2008
Field of study

Semantic object-based processing will play an increasingly important role in future multimedia systems due to the ubiquity of digital multimedia capture/playback technologies and increasing storage capacity. Although the object based paradigm has many undeniable benefits, numerous technical challenges remain before the applications becomes pervasive, particularly on computational constrained mobile devices. A fundamental issue is the ill-posed problem of semantic object segmentation. Furthermore, on battery powered mobile computing devices, the additional algorithmic complexity of semantic object based processing compared to conventional video processing is highly undesirable both from a real-time operation and battery life perspective. This thesis attempts to tackle these issues by firstly constraining the solution space and focusing on the human face as a primary semantic concept of use to users of mobile devices. A novel face detection algorithm is proposed, which from the outset was designed to be amenable to be offloaded from the host microprocessor to dedicated hardware, thereby providing real-time performance and reducing power consumption. The algorithm uses an Artificial Neural Network (ANN), whose topology and weights are evolved via a genetic algorithm (GA). The computational burden of the ANN evaluation is offloaded to a dedicated hardware accelerator, which is capable of processing any evolved network topology. Efficient arithmetic circuitry, which leverages modified Booth recoding, column compressors and carry save adders, is adopted throughout the design. To tackle the increased computational costs associated with object tracking or object based shape encoding, a novel energy efficient binary motion estimation architecture is proposed. Energy is reduced in the proposed motion estimation architecture by minimising the redundant operations inherent in the binary data. Both architectures are shown to compare favourable with the relevant prior art

DCU Online Research Access Service

MODLEX: A Multi Objective Data Layout EXploration Framework for Embedded Systems-on-Chip

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Video Compression from the Hardware Perspective

Author: Grzegorz Pastuszak
Publication venue: 'IntechOpen'
Publication date: 05/04/2012
Field of study

IntechOpen

Energy efficient hardware acceleration of multimedia processing tools

Author: Kinane Andrew
Publication venue: Dublin City University. School of Electronic Engineering
Publication date: 01/01/2006
Field of study

The world of mobile devices is experiencing an ongoing trend of feature enhancement and generalpurpose multimedia platform convergence. This trend poses many grand challenges, the most pressing being their limited battery life as a consequence of delivering computationally demanding features. The envisaged mobile application features can be considered to be accelerated by a set of underpinning hardware blocks Based on the survey that this thesis presents on modem video compression standards and their associated enabling technologies, it is concluded that tight energy and throughput constraints can still be effectively tackled at algorithmic level in order to design re-usable optimised hardware acceleration cores. To prove these conclusions, the work m this thesis is focused on two of the basic enabling technologies that support mobile video applications, namely the Shape Adaptive Discrete Cosine Transform (SA-DCT) and its inverse, the SA-IDCT. The hardware architectures presented in this work have been designed with energy efficiency in mind. This goal is achieved by employing high level techniques such as redundant computation elimination, parallelism and low switching computation structures. Both architectures compare favourably against the relevant pnor art in the literature. The SA-DCT/IDCT technologies are instances of a more general computation - namely, both are Constant Matrix Multiplication (CMM) operations. Thus, this thesis also proposes an algorithm for the efficient hardware design of any general CMM-based enabling technology. The proposed algorithm leverages the effective solution search capability of genetic programming. A bonus feature of the proposed modelling approach is that it is further amenable to hardware acceleration. Another bonus feature is an early exit mechanism that achieves large search space reductions .Results show an improvement on state of the art algorithms with future potential for even greater savings

Irish Universities

DCU Online Research Access Service

Parallelism and the software-hardware interface in embedded systems

Author: Chouliaras V A
Publication venue
Publication date: 01/01/2005
Field of study

This thesis by publications addresses issues in the architecture and microarchitecture of next generation, high performance streaming Systems-on-Chip through quantifying the most important forms of parallelism in current and emerging embedded system workloads. The work consists of three major research tracks, relating to data level parallelism, thread level parallelism and the software-hardware interface which together reflect the research interests of the author as they have been formed in the last nine years. Published works confirm that parallelism at the data level is widely accepted as the most important performance leverage for the efficient execution of embedded media and telecom applications and has been exploited via a number of approaches the most efficient being vectorlSIMD architectures. A further, complementary and substantial form of parallelism exists at the thread level but this has not been researched to the same extent in the context of embedded workloads. For the efficient execution of such applications, exploitation of both forms of parallelism is of paramount importance. This calls for a new architectural approach in the software-hardware interface as its rigidity, manifested in all desktop-based and the majority of embedded CPU's, directly affects the performance ofvectorized, threaded codes. The author advocates a holistic, mature approach where parallelism is extracted via automatic means while at the same time, the traditionally rigid hardware-software interface is optimized to match the temporal and spatial behaviour of the embedded workload. This ultimate goal calls for the precise study of these forms of parallelism for a number of applications executing on theoretical models such as instruction set simulators and parallel RAM machines as well as the development of highly parametric microarchitectural frameworks to encapSUlate that functionality.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

Loughborough University Institutional Repository

OpenGrey Repository

R-DVB: Software Defined Radio implementation of DVB-T signal detection functions for digital terrestrial television

Author: ROSE LUCA
Publication venue: 'Pisa University Press'
Publication date: 25/04/2009
Field of study

This thesis describes the implementation steps of ETSI DVB-T compliant software defined radio bench receiver, using the GNU Radio framework. It also analyzes its performances and suggest futures optimization tasks in order to achieve the real-time goal

Electronic Thesis and Dissertation Archive - Università di Pisa

Characterization, modeling and simulation of 4H-SiC power diodes

Author: Freda Albanese Loredana
Publication venue: Universita degli studi di Salerno
Publication date: 18/05/2011
Field of study

2009 - 2010Exploring the attractive electrical properties of the Silicon Carbide (SiC) for power devices, the characterization and the analysis of 4H-SiC pin diodes is the main topic of this Ph.D. document. In particular, the thesis concerns the development of an auto consistent, analytical, physics based model, created for accurately replicating the power diodes behavior, including both on-state and transient conditions. At the present, the fabrication of SiC devices with the given performances is not completely obvious because of the lack of knowledge still existing in the physical properties of the material, especially of those related to carrier transport and of their dependences on process parameters. Among these, one can cite the degree of doping activation, the carrier lifetime into epitaxial layers that will be employed and the sensitivity of some physical parameters to temperature changes. Therefore, a set of investigative tools, designed especially for SiC devices, cannot be regarded as secondary objective. It will be useful both for process monitoring, becoming essential to the tuning of technological processes used for the implementation of the final devices, and for a proper diagnostics of the realized devices. Following this need, in our research activity firstly a predictive, static analytical model, including temperature dependence, is developed. It is able to explain the carrier transport in diffused regions as function of the injection level and turns also useful for better understanding the influence of physical parameters, which depend in a significant way from the processed material, on device performances. The model solves the continuity equation in double carrier conditions, taking into account the effects due to varying doping profile of the junction, the spatial dependence of physical parameters on both doping and injection level and the modification of the electric field of the region with the injection regime. The model includes also the device characterization at high temperatures to analyze the influence of thermal issues on the overall behavior up to temperature of 250°C. The accuracy of the static model has been extensively demonstrated by numerous comparisons with numerical results obtained by the SILVACO commercial simulator. Secondly, with the aim to properly account for the dynamic electrical behavior of a diode with generic structure, the static model has been incorporated in a more general, self-consistent model, allowing the analysis of the device behavior when it is switched from an arbitrary forward-bias condition. In particular, the attention is focused on an abrupt variation of diode voltage due to an instantaneous interruption of the conduction current: although this situation is notably interesting for the study of the switching behavior of diodes, the voltage transitory is also traditionally used in different techniques of investigation to extract more information about the mean carrier lifetime. This occurs, for example, in the conventional Open Circuit Voltage Decay (OCVD) technique, where the voltage decay due to the current interruption is useful for an indirect measure of minority carrier lifetime in the epitaxial layer. Because of its heavy dependence on processes, the carrier lifetime is an important parameter to be monitored, especially in the case of bipolar devices, and it cannot be neglected. Due to the existent uncertainty about this parameter in SiC epi-layers, the OCVD method reveals itself a practical way to overcoming this limit. In detail, by using our self-consistent model, that exploits an improved method of the traditional OCVD technique, it is possible to characterize the carrier lifetime into 4H-SiC epitaxial layer of a generic diode under test, obtaining the spatial distributions of the minority carrier concentration and carrier lifetime at any injection regime. The overall model performances are compared to both device simulations and experimental results performed on Si and 4H-SiC rectifier structures with various physical and electrical characteristics. From the comparisons, the model results to have good predictive capabilities for describing the spatial–temporal variation of carriers and currents along the whole epi-layer, proving contextually the validity of the used approximations and allowing also to resolve some ambiguities reported in the literature, such as the stated inapplicability of the OCVD method on thick epitaxial layers, the reasons of the observed non linear decay of the voltage with time, and the effects of junction properties on voltage transient. Finally, with the imposition of right boundary conditions, it is possible to use the versatility of the developed model for extending the analysis and obtaining a physical insight of any arbitrary switching condition of 4H-SiC power diodes. [edited by author]IX n.s

EleA@UniSA - Università degli Studi di Salerno

Exploration and Design of Power-Efficient Networked Many-Core Systems

Author: Rahmani-Sane Amir-Mohammad
Publication venue: Turku Centre for Computer Science
Publication date: 14/12/2012
Field of study

Multiprocessing is a promising solution to meet the requirements of near future applications. To get full benefit from parallel processing, a manycore system needs efficient, on-chip communication architecture. Networkon- Chip (NoC) is a general purpose communication concept that offers highthroughput, reduced power consumption, and keeps complexity in check by a regular composition of basic building blocks. This thesis presents power efficient communication approaches for networked many-core systems. We address a range of issues being important for designing power-efficient manycore systems at two different levels: the network-level and the router-level. From the network-level point of view, exploiting state-of-the-art concepts such as Globally Asynchronous Locally Synchronous (GALS), Voltage/ Frequency Island (VFI), and 3D Networks-on-Chip approaches may be a solution to the excessive power consumption demanded by today’s and future many-core systems. To this end, a low-cost 3D NoC architecture, based on high-speed GALS-based vertical channels, is proposed to mitigate high peak temperatures, power densities, and area footprints of vertical interconnects in 3D ICs. To further exploit the beneficial feature of a negligible inter-layer distance of 3D ICs, we propose a novel hybridization scheme for inter-layer communication. In addition, an efficient adaptive routing algorithm is presented which enables congestion-aware and reliable communication for the hybridized NoC architecture. An integrated monitoring and management platform on top of this architecture is also developed in order to implement more scalable power optimization techniques. From the router-level perspective, four design styles for implementing power-efficient reconfigurable interfaces in VFI-based NoC systems are proposed. To enhance the utilization of virtual channel buffers and to manage their power consumption, a partial virtual channel sharing method for NoC routers is devised and implemented. Extensive experiments with synthetic and real benchmarks show significant power savings and mitigated hotspots with similar performance compared to latest NoC architectures. The thesis concludes that careful codesigned elements from different network levels enable considerable power savings for many-core systems.Siirretty Doriast

UTUPub

Performance and Energy Consumption Characterization and Modeling of Video Decoding on Multi-core Heterogenous SoC and their Applications

Author: Benmoussa Yahia
Publication venue: HAL CCSD
Publication date: 16/06/2015
Field of study

To meet the increasing complexity of mobile multimedia applications, the System on Chip (SoC) equipping modern mobile devices integrate powerful heterogeneous processing elements among which General Purpose Processors (GPP), Digital Signal Processors (DSP), hardware accelerator are the most common ones.Due to the ever-growing gap between battery lifetime and hardware/software complexity in addition to application computing power needs, the energy saving issue becomes crucial in the design of such systems. In this context, we propose a study aiming to enhance the understanding of the energy consumption behavior of video decoding on these kinds of systems. Accordingly, an end-to-end methodology for characterizing and modeling the performance and the energy consumption of video decoding on GPP and DSP is proposed. The characterization step is based on an exhaustive experimental methodology for evaluating, at different abstraction levels, the performance and the energy consumption of video decoding. It was achieved on embedded platforms on which were executed a wide range of video decoding configurations. This step highlighted the importance to consider different parameters which may pertain to different abstraction levels in evaluating the overall energy efficiency of a given system. The measurements obtained in this step were used to build empirically performance and energy models for video decoding on both GPP and DSP. The proposed models gave very accurate estimation (R 2 = 97%) of both the performance and the energy consumption of video decoding in terms of a rich set of parameters including the video quality and the processor frequency. Moreover, based on a multi-level characterization and sub-model decomposition approaches, we show how the developed models, unlike classic empirical models, are easily and rapidly generalizable to other platforms.Some possible applications using the developed models, in the context of adaptive video decoding, were proposed. In general, it consists to use the capability of the proposed performance model to predict the decoding time of a given video quality in dimensioning/scheduling the processing resources. Due to the increasing demand on High Definition (HD), the characterization methodology was extended to consider HD video decoding on both parallel multi-cores and hardware video accelerator. This part highlighted the potential of parallelism video decoding to increase the energy efficiency of video decoding and point out some open issues in this domain.Pour répondre à la complexité croissante des applications multimédia mobiles, les systèmes sur puce équipant les appareils mobiles modernes intègrent des unités de calcul puissantes et hétérogène. Parmi ces units de calcul, on peut trouver des processeurs à usage général, des processeur de traitement de signal et des accélérateurs matériels. En raison de l’écart toujours croissant entre la durée de vie des batteries et la demande de plus en plus importante en puissance de calcul, l’économie d’énergie devient un enjeu crucial dans la conception des systèmes mobiles. Cette problématique est accentuée par l’augmentation de la complexité des logiciels et architectures matériels utilisés. Dans ce contexte, nous proposons une étude visant à améliorer la compréhension des considérations énergétiques du décodage vidéo sur ce genre de systèmes. Nous proposerons ainsi une méthodologie pour la caractérisation et la modélisation des performances et de la consommation d’énergie du décodage vidéo, aussi bien sur des processeurs à usage général de type ARM que sur un processeurde traitement de signal. L’étape de caractérisation est basée sur une méthodologie expérimentale pour évaluer de façon exhaustive et à différents niveaux d’abstraction, les performances et la consommation d’énergie du décodage vidéo. Cette caractérisation a été réalisée sur des plates-formes embarquées sur lesquels ont été exécutés un large éventail de configurations du décodage vidéo. Cette étape a souligné l’importance d’examiner différents paramètres qui peuvent se rapporter à différents niveaux d’abstraction dans l’évaluation de l’efficacité énergétique globale d’un système donné. Les mesures obtenues dans cette étape ont été utilisées pour construire empiriquement des modèles de performance et de consommation d’énergie pour le décodage vidéo à la fois sur des processeurs à usage général type ARM et sur un processeur de traitement de signal. Les modèles proposés peuvent estimer avec une grande précision (R 2 = 97%) la performance et la consommation d’énergie de décodage vidéo en fonction d’un nombre de paramètres comprenant la qualité de la vidéo et la fréquence du processeur. En plus, en se basant sur une caractérisation multi-niveaux et une approches de modélisation par décomposition en sous-modèles, nous montrons comment les modèles développés, contrairement aux modèles empiriques classiques, sont facilement et rapidement généralisables à d’autres plates-formes. Nous proposerons également certaines applications possibles des modèles développés, dans le cadre du décodage vidéo adaptatif. En général, cela consiste à exploiter la capacité du modèle de performance proposé pour prédire le temps de décodage d’une qualité vidéo donnée afin de mieux dimensionner les ressources de calculs dans un but de réduire leur consommationd’énergie

Thèses en Ligne

HAL-Université de Bretagne Occidentale

Recommended from our members

Efficient FPGA implementation and power modelling of image and signal processing IP cores

Author: Chandrasekaran Shrutisagar
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2007
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Field Programmable Gate Arrays (FPGAs) are the technology of choice in a number ofimage and signal processing application areas such as consumer electronics, instrumentation, medical data processing and avionics due to their reasonable energy consumption, high performance, security, low design-turnaround time and reconfigurability. Low power FPGA devices are also emerging as competitive solutions for mobile and thermally constrained platforms. Most computationally intensive image and signal processing algorithms also consume a lot of power leading to a number of issues including reduced mobility, reliability concerns and increased design cost among others. Power dissipation has become one of the most important challenges, particularly for FPGAs. Addressing this problem requires optimisation and awareness at all levels in the design flow. The key achievements of the work presented in this thesis are summarised here. Behavioural level optimisation strategies have been used for implementing matrix product and inner product through the use of mathematical techniques such as Distributed Arithmetic (DA) and its variations including offset binary coding, sparse factorisation and novel vector level transformations. Applications to test the impact of these algorithmic and arithmetic transformations include the fast Hadamard/Walsh transforms and Gaussian mixture models. Complete design space exploration has been performed on these cores, and where appropriate, they have been shown to clearly outperform comparable existing implementations. At the architectural level, strategies such as parallelism, pipelining and systolisation have been successfully applied for the design and optimisation of a number of cores including colour space conversion, finite Radon transform, finite ridgelet transform and circular convolution. A pioneering study into the influence of supply voltage scaling for FPGA based designs, used in conjunction with performance enhancing strategies such as parallelism and pipelining has been performed. Initial results are very promising and indicated significant potential for future research in this area. A key contribution of this work includes the development of a novel high level power macromodelling technique for design space exploration and characterisation of custom IP cores for FPGAs, called Functional Level Power Analysis and Modelling (FLPAM). FLPAM is scalable, platform independent and compares favourably with existing approaches. A hybrid, top-down design flow paradigm integrating FLPAM with commercially available design tools for systematic optimisation of IP cores has also been developed

Brunel University Research Archive