98 research outputs found

    Energy efficient enabling technologies for semantic video processing on mobile devices

    Get PDF
    Semantic object-based processing will play an increasingly important role in future multimedia systems due to the ubiquity of digital multimedia capture/playback technologies and increasing storage capacity. Although the object based paradigm has many undeniable benefits, numerous technical challenges remain before the applications becomes pervasive, particularly on computational constrained mobile devices. A fundamental issue is the ill-posed problem of semantic object segmentation. Furthermore, on battery powered mobile computing devices, the additional algorithmic complexity of semantic object based processing compared to conventional video processing is highly undesirable both from a real-time operation and battery life perspective. This thesis attempts to tackle these issues by firstly constraining the solution space and focusing on the human face as a primary semantic concept of use to users of mobile devices. A novel face detection algorithm is proposed, which from the outset was designed to be amenable to be offloaded from the host microprocessor to dedicated hardware, thereby providing real-time performance and reducing power consumption. The algorithm uses an Artificial Neural Network (ANN), whose topology and weights are evolved via a genetic algorithm (GA). The computational burden of the ANN evaluation is offloaded to a dedicated hardware accelerator, which is capable of processing any evolved network topology. Efficient arithmetic circuitry, which leverages modified Booth recoding, column compressors and carry save adders, is adopted throughout the design. To tackle the increased computational costs associated with object tracking or object based shape encoding, a novel energy efficient binary motion estimation architecture is proposed. Energy is reduced in the proposed motion estimation architecture by minimising the redundant operations inherent in the binary data. Both architectures are shown to compare favourable with the relevant prior art

    MODLEX: A Multi Objective Data Layout EXploration Framework for Embedded Systems-on-Chip

    Full text link

    Video Compression from the Hardware Perspective

    Get PDF

    Energy efficient hardware acceleration of multimedia processing tools

    Get PDF
    The world of mobile devices is experiencing an ongoing trend of feature enhancement and generalpurpose multimedia platform convergence. This trend poses many grand challenges, the most pressing being their limited battery life as a consequence of delivering computationally demanding features. The envisaged mobile application features can be considered to be accelerated by a set of underpinning hardware blocks Based on the survey that this thesis presents on modem video compression standards and their associated enabling technologies, it is concluded that tight energy and throughput constraints can still be effectively tackled at algorithmic level in order to design re-usable optimised hardware acceleration cores. To prove these conclusions, the work m this thesis is focused on two of the basic enabling technologies that support mobile video applications, namely the Shape Adaptive Discrete Cosine Transform (SA-DCT) and its inverse, the SA-IDCT. The hardware architectures presented in this work have been designed with energy efficiency in mind. This goal is achieved by employing high level techniques such as redundant computation elimination, parallelism and low switching computation structures. Both architectures compare favourably against the relevant pnor art in the literature. The SA-DCT/IDCT technologies are instances of a more general computation - namely, both are Constant Matrix Multiplication (CMM) operations. Thus, this thesis also proposes an algorithm for the efficient hardware design of any general CMM-based enabling technology. The proposed algorithm leverages the effective solution search capability of genetic programming. A bonus feature of the proposed modelling approach is that it is further amenable to hardware acceleration. Another bonus feature is an early exit mechanism that achieves large search space reductions .Results show an improvement on state of the art algorithms with future potential for even greater savings

    Parallelism and the software-hardware interface in embedded systems

    Get PDF
    This thesis by publications addresses issues in the architecture and microarchitecture of next generation, high performance streaming Systems-on-Chip through quantifying the most important forms of parallelism in current and emerging embedded system workloads. The work consists of three major research tracks, relating to data level parallelism, thread level parallelism and the software-hardware interface which together reflect the research interests of the author as they have been formed in the last nine years. Published works confirm that parallelism at the data level is widely accepted as the most important performance leverage for the efficient execution of embedded media and telecom applications and has been exploited via a number of approaches the most efficient being vectorlSIMD architectures. A further, complementary and substantial form of parallelism exists at the thread level but this has not been researched to the same extent in the context of embedded workloads. For the efficient execution of such applications, exploitation of both forms of parallelism is of paramount importance. This calls for a new architectural approach in the software-hardware interface as its rigidity, manifested in all desktop-based and the majority of embedded CPU's, directly affects the performance ofvectorized, threaded codes. The author advocates a holistic, mature approach where parallelism is extracted via automatic means while at the same time, the traditionally rigid hardware-software interface is optimized to match the temporal and spatial behaviour of the embedded workload. This ultimate goal calls for the precise study of these forms of parallelism for a number of applications executing on theoretical models such as instruction set simulators and parallel RAM machines as well as the development of highly parametric microarchitectural frameworks to encapSUlate that functionality.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    R-DVB: Software Defined Radio implementation of DVB-T signal detection functions for digital terrestrial television

    Get PDF
    This thesis describes the implementation steps of ETSI DVB-T compliant software defined radio bench receiver, using the GNU Radio framework. It also analyzes its performances and suggest futures optimization tasks in order to achieve the real-time goal

    Characterization, modeling and simulation of 4H-SiC power diodes

    Get PDF
    2009 - 2010Exploring the attractive electrical properties of the Silicon Carbide (SiC) for power devices, the characterization and the analysis of 4H-SiC pin diodes is the main topic of this Ph.D. document. In particular, the thesis concerns the development of an auto consistent, analytical, physics based model, created for accurately replicating the power diodes behavior, including both on-state and transient conditions. At the present, the fabrication of SiC devices with the given performances is not completely obvious because of the lack of knowledge still existing in the physical properties of the material, especially of those related to carrier transport and of their dependences on process parameters. Among these, one can cite the degree of doping activation, the carrier lifetime into epitaxial layers that will be employed and the sensitivity of some physical parameters to temperature changes. Therefore, a set of investigative tools, designed especially for SiC devices, cannot be regarded as secondary objective. It will be useful both for process monitoring, becoming essential to the tuning of technological processes used for the implementation of the final devices, and for a proper diagnostics of the realized devices. Following this need, in our research activity firstly a predictive, static analytical model, including temperature dependence, is developed. It is able to explain the carrier transport in diffused regions as function of the injection level and turns also useful for better understanding the influence of physical parameters, which depend in a significant way from the processed material, on device performances. The model solves the continuity equation in double carrier conditions, taking into account the effects due to varying doping profile of the junction, the spatial dependence of physical parameters on both doping and injection level and the modification of the electric field of the region with the injection regime. The model includes also the device characterization at high temperatures to analyze the influence of thermal issues on the overall behavior up to temperature of 250°C. The accuracy of the static model has been extensively demonstrated by numerous comparisons with numerical results obtained by the SILVACO commercial simulator. Secondly, with the aim to properly account for the dynamic electrical behavior of a diode with generic structure, the static model has been incorporated in a more general, self-consistent model, allowing the analysis of the device behavior when it is switched from an arbitrary forward-bias condition. In particular, the attention is focused on an abrupt variation of diode voltage due to an instantaneous interruption of the conduction current: although this situation is notably interesting for the study of the switching behavior of diodes, the voltage transitory is also traditionally used in different techniques of investigation to extract more information about the mean carrier lifetime. This occurs, for example, in the conventional Open Circuit Voltage Decay (OCVD) technique, where the voltage decay due to the current interruption is useful for an indirect measure of minority carrier lifetime in the epitaxial layer. Because of its heavy dependence on processes, the carrier lifetime is an important parameter to be monitored, especially in the case of bipolar devices, and it cannot be neglected. Due to the existent uncertainty about this parameter in SiC epi-layers, the OCVD method reveals itself a practical way to overcoming this limit. In detail, by using our self-consistent model, that exploits an improved method of the traditional OCVD technique, it is possible to characterize the carrier lifetime into 4H-SiC epitaxial layer of a generic diode under test, obtaining the spatial distributions of the minority carrier concentration and carrier lifetime at any injection regime. The overall model performances are compared to both device simulations and experimental results performed on Si and 4H-SiC rectifier structures with various physical and electrical characteristics. From the comparisons, the model results to have good predictive capabilities for describing the spatial–temporal variation of carriers and currents along the whole epi-layer, proving contextually the validity of the used approximations and allowing also to resolve some ambiguities reported in the literature, such as the stated inapplicability of the OCVD method on thick epitaxial layers, the reasons of the observed non linear decay of the voltage with time, and the effects of junction properties on voltage transient. Finally, with the imposition of right boundary conditions, it is possible to use the versatility of the developed model for extending the analysis and obtaining a physical insight of any arbitrary switching condition of 4H-SiC power diodes. [edited by author]IX n.s

    Exploration and Design of Power-Efficient Networked Many-Core Systems

    Get PDF
    Multiprocessing is a promising solution to meet the requirements of near future applications. To get full benefit from parallel processing, a manycore system needs efficient, on-chip communication architecture. Networkon- Chip (NoC) is a general purpose communication concept that offers highthroughput, reduced power consumption, and keeps complexity in check by a regular composition of basic building blocks. This thesis presents power efficient communication approaches for networked many-core systems. We address a range of issues being important for designing power-efficient manycore systems at two different levels: the network-level and the router-level. From the network-level point of view, exploiting state-of-the-art concepts such as Globally Asynchronous Locally Synchronous (GALS), Voltage/ Frequency Island (VFI), and 3D Networks-on-Chip approaches may be a solution to the excessive power consumption demanded by today’s and future many-core systems. To this end, a low-cost 3D NoC architecture, based on high-speed GALS-based vertical channels, is proposed to mitigate high peak temperatures, power densities, and area footprints of vertical interconnects in 3D ICs. To further exploit the beneficial feature of a negligible inter-layer distance of 3D ICs, we propose a novel hybridization scheme for inter-layer communication. In addition, an efficient adaptive routing algorithm is presented which enables congestion-aware and reliable communication for the hybridized NoC architecture. An integrated monitoring and management platform on top of this architecture is also developed in order to implement more scalable power optimization techniques. From the router-level perspective, four design styles for implementing power-efficient reconfigurable interfaces in VFI-based NoC systems are proposed. To enhance the utilization of virtual channel buffers and to manage their power consumption, a partial virtual channel sharing method for NoC routers is devised and implemented. Extensive experiments with synthetic and real benchmarks show significant power savings and mitigated hotspots with similar performance compared to latest NoC architectures. The thesis concludes that careful codesigned elements from different network levels enable considerable power savings for many-core systems.Siirretty Doriast

    Performance and Energy Consumption Characterization and Modeling of Video Decoding on Multi-core Heterogenous SoC and their Applications

    Get PDF
    To meet the increasing complexity of mobile multimedia applications, the System on Chip (SoC) equipping modern mobile devices integrate powerful heterogeneous processing elements among which General Purpose Processors (GPP), Digital Signal Processors (DSP), hardware accelerator are the most common ones.Due to the ever-growing gap between battery lifetime and hardware/software complexity in addition to application computing power needs, the energy saving issue becomes crucial in the design of such systems. In this context, we propose a study aiming to enhance the understanding of the energy consumption behavior of video decoding on these kinds of systems. Accordingly, an end-to-end methodology for characterizing and modeling the performance and the energy consumption of video decoding on GPP and DSP is proposed. The characterization step is based on an exhaustive experimental methodology for evaluating, at different abstraction levels, the performance and the energy consumption of video decoding. It was achieved on embedded platforms on which were executed a wide range of video decoding configurations. This step highlighted the importance to consider different parameters which may pertain to different abstraction levels in evaluating the overall energy efficiency of a given system. The measurements obtained in this step were used to build empirically performance and energy models for video decoding on both GPP and DSP. The proposed models gave very accurate estimation (R 2 = 97%) of both the performance and the energy consumption of video decoding in terms of a rich set of parameters including the video quality and the processor frequency. Moreover, based on a multi-level characterization and sub-model decomposition approaches, we show how the developed models, unlike classic empirical models, are easily and rapidly generalizable to other platforms.Some possible applications using the developed models, in the context of adaptive video decoding, were proposed. In general, it consists to use the capability of the proposed performance model to predict the decoding time of a given video quality in dimensioning/scheduling the processing resources. Due to the increasing demand on High Definition (HD), the characterization methodology was extended to consider HD video decoding on both parallel multi-cores and hardware video accelerator. This part highlighted the potential of parallelism video decoding to increase the energy efficiency of video decoding and point out some open issues in this domain.Pour rĂ©pondre Ă  la complexitĂ© croissante des applications multimĂ©dia mobiles, les systĂšmes sur puce Ă©quipant les appareils mobiles modernes intĂšgrent des unitĂ©s de calcul puissantes et hĂ©tĂ©rogĂšne. Parmi ces units de calcul, on peut trouver des processeurs Ă  usage gĂ©nĂ©ral, des processeur de traitement de signal et des accĂ©lĂ©rateurs matĂ©riels. En raison de l’écart toujours croissant entre la durĂ©e de vie des batteries et la demande de plus en plus importante en puissance de calcul, l’économie d’énergie devient un enjeu crucial dans la conception des systĂšmes mobiles. Cette problĂ©matique est accentuĂ©e par l’augmentation de la complexitĂ© des logiciels et architectures matĂ©riels utilisĂ©s. Dans ce contexte, nous proposons une Ă©tude visant Ă  amĂ©liorer la comprĂ©hension des considĂ©rations Ă©nergĂ©tiques du dĂ©codage vidĂ©o sur ce genre de systĂšmes. Nous proposerons ainsi une mĂ©thodologie pour la caractĂ©risation et la modĂ©lisation des performances et de la consommation d’énergie du dĂ©codage vidĂ©o, aussi bien sur des processeurs Ă  usage gĂ©nĂ©ral de type ARM que sur un processeurde traitement de signal. L’étape de caractĂ©risation est basĂ©e sur une mĂ©thodologie expĂ©rimentale pour Ă©valuer de façon exhaustive et Ă  diffĂ©rents niveaux d’abstraction, les performances et la consommation d’énergie du dĂ©codage vidĂ©o. Cette caractĂ©risation a Ă©tĂ© rĂ©alisĂ©e sur des plates-formes embarquĂ©es sur lesquels ont Ă©tĂ© exĂ©cutĂ©s un large Ă©ventail de configurations du dĂ©codage vidĂ©o. Cette Ă©tape a soulignĂ© l’importance d’examiner diffĂ©rents paramĂštres qui peuvent se rapporter Ă  diffĂ©rents niveaux d’abstraction dans l’évaluation de l’efficacitĂ© Ă©nergĂ©tique globale d’un systĂšme donnĂ©. Les mesures obtenues dans cette Ă©tape ont Ă©tĂ© utilisĂ©es pour construire empiriquement des modĂšles de performance et de consommation d’énergie pour le dĂ©codage vidĂ©o Ă  la fois sur des processeurs Ă  usage gĂ©nĂ©ral type ARM et sur un processeur de traitement de signal. Les modĂšles proposĂ©s peuvent estimer avec une grande prĂ©cision (R 2 = 97%) la performance et la consommation d’énergie de dĂ©codage vidĂ©o en fonction d’un nombre de paramĂštres comprenant la qualitĂ© de la vidĂ©o et la frĂ©quence du processeur. En plus, en se basant sur une caractĂ©risation multi-niveaux et une approches de modĂ©lisation par dĂ©composition en sous-modĂšles, nous montrons comment les modĂšles dĂ©veloppĂ©s, contrairement aux modĂšles empiriques classiques, sont facilement et rapidement gĂ©nĂ©ralisables Ă  d’autres plates-formes. Nous proposerons Ă©galement certaines applications possibles des modĂšles dĂ©veloppĂ©s, dans le cadre du dĂ©codage vidĂ©o adaptatif. En gĂ©nĂ©ral, cela consiste Ă  exploiter la capacitĂ© du modĂšle de performance proposĂ© pour prĂ©dire le temps de dĂ©codage d’une qualitĂ© vidĂ©o donnĂ©e afin de mieux dimensionner les ressources de calculs dans un but de rĂ©duire leur consommationd’énergie
    • 

    corecore