Search CORE

6,793 research outputs found

Data Cache-Energy and Throughput Models: Design Exploration for Embedded Processors

Author: McDonald-Maier Klaus
Qadri Muhammad Yasir
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2009
Field of study

Most modern 16-bit and 32-bit embedded processors contain cache memories to further increase instruction throughput of the device. Embedded processors that contain cache memories open an opportunity for the low-power research community to model the impact of cache energy consumption and throughput gains. For optimal cache memory configuration mathematical models have been proposed in the past. Most of these models are complex enough to be adapted for modern applications like run-time cache reconfiguration. This paper improves and validates previously proposed energy and throughput models for a data cache, which could be used for overhead analysis for various cache types with relatively small amount of inputs. These models analyze the energy and throughput of a data cache on an application basis, thus providing the hardware and software designer with the feedback vital to tune the cache or application for a given energy budget. The models are suitable for use at design time in the cache optimization process for embedded processors considering time and energy overhead or could be employed at runtime for reconfigurable architectures

University of Essex Research Repository

Springer - Publisher Connector

Directory of Open Access Journals

Recommended from our members

Design Space Exploration in Cyber-Physical Systems

Author: Amir Maral
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Cyber physical systems (CPS) integrate a variety of engineering areas such as control, mechanical and computer engineering in a holistic design effort. While interdependencies between the different disciplines are key attributes of CPS design science, little is known about the impact of design decisions of the cyber part on the overall system qualities. To investigate these interdependencies, this paper proposes a simulation-based Design Space Exploration (DSE) framework that considers detailed cyber system parameters such as cache size, bus width, and voltage levels in addition to physical and control parameters of the CPS. We propose an exploration algorithm that surfs the parameter configurations in the cyber physical sub-systems, in order to approximate the Pareto-optimal design points with regards to the trade-os among the design objectives, such as energy consumption and control stability. We apply the proposed framework to a network control system for an inverted-pendulum application. The presented holistic evaluation of the identified Pareto-points reveals the presence of non-trivial trade-os, which are imposed by the control, physical, and detailed cyber parameters. For instance the identified energy and control optimal design points comprise configurations with a wide range of CPU speeds, sample times and cache configuration following non-trivial zig-zag patterns. The proposed framework could identify and manage those trade-os and, as a result, is an imperative rst step to automate the search for superior CSP configurations

eScholarship - University of California

Performance analysis and optimization of automotive GPUs

Author: Abella Ferrer Jaume
Benedicte Illescas Pedro
Cazorla Almeida Francisco Javier
Kosmidis Leonidas
Mazzocchetti Fabio
Tabani Hamid
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Advanced Driver Assistance Systems (ADAS) and Autonomous Driving (AD) have drastically increased the performance demands of automotive systems. Suitable highperformance platforms building upon Graphic Processing Units (GPUs) have been developed to respond to this demand, being NVIDIA Jetson TX2 a relevant representative. However, whether high-performance GPU configurations are appropriate for automotive setups remains as an open question. This paper aims at providing light on this question by modelling an automotive GPU (Jetson TX2), analyzing its microarchitectural parameters against relevant benchmarks, and identifying specific configurations able to meaningfully increase performance within similar cost envelopes, or to decrease costs preserving original performance levels. Overall, our analysis opens the door to the optimization of automotive GPUs for further system efficiency.This work has been partially supported by the Spanish Ministry of Economy and Competitiveness (MINECO) under grant TIN2015-65316-P, the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 772773) and the HiPEAC Network of Excellence. Pedro Benedicte and Jaume Abella have been partially supported by the MINECO under FPU15/01394 grant and Ramon y Cajal postdoctoral fellowship number RYC-2013-14717 respectively and Leonidas Kosmidis under Juan de la Cierva-Formacin postdoctoral fellowship (FJCI-2017-34095).Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Scipedia

Cycle Accurate Energy and Throughput Estimation for Data Cache

Author: McDonald-Maier KD
Qadri M
Publication venue: 'TECSI'
Publication date: 01/01/2009
Field of study

Resource optimization in energy constrained real-time adaptive embedded systems highly depends on accurate energy and throughput estimates of processor peripherals. Such applications require lightweight, accurate mathematical models to profile energy and timing requirements on the go. This paper presents enhanced mathematical models for data cache energy and throughput estimation. The energy and throughput models were found to be within 95% accuracy of per instruction energy model of a processor, and a full system simulator?s timing model respectively. Furthermore, the possible application of these models in various scenarios is discussed in this paper

University of Essex Research Repository

Improving early design stage timing modeling in multicore based real-time systems

Author: Abella Ferrer Jaume
Cazorla Almeida Francisco Javier
Fernández Mikel
Jalle Ibarra Javier
Trilla David
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

This paper presents a modelling approach for the timing behavior of real-time embedded systems (RTES) in early design phases. The model focuses on multicore processors - accepted as the next computing platform for RTES - and in particular it predicts the contention tasks suffer in the access to multicore on-chip shared resources. The model presents the key properties of not requiring the application's source code or binary and having high-accuracy and low overhead. The former is of paramount importance in those common scenarios in which several software suppliers work in parallel implementing different applications for a system integrator, subject to different intellectual property (IP) constraints. Our model helps reducing the risk of exceeding the assigned budgets for each application in late design stages and its associated costs.This work has received funding from the European Space Agency under Project Reference AO=17722=13=NL=LvH, and has also been supported by the Spanish Ministry of Science and Innovation grant TIN2015-65316-P. Jaume Abella has been partially supported by the MINECO under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Modeling and visualizing networked multi-core embedded software energy consumption

Author: Eder Kerstin
Kerrison Steve
Publication venue
Publication date: 09/09/2015
Field of study

In this report we present a network-level multi-core energy model and a software development process workflow that allows software developers to estimate the energy consumption of multi-core embedded programs. This work focuses on a high performance, cache-less and timing predictable embedded processor architecture, XS1. Prior modelling work is improved to increase accuracy, then extended to be parametric with respect to voltage and frequency scaling (VFS) and then integrated into a larger scale model of a network of interconnected cores. The modelling is supported by enhancements to an open source instruction set simulator to provide the first network timing aware simulations of the target architecture. Simulation based modelling techniques are combined with methods of results presentation to demonstrate how such work can be integrated into a software developer's workflow, enabling the developer to make informed, energy aware coding decisions. A set of single-, multi-threaded and multi-core benchmarks are used to exercise and evaluate the models and provide use case examples for how results can be presented and interpreted. The models all yield accuracy within an average +/-5 % error margin

arXiv.org e-Print Archive

Explore Bristol Research