Search CORE

8,184 research outputs found

An improved instruction-level power model for ARM11 microprocessor

Author: Wang Wei
Zwolinski Mark
Publication venue
Publication date: 23/01/2014
Field of study

The power and energy consumed by a chip has become the primary design constraint for embedded systems, which has led to a lot of work in hardware design techniques such as clock gating and power gating. The software can also affect the power usage of a chip, hence good software design can be used to reduce the power further. In this paper we present an instruction-level power model based on an ARM1176JZF-S processor to predict the power of software applications. Our model takes substantially less input data than existing high accuracy models and does not need to consider each instruction individually. We show that the power is related to both the distribution of instruction types and the operations per clock cycle (OPC) of the program. Our model does not need to consider the effect of two adjacent instructions, which saves a lot of calculation and measurements. Pipeline stall effects are also considered by OPC instead of cache miss, because there are a lot of other reasons that can cause the pipeline to stall. The model shows good performance with a maximum estimation error of -8.28\% and an average absolute estimation error is 4.88\% over six benchmarks. Finally, we prove that energy per operation (EPO) decreases with increasing operations per clock cycle, and we confirm the relationship empirically

Southampton (e-Prints Soton)

Recommended from our members

On-chip micro-evaporation: Experimental evaluation of liquid pumping and vapor compression cooling systems

Author: 3rd Micro and Nano Flows Conference (MNF2011)
Marcinichen JB
Thome JR
Publication venue: Brunel University
Publication date: 01/01/2011
Field of study

This paper was presented at the 3rd Micro and Nano Flows Conference (MNF2011), which was held at the Makedonia Palace Hotel, Thessaloniki in Greece. The conference was organised by Brunel University and supported by the Italian Union of Thermofluiddynamics, Aristotle University of Thessaloniki, University of Thessaly, IPEM, the Process Intensification Network, the Institution of Mechanical Engineers, the Heat Transfer Society, HEXAG - the Heat Exchange Action Group, and the Energy Institute.Thermal designers of data centers and server manufacturers are showing a great concern regarding the cooling of new generation data centers, which are more compact and dissipate more power than is currently possible to cool by conventional air conditioning systems. With very large data centers exceeding 100 000 servers, some consume more than 50 MW [1] of electrical energy to operate, energy which is directly converted to heat and then simply wasted as it is dissipated into the atmosphere. A potentially significantly better solution would be to make use of on-chip two-phase cooling [2], which, besides improving the cooling performance at the chip level, also adds the capability to reuse the waste heat in a convenient manner, since higher evaporating and condensing temperatures of the two-phase cooling system (from 60-95°C) are possible with such a new green cooling technology. In the present project, two such two-phase cooling cycles using micro-evaporation technology were experimentally evaluated with specific attention being paid to energy consumption, overall exergetic efficiency and controllability. The main difference between the two cooling cycles is the driver, where both a mini-compressor and a gear pump were considered. The former has the advantage due to its appeal of energy recovery since its exergy potential is higher and the waste heat is exported at a higher temperature for reuse.This study is supported by: the Swiss Commission for Technology and Innovation (CTI) contract number 6862.2; the LTCM laboratory; IBM Zürich Research Laboratory (Switzerland) and Embraco (Brazil)

Brunel University Research Archive

Optimizing the flash-RAM energy trade-off in deeply embedded systems

Author: Eder Kerstin
Hollis Simon
Pallister James
Publication venue
Publication date: 03/06/2014
Field of study

Deeply embedded systems often have the tightest constraints on energy consumption, requiring that they consume tiny amounts of current and run on batteries for years. However, they typically execute code directly from flash, instead of the more energy efficient RAM. We implement a novel compiler optimization that exploits the relative efficiency of RAM by statically moving carefully selected basic blocks from flash to RAM. Our technique uses integer linear programming, with an energy cost model to select a good set of basic blocks to place into RAM, without impacting stack or data storage. We evaluate our optimization on a common ARM microcontroller and succeed in reducing the average power consumption by up to 41% and reducing energy consumption by up to 22%, while increasing execution time. A case study is presented, where an application executes code then sleeps for a period of time. For this example we show that our optimization could allow the application to run on battery for up to 32% longer. We also show that for this scenario the total application energy can be reduced, even if the optimization increases the execution time of the code

arXiv.org e-Print Archive

Explore Bristol Research

An Energy and Performance Exploration of Network-on-Chip Architectures

Author: Arnab Banerjee
Gerard J. M. Smit
Pascal T. Wolkotte
Robert D. Mullins
Senior Member
Simon W. Moore
Student Member
Publication venue: IEEE Circuits and Systems Society
Publication date: 01/01/2009
Field of study

In this paper, we explore the designs of a circuit-switched router, a wormhole router, a quality-of-service (QoS) supporting virtual channel router and a speculative virtual channel router and accurately evaluate the energy-performance tradeoffs they offer. Power results from the designs placed and routed in a 90-nm CMOS process show that all the architectures dissipate significant idle state power. The additional energy required to route a packet through the router is then shown to be dominated by the data path. This leads to the key result that, if this trend continues, the use of more elaborate control can be justified and will not be immediately limited by the energy budget. A performance analysis also shows that dynamic resource allocation leads to the lowest network latencies, while static allocation may be used to meet QoS goals. Combining the power and performance figures then allows an energy-latency product to be calculated to judge the efficiency of each of the networks. The speculative virtual channel router was shown to have a very similar efficiency to the wormhole router, while providing a better performance, supporting its use for general purpose designs. Finally, area metrics are also presented to allow a comparison of implementation costs

CiteSeerX

University of Twente Research Information

Energy Consumption Saving in Embedded Microprocessors Using Hardware Accelerators

Author: Cardarilli Gian Carlo
Fazzolari Rocco
Nunzio Luca Di
Re Marco
Silvestri Francesca
Spanò Sergio
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/06/2018
Field of study

This paper deals with the reduction of power consumption in embedded microprocessors. Computing power and energy efficiency are becoming the main challenges for embedded system applications. This is, in particular, the caseof wearable systems. When the power supply is provided by batteries, an important requirement for these systems is the long service life. This work investigates a method for the reduction of microprocessor energy consumption, based on the use of hardware accelerators. Their use allows to reduce the execution time and to decrease the clock frequency, so reducing the power consumption. In order to provide experimental results, authors analyze a case of study in the field of wearable devices for the processing of ECG signals. The experimental results show that the use of hardware accelerator significantly reduces the power consumption

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

Modeling and Energy Optimization of LDPC Decoder Circuits with Timing Violations

Author: Gross Warren J.
Kschischang Frank R.
Leduc-Primeau François
Publication venue
Publication date: 17/11/2017
Field of study

This paper proposes a "quasi-synchronous" design approach for signal processing circuits, in which timing violations are permitted, but without the need for a hardware compensation mechanism. The case of a low-density parity-check (LDPC) decoder is studied, and a method for accurately modeling the effect of timing violations at a high level of abstraction is presented. The error-correction performance of code ensembles is then evaluated using density evolution while taking into account the effect of timing faults. Following this, several quasi-synchronous LDPC decoder circuits based on the offset min-sum algorithm are optimized, providing a 23%-40% reduction in energy consumption or energy-delay product, while achieving the same performance and occupying the same area as conventional synchronous circuits.Comment: To appear in IEEE Transactions on Communication

arXiv.org e-Print Archive

HAL-Université de Bretagne Occidentale

PolyPublie

An Early-Stage Statement-Level Metric for Energy Characterization of Embedded Processors

Author: Luigi Pomante
Paolo Giammatteo
Vincenzo Stoico
Vittoriano Muttillo
Publication venue
Publication date: 01/01/2020
Field of study

Abstract This work presents an early stage statement-level metric for energy characterization of embedded processors. Definition and the framework for metric evaluation are provided. In particular, such a metric is based on an existing assembly-level analysis and some profiling activities performed on a given C benchmark, and it is related to the average energy consumption of a generic C statement, for a given target processor. Its evaluation is performed with a one-time effort and, once available, it can be used to rapidly estimate the energy consumption of a given C function for all the considered processors. Two reference embedded processors are then considered in order to show an example of usage of the proposed metric and framework

VU Research Portal

Archivio della Ricerca - Università degli Studi di Teramo

Open Access Repository