Search CORE

1,977 research outputs found

Dynamic Energy Management for Chip Multi-processors under Performance Constraints

Author: Ababei Cristinel
Moghaddam Milad Ghorbani
Publication venue: e-Publications@Marquette
Publication date: 01/10/2017
Field of study

We introduce a novel algorithm for dynamic energy management (DEM) under performance constraints in chip multi-processors (CMPs). Using the novel concept of delayed instructions count, performance loss estimations are calculated at the end of each control period for each core. In addition, a Kalman filtering based approach is employed to predict workload in the next control period for which voltage-frequency pairs must be selected. This selection is done with a novel dynamic voltage and frequency scaling (DVFS) algorithm whose objective is to reduce energy consumption but without degrading performance beyond the user set threshold. Using our customized Sniper based CMP system simulation framework, we demonstrate the effectiveness of the proposed algorithm for a variety of benchmarks for 16 core and 64 core network-on-chip based CMP architectures. Simulation results show consistent energy savings across the board. We present our work as an investigation of the tradeoff between the achievable energy reduction via DVFS when predictions are done using the effective Kalman filter for different performance penalty thresholds

epublications@Marquette

Investigation of LSTM Based Prediction for Dynamic Energy Management in Chip Multiprocessors

Author: Ababei Cristinel
Guan Wenkai
Moghaddam Milad Ghorbani
Publication venue: e-Publications@Marquette
Publication date: 26/03/2018
Field of study

In this paper, we investigate the effectiveness of using long short-term memory (LSTM) instead of Kalman filtering to do prediction for the purpose of constructing dynamic energy management (DEM) algorithms in chip multi-processors (CMPs). Either of the two prediction methods is employed to estimate the workload in the next control period for each of the processor cores. These estimates are then used to select voltage-frequency (VF) pairs for each core of the CMP during the next control period as part of a dynamic voltage and frequency scaling (DVFS) technique. The objective of the DVFS technique is to reduce energy consumption under performance constraints that are set by the user. We conduct our investigation using a custom Sniper system simulation framework. Simulation results for 16 and 64 core network-on-chip based CMP architectures and using several benchmarks demonstrate that the LSTM is slightly better than Kalman filtering

epublications@Marquette

Investigation of LSTM Based Prediction for Dynamic Energy Management in Chip Multiprocessors

Author: Moghaddam Milad Ghorbani
Guan Wenkai
Ababei Cristinel
Publication venue: e-Publications@Marquette
Publication date: 26/03/2018
Field of study

epublications@Marquette

Saint Louis University Libraries Digital Collections

Recommended from our members

A RISC-V Vector Processor With Simultaneous-Switching Switched-Capacitor DC-DC Converters in 28 nm FDSOI

Author: Alon E
Asanović K
Avizienis R
Bailey S
Blagojević M
Chen PH
Chiu PF
Flatresse P
Jevtić R
Keller B
Kwak J
Le HP
Lee Y
Nikolić B
Puggelli A
Richards B
Sutardja N
Waterman A
Zimmer B
Publication venue: eScholarship, University of California
Publication date: 01/04/2016
Field of study

This work demonstrates a RISC-V vector microprocessor implemented in 28 nm FDSOI with fully integrated simultaneous-switching switched-capacitor DC-DC (SC DC-DC) converters and adaptive clocking that generates four on-chip voltages between 0.45 and 1 V using only 1.0 V core and 1.8 V IO voltage inputs. The converters achieve high efficiency at the system level by switching simultaneously to avoid charge-sharing losses and by using an adaptive clock to maximize performance for the resulting voltage ripple. Details about the implementation of the DC-DC switches, DC-DC controller, and adaptive clock are provided, and the sources of conversion loss are analyzed based on measured results. This system pushes the capabilities of dynamic voltage scaling by enabling fast transitions (20 ns), simple packaging (no off-chip passives), low area overhead (16%), high conversion efficiency (80%-86%), and high energy efficiency (26.2 DP GFLOPS/W) for mobile devices

eScholarship - University of California

RUNTIME METHODS TO IMPROVE ENERGY EFFICIENCY IN SUPERCOMPUTING APPLICATIONS

Author: Bhalachandra Sridutt
Publication venue: University of North Carolina at Chapel Hill Graduate School
Publication date: 01/01/2018
Field of study

Energy efficiency in supercomputing is critical to limit operating costs and carbon footprints. While the energy efficiency of future supercomputing centers needs to improve at all levels, the energy consumed by the processing units is a large fraction of the total energy consumed by High Performance Computing (HPC) systems. HPC applications use a parallel programming paradigm like the Message Passing Interface (MPI) to coordinate computation and communication among thousands of processors. With dynamically-changing factors both in hardware and software affecting energy usage of processors, there exists a need for power monitoring and regulation at runtime to achieve savings in energy. This dissertation highlights an adaptive runtime framework that enables processors with core-specific power control by dynamically adapting to workload characteristics to reduce power with little or no performance impact. Two opportunities to improve the energy efficiency of processors running MPI applications are identified - computational workload imbalance and waiting on memory. Monitoring of performance and power regulation is performed by the framework transparently within the MPI runtime system, eliminating the need for code changes to MPI applications. The effect of enforcing power limits (capping) on processors is also investigated. Experiments on 32 nodes (1024 cores) show that in presence of workload imbalance, the runtime reduces Central Processing Unit (CPU) frequency on cores not on the critical path, thereby reducing power and hence energy usage without deteriorating performance. Using this runtime, six MPI mini-applications and a full MPI application show an overall 20% decrease in energy use with less than 1% increase in execution time. In addition, the lowering of frequency on non-critical cores reduces run-to-run performance variation and improves performance. For the full application, an average speedup of 11% is seen, while the power is lowered by about 31% for an energy savings of up to 42%. Another experiment on 16 nodes (256 cores) that are power capped also shows performance improvement along with power reduction. Thus, energy optimization can also be a performance optimization. For applications that are limited by memory access times, memory metrics identified facilitate lowering of power by up to 32% without adversely impacting performance.Doctor of Philosoph

Carolina Digital Repository

A Unified Infrastructure for Monitoring and Tuning the Energy Efficiency of HPC Applications

Author: Schöne Robert
Publication venue
Publication date: 19/09/2017
Field of study

High Performance Computing (HPC) has become an indispensable tool for the scientific community to perform simulations on models whose complexity would exceed the limits of a standard computer. An unfortunate trend concerning HPC systems is that their power consumption under high-demanding workloads increases. To counter this trend, hardware vendors have implemented power saving mechanisms in recent years, which has increased the variability in power demands of single nodes. These capabilities provide an opportunity to increase the energy efficiency of HPC applications. To utilize these hardware power saving mechanisms efficiently, their overhead must be analyzed. Furthermore, applications have to be examined for performance and energy efficiency issues, which can give hints for optimizations. This requires an infrastructure that is able to capture both, performance and power consumption information concurrently. The mechanisms that such an infrastructure would inherently support could further be used to implement a tool that is able to do both, measuring and tuning of energy efficiency. This thesis targets all steps in this process by making the following contributions: First, I provide a broad overview on different related fields. I list common performance measurement tools, power measurement infrastructures, hardware power saving capabilities, and tuning tools. Second, I lay out a model that can be used to define and describe energy efficiency tuning on program region scale. This model includes hardware and software dependent parameters. Hardware parameters include the runtime overhead and delay for switching power saving mechanisms as well as a contemplation of their scopes and the possible influence on application performance. Thus, in a third step, I present methods to evaluate common power saving mechanisms and list findings for different x86 processors. Software parameters include their performance and power consumption characteristics as well as the influence of power-saving mechanisms on these. To capture software parameters, an infrastructure for measuring performance and power consumption is necessary. With minor additions, the same infrastructure can later be used to tune software and hardware parameters. Thus, I lay out the structure for such an infrastructure and describe common components that are required for measuring and tuning. Based on that, I implement adequate interfaces that extend the functionality of contemporary performance measurement tools. Furthermore, I use these interfaces to conflate performance and power measurements and further process the gathered information for tuning. I conclude this work by demonstrating that the infrastructure can be used to manipulate power-saving mechanisms of contemporary x86 processors and increase the energy efficiency of HPC applications

Technische Universität Dresden: Qucosa

Processor evaluation for low power frequency converter product family

Author: Kukkonen Lauri
Publication venue: Aalto-yliopisto
Publication date: 01/01/2010
Field of study

Tässä työssä tutkitaan markkinoilla olevia tai lähitulevaisuudessa markkinoille saapuvia prosessoreja käytettäväksi pienitehoisissa taajuusmuuttajissa. Tutkimuksen tarkoitus on selvittää prosessorin sopivuutta sovellukseen, jossa hinta on merkittävä tekijä. Tutkimuksessa esitettyjen vaatimusten perusteella houkuttelevimmat prosessorit otetaan tarkempaan tutkimukseen. Tarkemman selvityksen jälkeen vaatimuksia teknisesti mahdollisimman tarkasti vastaavat prosessorit pyydettiin valmistajalta testattavaksi. Testaaminen suoritettiin lopulta viidelle eri prosessorille, joista kaksi perustui samaan ytimeen. Testaamisen tavoitteena on selvittää prosessorin sopivuus käyttökohteeseensa. Sopivuus testattiin suorittamalla prosessoreissa taajuusmuuttajakäyttöä mallintavaa testikoodia. Tuloksina testikoodin ajamisesta saatiin tietyissä aliohjelmissa kulutettu aika sekä kulutetut kellosyklit. Suorituskyvyn lisäksi testaukseen kuului prosessorikohtaisen kääntäjän aikaansaaman koodin koko. Aliohjelmat sisälsivät sekä aritmeettisia, että loogisia operaatioita, joiden kombinaationa mahdollisimman hyvä sopivuus saatiin selvitettyä.The aim of this thesis is to study processors to be used in a low power frequency converter. Processors under investigation must be currently or in the near future in the market. The purpose is to examine suitability of a processor to an application in which price is an essential factor. The requirements presented in this study will determine which processor will be reviewed more closely. After a precise review, processor vendors was asked to provide as corresponding device as possible to a test. Testing was accomplished eventually with five different processors of which two were based on a same core. The aim of the testing was to investigate suitability of the processors to their target task. Suitability was tested by executing code that models frequency converter application. As a result, spent time and clock cycles are presented in certain functions. In addition to performance, the testing included evaluation of the size of the output code the compilers created. Functions under test consisted of a combination of arithmetic and logic operations that was used to interpret the suitability of the processor

Aaltodoc Publication Archive

Recommended from our members

High efficiency smart voltage regulating module for green mobile computing

Author: Tapou Monaf Sabri
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2014
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.In this thesis a design for a smart high efficiency voltage regulating module capable of supplying the core of modern microprocessors incorporating dynamic voltage and frequency scaling (DVS) capability is accomplished using a RISC based microcontroller to facilitate all the functions required to control, protect, and supply the core with the required variable operating voltage as set by the DVS management system. Normally voltage regulating modules provide maximum power efficiency at designed peak load, and the efficiency falls off as the load moves towards lesser values. A mathematical model has been derived for the main converter and small signal analysis has been performed in order to determine system operation stability and select a control scheme that would improve converter operation response to transients and not requiring intense computational power to realize. A Simulation model was built using Matlab/Simulink and after experimenting with tuned PID controller and fuzzy logic controllers, a simple fuzzy logic control scheme was selected to control the pulse width modulated converter and several methods were devised to reduce the requirements for computational power making the whole system operation realizable using a low power RISC based microcontroller. The same microcontroller provides circuit adaptations operation in addition to providing protection to load in terms of over voltage and over current protection. A novel circuit technique and operation control scheme enables the designed module to selectively change some of the circuit elements in the main pulse width modulated buck converter so as to improve efficiency over a wider range of loads. In case of very light loads as the case when the device goes into standby, sleep or hibernation mode, a secondary converter starts operating and the main converter stops. The secondary converter adapts a different operation scheme using switched capacitor technique which provides high efficiency at low load currents. A fuzzy logic control scheme was chosen for the main converter for its lighter computational power requirement promoting implementation using ultra low power embedded controllers. Passive and active components were carefully selected to augment operational efficiency. These aspects enabled the designed voltage regulating module to operate with efficiency improvement in off peak load region in the range of 3% to 5%. At low loads as the case when the computer system goes to standby or sleep mode, the efficiency improvent is better than 13% which will have noticeable contribution in extending battery run time thus contributing to lowering the carbon footprint of human consumption

Brunel University Research Archive