Search CORE

6 research outputs found

queue: Customized large-scale clock frequency scaling

Author: Allan Snavely
Ananta Tiwari
Joshua Peraza
Laura Carrington
Michael Laurenzano
Publication venue
Publication date: 01/01/2012
Field of study

Abstract-We examine the scalability of a set of techniques related to Dynamic Voltage-Frequency Scaling (DVFS) on HPC systems to reduce the energy consumption of scientific applications through an application-aware analysis and runtime framework, Green Queue. Green Queue supports making CPU clock frequency changes in response to intra-node and internode observations about application behavior. Our intra-node approach reduces CPU clock frequencies and therefore power consumption while CPUs lacks computational work due to inefficient data movement. Our inter-node approach reduces clock frequencies for MPI ranks that lack computational work. We investigate these techniques on a set of large scientific applications on 1024 cores of Gordon, an Intel Sandybridgebased supercomputer at the San Diego Supercomputer Center. Our optimal intra-node technique showed an average measured energy savings of 10.6% and a maximum of 21.0% over regular application runs. Our optimal inter-node technique showed an average 17.4% and a maximum of 31.7% energy savings

CiteSeerX

Enabling fair pricing on HPC systems with node sharing

Author: Blagodurov S.
Breslow A. D.
Garey M. R.
Lin Z.
Wang H.
Xu C.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/04/2013
Field of study

Abstract not provide

Crossref

UNT Digital Library

Selective Core Boosting: The Return of the Turbo Button

Author: Dice Dave
Diestelhorst Stephan
Felber Pascal
Fetzer Christof
Marlier Patrick
Wamhoff Jons-Tobias
Publication venue: Technische Universität Dresden
Publication date: 26/11/2013
Field of study

Several modern multi-core architectures support the dynamic control of the CPU's clock rate, allowing processor cores to temporarily operate at speeds exceeding the operational base frequency. Conversely, cores can operate at a lower speed or be disabled altogether to save power. Such facilities are notably provided by Intel's Turbo Boost and AMD's Turbo CORE technologies. Frequency control is typically driven by the operating system which requests changes to the performance state of the processor based on the current load of the system. In this paper, we investigate the use of dynamic frequency scaling from user space to speed up multi-threaded applications that must occasionally execute time-critical tasks or to solve problems that have heterogeneous computing requirements. We propose a general-purpose library that allows selective control of the frequency of the cores - subject to the limitations of the target architecture. We analyze the performance trade-offs and illustrate its benefits using several benchmarks and real-world workloads when temporarily boosting selected cores executing time-critical operations. While our study primarily focuses on AMD's architecture, we also provide a comparative evaluation of the features, limitations, and runtime overheads of both Turbo Boost and Turbo CORE technologies. Our results show that we can successful exploit these new hardware facilities to accelerate the execution of key sections of code (critical paths) improving overall performance of some multi-threaded applications. Unlike prior research, we focus on performance instead of power conservation. Our results further can give guidelines for the design of hardware power management facilities and the operating system interfaces to those facilities

Technische Universität Dresden: Qucosa

Application aware performance, power consumption, and reliability tradeoff

Author: Gorti Naga Pavan Kumar
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2014
Field of study

There has been an unprecedented increase in the drive for microprocessor performance. This drive is motivated by the increase in software complexity, opportunity to solve previously unattempted problems especially in scientific domain, and a need to crunch the ever growing `Big Data\u27 to enable a multitude of technological advances to benefit mankind. A consequence of these phenomena is the ever increasing transistor count in deployed computing systems. Although technology scaling leads to lower power consumption per transistor, the overall system level power consumption is on the rise. This leads to a variety of power supply related issues. As the chip die area is not increasing significantly, and the supply voltage reduction is not keeping on par with the reduction in device dimensions, an increase in power density is observed. This manifests as an increased temperature profile on the chip floorplan. A rise in temperature necessitates aggressive and costly cooling mechanisms adding to the design complexity and manufacturing efforts. It also triggers various failure mechanisms leading to reduction in the expected chip lifetime/reliability. Given the conflicting trends in Performance, Power consumption, and chip Reliability (PPR), it is imperative to balance them in a fine-grained fashion to meet system level goals and expectations. Sole dependence on the advancements in manufacturing technology is no longer sufficient. Alternate venues for PPR management are being increasingly paid attention to. On the other hand, the PPR demands are usually time dependent. For example, the constraint on power consumption in a green data center is dictated by the energy reserve. The demand on performance in a cloud based platform depends on the agreed Quality of Service (QOS) requirements. The reliability of a microprocessor is dependent on the deployment domain. The goal of our research is to address the issue of growing microprocessor power consumption subject to performance and/or reliability goals. Through our developed schemes, we tailor the execution context to match application requirements. This leads to judicious use of power while adhering to aforementioned constraints. It is to be noted that the actual demands on performance, power consumption, and reliability are highly variant, and depend upon executing applications and operating conditions. As such, we develop schemes to cater to these variant demands. To meet these demands efficiently, the solutions developed are tailored to current hardware-software interaction characteristics. Two techniques that are very relevant in this area, namely dynamic voltage and frequency scaling (DVFS) and microarchitectural adaptation, are leveraged to produce expected PPR characteristics when executing a wide variety of tasks. In this dissertation, we demonstrate how the expected chip lifetime can be augmented in a real-time setting using DVFS while paying heed to performance constraints modeled as QoS requirements. Individual tasks in a task queue are assigned specific voltage and frequency pairs to utilize for their execution. This assignment is empowered by knowledge of application-wise hardware-software interactions to reach solutions that are tailored to the current execution scenario. Our observations indicate that a 2 to 18 fold improvement in chip lifetime can be expected by the utilization of the schemes we develop in this regard. Capitalizing on the power of microarchitectural adaptation, we further improve chip lifetime expectations 2-8 times, based on the failure mechanism investigated. This increase in expected chip lifetime directly translates to reduction of both operational and replacement costs. We also provide mechanisms to co-manage performance and power consumption constraints. Comprehensive microarchitectural adaptation space is very complex and its usage thus leads to significant runtime overhead. To tackle this, we devote a fair bit of attention to its pruning so as to narrow down on and utilize only the most effective adaptations. A two stage adaptation process is provided to a) improve optimality of the solutions delivered, and b) to keep the runtime overhead in check. We observe that our schemes provide 20\% higher normalized energy efficiency compared to the state of the art techniques proposed, while using just a very small fraction of the configuration space. We also find that our schemes effectively cater to a wide variety of demands on performance and power consumption, providing the necessary hardware characteristics within 10\% bound. Since only the most useful configuration space is retained for adaptation, occurrence of a fault that prohibits the usage of a certain adaptive control can lead to the inability to satisfy a subset of hardware demands. A detailed analysis has been carried out to understand how the remaining active configurations can preserve the expected hardware behavior. To a good extent, we observe that the system behavior under a failure closely tracks (with less than 5\% tracking error) the obtainable behavior without the presence of the fault. We believe that application tailored schemes for PPR management become increasingly relevant as the microprocessor design advancements saturate in the future. They will be extremely relevant to extract every possible ounce of performance while confirming to constraints on power consumption and reliability. Given the effectiveness of our schemes, we are confident that such schemes are applicable in different markets like embedded computing, desktop computing, cloud platforms and high performance computing. Insights drawn from our research will guide chip designers in the provision of effective adaptive controls to combat increasing demands on PPR characteristics

Digital Repository @ Iowa State University (ISU)

XSEDE: eXtreme Science and Engineering Discovery Environment Third Quarter 2012 Report

Author
Publication venue
Publication date: 30/09/2012
Field of study

The Extreme Science and Engineering Discovery Environment (XSEDE) is the most advanced, powerful, and robust collection of integrated digital resources and services in the world. It is an integrated cyberinfrastructure ecosystem with singular interfaces for allocations, support, and other key services that researchers can use to interactively share computing resources, data, and expertise.This a report of project activities and highlights from the third quarter of 2012.National Science Foundation, OCI-105357

Illinois Digital Environment for Access to Learning and Scholarship Repository