437 research outputs found
Multiprocessor Global Scheduling on Frame-Based DVFS Systems
In this ongoing work, we are interested in multiprocessor energy efficient
systems, where task durations are not known in advance, but are know
stochastically. More precisely, we consider global scheduling algorithms for
frame-based multiprocessor stochastic DVFS (Dynamic Voltage and Frequency
Scaling) systems. Moreover, we consider processors with a discrete set of
available frequencies
MORA: an Energy-Aware Slack Reclamation Scheme for Scheduling Sporadic Real-Time Tasks upon Multiprocessor Platforms
In this paper, we address the global and preemptive energy-aware scheduling
problem of sporadic constrained-deadline tasks on DVFS-identical multiprocessor
platforms. We propose an online slack reclamation scheme which profits from the
discrepancy between the worst- and actual-case execution time of the tasks by
slowing down the speed of the processors in order to save energy. Our algorithm
called MORA takes into account the application-specific consumption profile of
the tasks. We demonstrate that MORA does not jeopardize the system
schedulability and we show by performing simulations that it can save up to 32%
of energy (in average) compared to execution without using any energy-aware
algorithm.Comment: 11 page
Green computing: power optimisation of VFI-based real-time multiprocessor dataflow applications (extended version)
Execution time is no longer the only performance metric for computer systems. In fact, a trend is emerging to trade raw performance for energy savings. Techniques like Dynamic Power Management (DPM, switching to low power state) and Dynamic Voltage and Frequency Scaling (DVFS, throttling processor frequency) help modern systems to reduce their power consumption while adhering to performance requirements. To balance flexibility and design complexity, the concept of Voltage and Frequency Islands (VFIs) was recently introduced for power optimisation. It achieves fine-grained system-level power management, by operating all processors in the same VFI at a common frequency/voltage.This paper presents a novel approach to compute a power management strategy combining DPM and DVFS. In our approach, applications (modelled in full synchronous dataflow, SDF) are mapped on heterogeneous multiprocessor platforms (partitioned in voltage and frequency islands). We compute an energy-optimal schedule, meeting minimal throughput requirements. We demonstrate that the combination of DPM and DVFS provides an energy reduction beyond considering DVFS or DMP separately. Moreover, we show that by clustering processors in VFIs, DPM can be combined with any granularity of DVFS. Our approach uses model checking, by encoding the optimisation problem as a query over priced timed automata. The model-checker Uppaal Cora extracts a cost minimal trace, representing a power minimal schedule. We illustrate our approach with several case studies on commercially available hardware
A Survey of Fault-Tolerance Techniques for Embedded Systems from the Perspective of Power, Energy, and Thermal Issues
The relentless technology scaling has provided a significant increase in processor performance, but on the other hand, it has led to adverse impacts on system reliability. In particular, technology scaling increases the processor susceptibility to radiation-induced transient faults. Moreover, technology scaling with the discontinuation of Dennard scaling increases the power densities, thereby temperatures, on the chip. High temperature, in turn, accelerates transistor aging mechanisms, which may ultimately lead to permanent faults on the chip. To assure a reliable system operation, despite these potential reliability concerns, fault-tolerance techniques have emerged. Specifically, fault-tolerance techniques employ some kind of redundancies to satisfy specific reliability requirements. However, the integration of fault-tolerance techniques into real-time embedded systems complicates preserving timing constraints. As a remedy, many task mapping/scheduling policies have been proposed to consider the integration of fault-tolerance techniques and enforce both timing and reliability guarantees for real-time embedded systems. More advanced techniques aim additionally at minimizing power and energy while at the same time satisfying timing and reliability constraints. Recently, some scheduling techniques have started to tackle a new challenge, which is the temperature increase induced by employing fault-tolerance techniques. These emerging techniques aim at satisfying temperature constraints besides timing and reliability constraints. This paper provides an in-depth survey of the emerging research efforts that exploit fault-tolerance techniques while considering timing, power/energy, and temperature from the real-time embedded systems’ design perspective. In particular, the task mapping/scheduling policies for fault-tolerance real-time embedded systems are reviewed and classified according to their considered goals and constraints. Moreover, the employed fault-tolerance techniques, application models, and hardware models are considered as additional dimensions of the presented classification. Lastly, this survey gives deep insights into the main achievements and shortcomings of the existing approaches and highlights the most promising ones
Markov Decision Process Based Energy-Efficient On-Line Scheduling for Slice-Parallel Video Decoders on Multicore Systems
We consider the problem of energy-efficient on-line scheduling for
slice-parallel video decoders on multicore systems. We assume that each of the
processors are Dynamic Voltage Frequency Scaling (DVFS) enabled such that they
can independently trade off performance for power, while taking the video
decoding workload into account. In the past, scheduling and DVFS policies in
multi-core systems have been formulated heuristically due to the inherent
complexity of the on-line multicore scheduling problem. The key contribution of
this report is that we rigorously formulate the problem as a Markov decision
process (MDP), which simultaneously takes into account the on-line scheduling
and per-core DVFS capabilities; the power consumption of the processor cores
and caches; and the loss tolerant and dynamic nature of the video decoder's
traffic. In particular, we model the video traffic using a Direct Acyclic Graph
(DAG) to capture the precedence constraints among frames in a Group of Pictures
(GOP) structure, while also accounting for the fact that frames have different
display/decoding deadlines and non-deterministic decoding complexities. The
objective of the MDP is to minimize long-term power consumption subject to a
minimum Quality of Service (QoS) constraint related to the decoder's
throughput. Although MDPs notoriously suffer from the curse of dimensionality,
we show that, with appropriate simplifications and approximations, the
complexity of the MDP can be mitigated. We implement a slice-parallel version
of H.264 on a multiprocessor ARM (MPARM) virtual platform simulator, which
provides cycle-accurate and bus signal-accurate simulation for different
processors. We use this platform to generate realistic video decoding traces
with which we evaluate the proposed on-line scheduling algorithm in Matlab
CROSS-STACK PREDICTIVE CONTROL FRAMEWORK FOR MULTICORE REAL-TIME APPLICATIONS
Many of the next generation applications in entertainment, human computer interaction, infrastructure, security and medical systems are computationally intensive, always-on, and have soft real time (SRT) requirements. While failure to meet deadlines is not catastrophic in SRT systems, missing deadlines can result in an unacceptable degradation in the quality of service (QoS). To ensure acceptable QoS under dynamically changing operating conditions such as changes in the workload, energy availability, and thermal constraints, systems are typically designed for worst case conditions. Unfortunately, such over-designing of systems increases costs and overall power consumption.
In this dissertation we formulate the real-time task execution as a Multiple-Input, Single- Output (MISO) optimal control problem involving tracking a desired system utilization set point with control inputs derived from across the computing stack. We assume that an arbitrary number of SRT tasks may join and leave the system at arbitrary times. The tasks are scheduled on multiple cores by a dynamic priority multiprocessor scheduling algorithm. We use a model predictive controller (MPC) to realize optimal control. MPCs are easy to tune, can handle multiple control variables, and constraints on both the dependent and independent variables. We experimentally demonstrate the operation of our controller on a video encoder application and a computer vision application executing on a dual socket quadcore Xeon processor with a total of 8 processing cores. We establish that the use of DVFS and application quality as control variables enables operation at a lower power op- erating point while meeting real-time constraints as compared to non cross-stack control approaches. We also evaluate the role of scheduling algorithms in the control of homo- geneous and heterogeneous workloads. Additionally, we propose a novel adaptive control technique for time-varying workloads
PowerPack: Energy Profiling and Analysis of High-Performance Systems and Applications
Energy efficiency is a major concern in modern high-performance computing system design. In the past few years, there has been mounting evidence that power usage limits system scale and computing density, and thus, ultimately system performance. However, despite the impact of power and energy on the computer systems community, few studies provide insight to where and how power is consumed on high-performance systems and applications. In previous work, we designed a framework called PowerPack that was the first tool to isolate the power consumption of devices including disks, memory, NICs, and processors in a high-performance cluster and correlate these measurements to application functions. In this work, we extend our framework to support systems with multicore, multiprocessor-based nodes, and then provide in-depth analyses of the energy consumption of parallel applications on clusters of these systems. These analyses include the impacts of chip multiprocessing on power and energy efficiency, and its interaction with application executions. In addition, we use PowerPack to study the power dynamics and energy efficiencies of dynamic voltage and frequency scaling (DVFS) techniques on clusters. Our experiments reveal conclusively how intelligent DVFS scheduling can enhance system energy efficiency while maintaining performance
A Survey of Prediction and Classification Techniques in Multicore Processor Systems
In multicore processor systems, being able to accurately predict the future provides new optimization opportunities, which otherwise could not be exploited. For example, an oracle able to predict a certain application\u27s behavior running on a smart phone could direct the power manager to switch to appropriate dynamic voltage and frequency scaling modes that would guarantee minimum levels of desired performance while saving energy consumption and thereby prolonging battery life. Using predictions enables systems to become proactive rather than continue to operate in a reactive manner. This prediction-based proactive approach has become increasingly popular in the design and optimization of integrated circuits and of multicore processor systems. Prediction transforms from simple forecasting to sophisticated machine learning based prediction and classification that learns from existing data, employs data mining, and predicts future behavior. This can be exploited by novel optimization techniques that can span across all layers of the computing stack. In this survey paper, we present a discussion of the most popular techniques on prediction and classification in the general context of computing systems with emphasis on multicore processors. The paper is far from comprehensive, but, it will help the reader interested in employing prediction in optimization of multicore processor systems
A Power-Efficient Methodology for Mapping Applications on Multi-Processor System-on-Chip Architectures
This work introduces an application mapping methodology and case study for multi-processor on-chip architectures. Starting from the description of an application in standard sequential code (e.g. in C), first the application is profiled, parallelized when possible, then its components are moved to hardware implementation when necessary to satisfy performance and power constraints. After mapping, with the use of hardware objects to handle concurrency, the application power consumption can be further optimized by a task-based scheduler for the
remaining software part, without the need for operating system support. The key contributions of this work are: a methodology for high-level hardware/software partitioning that allows the designer to use the same code for both hardware and
software models for simulation, providing nevertheless preliminary estimations for timing and power consumption; and a task-based scheduling algorithm that does not require operating system support. The methodology has been applied to
the co-exploration of an industrial case study: an MPEG4 VGA real-time encoder
- …