59,287 research outputs found
ATMP: An Adaptive Tolerance-based Mixed-criticality Protocol for Multi-core Systems
© 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted ncomponent of this work in other works.The challenge of mixed-criticality scheduling is to keep tasks of higher criticality running in case of resource shortages caused by faults. Traditionally, mixedcriticality scheduling has focused on methods to handle faults where tasks overrun their optimistic worst-case execution time (WCET) estimate. In this paper we present the Adaptive Tolerance based Mixed-criticality Protocol (ATMP), which generalises the concept of mixed-criticality scheduling to handle also faults of other nature, like failure of cores in a multi-core system. ATMP is an adaptation method triggered by resource shortage at runtime. The first step of ATMP is to re-partition the task to the available cores and the second step is to optimise the utility at each core using the tolerance-based real-time computing model (TRTCM). The evaluation shows that the utility optimisation of ATMP can achieve a smoother degradation of service compared to just abandoning tasks
An Evaluation of Adaptive Partitioning of Real-Time Workloads on Linux
This paper provides an open implementation and an experimental evaluation of an adaptive partitioning approach for scheduling real-time tasks on symmetric multicore systems. The proposed technique is based on combining partitioned EDF scheduling with an adaptive migration policy that moves tasks across processors only when strictly needed to respect their temporal constraints. The implementation of the technique within the Linux kernel, via modifications to the SCHED_DEADLINE code base, is presented. An extensive experimentation-has been conducted by applying the technique on a real multi-core platform with several randomly generated synthetic task sets. The obtained experimental results highlight that the approach exhibits a promising performance to schedule real-time workloads on a real system, with a greatly reduced number of migrations compared to the original global EDF available in SCHED_DEADLINE
Energy Efficient Task Mapping and Resource Management on Multi-core Architectures
Reducing energy consumption of parallel applications executing on chip multi- processors (CMPs) is important for green computing. Hardware vendors have been developing a variety of system features to support energy efficient computing, for example, integrating asymmetric core types on a single chip referred to as static asymmetry and supporting dynamic voltage and frequency scaling (DVFS) referred to as dynamic asymmetry.A common parallelization scheme to exploit CMPs is task parallelism, which can express a wide range of computations in the form of task directed acyclic graphs (DAGs). Existing studies that target energy efficient task scheduling have demonstrated the benefits of leveraging DVFS, particularly per-core DVFS. Their scheduling decisions are mainly based on heuristics, such as task criticality, task dependencies and workload sizes. To enable energy efficient task scheduling, we identify multiple crucial factors that influence energy consumption - varying task characteristics, exploitation of intra-task parallelism (task moldability), and task granularity - which we collectively refer to as task heterogeneity. Task heterogeneity and architecture asymmetry features together complicate the task scheduling problem, since the most energy efficient configuration of resource allocation and frequency setting varies with each task. Our analysis shows that leveraging task heterogeneity in conjunction with static and dynamic asymmetry provides significant opportunities for energy reduction.This thesis contributes two scheduling techniques - ERASE and STEER - that target different scenarios. ERASE focuses on fine-grained tasking and in environments where DVFS is not under user control. It leverages the insights of task characteristics, task moldability, and instantaneous task parallelism detection for guiding scheduling decisions. ERASE comprises four modules: online performance modeling, power profiling, core activity tracing and a task scheduler. Online performance modeling and power profiling provide runtime with execution time and power predictions. Core activity tracing offers the instantaneous task parallelism and the task scheduler combines these information to enable the energy predictions and dynamically determine the best resource allocation for each task during runtime. STEER focuses on environments where DVFS is under user control and where the platform comprises multiple asymmetric cores grouped into clusters. STEER explores how much energy could be potentially saved by leveraging static asymmetry, dynamic asymmetry and task heterogeneity in conjunction. STEER comprises two predictive models for performance and power predictions, and a task scheduler that utilizes models for energy predictions and then identifies the best resource allocation and frequency settings for tasks. Moreover, it applies adaptive scheduling techniques based on task granularity to manage DVFS overheads, and coordinates the cluster frequency settings to reduce interference from concurrent running tasks on cluster-based architectures.The evaluation on an NVIDIA Jetson TX2 shows that ERASE achieves 10% energy savings on average compared to the state-of-the-art DVFS-based schedulers and can adapt to external DVFS changes, and STEER consumes 38% less energy on average than both the state-of-the-art and ERASE
Adaptive Partition Strategies for Loop Parallelism in Heterogeneous Architectures
Este trabajo describe nuestra contribución para la ejecución de bucles paralelos en arquitecturas multi-core/multi-GPU de forma que la carga computacional se distribuya de forma balanceada entre todas las unidades de computación.This paper explores the possibility of efficiently using multicores
in conjunction with multiple GPU accelerators under a parallel task
programming paradigm. In particular, we address the challenge of
extending a parallel_for template to allow its
exploitation on heterogeneous systems. The extension is based on a
two-stages pipeline engine which is responsible for partitioning and
scheduling the chunks into the computational resources. Under this
engine, we propose a dynamic scheduling strategy coupled with an
adaptive partitioning heuristic that resizes chunks to prevent
underutilization and load unbalance of CPUs and GPUs. In this paper
we introduce the adaptive
partitioning heuristic which is derived from an analytical model that
minimizes the load unbalance while maximizes the throughput in the
system. Using two benchmarks we evaluate the
overhead introduced by our template extensions finding that it is
negligible. We also evaluate the efficiency of our adaptive
partitioning strategies and compared them with related work.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech. TIN2010-16144, P08-TIC-3500, P11-TIC-0814
CoreTSAR: Task Scheduling for Accelerator-aware Runtimes
Heterogeneous supercomputers that incorporate computational accelerators
such as GPUs are increasingly popular due to their high
peak performance, energy efficiency and comparatively low cost.
Unfortunately, the programming models and frameworks designed
to extract performance from all computational units still lack the
flexibility of their CPU-only counterparts. Accelerated OpenMP
improves this situation by supporting natural migration of OpenMP
code from CPUs to a GPU. However, these implementations currently
lose one of OpenMP’s best features, its flexibility: typical
OpenMP applications can run on any number of CPUs. GPU implementations
do not transparently employ multiple GPUs on a node
or a mix of GPUs and CPUs. To address these shortcomings, we
present CoreTSAR, our runtime library for dynamically scheduling
tasks across heterogeneous resources, and propose straightforward
extensions that incorporate this functionality into Accelerated
OpenMP. We show that our approach can provide nearly linear
speedup to four GPUs over only using CPUs or one GPU while
increasing the overall flexibility of Accelerated OpenMP
Mixed-Criticality Scheduling with I/O
This paper addresses the problem of scheduling tasks with different
criticality levels in the presence of I/O requests. In mixed-criticality
scheduling, higher criticality tasks are given precedence over those of lower
criticality when it is impossible to guarantee the schedulability of all tasks.
While mixed-criticality scheduling has gained attention in recent years, most
approaches typically assume a periodic task model. This assumption does not
always hold in practice, especially for real-time and embedded systems that
perform I/O. For example, many tasks block on I/O requests until devices signal
their completion via interrupts; both the arrival of interrupts and the waking
of blocked tasks can be aperiodic. In our prior work, we developed a scheduling
technique in the Quest real-time operating system, which integrates the
time-budgeted management of I/O operations with Sporadic Server scheduling of
tasks. This paper extends our previous scheduling approach with support for
mixed-criticality tasks and I/O requests on the same processing core. Results
show the effective schedulability of different task sets in the presence of I/O
requests is superior in our approach compared to traditional methods that
manage I/O using techniques such as Sporadic Servers.Comment: Second version has replaced simulation experiments with real machine
experiments, third version fixed minor error in Equation 5 (missing a plus
sign
- …