Search CORE

233 research outputs found

An adaptive, utilization-based approach to schedule real-time tasks for ARM big. LITTLE architectures

Author: Cucinotta Tommaso
Marinoni Mauro
Mascitti Agostino
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

ARM big.LITTLE architectures are spreading more and more in the mobile world thanks to their power-saving capabilities due to the use of two ISA-compatible islands, one focusing on energy efficiency and the other one on computational power. This architecture makes the problem of energy-aware task scheduling particularly challenging, due to the number of variables to take into account and the need for having lightweight mechanisms that can be readily computed in an operating system kernel scheduler. This paper presents a novel task scheduler for big.LITTLE platforms, combining the well-known Constant Bandwidth Server algorithm with a power-aware per-job migration policy. This achieves real-time adaptation of the CPU islands' frequencies based on the individual cores' overall utilization, as available in the scheduler thanks to the use of the resource reservation paradigm. Preliminary results obtained by simulations based on modifications to the open-source RTSim tool show that the proposed technique is able to achieve interesting performance/energy trade-offs

Archivio della ricerca della Scuola Superiore Sant'Anna

Group-Based Parallel Multi-scheduling Methods for Grid Computing

Author: Abraham Goodhead Tomvie
Publication venue
Publication date: 01/01/2016
Field of study

Coventry University Pure Portal

Per-task energy metering and accounting in the multicore era

Author: Liu Qixiao
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2016
Field of study

Chip multi-core processors (CMPs) are the preferred processing platform across different domains such as data centers, real-time systems and mobile devices. In all those domains, energy is arguably the most expensive resource in a computing system, in particular, with fastest growth. Therefore, measuring the energy usage draws vast attention. Current studies mostly focus on obtaining finer-granularity energy measurement, such as measuring power in smaller time intervals, distributing energy to hardware components or software components. Such studies focus on scenarios where system energy is measured under the assumption that only one program is running in the system. So far, there is no hardware-level mechanism proposed to distribute the system energy to multiple running programs in a resource sharing multi-core system in an exact way. For the first time, we have formalized the need for per-task energy measurement in multicore by establishing a two-fold concept: Per-Task Energy Metering (PTEM) and Sensible Energy Accounting (SEA). In the scenario where many tasks running in parallel in a multicore system: For each task, the target of PTEM is to provide estimate of the actual energy consumption at runtime based on its resource usage during execution; and SEA aims at providing estimates on the energy it would have consumed when running in isolation with a particular fraction of system's resources. Accurately determining the energy consumed by each task in a system will become of prominent importance in future multi-core based systems as it offers several benefits including (i) Selection of appropriate co-runners, (ii) improved energy-aware task scheduling and (iii) energy-aware billing in data centers. We have shown how these two concepts can be applied to the main components of a computing system: the processor and the memory system. At first, we have applied PTEM to the processor by means of tracking the activities and occupancy of all the resources in a per-task basis. Secondly, we have applied PTEM to the memory system by means of tracking the activities and the state switches of memory banks. Then, we have applied SEA to the processor by predicting the activities and execution time for each task when they run with an fraction of chip resources alone. And last, we apply SEA to the memory system, by means of predicting activities, execution time and the time invoking memory system for each task. As for all these works, by trading-off the hardware cost with the estimation accuracy, we have obtained the implementable and affordable cost mechanisms with high accuracy. We have also shown how these techniques can be applied in different scenarios, such as, to detect significant energy usage variations for any particular task and to develop more energy efficient scheduling policy for the multi-core system. These works in this thesis have been published into IEEE/ACM journals and conferences proceedings that can be found in the publication chapter of this thesis.Los "Chip Multi-core Processors" (CMPs) son la plataforma de procesado preferida en diferentes dominios, tales como los centros de datos, sistemas de tiempo real y dispositivos móviles. En todos estos dominios, la energía puede ser el recurso más caro en el sistema de computación, concretamente, lo rápido que está creciendo. Por lo tanto, como medir el consumo energético está ganando mucha atención. Los estudios actuales se centran mayormente en cómo obtener medidas muy detalladas (finer granularity). Por ejemplo, tomar medidas de potencia en pequeños intervalos de tiempo, usando medidores de energía hardware o software. Estos estudios se centran en escenarios donde el consumo del sistema se mide bajo la suposición de que solo un programa se está ejecutando en el sistema. Aun no hay ninguna propuesta de un mecanismo a nivel de hardware para medir el consumo entre múltiples programas ejecutándose a la vez en un sistema multi-core con recursos compartidos. Por primera vez, hemos formalizado la necesidad de medir el consumo energético por-tarea en un multi-core estableciendo un concepto dual: Per-Taks Energy Metering (PTEM) y Sensible Energy Accounting (SEA). En un escenario donde varias tareas se ejecutan en paralelo en un sistema multi-core, por cada tarea, el objetivo de PTEM es estimar el consumo real energético durante tiempo de ejecución basándose en los recursos usados durante la ejecución, y SEA trata de proveer una estimación del consumo que tendría en solitario con solo una fracción concreta de los recursos del sistema. Determinar el consumo energético con precisión para cada tarea en un sistema tomara gran importancia en el futuro de los sistemas basados en multi-cores, ya que ofrecen varias ventajas tales como: (i) determinar los co-runners apropiados, (ii) mejorar la planificación de tareas teniendo en cuenta su consumo y (iii) facturación de los servicios de los data centers basada en el consumo. Hemos mostrado como estos dos conceptos pueden aplicarse a los principales componentes de un sistema de computación: el procesador y el sistema de memoria. Para empezar, hemos aplicado PTEM al procesador para registrar la actividad y la ocupación de todos los recursos por cada tarea. Luego, hemos aplicado SEA al procesador prediciendo la actividad y tiempo de ejecución por tarea cuando se ejecutan con solo una parte de los recursos del chip. Por último, hemos aplicado SEA al sistema de memoria para predecir la activada, el tiempo ejecución y cuando el sistema de memoria es invocado por cada tarea. Con todo ello, hemos alcanzado un compromiso entre el coste del hardware y la precisión en las estimaciones para obtener mecanismos implementables con un coste aceptable y una alta precisión. Durante nuestros estudios mostramos como esas técnicas pueden ser aplicadas a diferente escenarios, tales como: detectar variaciones significativas en el consumo energético por una tarea en concreto o como desarrollar políticas de planificación energéticamente más eficientes para sistemas multi-core. Los trabajos que hemos publicado durante el desarrollo de esta tesis en los IEEE/ACM journals y en varias conferencias pueden encontrarse en el capítulo de "publicaciones" de este documentoPostprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Parcus: Energy-Aware and Robust Parallelization of AUTOSAR Legacy Applications

Author: Böddeker Bert
Kehr Sebastian
Langen Dominik
Quiñones Eduardo
Schäfer Günter
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/06/2017
Field of study

Embedded multicore processors are an attractive alternative to sophisticated single-core processors for the use in automobile electronic control units (ECUs), due to their expected higher performance and energy efficiency. Parallelization approaches for AUTOSAR legacy software exploit these benefits. Nevertheless, these approaches focus on extracting performance neglecting the system's worst-case sensor/actuator latency and energy consumption. This paper presents Parcus, an energy-and latency-aware parallelization technique that combines both runnable-and tasklevel parallelism. Parcus explicitly models the traversal of data from sensor to actuator through task instances, enabling to consider the latency imposed by parallelization techniques. The parallel schedule quality (PSQ) metric quantifies the success of the parallelization, for which it takes the latency and the processor frequency into account. We demonstrate the applicability of Parcus with an automotive case study. The results show that Parcus can fully utilize the processor's energy-saving potential.This research received funding from the EU FP7 no. 287519 (parMERASA), the ARTEMIS-JU no. 621429 (EMC2), and the German Federal Ministry of Education and Research.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Empirical characterization and modeling of power consumption and energy aware scheduling in data centers

Author: Muraña Jonathan
Publication venue: Udelar.FI.
Publication date: 01/01/2019
Field of study

Energy-efficient management is key in modern data centers in order to reduce operational cost and environmental contamination. Energy management and renewable energy utilization are strategies to optimize energy consumption in high-performance computing. In any case, understanding the power consumption behavior of physical servers in datacenter is fundamental to implement energy-aware policies effectively. These policies should deal with possible performance degradation of applications to ensure quality of service. This thesis presents an empirical evaluation of power consumption for scientific computing applications in multicore systems. Three types of applications are studied, in single and combined executions on Intel and AMD servers, for evaluating the overall power consumption of each application. The main results indicate that power consumption behavior has a strong dependency with the type of application. Additional performance analysis shows that the best load of the server regarding energy efficiency depends on the type of the applications, with efficiency decreasing in heavily loaded situations. These results allow formulating models to characterize applications according to power consumption, efficiency, and resource sharing, which provide useful information for resource management and scheduling policies. Several scheduling strategies are evaluated using the proposed energy model over realistic scientific computing workloads. Results confirm that strategies that maximize host utilization provide the best energy efficiency.Agencia Nacional de Investigación e Innovación FSE_1_2017_1_14478

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Contention in multicore hardware shared resources: Understanding of the state of the art

Author: Abella Ferrer Jaume
Cazorla Almeida Francisco Javier
Fernández Gabriel
Quiñones Eduardo
Rochange Christine
Vardanega Tullio
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik
Publication date: 01/01/2014
Field of study

The real-time systems community has over the years devoted considerable attention to the impact on execution timing that arises from contention on access to hardware shared resources. The relevance of this problem has been accentuated with the arrival of multicore processors. From the state of the art on the subject, there appears to be considerable diversity in the understanding of the problem and in the “approach” to solve it. This sparseness makes it difficult for any reader to form a coherent picture of the problem and solution space. This paper draws a tentative taxonomy in which each known approach to the problem can be categorised based on its specific goals and assumptions.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

Dagstuhl Research Online Publication Server

Recommended from our members

Achieving Accurate Predictions of Future Events Under Hardware Heterogeneity

Author: Prodromou Andreas
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Heterogeneous hardware is becoming increasingly available in modern hardware, while research breakthroughs enforce the expectation that heterogeneity will keep increasing in the future. Significant gains can be achieved via appropriate utilization of heterogeneity, in terms of performance and power consumption, however, poor utilization can have a detrimental effect. Intelligent scheduling and resource management is a crucial challenge we need to overcome in order to harvest the full potential of heterogeneous hardware. As systems become larger and include greater levels of hardware diversity, the importance of intelligent scheduling and resource management is further accentuated.This dissertation presents techniques that aid the process of scheduling and resource management in the presence of heterogeneous hardware, via accurately predicting upcoming runtime events. With a proactive and accurate view of the near future, schedulers can utilize the underlying hardware more efficiently, and fully take advantage of the available benefits.By adapting a majority element heuristic, this dissertation significantly improves the accuracy of predicting memory addresses about to be accessed, while reducing prediction-related costs by a factor of ten thousand compared to previously proposed predictive approaches. Coupled with novel microarchitectural modifications, accurate address predictions are shown to improve the performance of heterogeneous memory architectures.Machine learning-based performance predictors are further presented, capable of predicting a program's performance when executed on a given general-purpose core. Trained to model the subtleties of the interaction between hardware and software, these predictors are capable of generating highly accurate predictions even for cores with varied Instruction Set Architectures. Utilizing these performance predictions for job scheduling, is shown to improve overall system performance. The trained predictors are further examined and interpreted in order to visualize the correlations between features picked up and amplified during training.Finally, this dissertation demonstrates that scheduling algorithms cannot guarantee deriving an optimal schedule during realistic execution scenarios due to the underlying hardware heterogeneity, the wide range of runtime requirements of software, as well as prediction error from performance predictors. In response, deep neural networks are trained to select one scheduling approach from a list of options with varied overheads and correctness guarantees. The scheduling approach chosen, is the one which will most likely return the highest-performance schedule with the lowest overhead, given a particular instance of the job-to-core assignment problem

eScholarship - University of California

Heuristic partitioning of real-time tasks on multi-processors

Author: A. Mascitti
L. Abeni
T. Cucinotta
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

This paper tackles the problem of admitting real-time tasks onto a symmetric multi-processor platform, where a partitioned EDF-based scheduler is used. We propose to combine a well-known utilization-based test for the first-fit partitioning strategy, with a simple heuristic based on the number of tasks and exact knowledge of the utilization of the first few biggest tasks. This results in an effective and efficient test improving on the state of the art in terms of admitted tasks, as shown by extensive tests performed on task sets generated using the widely adopted randfixedsum algorithm

Crossref

Archivio della ricerca della Scuola Superiore Sant'Anna

Parallel Continuous Preference Queries over Out-of-Order and Bursty Data Streams

Author: DANELUTTO MARCO
DE MATTEIS TIZIANO
MENCAGLI GABRIELE
TORQUATI MASSIMO
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Techniques to handle traffic bursts and out-of-order arrivals are of paramount importance to provide real-time sensor data analytics in domains like traffic surveillance, transportation management, healthcare and security applications. In these systems the amount of raw data coming from sensors must be analyzed by continuous queries that extract value-added information used to make informed decisions in real-time. To perform this task with timing constraints, parallelism must be exploited in the query execution in order to enable the real-time processing on parallel architectures. In this paper we focus on continuous preference queries, a representative class of continuous queries for decision making, and we propose a parallel query model targeting the efficient processing over out-of-order and bursty data streams. We study how to integrate punctuation mechanisms in order to enable out-of-order processing. Then, we present advanced scheduling strategies targeting scenarios with different burstiness levels, parameterized using the index of dispersion quantity. Extensive experiments have been performed using synthetic datasets and real-world data streams obtained from an existing real-time locating system. The experimental evaluation demonstrates the efficiency of our parallel solution and its effectiveness in handling the out-of-orderness degrees and burstiness levels of real-world applications

Crossref

Archivio della Ricerca - Università di Pisa

A survey of techniques for reducing interference in real-time applications on multicore platforms

Author: Carretero Pérez Jesús
Fernández Muñoz Javier
Lozano Santiago
Lugo Tamara
Publication venue: IEEE
Publication date: 15/02/2022
Field of study

This survey reviews the scientific literature on techniques for reducing interference in real-time multicore systems, focusing on the approaches proposed between 2015 and 2020. It also presents proposals that use interference reduction techniques without considering the predictability issue. The survey highlights interference sources and categorizes proposals from the perspective of the shared resource. It covers techniques for reducing contentions in main memory, cache memory, a memory bus, and the integration of interference effects into schedulability analysis. Every section contains an overview of each proposal and an assessment of its advantages and disadvantages.This work was supported in part by the Comunidad de Madrid Government "Nuevas Técnicas de Desarrollo de Software de Tiempo Real Embarcado Para Plataformas. MPSoC de Próxima Generación" under Grant IND2019/TIC-17261

Universidad Carlos III de Madrid e-Archivo