99 research outputs found

    Developing power‐aware scheduling mechanisms for computing systems virtualized by Xen

    Get PDF
    Cloud computing emerges as one of the most important technologies for interconnecting people and building the so‐called Internet of People (IoP). In such a cloud‐based IoP, the virtualization technique provides the key supporting environments for running the IoP jobs such as performing data analysis and mining personal information. Nowadays, energy consumption in such a system is a critical metric to measure the sustainability and eco‐friendliness of the system. This paper develops three power‐aware scheduling strategies in virtualized systems managed by Xen, which is a popular virtualization technique. These three strategies are the Least performance Loss Scheduling strategy, the No performance Loss Scheduling strategy, and the Best Frequency Match scheduling strategy. These power‐aware strategies are developed by identifying the limitation of Xen in scaling the CPU frequency and aim to reduce the energy waste without sacrificing the jobs running performance in the computing systems virtualized by Xen. Least performance Loss Scheduling works by re‐arranging the execution order of the virtual machines (VMs). No performance Loss Scheduling works by setting a proper initial CPU frequency for running the VMs. Best Frequency Match reduces energy waste and performance loss by allowing the VMs to jump the queue so that the VM that is put into execution best matches the current CPU frequency. Scheduling for both single core and multicore processors is considered in this paper. The evaluation experiments have been conducted, and the results show that compared with the original scheduling strategy in Xen, the developed power‐aware scheduling algorithm is able to reduce energy consumption without reducing the performance for the jobs running in Xen

    Resource-aware scheduling for 2D/3D multi-/many-core processor-memory systems

    Get PDF
    This dissertation addresses the complexities of 2D/3D multi-/many-core processor-memory systems, focusing on two key areas: enhancing timing predictability in real-time multi-core processors and optimizing performance within thermal constraints. The integration of an increasing number of transistors into compact chip designs, while boosting computational capacity, presents challenges in resource contention and thermal management. The first part of the thesis improves timing predictability. We enhance shared cache interference analysis for set-associative caches, advancing the calculation of Worst-Case Execution Time (WCET). This development enables accurate assessment of cache interference and the effectiveness of partitioned schedulers in real-world scenarios. We introduce TCPS, a novel task and cache-aware partitioned scheduler that optimizes cache partitioning based on task-specific WCET sensitivity, leading to improved schedulability and predictability. Our research explores various cache and scheduling configurations, providing insights into their performance trade-offs. The second part focuses on thermal management in 2D/3D many-core systems. Recognizing the limitations of Dynamic Voltage and Frequency Scaling (DVFS) in S-NUCA many-core processors, we propose synchronous thread migrations as a thermal management strategy. This approach culminates in the HotPotato scheduler, which balances performance and thermal safety. We also introduce 3D-TTP, a transient temperature-aware power budgeting strategy for 3D-stacked systems, reducing the need for Dynamic Thermal Management (DTM) activation. Finally, we present 3QUTM, a novel method for 3D-stacked systems that combines core DVFS and memory bank Low Power Modes with a learning algorithm, optimizing response times within thermal limits. This research contributes significantly to enhancing performance and thermal management in advanced processor-memory systems

    Energy Aware Runtime Systems for Elastic Stream Processing Platforms

    Get PDF
    Following an invariant growth in the required computational performance of processors, the multicore revolution started around 20 years ago. This revolution was mainly an answer to power dissipation constraints restricting the increase of clock frequency in single-core processors. The multicore revolution not only brought in the challenge of parallel programming, i.e. being able to develop software exploiting the entire capabilities of manycore architectures, but also the challenge of programming heterogeneous platforms. The question of “on which processing element to map a specific computational unit?”, is well known in the embedded community. With the introduction of general-purpose graphics processing units (GPGPUs), digital signal processors (DSPs) along with many-core processors on different system-on-chip platforms, heterogeneous parallel platforms are nowadays widespread over several domains, from consumer devices to media processing platforms for telecom operators. Finding mapping together with a suitable hardware architecture is a process called design-space exploration. This process is very challenging in heterogeneous many-core architectures, which promise to offer benefits in terms of energy efficiency. The main problem is the exponential explosion of space exploration. With the recent trend of increasing levels of heterogeneity in the chip, selecting the parameters to take into account when mapping software to hardware is still an open research topic in the embedded area. For example, the current Linux scheduler has poor performance when mapping tasks to computing elements available in hardware. The only metric considered is CPU workload, which as was shown in recent work does not match true performance demands from the applications. Doing so may produce an incorrect allocation of resources, resulting in a waste of energy. The origin of this research work comes from the observation that these approaches do not provide full support for the dynamic behavior of stream processing applications, especially if these behaviors are established only at runtime. This research will contribute to the general goal of developing energy-efficient solutions to design streaming applications on heterogeneous and parallel hardware platforms. Streaming applications are nowadays widely spread in the software domain. Their distinctive characiteristic is the retrieving of multiple streams of data and the need to process them in real time. The proposed work will develop new approaches to address the challenging problem of efficient runtime coordination of dynamic applications, focusing on energy and performance management.Efter en oförĂ€nderlig tillvĂ€xt i prestandakrav hos processorer, började den flerkĂ€rniga processor-revolutionen för ungefĂ€r 20 Ă„r sedan. Denna revolution skedde till största del som en lösning till begrĂ€nsningar i energieffekten allt eftersom klockfrekvensen kontinuerligt höjdes i en-kĂ€rniga processorer. Den flerkĂ€rniga processor-revolutionen medförde inte enbart utmaningen gĂ€llande parallellprogrammering, m.a.o. förmĂ„gan att utveckla mjukvara som anvĂ€nder sig av alla delelement i de flerkĂ€rniga processorerna, men ocksĂ„ utmaningen med programmering av heterogena plattformar. FrĂ„gestĂ€llningen ”pĂ„ vilken processorelement skall en viss berĂ€kning utföras?” Ă€r vĂ€l kĂ€nt inom ramen för inbyggda datorsystem. Efter introduktionen av grafikprocessorer för allmĂ€nna berĂ€kningar (GPGPU), signalprocesserings-processorer (DSP) samt flerkĂ€rniga processorer pĂ„ olika system-on-chip plattformar, Ă€r heterogena parallella plattformar idag omfattande inom mĂ„nga domĂ€ner, frĂ„n konsumtionsartiklar till mediaprocesseringsplattformar för telekommunikationsoperatörer. Processen att placera berĂ€kningarna pĂ„ en passande hĂ„rdvaruplattform kallas för utforskning av en designrymd (design-space exploration). Denna process Ă€r mycket utmanande för heterogena flerkĂ€rniga arkitekturer, och kan medföra fördelar nĂ€r det gĂ€ller energieffektivitet. Det största problemet Ă€r att de olika valmöjligheterna i designrymden kan vĂ€xa exponentiellt. Enligt den nuvarande trenden som förespĂ„r ökad heterogeniska aspekter i processorerna Ă€r utmaningen att hitta den mest passande placeringen av berĂ€kningarna pĂ„ hĂ„rdvaran Ă€nnu en forskningsfrĂ„ga inom ramen för inbyggda datorsystem. Till exempel, den nuvarande schemalĂ€ggaren i Linux operativsystemet Ă€r inkapabel att hitta en effektiv placering av berĂ€kningarna pĂ„ den underliggande hĂ„rdvaran. Det enda mĂ€tsĂ€ttet som anvĂ€nds Ă€r processorns belastning vilket, som visats i tidigare forskning, inte motsvarar den verkliga prestandan i applikationen. AnvĂ€ndning av detta mĂ€tsĂ€tt vid resursallokering resulterar i slöseri med energi. Denna forskning hĂ€rstammar frĂ„n observationerna att dessa tillvĂ€gagĂ„ngssĂ€tt inte stöder det dynamiska beteendet hos ström-processeringsapplikationer (stream processing applications), speciellt om beteendena bara etableras vid körtid. Denna forskning kontribuerar till det allmĂ€nna mĂ„let att utveckla energieffektiva lösningar för ström-applikationer (streaming applications) pĂ„ heterogena flerkĂ€rniga hĂ„rdvaruplattformar. Ström-applikationer Ă€r numera mycket vanliga i mjukvarudomĂ€n. Deras distinkta karaktĂ€r Ă€r inlĂ€sning av flertalet dataströmmar, och behov av att processera dem i realtid. Arbetet i denna forskning understöder utvecklingen av nya sĂ€tt för att lösa det utmanade problemet att effektivt koordinera dynamiska applikationer i realtid och fokus pĂ„ energi- och prestandahantering

    Thermal aware task assignment for multicore processors using genetic algorithm

    Get PDF
    Microprocessor power and thermal density are increasing exponentially. The reliability of the processor declined, cooling costs rose, and the processor's lifespan was shortened due to an overheated processor and poor thermal management like thermally unbalanced processors. Thus, the thermal management and balancing of multi-core processors are extremely crucial. This work mostly focuses on a compact temperature model of multicore processors. In this paper, a novel task assignment is proposed using a genetic algorithm to maintain the thermal balance of the cores, by considering the energy expended by each task that the core performs. And expecting the cores’ temperature using the hotspot simulator. The algorithm assigns tasks to the processors depending on the task parameters and current cores’ temperature in such a way that none of the tasks’ deadlines are lost for the earliest deadline first (EDF) scheduling algorithm. The mathematical model was derived, and the simulation results showed that the highest temperature difference between the cores is 8 °C for approximately 14 seconds of simulation. These results validate the effectiveness of the proposed algorithm in managing the hotspot and reducing both temperature and energy consumption in multicore processors

    CROSS-STACK PREDICTIVE CONTROL FRAMEWORK FOR MULTICORE REAL-TIME APPLICATIONS

    Get PDF
    Many of the next generation applications in entertainment, human computer interaction, infrastructure, security and medical systems are computationally intensive, always-on, and have soft real time (SRT) requirements. While failure to meet deadlines is not catastrophic in SRT systems, missing deadlines can result in an unacceptable degradation in the quality of service (QoS). To ensure acceptable QoS under dynamically changing operating conditions such as changes in the workload, energy availability, and thermal constraints, systems are typically designed for worst case conditions. Unfortunately, such over-designing of systems increases costs and overall power consumption. In this dissertation we formulate the real-time task execution as a Multiple-Input, Single- Output (MISO) optimal control problem involving tracking a desired system utilization set point with control inputs derived from across the computing stack. We assume that an arbitrary number of SRT tasks may join and leave the system at arbitrary times. The tasks are scheduled on multiple cores by a dynamic priority multiprocessor scheduling algorithm. We use a model predictive controller (MPC) to realize optimal control. MPCs are easy to tune, can handle multiple control variables, and constraints on both the dependent and independent variables. We experimentally demonstrate the operation of our controller on a video encoder application and a computer vision application executing on a dual socket quadcore Xeon processor with a total of 8 processing cores. We establish that the use of DVFS and application quality as control variables enables operation at a lower power op- erating point while meeting real-time constraints as compared to non cross-stack control approaches. We also evaluate the role of scheduling algorithms in the control of homo- geneous and heterogeneous workloads. Additionally, we propose a novel adaptive control technique for time-varying workloads

    DeadPool: Performance Deadline Based Frequency Pooling and Thermal Management Agent in DVFS Enabled MPSoCs

    Get PDF
    High operating temperature and frequent thermal cycles in a multi-processor system-on-chip, which is now popularly utilized in mobile/Edge devices, harm the overall lifespan and reliability of such devices. In this paper, we propose an intelligent software agent that works alongside other resource mapping and partitioning mechanism in order to monitor and reduce the operating temperature of the system by regulating the operating frequency of the CPU cores while catering for performance constraint at the same time. Our proposed approach?, DeadPool thermal management agent, is able to reduce the overall operating temperature of the system by 24.21% and reduce thermal cycle by 67.42% at the most when compared to the state-of-the-art methods

    P-EdgeCoolingMode: An Agent Based Performance Aware Thermal Management Unit for DVFS Enabled Heterogeneous MPSoCs

    Get PDF
    Thermal cycling as well as spatial and thermal gradient affects the lifetime reliability and performance of heterogeneous multiprocessor systems-on-chips (MPSoCs). Conventional temperature management techniques are not intelligent enough to cater for performance, energy efficiency as well as operating temperature of the system. In this paper we propose a light-weight novel thermal management mechanism (P-EdgeCoolingMode) in the form of intelligent software agent, which monitors and regulates the operating temperature of the CPU cores to improve reliability of the system while catering for performance requirements. P-EdgeCoolingMode is capable of pro-actively monitoring performance and based on the user’s demand the agent takes necessary action, making the proposed methodology highly suitable for implementation on existing as well as conceptual Edge devices utilizing heterogeneous MPSoCs with dynamic voltage and frequency scaling (DVFS) capabilities. We validated our methodology on the Odroid-XU4 MPSoC and Huawei P20 Lite (HiSilicon Kirin 659 MPSoC). P-EdgeCoolingMode has been successful to reduce the operating temperature while improving performance and reducing power consumption for chosen test cases than the state-of-the-art. For applications with demanding performance requirement P-EdgeCoolingMode has been found to improve the power consumption by 30.62% at the most in comparison to existing state-of-the-art power management methodologies
    • 

    corecore