400 research outputs found

    A Survey of Prediction and Classification Techniques in Multicore Processor Systems

    Get PDF
    In multicore processor systems, being able to accurately predict the future provides new optimization opportunities, which otherwise could not be exploited. For example, an oracle able to predict a certain application\u27s behavior running on a smart phone could direct the power manager to switch to appropriate dynamic voltage and frequency scaling modes that would guarantee minimum levels of desired performance while saving energy consumption and thereby prolonging battery life. Using predictions enables systems to become proactive rather than continue to operate in a reactive manner. This prediction-based proactive approach has become increasingly popular in the design and optimization of integrated circuits and of multicore processor systems. Prediction transforms from simple forecasting to sophisticated machine learning based prediction and classification that learns from existing data, employs data mining, and predicts future behavior. This can be exploited by novel optimization techniques that can span across all layers of the computing stack. In this survey paper, we present a discussion of the most popular techniques on prediction and classification in the general context of computing systems with emphasis on multicore processors. The paper is far from comprehensive, but, it will help the reader interested in employing prediction in optimization of multicore processor systems

    Hybrid dynamic energy and thermal management in heterogeneous embedded multiprocessor SoCs

    Full text link

    Thermal-aware resource allocation in earliest deadline first using fluid scheduling

    Get PDF
    Thermal issues in microprocessors have become a major design constraint because of their adverse effects on the reliability, performance and cost of the system. This article proposes an improvement in earliest deadline first, a uni-processor scheduling algorithm, without compromising its optimality in order to reduce the thermal peaks and variations. This is done by introducing a factor of fairness to earliest deadline first algorithm, which introduces idle intervals during execution and allows uniform distribution of workload over the time. The technique notably lowers the number of context switches when compare with the previous thermal-aware scheduling algorithm based on the same amount of fairness. Although, the algorithm is proposed for uni-processor environment, it is also applicable to partitioned scheduling in multi-processor environment, which primarily converts the multi-processor scheduling problem to a set of uni-processor scheduling problem and thereafter uses a uni-processor scheduling technique for scheduling. The simulation results show that the proposed approach reduces up to 5% of the temperature peaks and variations in a uni-processor environment while reduces up to 7% and 6% of the temperature spatial gradient and the average temperature in multi-processor environment, respectively

    Understanding the thermal implications of multicore architectures

    Get PDF
    Multicore architectures are becoming the main design paradigm for current and future processors. The main reason is that multicore designs provide an effective way of overcoming instruction-level parallelism (ILP) limitations by exploiting thread-level parallelism (TLP). In addition, it is a power and complexity-effective way of taking advantage of the huge number of transistors that can be integrated on a chip. On the other hand, today's higher than ever power densities have made temperature one of the main limitations of microprocessor evolution. Thermal management in multicore architectures is a fairly new area. Some works have addressed dynamic thermal management in bi/quad-core architectures. This work provides insight and explores different alternatives for thermal management in multicore architectures with 16 cores. Schemes employing both energy reduction and activity migration are explored and improvements for thread migration schemes are proposed.Peer ReviewedPostprint (published version

    Proactive temperature balancing for low cost thermal management in MPSoCs

    Full text link
    Abstract — Designing thermal management strategies that reduce the impact of hot spots and on-die temperature variations at low performance cost is a very significant challenge for multiprocessor system-on-chips (MPSoCs). In this work, we present a proactive MPSoC thermal man-agement approach, which predicts the future temperature and adjusts the job allocation on the MPSoC to minimize the impact of thermal hot spots and temperature variations without degrading performance. In addition, we implement and compare several reactive and proactive management strategies, and demonstrate that our proactive temperature-aware MPSoC job allocation technique is able to dramatically reduce the adverse effects of temperature at very low performance cost. We show experimental results using a simulator as well as an implementation on an UltraSPARC T1 system. I

    DYNAMIC THERMAL MANAGEMENT FOR MICROPROCESSORS THROUGH TASK SCHEDULING

    Get PDF
    With continuous IC(Integrated Circuit) technology size scaling, more and more transistors are integrated in a tiny area of the processor. Microprocessors experience unprecedented high power and high temperatures on chip, which can easily violate the thermal constraint. High temperature on the chip, if not controlled, can damage or even burn the chip. There are also emerging technologies which can exacerbate the thermal condition on modern processors. For example, 3D stacking is an IC technology that stacks several die layers together, in order to shorten the communication path between the dies to improve the chip performance. This technology unfortunately increases the power density per unit volumn, and the heat from each layer needs to dissipate vertically through the same heat sink. Another example is chip multi-processor. A chip multi-processor(CMP) integrates two or more independent actual processors (called “cores”), onto a single integrated circuit die. As IC technology nodes continually scale down to 45nm and below, there is significant within-die process variation(PV) in the current and near-future CMPs. Process variation makes the cores in the chip differ in their maximum operable frequency, and the amount of leakage power they consume. This can result in the immense spatial variation of the temperatures of the cores on the same chip, which means the temperatures of some cores can be much higher than other cores. One of the most commonly used methods to constrain a CPU from overheating is hardware dynamic thermal management(HW DTM), due to the high cost and inefficiency of current mechanical cooling techniques. Dynamic voltage/frequency scaling(DVFS) is such a broad-spectrum dynamic thermal management technique that can be applied to all types of processors, so we adopt DVFS as the HW DTM method in this thesis to simplify problem discussion. DVFS lowers the CPU power consumption by reducing CPU frequency or voltage when temperature overshoots, which constrains the temperature at the price of performance loss, in terms of reduced CPU throughput, or longer execution time of the programs. This thesis mainly addresses this problem, with the goal of eliminating unnecessary hardware-level DVFS and improving chip performance. The methodology of the experiments in this thesis are based on the accurate estimation of power and temperature on the processor. The CPU power usage of different benchmarks are estimated by reading the performance counters on a real P4 chip, and measuring the activities of different CPU functional units. The jobs are then categorized into powerintensive(hot) ones and power non-intensive(cool) ones. Many combinations of the jobs with mixed power(thermal) characteristics are used to evaluate the effectiveness of the algorithms we propose. When the experiments are conducted on a single-core processor, a compact dynamic thermal model embedded in Linux kernel is used to calculate the CPU temperature. When the experiments are conducted on the CMP with 3D stacked dies, or the CMP affected by significant process variation, a thermal simulation tool well recognized in academia is used. The contribution of the thesis is that it proposes new software-level task scheduling algorithms to avoid unnecessary hardware-level DVFS. New task scheduling algorithms are proposed not only for the single-core processor, but aslo for the CMP with 3D stacked dies, and the CMP under process variation. Compared with the state-of-the-art algorithms proposed by other researchers, the new algorithms we propose all show significant performance improvement. To improve the performance of the single-core processors, which is harmed by the thermal overshoots and the HW DTMs, we propose a heuristic algorithm named ThreshHot, which judiciously schedules hot jobs before cool jobs, to make the future temperature lower. Furthermore, it always makes the temperature stay as close to the threshold as possible while not overshooting. In the CMPs with 3D stacked dies, three heuristics are proposed and combined as one algorithm. First, the vertically stacked cores are treated as a core stack. The power of jobs is balanced among the core stacks instead of the individual cores. Second, the hot jobs are moved close to the heat sink to expedite heat dissipation. Third, when the thermal emergencies happen, the most power-intensive job in a core stack is penalized in order to lower the temperature quickly. When CMPs are under significant process variation, each core on the CMP has distinct maximum frequency and leakage power. Maximizing the overall CPU throughput on all the cores is in conflict with satisfying on-chip thermal constraints imposed on each core. A maximum bipartite matching algorithm is used to solve this dilemma, to exploit the maximum performance of the chip

    Dynamic Thermal Management for Microprocessors

    Get PDF
    In deep submicron era, thermal hot spots and large temperature gradients significantly impact system reliability, performance, cost and leakage power. Dynamic thermal management techniques are designed to tackle the problems and control the chip temperature as well as power consumption. They refer to those techniques which enable the chip to autonomously modify the task execution and power dissipation characteristics so that lower-cost cooling solutions could be adopted while still guaranteeing safe temperature regulation. As long as the temperature is regulated, the system reliability can be improved, leakage power can be reduced and cooling system lifetime can be extended significantly. Multimedia applications are expected to form the largest portion of workload in general purpose PC and portable devices. The ever-increasing computation intensity of multimedia applications elevates the processor temperature and consequently impairs the reliability and performance of the system. In this thesis, we propose to perform dynamic thermal management using reinforcement learning algorithm for multimedia applications. The presented learning model does not need any prior knowledge of the workload information or the system thermal and power characteristics. It learns the temperature change and workload switching patterns by observing the temperature sensor and event counters on the processor, and finds the management policy that provides good performance-thermal tradeoff during the runtime. As the system complexity increases, it is more and more difficult to perform thermal management in a centralized manner because of state explosion and the overhead of monitoring the entire chip. In this thesis, we present a framework for distributed thermal management in many-core systems where balanced thermal profile can be achieved by proactive task migration among neighboring cores. The framework has a low cost agent residing in each core that observes the local workload and temperature and communicates with its nearest neighbor for task migration and exchange. By choosing only those migration requests that will result in balanced workload without generating thermal emergency, the presented framework maintains workload balance across the system and avoids unnecessary migration. Experimental results show that, our distributed management policy achieves almost the same performance as a global management policy when the tasks are initially randomly distributed. Compared with existing proactive task migration technique, our approach generates less hotspot, less migration overhead with negligible performance overhead. Temperature affects the leakage power and cooling power. In this thesis, we address the impact of task allocation on a processor\u27s leakage power and cooling fan power. Although the leakage power is determined by the average die temperature and the fan power is determined by the peak temperature, our analysis shows that the overall power can be minimized if a task allocation with minimum peak temperature is adopted together with an intelligent fan speed adjustment technique that finds the optimal tradeoff between fan power and leakage power. We further present a multi-agent distributed task migration technique that searches for the best task allocation during runtime. By choosing only those migration requests that will result chip maximum temperature reduction, the presented framework achieves large fan power savings as well as overall power reduction

    Resource-aware scheduling for 2D/3D multi-/many-core processor-memory systems

    Get PDF
    This dissertation addresses the complexities of 2D/3D multi-/many-core processor-memory systems, focusing on two key areas: enhancing timing predictability in real-time multi-core processors and optimizing performance within thermal constraints. The integration of an increasing number of transistors into compact chip designs, while boosting computational capacity, presents challenges in resource contention and thermal management. The first part of the thesis improves timing predictability. We enhance shared cache interference analysis for set-associative caches, advancing the calculation of Worst-Case Execution Time (WCET). This development enables accurate assessment of cache interference and the effectiveness of partitioned schedulers in real-world scenarios. We introduce TCPS, a novel task and cache-aware partitioned scheduler that optimizes cache partitioning based on task-specific WCET sensitivity, leading to improved schedulability and predictability. Our research explores various cache and scheduling configurations, providing insights into their performance trade-offs. The second part focuses on thermal management in 2D/3D many-core systems. Recognizing the limitations of Dynamic Voltage and Frequency Scaling (DVFS) in S-NUCA many-core processors, we propose synchronous thread migrations as a thermal management strategy. This approach culminates in the HotPotato scheduler, which balances performance and thermal safety. We also introduce 3D-TTP, a transient temperature-aware power budgeting strategy for 3D-stacked systems, reducing the need for Dynamic Thermal Management (DTM) activation. Finally, we present 3QUTM, a novel method for 3D-stacked systems that combines core DVFS and memory bank Low Power Modes with a learning algorithm, optimizing response times within thermal limits. This research contributes significantly to enhancing performance and thermal management in advanced processor-memory systems
    • …
    corecore