217 research outputs found

    A Survey of Prediction and Classification Techniques in Multicore Processor Systems

    Get PDF
    In multicore processor systems, being able to accurately predict the future provides new optimization opportunities, which otherwise could not be exploited. For example, an oracle able to predict a certain application\u27s behavior running on a smart phone could direct the power manager to switch to appropriate dynamic voltage and frequency scaling modes that would guarantee minimum levels of desired performance while saving energy consumption and thereby prolonging battery life. Using predictions enables systems to become proactive rather than continue to operate in a reactive manner. This prediction-based proactive approach has become increasingly popular in the design and optimization of integrated circuits and of multicore processor systems. Prediction transforms from simple forecasting to sophisticated machine learning based prediction and classification that learns from existing data, employs data mining, and predicts future behavior. This can be exploited by novel optimization techniques that can span across all layers of the computing stack. In this survey paper, we present a discussion of the most popular techniques on prediction and classification in the general context of computing systems with emphasis on multicore processors. The paper is far from comprehensive, but, it will help the reader interested in employing prediction in optimization of multicore processor systems

    Resource Management for Multicores to Optimize Performance under Temperature and Aging Constraints

    Get PDF

    Combined on-line lifetime-energy optimization for asymmetric multicores

    Get PDF
    In this paper we present an architectural and on-line resource management solution to optimize lifetime reliability of asymmetric multicores while minimizing the system energy consumption, targeting both single nodes (multicores) as well as multiple ones (cluster of multicores). The solution exploits the different characteristics of the computing resources to achieve the desired performance while optimizing the lifetime/energy trade-off. The experimental results show that a combined optimization of energy and lifetime allows for achieving an extended lifetime (similar to the one pursued by lifetime-only optimization solutions) with a marginal energy consumption detriment (less than 2%) with respect to energy-aware but aging-unaware systems

    Dynamic Energy Management for Chip Multi-processors under Performance Constraints

    Get PDF
    We introduce a novel algorithm for dynamic energy management (DEM) under performance constraints in chip multi-processors (CMPs). Using the novel concept of delayed instructions count, performance loss estimations are calculated at the end of each control period for each core. In addition, a Kalman filtering based approach is employed to predict workload in the next control period for which voltage-frequency pairs must be selected. This selection is done with a novel dynamic voltage and frequency scaling (DVFS) algorithm whose objective is to reduce energy consumption but without degrading performance beyond the user set threshold. Using our customized Sniper based CMP system simulation framework, we demonstrate the effectiveness of the proposed algorithm for a variety of benchmarks for 16 core and 64 core network-on-chip based CMP architectures. Simulation results show consistent energy savings across the board. We present our work as an investigation of the tradeoff between the achievable energy reduction via DVFS when predictions are done using the effective Kalman filter for different performance penalty thresholds

    Mapping parallelism to heterogeneous processors

    Get PDF
    Most embedded devices are based on heterogeneous Multiprocessor System on Chips (MPSoCs). These contain a variety of processors like CPUs, micro-controllers, DSPs, GPUs and specialised accelerators. The heterogeneity of these systems helps in achieving good performance and energy efficiency but makes programming inherently difficult. There is no single programming language or runtime to program such platforms. This thesis makes three contributions to these problems. First, it presents a framework that allows code in Single Program Multiple Data (SPMD) form to be mapped to a heterogeneous platform. The mapping space is explored, and it is shown that the best mapping depends on the metric used. Next, a compiler framework is presented which bridges the gap between the high -level programming model of OpenMP and the heterogeneous resources of MPSoCs. It takes OpenMP programs and generates code which runs on all processors. It delivers programming ease while exploiting heterogeneous resources. Finally, a compiler-based approach to runtime power management for heterogeneous cores is presented. Given an externally provided budget, the approach generates heterogeneous, partitioned code that attempts to give the best performance within that budget

    Multi-core devices for safety-critical systems: a survey

    Get PDF
    Multi-core devices are envisioned to support the development of next-generation safety-critical systems, enabling the on-chip integration of functions of different criticality. This integration provides multiple system-level potential benefits such as cost, size, power, and weight reduction. However, safety certification becomes a challenge and several fundamental safety technical requirements must be addressed, such as temporal and spatial independence, reliability, and diagnostic coverage. This survey provides a categorization and overview at different device abstraction levels (nanoscale, component, and device) of selected key research contributions that support the compliance with these fundamental safety requirements.This work has been partially supported by the Spanish Ministry of Economy and Competitiveness under grant TIN2015-65316-P, Basque Government under grant KK-2019-00035 and the HiPEAC Network of Excellence. The Spanish Ministry of Economy and Competitiveness has also partially supported Jaume Abella under Ramon y Cajal postdoctoral fellowship (RYC-2013-14717).Peer ReviewedPostprint (author's final draft

    Self-Aware resource management in embedded systems

    Get PDF
    Resource management for modern embedded systems is challenging in the presence of dynamic workloads, limited energy and power budgets, and application and user requirements. These diverse and dynamic requirements often result in conflicting objectives that need to be handled by intelligent and self-aware resource management. State-of-the-art resource management approaches leverage offline and online machine learning techniques for handling such complexity. However, these approaches focus on fixed objectives, limiting their adaptability to dynamically evolving requirements at run-time. In this dissertation, we first propose resource management approaches with fixed objectives for handling concurrent dynamic workload scenarios, mixed-sensitivity workloads, and user requirements and battery constraints. Then, we propose comprehensive self-aware resource management for handling multiple dynamic objectives at run-time. The proposed resource management approaches in this dissertation use machine learning techniques for offline modeling and online controlling. In each resource management approach, we consider a dynamic set of requirements that had not been considered in the state-of-the-art approaches and improve the selfawareness of resource management by learning applications characteristics, users’ habits, and battery patterns. We characterize the applications by offline data collection for handling the conflicting requirements of multiple concurrent applications. Further, we consider user’s activities and battery patterns for user and battery-aware resource management. Finally, we propose a comprehensive resource management approach which considers dynamic variation in embedded systems and formulate a goal for resource management based on that. The approaches presented in this dissertation focus on dynamic variation in the embedded systems and responding to the variation efficiently. The approaches consider minimizing energy consumption, satisfying performance requirements of the applications, respecting power constraints, satisfying user requirements, and maximizing battery cycle life. Each resource management approach is evaluated and compared against the relevant state-of-the-art resource management frameworks

    Machine Learning for Resource-Constrained Computing Systems

    Get PDF
    Die verfĂŒgbaren Ressourcen in Informationsverarbeitungssystemen wie Prozessoren sind in der Regel eingeschrĂ€nkt. Das umfasst z. B. die elektrische Leistungsaufnahme, den Energieverbrauch, die WĂ€rmeabgabe oder die ChipflĂ€che. Daher ist die Optimierung der Verwaltung der verfĂŒgbaren Ressourcen von grĂ¶ĂŸter Bedeutung, um Ziele wie maximale Performanz zu erreichen. Insbesondere die Ressourcenverwaltung auf der Systemebene hat ĂŒber die (dynamische) Zuweisung von Anwendungen zu Prozessorkernen und ĂŒber die Skalierung der Spannung und Frequenz (dynamic voltage and frequency scaling, DVFS) einen großen Einfluss auf die Performanz, die elektrische Leistung und die Temperatur wĂ€hrend der AusfĂŒhrung von Anwendungen. Die wichtigsten Herausforderungen bei der Ressourcenverwaltung sind die hohe KomplexitĂ€t von Anwendungen und Plattformen, unvorhergesehene (zur Entwurfszeit nicht bekannte) Anwendungen oder Plattformkonfigurationen, proaktive Optimierung und die Minimierung des Laufzeit-Overheads. Bestehende Techniken, die auf einfachen Heuristiken oder analytischen Modellen basieren, gehen diese Herausforderungen nur unzureichend an. Aus diesem Grund ist der Hauptbeitrag dieser Dissertation der Einsatz maschinellen Lernens (ML) fĂŒr Ressourcenverwaltung. ML-basierte Lösungen ermöglichen die BewĂ€ltigung dieser Herausforderungen durch die Vorhersage der Auswirkungen potenzieller Entscheidungen in der Ressourcenverwaltung, durch SchĂ€tzung verborgener (unbeobachtbarer) Eigenschaften von Anwendungen oder durch direktes Lernen einer Ressourcenverwaltungs-Strategie. Diese Dissertation entwickelt mehrere neuartige ML-basierte Ressourcenverwaltung-Techniken fĂŒr verschiedene Plattformen, Ziele und Randbedingungen. ZunĂ€chst wird eine auf Vorhersagen basierende Technik zur Maximierung der Performanz von Mehrkernprozessoren mit verteiltem Last-Level Cache und limitierter Maximaltemperatur vorgestellt. Diese verwendet ein neuronales Netzwerk (NN) zur Vorhersage der Auswirkungen potenzieller Migrationen von Anwendungen zwischen Prozessorkernen auf die Performanz. Diese Vorhersagen erlauben die Bestimmung der bestmöglichen Migration und ermöglichen eine proaktive Verwaltung. Das NN ist so trainiert, dass es mit unbekannten Anwendungen und verschiedenen Temperaturlimits zurechtkommt. Zweitens wird ein Boosting-Verfahren zur Maximierung der Performanz homogener Mehrkernprozessoren mit limitierter Maximaltemperatur mithilfe von DVFS vorgestellt. Dieses basiert auf einer neuartigen {Boostability}-Metrik, die die AbhĂ€ngigkeiten von Performanz, elektrischer Leistung und Temperatur auf Spannungs/Frequenz-Änderungen in einer Metrik vereint. % ignorerepeated Die AbhĂ€ngigkeiten von Performanz und elektrischer Leistung hĂ€ngen von der Anwendung ab und können zur Laufzeit nicht direkt beobachtet (gemessen) werden. Daher wird ein NN verwendet, um diese Werte fĂŒr unbekannte Anwendungen zu schĂ€tzen und so die KomplexitĂ€t der Boosting-Optimierung zu bewĂ€ltigen. Drittens wird eine Technik zur Temperaturminimierung von heterogenen Mehrkernprozessoren mit Quality of Service-Zielen vorgestellt. Diese verwendet Imitationslernen, um eine Migrationsstrategie von Anwendungen aus optimalen Orakel-Demonstrationen zu lernen. DafĂŒr wird ein NN eingesetzt, um die KomplexitĂ€t der Plattform und des Anwendungsverhaltens zu bewĂ€ltigen. Die Inferenz des NNs wird mit Hilfe eines vorhandenen generischen Beschleunigers, einer Neural Processing Unit (NPU), beschleunigt. Auch die ML Algorithmen selbst mĂŒssen auch mit begrenzten Ressourcen ausgefĂŒhrt werden. Zuletzt wird eine Technik fĂŒr ressourcenorientiertes Training auf verteilten GerĂ€ten vorgestellt, um einen konstanten Trainingsdurchsatz bei sich schnell Ă€ndernder VerfĂŒgbarkeit von Rechenressourcen aufrechtzuerhalten, wie es z.~B.~aufgrund von Konflikten bei gemeinsam genutzten Ressourcen der Fall ist. Diese Technik verwendet Structured Dropout, welches beim Training zufĂ€llige Teile des NNs auslĂ€sst. Dadurch können die erforderlichen Ressourcen fĂŒr das Training dynamisch angepasst werden -- mit vernachlĂ€ssigbarem Overhead, aber auf Kosten einer langsameren Trainingskonvergenz. Die Pareto-optimalen Dropout-Parameter pro Schicht des NNs werden durch eine Design Space Exploration bestimmt. Evaluierungen dieser Techniken werden sowohl in Simulationen als auch auf realer Hardware durchgefĂŒhrt und zeigen signifikante Verbesserungen gegenĂŒber dem Stand der Technik, bei vernachlĂ€ssigbarem Laufzeit-Overhead. Zusammenfassend zeigt diese Dissertation, dass ML eine SchlĂŒsseltechnologie zur Optimierung der Verwaltung der limitierten Ressourcen auf Systemebene ist, indem die damit verbundenen Herausforderungen angegangen werden
    • 

    corecore