1,014 research outputs found

    Energy-Efficient, Reliable and QoS-Aware Task Mapping on Cyber-Physical Systems

    Get PDF
    Cyber-Physical Systems (CPS) usually consist of a set of embedded systems (CPS nodes) connected through wireless communication, providing multiple functionalities that support different types of applications. During CPS deployment, application tasks are mapped on the CPS nodes with the objective of enhancing real-time performance, energy efficiency, and execution reliability. To satisfy these requirements, effective task mapping approaches should be designed based on different types of tasks, platforms, applications, and system requirements. In this paper, we provide a comprehensive survey regarding the task mapping methods in CPS

    Energy-Quality-Time Optimized Task Mapping on DVFS-enabled Multicores

    Get PDF
    International audienceMulticore architectures have great potential for energy-constrained embedded systems, such as energy-harvestingwireless sensor networks. Some embedded applications, especially the real-time ones, can be modeled as imprecise computation tasks. A task is divided into a mandatory subtask that provides a baseline Quality-of-Service (QoS) and an optional subtask that refines the result to increase the QoS. Combining dynamic voltage and frequency scaling, task allocation and task adjustment, we can maximize the system QoS under real-time and energy supply constraints. However, the nonlinear and combinatorial nature of this problem makes it difficult to solve. This work first formulates a mixed-integer non-linear programming problem to concurrently carry out task-to-processor allocation, frequencyto- task assignment and optional task adjustment. We provide a mixed-integer linear programming form of this formulation without performance degradation and we propose a novel decomposition algorithm to provide an optimal solution withreduced computation time compared to state-of-the-art optimal approaches (22.6% in average). We also propose a heuristic version that has negligible computation tim

    A Survey of Fault-Tolerance Techniques for Embedded Systems from the Perspective of Power, Energy, and Thermal Issues

    Get PDF
    The relentless technology scaling has provided a significant increase in processor performance, but on the other hand, it has led to adverse impacts on system reliability. In particular, technology scaling increases the processor susceptibility to radiation-induced transient faults. Moreover, technology scaling with the discontinuation of Dennard scaling increases the power densities, thereby temperatures, on the chip. High temperature, in turn, accelerates transistor aging mechanisms, which may ultimately lead to permanent faults on the chip. To assure a reliable system operation, despite these potential reliability concerns, fault-tolerance techniques have emerged. Specifically, fault-tolerance techniques employ some kind of redundancies to satisfy specific reliability requirements. However, the integration of fault-tolerance techniques into real-time embedded systems complicates preserving timing constraints. As a remedy, many task mapping/scheduling policies have been proposed to consider the integration of fault-tolerance techniques and enforce both timing and reliability guarantees for real-time embedded systems. More advanced techniques aim additionally at minimizing power and energy while at the same time satisfying timing and reliability constraints. Recently, some scheduling techniques have started to tackle a new challenge, which is the temperature increase induced by employing fault-tolerance techniques. These emerging techniques aim at satisfying temperature constraints besides timing and reliability constraints. This paper provides an in-depth survey of the emerging research efforts that exploit fault-tolerance techniques while considering timing, power/energy, and temperature from the real-time embedded systems’ design perspective. In particular, the task mapping/scheduling policies for fault-tolerance real-time embedded systems are reviewed and classified according to their considered goals and constraints. Moreover, the employed fault-tolerance techniques, application models, and hardware models are considered as additional dimensions of the presented classification. Lastly, this survey gives deep insights into the main achievements and shortcomings of the existing approaches and highlights the most promising ones

    Automatic synthesis and optimization of chip multiprocessors

    Get PDF
    The microprocessor technology has experienced an enormous growth during the last decades. Rapid downscale of the CMOS technology has led to higher operating frequencies and performance densities, facing the fundamental issue of power dissipation. Chip Multiprocessors (CMPs) have become the latest paradigm to improve the power-performance efficiency of computing systems by exploiting the parallelism inherent in applications. Industrial and prototype implementations have already demonstrated the benefits achieved by CMPs with hundreds of cores.CMP architects are challenged to take many complex design decisions. Only a few of them are:- What should be the ratio between the core and cache areas on a chip?- Which core architectures to select?- How many cache levels should the memory subsystem have?- Which interconnect topologies provide efficient on-chip communication?These and many other aspects create a complex multidimensional space for architectural exploration. Design Automation tools become essential to make the architectural exploration feasible under the hard time-to-market constraints. The exploration methods have to be efficient and scalable to handle future generation on-chip architectures with hundreds or thousands of cores.Furthermore, once a CMP has been fabricated, the need for efficient deployment of the many-core processor arises. Intelligent techniques for task mapping and scheduling onto CMPs are necessary to guarantee the full usage of the benefits brought by the many-core technology. These techniques have to consider the peculiarities of the modern architectures, such as availability of enhanced power saving techniques and presence of complex memory hierarchies.This thesis has several objectives. The first objective is to elaborate the methods for efficient analytical modeling and architectural design space exploration of CMPs. The efficiency is achieved by using analytical models instead of simulation, and replacing the exhaustive exploration with an intelligent search strategy. Additionally, these methods incorporate high-level models for physical planning. The related contributions are described in Chapters 3, 4 and 5 of the document.The second objective of this work is to propose a scalable task mapping algorithm onto general-purpose CMPs with power management techniques, for efficient deployment of many-core systems. This contribution is explained in Chapter 6 of this document.Finally, the third objective of this thesis is to address the issues of the on-chip interconnect design and exploration, by developing a model for simultaneous topology customization and deadlock-free routing in Networks-on-Chip. The developed methodology can be applied to various classes of the on-chip systems, ranging from general-purpose chip multiprocessors to application-specific solutions. Chapter 7 describes the proposed model.The presented methods have been thoroughly tested experimentally and the results are described in this dissertation. At the end of the document several possible directions for the future research are proposed

    Performance, Power Modeling and Optimization for High-Performance Computing Systems

    Get PDF
    University of Minnesota Ph.D. dissertation.October 2016. Major: Electrical/Computer Engineering. Advisor: John Sartori. 1 computer file (PDF); xi, 154 pages.Heterogeneity abounds in modern high-performance computing systems. Applications are heterogeneous, containing time-varying unbalanced utilization for different resources, and system architectures have become heterogeneous in order to achieve higher levels of performance and energy efficiency. The most powerful, and also the most energy-efficient high-performance computing systems today consist of many-core CPUs and GPGPUs with a variety of specialize on-chip and off-chip memories. These heterogeneous systems provide a huge amount of computing resources, but it is becoming increasingly challenging to use them effectively and efficiently to maximize their potential. This becomes an even more pressing challenge as energy efficiency becomes the primary barrier to achieving higher levels of performance. This thesis addresses the challenges of performance modeling and optimization in heterogeneous high-performance computing systems. Effective system optimization requires understanding of how performance and power change in response to optimizations. Therefore, we begin by summarizing the impact of modern architectural advances on performance and power modeling for chip multiprocessors (CMPs). We present two models that estimate the performance and power in such systems. The first model, CAMP, is a fast and accurate cache-aware performance model that estimates the performance degradation due to cache contention of processes running on cache-sharing cores. We then propose a system-level power model for a multi-programmed CMP environment that accounts for cache contention. We explain how to integrate the two models to enable power-aware process assignment. Then, we propose an off-chip memory access-aware runtime DVFS control technique that minimizes energy consumption subject to a constraint on application execution time. The second part of the dissertation focuses on improving performance for GPGPUs. After a thorough analysis on CPI breakdown, we lay out all the key factors that govern GPU throughput. In order to improve overall performance for GPGPUs, we propose two approaches that address the key factors, without introducing extra congestion and degradation to the system. We first propose a new two-level priority scheduling policy to improve overall performance by optimizing effective degree of parallelism. Then, we propose ICMT, a full, detailed solution for intra-core multitasking for GPGPUs, including architectural support and a contention-aware workload scheduling algorithm that improves all the key factors in a balanced fashion. Furthermore, we propose a new contention-aware analytical performance model that provides fine-grained workload scheduling decisions for intra-core multitasking, including detailed resource allocation from co-scheduled workloads

    Energy-aware scheduling in distributed computing systems

    Get PDF
    Distributed computing systems, such as data centers, are key for supporting modern computing demands. However, the energy consumption of data centers has become a major concern over the last decade. Worldwide energy consumption in 2012 was estimated to be around 270 TWh, and grim forecasts predict it will quadruple by 2030. Maximizing energy efficiency while also maximizing computing efficiency is a major challenge for modern data centers. This work addresses this challenge by scheduling the operation of modern data centers, considering a multi-objective approach for simultaneously optimizing both efficiency objectives. Multiple data center scenarios are studied, such as scheduling a single data center and scheduling a federation of several geographically-distributed data centers. Mathematical models are formulated for each scenario, considering the modeling of their most relevant components such as computing resources, computing workload, cooling system, networking, and green energy generators, among others. A set of accurate heuristic and metaheuristic algorithms are designed for addressing the scheduling problem. These scheduling algorithms are comprehensively studied, and compared with each other, using statistical tools to evaluate their efficacy when addressing realistic workloads and scenarios. Experimental results show the designed scheduling algorithms are able to significantly increase the energy efficiency of data centers when compared to traditional scheduling methods, while providing a diverse set of trade-off solutions regarding the computing efficiency of the data center. These results confirm the effectiveness of the proposed algorithmic approaches for data center infrastructures.Los sistemas informĂĄticos distribuidos, como los centros de datos, son clave para satisfacer la demanda informĂĄtica moderna. Sin embargo, su consumo de energĂ©tico se ha convertido en una gran preocupaciĂłn. Se estima que mundialmente su consumo energĂ©tico rondĂł los 270 TWh en el año 2012, y algunos prevĂ©n que este consumo se cuadruplicarĂĄ para el año 2030. Maximizar simultĂĄneamente la eficiencia energĂ©tica y computacional de los centros de datos es un desafĂ­o crĂ­tico. Esta tesis aborda dicho desafĂ­o mediante la planificaciĂłn de la operativa del centro de datos considerando un enfoque multiobjetivo para optimizar simultĂĄneamente ambos objetivos de eficiencia. En esta tesis se estudian mĂșltiples variantes del problema, desde la planificaciĂłn de un Ășnico centro de datos hasta la de una federaciĂłn de mĂșltiples centros de datos geogrĂĄficmentea distribuidos. Para esto, se formulan modelos matemĂĄticos para cada variante del problema, modelado sus componentes mĂĄs relevantes, como: recursos computacionales, carga de trabajo, refrigeraciĂłn, redes, energĂ­a verde, etc. Para resolver el problema de planificaciĂłn planteado, se diseñan un conjunto de algoritmos heurĂ­sticos y metaheurĂ­sticos. Estos son estudiados exhaustivamente y su eficiencia es evaluada utilizando una baterĂ­a de herramientas estadĂ­sticas. Los resultados experimentales muestran que los algoritmos de planificaciĂłn diseñados son capaces de aumentar significativamente la eficiencia energĂ©tica de un centros de datos en comparaciĂłn con mĂ©todos tradicionales planificaciĂłn. A su vez, los mĂ©todos propuestos proporcionan un conjunto diverso de soluciones con diferente nivel de compromiso respecto a la eficiencia computacional del centro de datos. Estos resultados confirman la eficacia del enfoque algorĂ­tmico propuesto

    Power and memory optimization techniques in embedded systems design

    Get PDF
    Embedded systems incur tight constraints on power consumption and memory (which impacts size) in addition to other constraints such as weight and cost. This dissertation addresses two key factors in embedded system design, namely minimization of power consumption and memory requirement. The first part of this dissertation considers the problem of optimizing power consumption (peak power as well as average power) in high-level synthesis (HLS). The second part deals with memory usage optimization mainly targeting a restricted class of computations expressed as loops accessing large data arrays that arises in scientific computing such as the coupled cluster and configuration interaction methods in quantum chemistry. First, a mixed-integer linear programming (MILP) formulation is presented for the scheduling problem in HLS using multiple supply-voltages in order to optimize peak power as well as average power and energy consumptions. For large designs, the MILP formulation may not be suitable; therefore, a two-phase iterative linear programming formulation and a power-resource-saving heuristic are presented to solve this problem. In addition, a new heuristic that uses an adaptation of the well-known force-directed scheduling heuristic is presented for the same problem. Next, this work considers the problem of module selection simultaneously with scheduling for minimizing peak and average power consumption. Then, the problem of power consumption (peak and average) in synchronous sequential designs is addressed. A solution integrating basic retiming and multiple-voltage scheduling (MVS) is proposed and evaluated. A two-stage algorithm namely power-oriented retiming followed by a MVS technique for peak and/or average power optimization is presented. Memory optimization is addressed next. Dynamic memory usage optimization during the evaluation of a special class of interdependent large data arrays is considered. Finally, this dissertation develops a novel integer-linear programming (ILP) formulation for static memory optimization using the well-known fusion technique by encoding of legality rules for loop fusion of a special class of loops using logical constraints over binary decision variables and a highly effective approximation of memory usage

    Stochastic Performance Throttling for Multicore Architectures under Spatial and Temporal Dependencies

    Get PDF

    Using Imprecise Computing for Improved Real-Time Scheduling

    Get PDF
    Conventional hard real-time scheduling is often overly pessimistic due to the worst case execution time estimation. The pessimism can be mitigated by exploiting imprecise computing in applications where occasional small errors are acceptable. This leverage is investigated in a few previous works, which are restricted to preemptive cases. We study how to make use of imprecise computing in uniprocessor non-preemptive real-time scheduling, which is known to be more difficult than its preemptive counterpart. Several heuristic algorithms are developed for periodic tasks with independent or cumulative errors due to imprecision. Simulation results show that the proposed techniques can significantly improve task schedulability and achieve desired accuracy– schedulability tradeoff. The benefit of considering imprecise computing is further confirmed by a prototyping implementation in Linux system. Mixed-criticality system is a popular model for reducing pessimism in real-time scheduling while providing guarantee for critical tasks in presence of unexpected overrun. However, it is controversial due to some drawbacks. First, all low-criticality tasks are dropped in high-criticality mode, although they are still needed. Second, a single high-criticality job overrun leads to the pessimistic high-criticality mode for all high-criticality tasks and consequently resource utilization becomes inefficient. We attempt to tackle aforementioned two limitations of mixed-criticality system simultaneously in multiprocessor scheduling, while those two issues are mostly focused on uniprocessor scheduling in several recent works. We study how to achieve graceful degradation of low-criticality tasks by continuing their executions with imprecise computing or even precise computing if there is sufficient utilization slack. Schedulability conditions under this Variable-Precision Mixed-Criticality (VPMC) system model are investigated for partitioned scheduling and global fpEDF-VD scheduling. And a deferred switching protocol is introduced so that the chance of switching to high-criticality mode is significantly reduced. Moreover, we develop a precision optimization approach that maximizes precise computing of low-criticality tasks through 0-1 knapsack formulation. Experiments are performed through both software simulations and Linux proto- typing with consideration of overhead. Schedulability of the proposed methods is studied so that the Quality-of-Service for low-criticality tasks is improved with guarantee of satisfying all deadline constraints. The proposed precision optimization can largely reduce computing errors compared to constantly executing low-criticality tasks with imprecise computing in high-criticality mode
    • 

    corecore