233 research outputs found

    Decoupling contention management from scheduling

    Get PDF
    Many parallel applications exhibit unpredictable communication between threads, leading to contention for shared objects. The choice of contention management strategy impacts strongly the performance and scalability of these applications: spinning provides maximum performance but wastes significant processor resources, while blocking-based approaches conserve processor resources but introduce high overheads on the critical path of computation. Under situations of high or changing load, the operating system complicates matters further with arbitrary scheduling decisions which often preempt lock holders, leading to long serialization delays until the preempted thread resumes execution. We observe that contention management is orthogonal to the problems of scheduling and load management and propose to decouple them so each may be solved independently and effectively. To this end, we propose a load control mechanism which manages the number of active threads in the system separately from any contention which may exist. By isolating contention management from damaging interactions with the OS scheduler, we combine the efficiency of spinning with the robustness of blocking. The proposed load control mechanism results in stable, high performance for both lightly and heavily loaded systems, requires no special privileges or modifications at the OS level, and can be implemented as a library which benefits existing code

    Decoupling contention management from scheduling

    Full text link

    Task Activity Vectors: A Novel Metric for Temperature-Aware and Energy-Efficient Scheduling

    Get PDF
    This thesis introduces the abstraction of the task activity vector to characterize applications by the processor resources they utilize. Based on activity vectors, the thesis introduces scheduling policies for improving the temperature distribution on the processor chip and for increasing energy efficiency by reducing the contention for shared resources of multicore and multithreaded processors

    Fewer Cores, More Hertz: Leveraging High-Frequency Cores in the OS Scheduler for Improved Application Performance

    Get PDF
    International audienceIn modern server CPUs, individual cores can run at different frequencies, which allows for fine-grained control of the per-formance/energy tradeoff. Adjusting the frequency, however, incurs a high latency. We find that this can lead to a problem of frequency inversion, whereby the Linux scheduler places a newly active thread on an idle core that takes dozens to hundreds of milliseconds to reach a high frequency, just before another core already running at a high frequency becomes idle. In this paper, we first illustrate the significant performance overhead of repeated frequency inversion through a case study of scheduler behavior during the compilation of the Linux kernel on an 80-core Intel R Xeon-based machine. Following this, we propose two strategies to reduce the likelihood of frequency inversion in the Linux scheduler. When benchmarked over 60 diverse applications on the Intel R Xeon, the better performing strategy, S move , improves performance by more than 5% (at most 56% with no energy overhead) for 23 applications, and worsens performance by more than 5% (at most 8%) for only 3 applications. On a 4-core AMD Ryzen we obtain performance improvements up to 56%

    Energy Demand Response for High-Performance Computing Systems

    Get PDF
    The growing computational demand of scientific applications has greatly motivated the development of large-scale high-performance computing (HPC) systems in the past decade. To accommodate the increasing demand of applications, HPC systems have been going through dramatic architectural changes (e.g., introduction of many-core and multi-core systems, rapid growth of complex interconnection network for efficient communication between thousands of nodes), as well as significant increase in size (e.g., modern supercomputers consist of hundreds of thousands of nodes). With such changes in architecture and size, the energy consumption by these systems has increased significantly. With the advent of exascale supercomputers in the next few years, power consumption of the HPC systems will surely increase; some systems may even consume hundreds of megawatts of electricity. Demand response programs are designed to help the energy service providers to stabilize the power system by reducing the energy consumption of participating systems during the time periods of high demand power usage or temporary shortage in power supply. This dissertation focuses on developing energy-efficient demand-response models and algorithms to enable HPC system\u27s demand response participation. In the first part, we present interconnection network models for performance prediction of large-scale HPC applications. They are based on interconnected topologies widely used in HPC systems: dragonfly, torus, and fat-tree. Our interconnect models are fully integrated with an implementation of message-passing interface (MPI) that can mimic most of its functions with packet-level accuracy. Extensive experiments show that our integrated models provide good accuracy for predicting the network behavior, while at the same time allowing for good parallel scaling performance. In the second part, we present an energy-efficient demand-response model to reduce HPC systems\u27 energy consumption during demand response periods. We propose HPC job scheduling and resource provisioning schemes to enable HPC system\u27s emergency demand response participation. In the final part, we propose an economic demand-response model to allow both HPC operator and HPC users to jointly reduce HPC system\u27s energy cost. Our proposed model allows the participation of HPC systems in economic demand-response programs through a contract-based rewarding scheme that can incentivize HPC users to participate in demand response

    Co-Simulation of Cyber-Physical System with Distributed Embedded Control

    Get PDF

    Real time control in Linux

    Get PDF
    In this thesis, the approaches to achieving real time control under Linux operating platform are presented and four different real time control applications are discussed. The driver of the ADS 12 data acquisition card is programmed to enhance hardware supported by COMEDI through which the connection between computer and DAQ boards are built up. A simple project combining RTAI (Real Time Application Interface) with COMEDI is introduced together with the discussion of one SISO (Single-Input Single-Output) control project and two SIMO (Single-Inputs Multi- Output) control projects based on different controllers, and RTLab is selected to provide us with real time functionalities as it combines COMEDI with RTAI or RTLinx very well in Linux. Further more, to enhance the observability and maneuverability of RTLab, additional custom plugin graphic windows have also been made for every application in the project

    A survey of design techniques for system-level dynamic power management

    Full text link

    Lottery and stride scheduling : flexible proportional-share resource management

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1995.Includes bibliographical references (p. 145-151).by Carl A. Waldspurger.Ph.D
    corecore