920 research outputs found

    Smartlocks: Self-Aware Synchronization through Lock Acquisition Scheduling

    Get PDF
    As multicore processors become increasingly prevalent, system complexity is skyrocketing. The advent of the asymmetric multicore compounds this -- it is no longer practical for an average programmer to balance the system constraints associated with today's multicores and worry about new problems like asymmetric partitioning and thread interference. Adaptive, or self-aware, computing has been proposed as one method to help application and system programmers confront this complexity. These systems take some of the burden off of programmers by monitoring themselves and optimizing or adapting to meet their goals. This paper introduces an open-source self-aware synchronization library for multicores and asymmetric multicores called Smartlocks. Smartlocks is a spin-lock library that adapts its internal implementation during execution using heuristics and machine learning to optimize toward a user-defined goal, which may relate to performance, power, or other problem-specific criteria. Smartlocks builds upon adaptation techniques from prior work like reactive locks, but introduces a novel form of adaptation designed for asymmetric multicores that we term lock acquisition scheduling. Lock acquisition scheduling is optimizing which waiter will get the lock next for the best long-term effect when multiple threads (or processes) are spinning for a lock. Our results demonstrate empirically that lock scheduling is important for asymmetric multicores and that Smartlocks significantly outperform conventional and reactive locks for asymmetries like dynamic variations in processor clock frequencies caused by thermal throttling events

    A Novel Approach to Multiagent based Scheduling for Multicore Architecture

    Get PDF
    In a Multicore architecture, eachpackage consists of large number of processors. Thisincrease in processor core brings new evolution inparallel computing. Besides enormous performanceenhancement, this multicore package injects lot ofchallenges and opportunities on the operating systemscheduling point of view. We know that multiagentsystem is concerned with the development andanalysis of optimization problems. The main objectiveof multiagent system is to invent some methodologiesthat make the developer to build complex systems thatcan be used to solve sophisticated problems. This isdifficult for an individual agent to solve. In this paperwe combine the AMAS theory of multiagent systemwith the scheduler of operating system to develop anew process scheduling algorithm for multicorearchitecture. This multiagent based schedulingalgorithm promises in minimizing the average waitingtime of the processes in the centralized queue and alsoreduces the task of the scheduler. We actuallymodified and simulated the linux 2.6.11 kernel processscheduler to incorporate the multiagent systemconcept. The comparison is made for different numberof cores with multiple combinations of process and theresults are shown for average waiting time Vs numberof cores in the centralized queue

    Synthesis of application specific processor architectures for ultra-low energy consumption

    No full text
    In this paper we suggest that further energy savings can be achieved by a new approach to synthesis of embedded processor cores, where the architecture is tailored to the algorithms that the core executes. In the context of embedded processor synthesis, both single-core and many-core, the types of algorithms and demands on the execution efficiency are usually known at the chip design time. This knowledge can be utilised at the design stage to synthesise architectures optimised for energy consumption. Firstly, we present an overview of both traditional energy saving techniques and new developments in architectural approaches to energy-efficient processing. Secondly, we propose a picoMIPS architecture that serves as an architectural template for energy-efficient synthesis. As a case study, we show how the picoMIPS architecture can be tailored to an energy efficient execution of the DCT algorithm

    Intelligent Management of Mobile Systems through Computational Self-Awareness

    Full text link
    Runtime resource management for many-core systems is increasingly complex. The complexity can be due to diverse workload characteristics with conflicting demands, or limited shared resources such as memory bandwidth and power. Resource management strategies for many-core systems must distribute shared resource(s) appropriately across workloads, while coordinating the high-level system goals at runtime in a scalable and robust manner. To address the complexity of dynamic resource management in many-core systems, state-of-the-art techniques that use heuristics have been proposed. These methods lack the formalism in providing robustness against unexpected runtime behavior. One of the common solutions for this problem is to deploy classical control approaches with bounds and formal guarantees. Traditional control theoretic methods lack the ability to adapt to (1) changing goals at runtime (i.e., self-adaptivity), and (2) changing dynamics of the modeled system (i.e., self-optimization). In this chapter, we explore adaptive resource management techniques that provide self-optimization and self-adaptivity by employing principles of computational self-awareness, specifically reflection. By supporting these self-awareness properties, the system can reason about the actions it takes by considering the significance of competing objectives, user requirements, and operating conditions while executing unpredictable workloads

    Reinforcement learning based multi core scheduling (RLBMCS) for real time systems

    Get PDF
    Embedded systems with multi core processors are increasingly popular because of the diversity of applications that can be run on it. In this work, a reinforcement learning based scheduling method is proposed to handle the real time tasks in multi core systems with effective CPU usage and lower response time. The priority of the tasks is varied dynamically to ensure fairness with reinforcement learning based priority assignment and Multi Core MultiLevel Feedback queue (MCMLFQ) to manage the task execution in multi core system

    Heterogeneity-aware scheduling and data partitioning for system performance acceleration

    Get PDF
    Over the past decade, heterogeneous processors and accelerators have become increasingly prevalent in modern computing systems. Compared with previous homogeneous parallel machines, the hardware heterogeneity in modern systems provides new opportunities and challenges for performance acceleration. Classic operating systems optimisation problems such as task scheduling, and application-specific optimisation techniques such as the adaptive data partitioning of parallel algorithms, are both required to work together to address hardware heterogeneity. Significant effort has been invested in this problem, but either focuses on a specific type of heterogeneous systems or algorithm, or a high-level framework without insight into the difference in heterogeneity between different types of system. A general software framework is required, which can not only be adapted to multiple types of systems and workloads, but is also equipped with the techniques to address a variety of hardware heterogeneity. This thesis presents approaches to design general heterogeneity-aware software frameworks for system performance acceleration. It covers a wide variety of systems, including an OS scheduler targeting on-chip asymmetric multi-core processors (AMPs) on mobile devices, a hierarchical many-core supercomputer and multi-FPGA systems for high performance computing (HPC) centers. Considering heterogeneity from on-chip AMPs, such as thread criticality, core sensitivity, and relative fairness, it suggests a collaborative based approach to co-design the task selector and core allocator on OS scheduler. Considering the typical sources of heterogeneity in HPC systems, such as the memory hierarchy, bandwidth limitations and asymmetric physical connection, it proposes an application-specific automatic data partitioning method for a modern supercomputer, and a topological-ranking heuristic based schedule for a multi-FPGA based reconfigurable cluster. Experiments on both a full system simulator (GEM5) and real systems (Sunway Taihulight Supercomputer and Xilinx Multi-FPGA based clusters) demonstrate the significant advantages of the suggested approaches compared against the state-of-the-art on variety of workloads."This work is supported by St Leonards 7th Century Scholarship and Computer Science PhD funding from University of St Andrews; by UK EPSRC grant Discovery: Pattern Discovery and Program Shaping for Manycore Systems (EP/P020631/1)." -- Acknowledgement

    Performance Controlled Power Optimization for Virtualized Internet Datacenters

    Get PDF
    Modern data centers must provide performance assurance for complex system software such as web applications. In addition, the power consumption of data centers needs to be minimized to reduce operating costs and avoid system overheating. In recent years, more and more data centers start to adopt server virtualization strategies for resource sharing to reduce hardware and operating costs by consolidating applications previously running on multiple physical servers onto a single physical server. In this dissertation, several power efficient algorithms are proposed to effectively reduce server power consumption while achieving the required application-level performance for virtualized servers. First, at the server level this dissertation proposes two control solutions based on dynamic voltage and frequency scaling (DVFS) technology and request batching technology. The two solutions share a performance balancing technique that maintains performance balancing among all virtual machines so that they can have approximately the same performance level relative to their allowed peak values. Then, when the workload intensity is light, we adopt the request batching technology by using a controller to determine the time length for periodically batching incoming requests and putting the processor into sleep mode. When the workload intensity changes from light to moderate, request batching is automatically switched to DVFS to increase the processor frequency for performance guarantees. Second, at the datacenter level, this dissertation proposes a performance-controlled power optimization solution for virtualized server clusters with multi-tier applications. The solution utilizes both DVFS and server consolidation strategies for maximized power savings by integrating feedback control with optimization strategies. At the application level, a multi-input-multi-output controller is designed to achieve the desired performance for applications spanning multiple VMs, on a short time scale, by reallocating the CPU resources and DVFS. At the cluster level, a power optimizer is proposed to incrementally consolidate VMs onto the most power-efficient servers on a longer time scale. Finally, this dissertation proposes a VM scheduling algorithm that exploits core performance heterogeneity to optimize the overall system energy efficiency. The four algorithms at the three different levels are demonstrated with empirical results on hardware testbeds and trace-driven simulations and compared against state-of-the-art baselines
    corecore