1,423 research outputs found

    Preemptive Software Transactional Memory

    Get PDF
    In state-of-the-art Software Transactional Memory (STM) systems, threads carry out the execution of transactions as non-interruptible tasks. Hence, a thread can react to the injection of a higher priority transactional task and take care of its processing only at the end of the currently executed transaction. In this article we pursue a paradigm shift where the execution of an in-memory transaction is carried out as a preemptable task, so that a thread can start processing a higher priority transactional task before finalizing its current transaction. We achieve this goal in an application-transparent manner, by only relying on Operating System facilities we include in our preemptive STM architecture. With our approach we are able to re-evaluate CPU assignment across transactions along a same thread every few tens of microseconds. This is mandatory for an effective priority-aware architecture given the typically finer-grain nature of in-memory transactions compared to their counterpart in database systems. We integrated our preemptive STM architecture with the TinySTM package, and released it as open source. We also provide the results of an experimental assessment of our proposal based on running a port of the TPC-C benchmark to the STM environment

    Enabling preemptive multiprogramming on GPUs

    Get PDF
    GPUs are being increasingly adopted as compute accelerators in many domains, spanning environments from mobile systems to cloud computing. These systems are usually running multiple applications, from one or several users. However GPUs do not provide the support for resource sharing traditionally expected in these scenarios. Thus, such systems are unable to provide key multiprogrammed workload requirements, such as responsiveness, fairness or quality of service. In this paper, we propose a set of hardware extensions that allow GPUs to efficiently support multiprogrammed GPU workloads. We argue for preemptive multitasking and design two preemption mechanisms that can be used to implement GPU scheduling policies. We extend the architecture to allow concurrent execution of GPU kernels from different user processes and implement a scheduling policy that dynamically distributes the GPU cores among concurrently running kernels, according to their priorities. We extend the NVIDIA GK110 (Kepler) like GPU architecture with our proposals and evaluate them on a set of multiprogrammed workloads with up to eight concurrent processes. Our proposals improve execution time of high-priority processes by 15.6x, the average application turnaround time between 1.5x to 2x, and system fairness up to 3.4x.We would like to thank the anonymous reviewers, Alexan- der Veidenbaum, Carlos Villavieja, Lluis Vilanova, Lluc Al- varez, and Marc Jorda on their comments and help improving our work and this paper. This work is supported by Euro- pean Commission through TERAFLUX (FP7-249013), Mont- Blanc (FP7-288777), and RoMoL (GA-321253) projects, NVIDIA through the CUDA Center of Excellence program, Spanish Government through Programa Severo Ochoa (SEV-2011-0067) and Spanish Ministry of Science and Technology through TIN2007-60625 and TIN2012-34557 projects.Peer ReviewedPostprint (author’s final draft

    Flexible Scheduling in Multimedia Kernels: an Overview

    Get PDF
    Current Hard Real-Time (HRT) kernels have their timely behaviour guaranteed on the cost of a rather restrictive use of the available resources. This makes current HRT scheduling techniques inadequate for use in a multimedia environment where we can make a considerable profit by a better and more flexible use of the resources. We will show that we can improve the flexibility and efficiency of multimedia kernels. Therefore we introduce Real Time Transactions (RTT) with Deadline Inheritance policies for a small class of scheduling algorithms and we will evaluate these algorithms for use in a multimedia environmen

    Optimal Selection of Preemption Points to Minimize Preemption Overhead

    Get PDF
    A central issue for verifying the schedulability of hard real-time systems is the correct evaluation of task execution times. These values are significantly influenced by the preemption overhead, which mainly includes the cache related delays and the context switch times introduced by each preemption. Since such an overhead significantly depends on the particular point in the code where preemption takes place, this paper proposes a method for placing suitable preemption points in each task in order to maximize the chances of finding a schedulable solution. In a previous work, we presented a method for the optimal selection of preemption points under the restrictive assumption of a fixed preemption cost, identical for each preemption point. In this paper, we remove such an assumption, exploring a more realistic and complex scenario where the preemption cost varies throughout the task code. Instead of modeling the problem with an integer programming formulation, with exponential worst-case complexity, we derive an optimal algorithm that has a linear time and space complexity. This somewhat surprising result allows selecting the best preemption points even in complex scenarios with a large number of potential preemption locations. Experimental results are also presented to show the effectiveness of the proposed approach in increasing the system schedulability

    Effective And Efficient Preemption Placement For Cache Overhead Minimization In Hard Real-Time Systems

    Get PDF
    Schedulability analysis for real-time systems has been the subject of prominent research over the past several decades. One of the key foundations of schedulability analysis is an accurate worst case execution time (WCET) for each task. In preemption based real-time systems, the CRPD can represent a significant component (up to 44% as documented in research literature) of variability to overall task WCET. Several methods have been employed to calculate CRPD with significant levels of pessimism that may result in a task set erroneously declared as non-schedulable. Furthermore, they do not take into account that CRPD cost is inherently a function of where preemptions actually occur. Our approach for computing CRPD via loaded cache blocks (LCBs) is more accurate in the sense that cache state reflects which cache blocks and the specific program locations where they are reloaded. Limited preemption models attempt to minimize preemption overhead (CRPD) by reducing the number of allowed preemptions and/or allowing preemption at program locations where the CRPD effect is minimized. These algorithms rely heavily on accurate CRPD measurements or estimation models in order to identify an optimal set of preemption points. Our approach improves the effectiveness of limited optimal preemption point placement algorithms by calculating the LCBs for each pair of adjacent preemptions to more accurately model task WCET and maximize schedulability as compared to existing preemption point placement approaches. We utilize dynamic programming technique to develop an optimal preemption point placement algorithm. Lastly, we will demonstrate, using a case study, improved task set schedulability and optimal preemption point placement via our new LCB characterization. We propose a new CRPD metric, called loaded cache blocks (LCB) which accurately characterizes the CRPD a real-time task may be subjected to due to the preemptive execution of higher priority tasks. We show how to integrate our new LCB metric into our newly developed algorithms that automatically place preemption points supporting linear control flow graphs (CFGs) for limited preemption scheduling applications. We extend the derivation of loaded cache blocks (LCB), that was proposed for linear control flow graphs (CFGs) to conditional CFGs. We show how to integrate our revised LCB metric into our newly developed algorithms that automatically place preemption points supporting conditional control flow graphs (CFGs) for limited preemption scheduling applications. For future work, we will verify the correctness of our framework through other measurable physical and hardware constraints. Also, we plan to complete our work on developing a generalized framework that can be seamlessly integrated into real-time schedulability analysis

    A fine-grain time-sharing Time Warp system

    Get PDF
    Although Parallel Discrete Event Simulation (PDES) platforms relying on the Time Warp (optimistic) synchronization protocol already allow for exploiting parallelism, several techniques have been proposed to further favor performance. Among them we can mention optimized approaches for state restore, as well as techniques for load balancing or (dynamically) controlling the speculation degree, the latter being specifically targeted at reducing the incidence of causality errors leading to waste of computation. However, in state of the art Time Warp systems, events’ processing is not preemptable, which may prevent the possibility to promptly react to the injection of higher priority (say lower timestamp) events. Delaying the processing of these events may, in turn, give rise to higher incidence of incorrect speculation. In this article we present the design and realization of a fine-grain time-sharing Time Warp system, to be run on multi-core Linux machines, which makes systematic use of event preemption in order to dynamically reassign the CPU to higher priority events/tasks. Our proposal is based on a truly dual mode execution, application vs platform, which includes a timer-interrupt based support for bringing control back to platform mode for possible CPU reassignment according to very fine grain periods. The latter facility is offered by an ad-hoc timer-interrupt management module for Linux, which we release, together with the overall time-sharing support, within the open source ROOT-Sim platform. An experimental assessment based on the classical PHOLD benchmark and two real world models is presented, which shows how our proposal effectively leads to the reduction of the incidence of causality errors, as compared to traditional Time Warp, especially when running with higher degrees of parallelism

    HeteroCore GPU to exploit TLP-resource diversity

    Get PDF

    Limited Preemptive Scheduling for Real-Time Systems: a Survey

    Get PDF
    The question whether preemptive algorithms are better than nonpreemptive ones for scheduling a set of real-time tasks has been debated for a long time in the research community. In fact, especially under fixed priority systems, each approach has advantages and disadvantages, and no one dominates the other when both predictability and efficiency have to be taken into account in the system design. Recently, limited preemption models have been proposed as a viable alternative between the two extreme cases of fully preemptive and nonpreemptive scheduling. This paper presents a survey of the existing approaches for reducing preemptions and compares them under different metrics, providing both qualitative and quantitative performance evaluations
    • …
    corecore