64 research outputs found

    MCSH, a lock with the standard interface

    Get PDF
    The MCS lock of Mellor-Crummey and Scott (1991), 23 pages. is a very efficient first-come first-served mutual-exclusion algorithm that uses the atomic hardware primitives fetch-and-store and compare-and-swap. However, it has the disadvantage that the calling thread must provide a pointer to an allocated record. This additional parameter violates the standard locking interface, which has only the lock as a parameter. Hence, it is impossible to switch to MCS without editing and recompiling an application that uses locks.This article provides a variation of MCS with the standard interface, which remains FCFS, called MCSH. One key ingredient is to stack allocate the necessary record in the acquire procedure of the lock, so its life-time only spans the delay to enter a critical section. A second key ingredient is communicating the allocated record between the acquire and release procedures through the lock to maintain the standard locking interface. Both of these practices are known to practitioners, but our solution combines them in a unique way. Furthermore, when these practices are used in prior papers, their correctness is often argued informally. The correctness of MCSH is verified rigorously with the proof assistant PVS, and experiments are run to compare its performance with MCS and similar locks

    Correctness and concurrent complexity of the Black-White Bakery Algorithm

    Get PDF
    Lamport’s Bakery Algorithm (Commun ACM 17:453–455, 1974) implements mutual exclusion for a fixed number of threads with the first-come first-served property. It has the disadvantage, however, that it uses integer communication variables that can become arbitrarily large. Taubenfeld’s Black-White Bakery Algorithm (Proceedings of the DISC. LNCS, vol 3274, pp 56–70, 2004) keeps the integers bounded, and is adaptive in the sense that the time complexity only depends on the number of competing threads, say N. The present paper offers an assertional proof of correctness and shows that the concurrent complexity for throughput is linear in N, and for individual progress is quadratic in N. This is proved with a bounded version of UNITY, i.e., by assertional means

    Recoverable, Abortable, and Adaptive Mutual Exclusion with Sublogarithmic RMR Complexity

    Get PDF
    We present the first recoverable mutual exclusion (RME) algorithm that is simultaneously abortable, adaptive to point contention, and with sublogarithmic RMR complexity. Our algorithm has O(min(K,log_W N)) RMR passage complexity and O(F + min(K,log_W N)) RMR super-passage complexity, where K is the number of concurrent processes (point contention), W is the size (in bits) of registers, and F is the number of crashes in a super-passage. Under the standard assumption that W = ?(log N), these bounds translate to worst-case O((log N)/(log log N)) passage complexity and O(F + (log N)/(log log N)) super-passage complexity. Our key building blocks are: - A D-process abortable RME algorithm, for D ? W, with O(1) passage complexity and O(1+F) super-passage complexity. We obtain this algorithm by using the Fetch-And-Add (FAA) primitive, unlike prior work on RME that uses Fetch-And-Store (FAS/SWAP). - A generic transformation that transforms any abortable RME algorithm with passage complexity of B < W, into an abortable RME lock with passage complexity of O(min(K,B))

    A complexity separation between the cache-coherent and distributed shared memory models

    Full text link

    A comprehensive survey on cooperative intersection management for heterogeneous connected vehicles

    Get PDF
    Nowadays, with the advancement of technology, world is trending toward high mobility and dynamics. In this context, intersection management (IM) as one of the most crucial elements of the transportation sector demands high attention. Today, road entities including infrastructures, vulnerable road users (VRUs) such as motorcycles, moped, scooters, pedestrians, bicycles, and other types of vehicles such as trucks, buses, cars, emergency vehicles, and railway vehicles like trains or trams are able to communicate cooperatively using vehicle-to-everything (V2X) communications and provide traffic safety, efficiency, infotainment and ecological improvements. In this paper, we take into account different types of intersections in terms of signalized, semi-autonomous (hybrid) and autonomous intersections and conduct a comprehensive survey on various intersection management methods for heterogeneous connected vehicles (CVs). We consider heterogeneous classes of vehicles such as road and rail vehicles as well as VRUs including bicycles, scooters and motorcycles. All kinds of intersection goals, modeling, coordination architectures, scheduling policies are thoroughly discussed. Signalized and semi-autonomous intersections are assessed with respect to these parameters. We especially focus on autonomous intersection management (AIM) and categorize this section based on four major goals involving safety, efficiency, infotainment and environment. Each intersection goal provides an in-depth investigation on the corresponding literature from the aforementioned perspectives. Moreover, robustness and resiliency of IM are explored from diverse points of view encompassing sensors, information management and sharing, planning universal scheme, heterogeneous collaboration, vehicle classification, quality measurement, external factors, intersection types, localization faults, communication anomalies and channel optimization, synchronization, vehicle dynamics and model mismatch, model uncertainties, recovery, security and privacy

    A distributed scheduling algorithm for quality of service support in multiaccess networks

    Get PDF
    Thesis (S.B. and M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.Includes bibliographical references (leaves 94-95).This thesis presents a distributed scheduling algorithm for the support of quality of service in multiaccess networks. Unlike most contention-based multiaccess protocols which offer no quality of service guarantee and suffer the problems of fairness and low throughput at high load, our algorithm provides fairness and bandwidth reservation in an integrated services environment and at the same time achieves high throughput. Moreover, while most reservation-based multiaccess protocols require a centralized scheduler and a separate channel for arbitration, our algorithm is truly distributed in the sense that network nodes coordinate their transmissions only via headers in the packets. We derive theoretical bounds illustrating how our distributed algorithm approximates the optimal centralized algorithm. Simulation results are also presented to justify our claims.by Craig Ian Barrack.S.B.and M.Eng

    Tournaments for mutual exclusion:verification and concurrent complexity

    Get PDF
    Given a mutual exclusion algorithm MXd for d≥2d≥2 threads, a mutual exclusion algorithm for N>dN>d threads can be built in a tree of degree d with N leaves, with the critical section at the root of the tree. This tournament solution seems obviously correct and efficient. The present note proves the correctness, and formalizes the efficiency in terms of concurrent complexity by means of Bounded Unity. If the tree is balanced, the throughput is logarithmic in N. If moreover MXd satisfies FCFS (first-come first-served), the worst case individual delay of the tournament algorithm is of order N. This is optimal

    GPU Resource Optimization and Scheduling for Shared Execution Environments

    Get PDF
    General purpose graphics processing units have become a computing workhorse for a variety of data- and compute-intensive applications, from large supercomputing systems for massive data analytics to small, mobile embedded devices for autonomous vehicles. Making effective and efficient use of these processors traditionally relies on extensive programmer expertise to design and develop kernel methods which simultaneously trade off task decomposition and resource exploitation. Often, new architecture designs force code refinements in order to continue to achieve optimal performance. At the same time, not all applications require full utilization of the system to achieve that optimal performance. In this case, the increased capability of new architectures introduces an ever-widening gap between the level of resources necessary for optimal performance and the level necessary to maintain system efficiency. The ability to schedule and execute multiple independent tasks on a GPU, known generally as concurrent kernel execution, enables application programmers and system developers to balance application performance and system efficiency. Various approaches to develop both coarse- and fine-grained scheduling mechanisms to achieve a high degree of resource utilization and improved application performance have been studied. Most of these works focus on mechanisms for the management of compute resources, while a small percentage consider the data transfer channels. In this dissertation, we propose a pragmatic approach to scheduling and managing both types of resources – data transfer and compute – that is transparent to an application programmer and capable of providing near-optimal system performance. Furthermore, the approaches described herein rely on reinforcement learning methods, which enable the scheduling solutions to be flexible to a variety of factors, such as transient application behaviors, changing system designs, and tunable objective functions. Finally, we describe a framework for the practical implementation of learned scheduling policies to achieve high resource utilization and efficient system performance
    • …
    corecore