Search CORE

150 research outputs found

RCU Semantics: A First Attempt

Author: McKenney Paul E.
Walpole Jonathan
Publication venue: PDXScholar
Publication date: 01/01/2005
Field of study

There is not yet a formal statement of RCU (read-copy update) semantics. While this lack has thus far not been an impediment to adoption and use of RCU, it is quite possible that formal semantics would point the way towards tools that automatically validate uses of RCU or that permit RCU algorithms to be automatically generated by a parallel compiler. This paper is a first attempt to supply a formal definition of RCU. Or at least a semi-formal definition: although RCU does not yet wear a tux (though it does run in Linux), at least it might yet wear some clothes

PDXScholar (Portland State University)

The acceptability of promotional policies to teachers and administrators in five selected communities.

Author: Dodge Edmund E.
Lund Marion G.
McKenney Helen F.
Ruthman Paul E.
Scott Ronald P.
Publication venue: Boston University
Publication date: 01/01/1953
Field of study

Thesis (Ed.M.)--Boston Universit

Boston University Institutional Repository (OpenBU)

Resizable, Scalable, Concurrent Hash Tables

Author: McKenney Paul E.
Triplett Josh
Walpole Jonathan
Publication venue: PDXScholar
Publication date: 01/06/2011
Field of study

We present algorithms for shrinking and expanding a hash table while allowing concurrent, wait-free, linearly scalable lookups. These resize algorithms allow the hash table to maintain constant-time performance as the number of entries grows, and reclaim memory as the number of entries decreases, without delaying or disrupting readers. We implemented our algorithms in the Linux kernel, to test their performance and scalability. Benchmarks show lookup scalability improved 125x over readerwriter locking, and 56% over the current state-of-the-art for Linux, with no performance degradation for lookups during a resize. To achieve this performance, this hash table implementation uses a new concurrent programming methodology known as relativistic programming. In particular, we make use of an existing synchronization primitive which waits for all current readers to finish, with little to no reader overhead; careful use of this primitive allows ordering of updates without read-side synchronization or memory barriers

PDXScholar (Portland State University)

When Do Real Time Systems Need Multiple CPUs?

Author: Paul E Mckenney
Publication venue
Publication date: 24/04/2020
Field of study

Abstract Until recently, real-time systems were always single-CPU systems. The prospect of multiprocessing has arrived with the advent of low-cost and readily available multi-core systems. Now many RTOSes, perhaps most notably Linux TM , provide real-time response on multiprocessor systems. However, this begs the question as to whether your real-time application should avail itself of parallelism. Furthermore, if the answer is "yes," the next question is what form of parallelism your application should avail itself of: shared memory parallelism with locking and threads, process pipelines, multiple cooperating processes, or one of a number of other approaches. This paper will examine these questions, providing rules of thumb to help you choose whether your real-time application should be parallel, and, if so, what sort of parallelism is best for you

CiteSeerX

Universal Wait-Free Memory Reclamation

Author: Evans Jason
McKenney Paul E.
Treiber R. K.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/01/2020
Field of study

In this paper, we present a universal memory reclamation scheme, Wait-Free Eras (WFE), for deleted memory blocks in wait-free concurrent data structures. WFE's key innovation is that it is completely wait-free. Although some prior techniques provide similar guarantees for certain data structures, they lack support for arbitrary wait-free data structures. Consequently, developers are typically forced to marry their wait-free data structures with lock-free Hazard Pointers or (potentially blocking) epoch-based memory reclamation. Since both these schemes provide weaker progress guarantees, they essentially forfeit the strong progress guarantee of wait-free data structures. Though making the original Hazard Pointers scheme or epoch-based reclamation completely wait-free seems infeasible, we achieved this goal with a more recent, (lock-free) Hazard Eras scheme, which we extend to guarantee wait-freedom. As this extension is non-trivial, we discuss all challenges pertaining to the construction of universal wait-free memory reclamation. WFE is implementable on ubiquitous x86_64 and AArch64 (ARM) architectures. Its API is mostly compatible with Hazard Pointers, which allows easy transitioning of existing data structures into WFE. Our experimental evaluations show that WFE's performance is close to epoch-based reclamation and almost matches the original Hazard Eras scheme, while providing the stronger wait-free progress guarantee

arXiv.org e-Print Archive

Crossref

Recommended from our members

Position estimation

Author: McKenney Paul E.
Publication venue: Oregon State University. Department of Computer Science
Publication date
Field of study

Acoustic position estimation is used where high accuracy navigation is required over a small area, such as for searching or for collecting gravitational or geomagnetic data. In this position estimation method, a surface ship or submersible periodically sends out a high-frequency acoustic 'ping' at a prearranged frequency. This ping is received by an array of transponders attached to the ocean floor, each of these transponders 'replies' to the ping with another ping at its own prearranged frequency. The ship records the times elapsed from when it sent out its ping to when it received each of the transponder's replies. The ship can then convert these elapsed 'round trip times' into distances, and can compute its position relative to the transponder array. However, the relative positions of the transponders must be known. If the ship had some way of accurately determining its position when it deployed the transponders, it would not have needed to deploy them in the first place (since the only purpose of the transponders it to determine the ship's position accurately). Furthermore, even if the ship did know its position when it deployed a given transponder, there are many forces (such as ocean currents) that would prevent the transponder from descending exactly straight down. This paper presents an algorithm that can determine the relative positions of the transponders in an array from acoustic measurements collected by the ship. This algorithm makes use of a second-order Newton's method with exact linesearch to minimize a.n error function whose domain is the set of coordinates of all the transponder and ship positions involved in the acoustic measurements. This algorithm will be qualitatively compared to an older algorithm that has been used to solve the problem

ScholarsArchive@OSU

Concurrent Search Data Structures Can Be Blocking and Practically Wait-Free

Author: Abramson Morton
Intel
John
Kivity Avi
McKenney Paul E
McKenney Paul E.
Timothy
Uhlig Volkmar
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/09/2016
Field of study

We argue that there is virtually no practical situation in which one should seek a "theoretically wait-free" algorithm at the expense of a state-of-the-art blocking algorithm in the case of search data structures: blocking algorithms are simple, fast, and can be made "practically wait-free". We draw this conclusion based on the most exhaustive study of blocking search data structures to date. We consider (a) different search data structures of different sizes, (b) numerous uniform and non-uniform workloads, representative of a wide range of practical scenarios, with different percentages of update operations, (c) with and without delayed threads, (d) on different hardware technologies, including processors providing HTM instructions. We explain our claim that blocking search data structures are practically wait-free through an analogy with the birthday paradox, revealing that, in state-of-the-art algorithms implementing such data structures, the probability of conflicts is extremely small. When conflicts occur as a result of context switches and interrupts, we show that HTM-based locks enable blocking algorithms to cope with the

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Regular Topologies for Gigabit Wide-Area Networks: Congestion Avoidance Testbed Experiments

Author: Denny Barbara A.
Lee Danny
McKenney Paul E., Sr.
Publication venue
Publication date
Field of study

This document is Volume 3 of the final technical report on the work performed by SRI International (SRI) on SRI Project 8600. The document includes source listings for all software developed by SRI under this effort. Since some of our work involved the use of ST-II and the Sun Microsystems, Inc. (Sun) High-Speed Serial Interface (HSI/S) driver, we have included some of the source developed by LBL and BBN as well. In most cases, our decision to include source developed by other contractors depended on whether it was necessary to modify the original code. If we have modified the software in any way, it is included in this document. In the case of the Traffic Generator (TG), however, we have included all the ST-II software, even though BBN performed the integration, because the ST-II software is part of the standard TG release. It is important to note that all the code developed by other contractors is in the public domain, so that all software developed under this effort can be re-created from the source included here

NASA Technical Reports Server

Congestion Avoidance Testbed Experiments

Author: Denny Barbara A.
Lee Danny
Lee Diane S.
McKenney Paul E., Sr.
Publication venue
Publication date
Field of study

DARTnet provides an excellent environment for executing networking experiments. Since the network is private and spans the continental United States, it gives researchers a great opportunity to test network behavior under controlled conditions. However, this opportunity is not available very often, and therefore a support environment for such testing is lacking. To help remedy this situation, part of SRI's effort in this project was devoted to advancing the state of the art in the techniques used for benchmarking network performance. The second objective of SRI's effort in this project was to advance networking technology in the area of traffic control, and to test our ideas on DARTnet, using the tools we developed to improve benchmarking networks. Networks are becoming more common and are being used by more and more people. The applications, such as multimedia conferencing and distributed simulations, are also placing greater demand on the resources the networks provide. Hence, new mechanisms for traffic control must be created to enable their networks to serve the needs of their users. SRI's objective, therefore, was to investigate a new queueing and scheduling approach that will help to meet the needs of a large, diverse user population in a "fair" way

NASA Technical Reports Server

Asynchronized Concurrency: The Secret to Scaling Concurrent Search Data Structures

Author: Arcangeli Andrea
Boyd-Wickizer Silas
David Tudor
Fan Bin
Herlihy Maurice
Intel
McKenney Paul E
McKenney Paul E
Nishtala Rajesh
Timothy
Triplett Josh
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 13/04/2015
Field of study

We introduce "asynchronized concurrency (ASCY),'' a paradigm consisting of four complementary programming patterns. ASCY calls for the design of concurrent search data structures (CSDSs) to resemble that of their sequential counterparts. We argue that ASCY leads to implementations which are portably scalable: they scale across different types of hardware platforms, including single and multi-socket ones, for various classes of workloads, such as read-only and read-write, and according to different performance metrics, including throughput, latency, and energy. We substantiate our thesis through the most exhaustive evaluation of CSDSs to date, involving 6 platforms, 22 state-of-the-art CSDS algorithms, 10 re-engineered state-of-the-art CSDS algorithms following the ASCY patterns, and 2 new CSDS algorithms designed with ASCY in mind. We observe up to 30% improvements in throughput in the re-engineered algorithms, while our new algorithms out-perform the state-of-the-art alternatives

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Crossref