10 research outputs found
Fast Nonblocking Persistence for Concurrent Data Structures
We present a fully lock-free variant of our recent Montage system for persistent data structures. The variant, nbMontage, adds persistence to almost any nonblocking concurrent structure without introducing significant overhead or blocking of any kind. Like its predecessor, nbMontage is buffered durably linearizable: it guarantees that the state recovered in the wake of a crash will represent a consistent prefix of pre-crash execution. Unlike its predecessor, nbMontage ensures wait-free progress of the persistence frontier, thereby bounding the number of recent updates that may be lost on a crash, and allowing a thread to force an update of the frontier (i.e., to perform a sync operation) without the risk of blocking. As an extra benefit, the helping mechanism employed by our wait-free sync significantly reduces its latency.
Performance results for nonblocking queues, skip lists, trees, and hash tables rival custom data structures in the literature - dramatically faster than achieved with prior general-purpose systems, and generally within 50% of equivalent non-persistent structures placed in DRAM
Defining and Verifying Durable Opacity: Correctness for Persistent Software Transactional Memory
Non-volatile memory (NVM), aka persistent memory, is a new paradigm for
memory that preserves its contents even after power loss. The expected ubiquity
of NVM has stimulated interest in the design of novel concepts ensuring
correctness of concurrent programming abstractions in the face of persistency.
So far, this has lead to the design of a number of persistent concurrent data
structures, built to satisfy an associated notion of correctness: durable
linearizability.
In this paper, we transfer the principle of durable concurrent correctness to
the area of software transactional memory (STM). Software transactional memory
algorithms allow for concurrent access to shared state. Like linearizability
for concurrent data structures, opacity is the established notion of
correctness for STMs. First, we provide a novel definition of durable opacity
extending opacity to handle crashes and recovery in the context of NVM. Second,
we develop a durably opaque version of an existing STM algorithm, namely the
Transactional Mutex Lock (TML). Third, we design a proof technique for durable
opacity based on refinement between TML and an operational characterisation of
durable opacity by adapting the TMS2 specification. Finally, we apply this
proof technique to show that the durable version of TML is indeed durably
opaque. The correctness proof is mechanized within Isabelle.Comment: This is the full version of the paper that is to appear in FORTE 2020
(https://www.discotec.org/2020/forte
Persistency semantics of the Intel-x86 architecture
Emerging non-volatile memory (NVM) technologies promise the durability of disks with the performance of RAM. To describe the persistency guarantees of NVM, several memory persistency models have been proposed in the literature. However, the persistency semantics of the ubiquitous x86 architecture remains unexplored to date. To close this gap, we develop the Px86 (‘persistent x86’) model, formalising the persistency semantics of Intel-x86 for the first time. We formulate Px86 both operationally and declaratively, and prove that the two characterisations are equivalent. To demonstrate the application of Px86, we develop two persistent libraries over Px86: a persistent transactional library, and a persistent variant of the Michael–Scott queue. Finally, we encode our declarative Px86 model in Alloy and use it to generate persistency litmus tests automatically
The Fence Complexity of Persistent Sets
We study the psync complexity of concurrent sets in the non-volatile shared
memory model. Flush instructions are used in non-volatile memory to force
shared state to be written back to non-volatile memory and must typically be
accompanied by the use of expensive fence instructions to enforce ordering
among such flushes. Collectively we refer to a flush and a fence as a psync.
The safety property of strict linearizability forces crashed operations to take
effect before the crash or not take effect at all; the weaker property of
durable linearizability enforces this requirement only for operations that have
completed prior to the crash event. We consider lock-free implementations of
list-based sets and prove two lower bounds. We prove that for any durable
linearizable lock-free set there must exist an execution where some process
must perform at least one redundant psync as part of an update operation. We
introduce an extension to strict linearizability specialized for persistent
sets that we call strict limited effect (SLE) linearizability. SLE
linearizability explicitly ensures that operations do not take effect after a
crash which better reflects the original intentions of strict linearizability.
We show that it is impossible to implement SLE linearizable lock-free sets in
which read-only (or search) operations do not flush or fence. We undertake an
empirical study of persistent sets that examines various algorithmic design
techniques and the impact of flush instructions in practice. We present
concurrent set algorithms that provide matching upper bounds and rigorously
evaluate them against existing persistent sets to expose the impact of
algorithmic design and safety properties on psync complexity in practice as
well as the cost of recovering the data structure following a system crash
Persistent owicki-gries reasoning: a program logic for reasoning about persistent programs on Intel-x86
The advent of non-volatile memory (NVM) technologies is expected to transform how software systems are structured fundamentally, making the task of correct programming significantly harder. This is because ensuring that memory stores persist in the correct order is challenging, and requires low-level programming to flush the cache at appropriate points. This has in turn resulted in a noticeable verification gap. To address this, we study the verification of NVM programs, and present Persistent Owicki-Gries (POG), the first program logic for reasoning about such programs. We prove the soundness of POG over the recent Intel-x86 model, which formalises the out-of-order persistence of memory stores and the semantics of the Intel cache line flush instructions. We then use POG to verify several programs that interact with NVM
Infrastructure for Performance Monitoring and Analysis of Systems and Applications
The growth of High Performance Computer (HPC) systems increases the complexity with respect to understanding resource utilization, system management, and performance issues. HPC performance monitoring tools need to collect information at both the application and system levels to yield a complete performance picture. Existing approaches limit the abilities of the users to do meaningful analysis on actionable timescale. Efficient infrastructures are required to support largescale systems performance data analysis for both run-time troubleshooting and post-run processing modes. In this dissertation, we present methods to fill these gaps in the infrastructure for HPC performance monitoring and analysis. First, we enhance the architecture of a monitoring system to integrate streaming analysis capabilities at arbitrary locations within its data collection, transport, and aggregation facilities. Next, we present an approach to streaming collection of application performance data. We integrate these methods with a monitoring system used on large-scale computational platforms. Finally, we present a new approach for constructing durable transactional linked data structures that takes advantage of byte-addressable non-volatile memory technologies. Transactional data structures are building blocks of in-memory databases that are used by HPC monitoring systems to store and retrieve data efficiently. We evaluate the presented approaches on a series of case studies. The experiment results demonstrate the impact of our tools, while keeping the overhead in an acceptable margin