97,793 research outputs found
Improving the Performance and Endurance of Persistent Memory with Loose-Ordering Consistency
Persistent memory provides high-performance data persistence at main memory.
Memory writes need to be performed in strict order to satisfy storage
consistency requirements and enable correct recovery from system crashes.
Unfortunately, adhering to such a strict order significantly degrades system
performance and persistent memory endurance. This paper introduces a new
mechanism, Loose-Ordering Consistency (LOC), that satisfies the ordering
requirements at significantly lower performance and endurance loss. LOC
consists of two key techniques. First, Eager Commit eliminates the need to
perform a persistent commit record write within a transaction. We do so by
ensuring that we can determine the status of all committed transactions during
recovery by storing necessary metadata information statically with blocks of
data written to memory. Second, Speculative Persistence relaxes the write
ordering between transactions by allowing writes to be speculatively written to
persistent memory. A speculative write is made visible to software only after
its associated transaction commits. To enable this, our mechanism supports the
tracking of committed transaction ID and multi-versioning in the CPU cache. Our
evaluations show that LOC reduces the average performance overhead of memory
persistence from 66.9% to 34.9% and the memory write traffic overhead from
17.1% to 3.4% on a variety of workloads.Comment: This paper has been accepted by IEEE Transactions on Parallel and
Distributed System
Weak persistency semantics from the ground up: formalising the persistency semantics of ARMv8 and transactional models
Emerging non-volatile memory (NVM) technologies promise the durability of disks with the performance of volatile memory (RAM). To describe the persistency guarantees of NVM, several memory persistency models have been proposed in the literature. However, the formal persistency semantics of mainstream hardware is unexplored to date. To close this gap, we present a formal declarative framework for describing concurrency models in the NVM context, and then develop the PARMv8 persistency model as an instance of our framework, formalising the persistency semantics of the ARMv8 architecture for the first time. To facilitate correct persistent programming, we study transactions as a simple abstraction for concurrency and persistency control. We thus develop the PSER (persistent serialisability) persistency model, formalising transactional semantics in the NVM context for the first time, and demonstrate that PSER correctly compiles to PARMv8. This then enables programmers to write correct, concurrent and persistent programs, without having to understand the low-level architecture-specific persistency semantics of the underlying hardware
Architectural support for persistent memory systems
The long stated vision of persistent memory is set to be realized with the release of
3D XPoint memory by Intel and Micron. Persistent memory, as the name suggests,
amalgamates the persistence (non-volatility) property of storage devices (like disks)
with byte-addressability and low latency of memory. These properties of persistent
memory coupled with its accessibility through the processor load/store interface enable
programmers to design in-memory persistent data structures.
An important challenge in designing persistent memory systems is to provide support
for maintaining crash consistency of these in-memory data structures. Crash consistency
is necessary to ensure the correct recovery of program state after a crash. Ordering
is a primitive that can be used to design crash consistent programs. It provides
guarantees on the order of updates to persistent memory. Atomicity can also be used
to design crash consistent programs via two primitives. First, as an atomic durability
primitive which guarantees that in the presence of system crashes updates are made
durable atomically, which means either all or none of the updates are made durable.
Second, in the form of ACID transactions that guarantee atomic visibility and atomic
durability.
Existing systems do not support ordering, let alone atomic durability or ACID.
In fact, these systems implement various performance enhancing optimizations that
deliberately reorder updates to memory. Moreover, software in these systems cannot
explicitly control the movement of data from volatile cache to persistent memory.
Therefore, any ordering requirement has to be enforced synchronously which degrades
performance because program execution is stalled waiting for updates to reach persistent
memory. This thesis aims to provide the design principles and efficient implementations
for three crash consistency primitives: ordering, atomic durability and ACID
transactions.
A set of persistency models have been proposed recently which provide support for
the ordering primitive. This thesis extends the taxonomy of these models by adding
buffering, which allows the hardware to enforce ordering in the background, as a new
layer of classification. It then goes on show how the existing implementation of a
buffered model degenerates to a performance inefficient non-buffered model because
of the presence of conflicts and proposes efficient solutions to eliminate or limit the
impact of these conflicts with minimal hardware modifications. This thesis also proposes
the first implementation of a buffered model for a server class processor with
multi-banked caches and multiple memory controllers.
Write ahead logging (WAL) is a commonly used approach to provide atomic durability.
This thesis argues that existing implementations ofWAL in software are not only
inefficient, because of the fine grained ordering dependencies, but also waste precious
execution cycles to implement a fundamentally data movement task. It then proposes
ATOM, a hardware log manager based on undo logging that performs the logging operation
out of the critical path. This thesis presents the design principles behind ATOM
and two techniques that optimize its performance. These techniques enable the memory
controller to enforce fine grained ordering required for logging and to even perform
logging in some cases. In doing so, ATOM significantly reduces processor stall cycles
and improves performance.
The most commonly used abstraction employed to atomically update persistent
data is that of durable transactions with ACID (Atomicity, Consistency, Isolation
and Durability) semantics that make updates within a transaction both visible and
durable atomically. As a final contribution, this thesis tackles the problem of providing
efficient support for durable transactions in hardware by integrating hardware
support for atomic durability with hardware transactional memory (HTM). It proposes
DHTM (durable hardware transactional memory) in which durability is considered as
a first class design constraint. DHTM guarantees atomic durability via hardware redo-logging,
and integrates this logging support with a commercial HTM to provide atomic
visibility. Furthermore, DHTM leverages the same logging infrastructure to extend the
supported transaction size, from being L1-limited to the LLC, with minor changes to
the coherence protocol
- …