22,410 research outputs found
WiscSort: External Sorting For Byte-Addressable Storage
We present WiscSort, a new approach to high-performance concurrent sorting
for existing and future byte-addressable storage (BAS) devices. WiscSort
carefully reduces writes, exploits random reads by splitting keys and values
during sorting, and performs interference-aware scheduling with thread pool
sizing to avoid I/O bandwidth degradation. We introduce the BRAID model which
encompasses the unique characteristics of BAS devices. Many state-of-the-art
sorting systems do not comply with the BRAID model and deliver sub-optimal
performance, whereas WiscSort demonstrates the effectiveness of complying with
BRAID. We show that WiscSort is 2-7x faster than competing approaches on a
standard sort benchmark. We evaluate the effectiveness of key-value separation
on different key-value sizes and compare our concurrency optimizations with
various other concurrency models. Finally, we emulate generic BAS devices and
show how our techniques perform well with various combinations of hardware
properties
Instant restore after a media failure
Media failures usually leave database systems unavailable for several hours
until recovery is complete, especially in applications with large devices and
high transaction volume. Previous work introduced a technique called
single-pass restore, which increases restore bandwidth and thus substantially
decreases time to repair. Instant restore goes further as it permits read/write
access to any data on a device undergoing restore--even data not yet
restored--by restoring individual data segments on demand. Thus, the restore
process is guided primarily by the needs of applications, and the observed mean
time to repair is effectively reduced from several hours to a few seconds.
This paper presents an implementation and evaluation of instant restore. The
technique is incrementally implemented on a system starting with the
traditional ARIES design for logging and recovery. Experiments show that the
transaction latency perceived after a media failure can be cut down to less
than a second and that the overhead imposed by the technique on normal
processing is minimal. The net effect is that a few "nines" of availability are
added to the system using simple and low-overhead software techniques
- …