57 research outputs found
Storage Management with Multi-Version Partitioned B-Trees
Database Management Systems and K/V-Stores operate on updatable datasets --
massively exceeding the size of available main memory. Tree-based K/V storage
management structures became particularly popular in storage engines. B+ Trees
allow constant search performance, however write-heavy workloads yield in
inefficient write patterns to secondary storage devices and poor performance
characteristics. LSM-Trees overcome this issue by horizontal partitioning
fractions of data - small enough to fully reside in main memory, but require
frequent maintenance to sustain search performance.
Firstly, we propose Multi-Version Partitioned BTrees (MV-PBT) as sole storage
and index management structure in key-sorted storage engines like K/V-Stores.
Secondly, we compare MV-PBT against LSM-Trees. The logical horizontal
partitioning in MV-PBT allows leveraging recent advances in modern B-Tree
techniques in a small transparent and memory resident portion of the structure.
Structural properties sustain steady read performance, yielding efficient write
patterns and reducing write amplification.
We integrated MV-PBT in the WiredTiger KV storage engine. MV-PBT offers an up
to 2x increased steady throughput in comparison to LSM-Trees and several orders
of magnitude in comparison to B+ Trees in a YCSB workload.Comment: Extended Version, ADBIS 202
Dynamic Physiological Partitioning on a Shared-nothing Database Cluster
Traditional DBMS servers are usually over-provisioned for most of their daily
workloads and, because they do not show good-enough energy proportionality,
waste a lot of energy while underutilized. A cluster of small (wimpy) servers,
where its size can be dynamically adjusted to the current workload, offers
better energy characteristics for these workloads. Yet, data migration,
necessary to balance utilization among the nodes, is a non-trivial and
time-consuming task that may consume the energy saved. For this reason, a
sophisticated and easy to adjust partitioning scheme fostering dynamic
reorganization is needed. In this paper, we adapt a technique originally
created for SMP systems, called physiological partitioning, to distribute data
among nodes, that allows to easily repartition data without interrupting
transactions. We dynamically partition DB tables based on the nodes'
utilization and given energy constraints and compare our approach with physical
partitioning and logical partitioning methods. To quantify possible energy
saving and its conceivable drawback on query runtimes, we evaluate our
implementation on an experimental cluster and compare the results w.r.t.
performance and energy consumption. Depending on the workload, we can
substantially save energy without sacrificing too much performance
DeltaTree: A Practical Locality-aware Concurrent Search Tree
As other fundamental programming abstractions in energy-efficient computing,
search trees are expected to support both high parallelism and data locality.
However, existing highly-concurrent search trees such as red-black trees and
AVL trees do not consider data locality while existing locality-aware search
trees such as those based on the van Emde Boas layout (vEB-based trees), poorly
support concurrent (update) operations.
This paper presents DeltaTree, a practical locality-aware concurrent search
tree that combines both locality-optimisation techniques from vEB-based trees
and concurrency-optimisation techniques from non-blocking highly-concurrent
search trees. DeltaTree is a -ary leaf-oriented tree of DeltaNodes in which
each DeltaNode is a size-fixed tree-container with the van Emde Boas layout.
The expected memory transfer costs of DeltaTree's Search, Insert, and Delete
operations are , where are the tree size and the unknown
memory block size in the ideal cache model, respectively. DeltaTree's Search
operation is wait-free, providing prioritised lanes for Search operations, the
dominant operation in search trees. Its Insert and {\em Delete} operations are
non-blocking to other Search, Insert, and Delete operations, but they may be
occasionally blocked by maintenance operations that are sometimes triggered to
keep DeltaTree in good shape. Our experimental evaluation using the latest
implementation of AVL, red-black, and speculation friendly trees from the
Synchrobench benchmark has shown that DeltaTree is up to 5 times faster than
all of the three concurrent search trees for searching operations and up to 1.6
times faster for update operations when the update contention is not too high
A file-based linked data fragments approach to prefix search
Text-fields that need to look up specific entities in a dataset can be equipped with autocompletion functionality. When a dataset becomes too large to be embedded in the page, setting up a full-text search API is not the only alternative. Alternate API designs that balance different trade-offs such as archivability, cacheability and privacy, may not require setting up a new back-end architecture. In this paper, we propose to perform prefix search over a fragmentation of the dataset, enabling the client to take part in the query execution by navigating through the fragmented dataset. Our proposal consists of (i) a self-describing fragmentation strategy, (ii) a client search algorithm, and (iii) an evaluation of the proposed solution, based on a small dataset of 73k entities and a large dataset of 3.87 m entities. We found that the server cache hit ratio is three times higher compared to a server-side prefix search API, at the cost of a higher bandwidth consumption. Nevertheless, an acceptable user-perceived performance has been measured: assuming 150 ms as an acceptable waiting time between keystrokes, this approach allows 15 entities per prefix to be retrieved in this interval. We conclude that an alternate set of trade-offs has been established for specific prefix search use cases: having added more choice to the spectrum of Web APIs for autocompletion, a file-based approach enables more datasets to afford prefix search
Optimizing Key Distribution in Peer to Peer Network Using B-Trees
Peer to peer network architecture introduces many desired features including self-scalability that led to achieving higher efficiency rate than the traditional server-client architecture. This was contributed to the highly distributed architecture of peer to peer network. Meanwhile, the lack of a centralized control unit in peer to peer network introduces some challenge. One of these challenges is key distribution and management in such an architecture. This research will explore the possibility of developing a novel scheme for distributing and managing keys in peer to peer network architecture efficiently
- …