25,279 research outputs found
DeltaTree: A Practical Locality-aware Concurrent Search Tree
As other fundamental programming abstractions in energy-efficient computing,
search trees are expected to support both high parallelism and data locality.
However, existing highly-concurrent search trees such as red-black trees and
AVL trees do not consider data locality while existing locality-aware search
trees such as those based on the van Emde Boas layout (vEB-based trees), poorly
support concurrent (update) operations.
This paper presents DeltaTree, a practical locality-aware concurrent search
tree that combines both locality-optimisation techniques from vEB-based trees
and concurrency-optimisation techniques from non-blocking highly-concurrent
search trees. DeltaTree is a -ary leaf-oriented tree of DeltaNodes in which
each DeltaNode is a size-fixed tree-container with the van Emde Boas layout.
The expected memory transfer costs of DeltaTree's Search, Insert, and Delete
operations are , where are the tree size and the unknown
memory block size in the ideal cache model, respectively. DeltaTree's Search
operation is wait-free, providing prioritised lanes for Search operations, the
dominant operation in search trees. Its Insert and {\em Delete} operations are
non-blocking to other Search, Insert, and Delete operations, but they may be
occasionally blocked by maintenance operations that are sometimes triggered to
keep DeltaTree in good shape. Our experimental evaluation using the latest
implementation of AVL, red-black, and speculation friendly trees from the
Synchrobench benchmark has shown that DeltaTree is up to 5 times faster than
all of the three concurrent search trees for searching operations and up to 1.6
times faster for update operations when the update contention is not too high
Fast Dynamic Arrays
We present a highly optimized implementation of tiered vectors, a data
structure for maintaining a sequence of elements supporting access in time
and insertion and deletion in time for
while using extra space. We consider several different implementation
optimizations in C++ and compare their performance to that of vector and
multiset from the standard library on sequences with up to elements. Our
fastest implementation uses much less space than multiset while providing
speedups of for access operations compared to multiset and speedups
of compared to vector for insertion and deletion operations
while being competitive with both data structures for all other operations
Strengthened Lazy Heaps: Surpassing the Lower Bounds for Binary Heaps
Let denote the number of elements currently in a data structure. An
in-place heap is stored in the first locations of an array, uses
extra space, and supports the operations: minimum, insert, and extract-min. We
introduce an in-place heap, for which minimum and insert take worst-case
time, and extract-min takes worst-case time and involves at most
element comparisons. The achieved bounds are optimal to within
additive constant terms for the number of element comparisons. In particular,
these bounds for both insert and extract-min -and the time bound for insert-
surpass the corresponding lower bounds known for binary heaps, though our data
structure is similar. In a binary heap, when viewed as a nearly complete binary
tree, every node other than the root obeys the heap property, i.e. the element
at a node is not smaller than that at its parent. To surpass the lower bound
for extract-min, we reinforce a stronger property at the bottom levels of the
heap that the element at any right child is not smaller than that at its left
sibling. To surpass the lower bound for insert, we buffer insertions and allow
nodes to violate heap order in relation to their parents
POPE: Partial Order Preserving Encoding
Recently there has been much interest in performing search queries over
encrypted data to enable functionality while protecting sensitive data. One
particularly efficient mechanism for executing such queries is order-preserving
encryption/encoding (OPE) which results in ciphertexts that preserve the
relative order of the underlying plaintexts thus allowing range and comparison
queries to be performed directly on ciphertexts. In this paper, we propose an
alternative approach to range queries over encrypted data that is optimized to
support insert-heavy workloads as are common in "big data" applications while
still maintaining search functionality and achieving stronger security.
Specifically, we propose a new primitive called partial order preserving
encoding (POPE) that achieves ideal OPE security with frequency hiding and also
leaves a sizable fraction of the data pairwise incomparable. Using only O(1)
persistent and non-persistent client storage for
, our POPE scheme provides extremely fast batch insertion
consisting of a single round, and efficient search with O(1) amortized cost for
up to search queries. This improved security and
performance makes our scheme better suited for today's insert-heavy databases.Comment: Appears in ACM CCS 2016 Proceeding
Random Indexing K-tree
Random Indexing (RI) K-tree is the combination of two algorithms for
clustering. Many large scale problems exist in document clustering. RI K-tree
scales well with large inputs due to its low complexity. It also exhibits
features that are useful for managing a changing collection. Furthermore, it
solves previous issues with sparse document vectors when using K-tree. The
algorithms and data structures are defined, explained and motivated. Specific
modifications to K-tree are made for use with RI. Experiments have been
executed to measure quality. The results indicate that RI K-tree improves
document cluster quality over the original K-tree algorithm.Comment: 8 pages, ADCS 2009; Hyperref and cleveref LaTeX packages conflicted.
Removed clevere
Managing Unbounded-Length Keys in Comparison-Driven Data Structures with Applications to On-Line Indexing
This paper presents a general technique for optimally transforming any
dynamic data structure that operates on atomic and indivisible keys by
constant-time comparisons, into a data structure that handles unbounded-length
keys whose comparison cost is not a constant. Examples of these keys are
strings, multi-dimensional points, multiple-precision numbers, multi-key data
(e.g.~records), XML paths, URL addresses, etc. The technique is more general
than what has been done in previous work as no particular exploitation of the
underlying structure of is required. The only requirement is that the insertion
of a key must identify its predecessor or its successor.
Using the proposed technique, online suffix tree can be constructed in worst
case time per input symbol (as opposed to amortized
time per symbol, achieved by previously known algorithms). To our knowledge,
our algorithm is the first that achieves worst case time per input
symbol. Searching for a pattern of length in the resulting suffix tree
takes time, where is the
number of occurrences of the pattern. The paper also describes more
applications and show how to obtain alternative methods for dealing with suffix
sorting, dynamic lowest common ancestors and order maintenance
- …