21 research outputs found
Using Hashing to Solve the Dictionary Problem (In External Memory)
We consider the dictionary problem in external memory and improve the update
time of the well-known buffer tree by roughly a logarithmic factor. For any
\lambda >= max {lg lg n, log_{M/B} (n/B)}, we can support updates in time
O(\lambda / B) and queries in sublogarithmic time, O(log_\lambda n). We also
present a lower bound in the cell-probe model showing that our data structure
is optimal.
In the RAM, hash tables have been used to solve the dictionary problem faster
than binary search for more than half a century. By contrast, our data
structure is the first to beat the comparison barrier in external memory. Ours
is also the first data structure to depart convincingly from the indivisibility
paradigm
Cache-oblivious dynamic dictionaries with update/query tradeoffs
Several existing cache-oblivious dynamic dictionaries
achieve O(logB N) (or slightly better O(logB
N over M )) memory
transfers per operation, where N is the number of
items stored, M is the memory size, and B is the
block size, which matches the classic B-tree data structure.
One recent structure achieves the same query
bound and a sometimes-better amortized update bound
of O (...) memory transfers.
This paper presents a new data structure, the xDict,
implementing predecessor queries in O(...)worstcase
memory transfers and insertions and deletions in
O (...) amortized memory transfers, for any constant " with 0 < epsilon < 1. For example, the xDict achieves subconstant amortized update cost when N = ..., whereas the B-tree’s ... is subconstant only when ... is subconstant only when N = .... The xDict attains the optimal tradeoff between insertions and queries, even in the broader external-memory model, for the range where inserts cost between (...) and O(1= lg3 N) memory transfers.Danish National Research Foundation (MADALGO (Center for Massive Data Algorithmics))National Science Foundation (U.S.) (NSF Grants CCF-0541209)National Science Foundation (U.S.) (NSF Grants CCF-0541209)Computing Innovation Fellow
External-Memory Dictionaries with Worst-Case Update Cost
The Bϵ-tree [Brodal and Fagerberg 2003] is a simple I/O-efficient external-memory-model data structure that supports updates orders of magnitude faster than B-tree with a query performance comparable to the B-tree: for any positive constant ϵ \u3c 1 insertions and deletions take O(B11-ϵ logB N) time (rather than O(logB N) time for the classic B-tree), queries take O(logB N) time and range queries returning k items take O(logB N + Bk) time. Although the Bϵ-tree has an optimal update/query tradeoff, the runtimes are amortized. Another structure, the write-optimized skip list, introduced by Bender et al. [PODS 2017], has the same performance as the Bϵ-tree but with runtimes that are randomized rather than amortized. In this paper, we present a variant of the Bϵ-tree with deterministic worst-case running times that are identical to the original’s amortized running times
External Memory Planar Point Location with Fast Updates
We study dynamic planar point location in the External Memory Model or Disk Access Model (DAM). Previous work in this model achieves polylog query and polylog amortized update time. We present a data structure with O(log_B^2 N) query time and O(1/B^(1-epsilon) log_B N) amortized update time, where N is the number of segments, B the block size and epsilon is a small positive constant, under the assumption that all faces have constant size. This is a B^(1-epsilon) factor faster for updates than the fastest previous structure, and brings the cost of insertion and deletion down to subconstant amortized time for reasonable choices of N and B. Our structure solves the problem of vertical ray-shooting queries among a dynamic set of interior-disjoint line segments; this is well-known to solve dynamic planar point location for a connected subdivision of the plane with faces of constant size
What Does Dynamic Optimality Mean in External Memory?
A data structure A is said to be dynamically optimal over a class of data structures ? if A is constant-competitive with every data structure C ? ?. Much of the research on binary search trees in the past forty years has focused on studying dynamic optimality over the class of binary search trees that are modified via rotations (and indeed, the question of whether splay trees are dynamically optimal has gained notoriety as the so-called dynamic-optimality conjecture). Recently, researchers have extended this to consider dynamic optimality over certain classes of external-memory search trees. In particular, Demaine, Iacono, Koumoutsos, and Langerman propose a class of external-memory trees that support a notion of tree rotations, and then give an elegant data structure, called the Belga B-tree, that is within an O(log log N)-factor of being dynamically optimal over this class.
In this paper, we revisit the question of how dynamic optimality should be defined in external memory. A defining characteristic of external-memory data structures is that there is a stark asymmetry between queries and inserts/updates/deletes: by making the former slightly asymptotically slower, one can make the latter significantly asymptotically faster (even allowing for operations with sub-constant amortized I/Os). This asymmetry makes it so that rotation-based search trees are not optimal (or even close to optimal) in insert/update/delete-heavy external-memory workloads. To study dynamic optimality for such workloads, one must consider a different class of data structures.
The natural class of data structures to consider are what we call buffered-propagation trees. Such trees can adapt dynamically to the locality properties of an input sequence in order to optimize the interactions between different inserts/updates/deletes and queries. We also present a new form of beyond-worst-case analysis that allows for us to formally study a continuum between static and dynamic optimality. Finally, we give a novel data structure, called the J?llo Tree, that is statically optimal and that achieves dynamic optimality for a large natural class of inputs defined by our beyond-worst-case analysis