5 research outputs found
A Framework for Index Bulk Loading and Dynamization
In this paper we investigate automated methods for externalizing
internal memory data structures. We consider a class of balanced trees that we
call weight-balanced partitioning trees (or wp-trees) for indexing a set of points
in Rd. Well-known examples of wp-trees include fed-trees, BBD-trees, pseudo
quad trees, and BAR trees. These trees are defined with fixed degree and are
thus suited for internal memory implementations. Given an efficient wp-tree
construction algorithm, we present a general framework for automatically obtaining
a new dynamic external data structure. Using this framework together
with a new general construction (bulk loading) technique of independent interest,
we obtain data structures with guaranteed good update performance in
terms of I /O transfers. Our approach gives considerably improved construction
and update I/O bounds of e.g. fed-trees and BBD-trees
Revised version of ``Efficient Cross-Trees for External Memory''
Due to a printing problem, the revised version of our
paper (Efficient cross-trees for external memory,
in ``External Memory Algorithms and Visualization'',
James Abello and Jeffrey Scott Vitter eds.,
DIMACS Series in Discrete Mathematics and Theoretical Computer
Science, American Mathematical Society Press, Providence, RI, 1999)
has been replaced by the (shorter) submitted
version. In this technical report, we include the revised version
that has not been published by mistake. In particular,
we describe efficient methods for organizing and maintaining large
multidimensional data sets in external memory. This is particular
important as access to external memory is currently several order of
magnitudes slower than access to main memory, and current technology
advances are likely to make this gap even wider. We focus
particularly on multidimensional data sets which must be kept
simultaneously sorted under several total orderings: these orderings
may be defined by the user, and may also be changed dynamically by
the user throughout the lifetime of the data structures, according
to the application at hand. Besides standard insertions and
deletions of data, our proposed solution can perform efficiently
{\em split\/} and {\em concatenate\/} operations on the whole data
sets according to any ordering. This allows the user: (1)~to
dynamically rearrange any ordering of a segment of data, in a time
that is faster than recomputing the new ordering from scratch;
(2)~to efficiently answer queries related to the data contained in a
particular range of the current orderings. Our solution fully
generalizes the notion of B-trees to higher dimensions by carefully
combining space-driven and data-driven partitions. Balancing is easy
as we introduce a new multidimensional data structure, {\em the
cross-tree}, that is the cross product of balanced trees. As a
result, the cross-tree is competitive with other popular index data
structures that require linear space (including k-d trees,
quad-trees, grid files, space filling curves, hB-trees, and
R-trees)