104,610 research outputs found
Faster deterministic sorting and priority queues in linear space
The RAM complexity of deterministic linear space sorting of integers in words is improved from to . No better bounds are known for polynomial space. In fact, the techniques give a deterministic linear space priority queue supporting insert and delete in amortized time and find-min in constant time. The priority queue can be implemented using addition, shift, and bit-wise boolean operations
Efficient Gauss Elimination for Near-Quadratic Matrices with One Short Random Block per Row, with Applications
In this paper we identify a new class of sparse near-quadratic random Boolean matrices that have full row rank over F_2 = {0,1} with high probability and can be transformed into echelon form in almost linear time by a simple version of Gauss elimination. The random matrix with dimensions n(1-epsilon) x n is generated as follows: In each row, identify a block of length L = O((log n)/epsilon) at a random position. The entries outside the block are 0, the entries inside the block are given by fair coin tosses. Sorting the rows according to the positions of the blocks transforms the matrix into a kind of band matrix, on which, as it turns out, Gauss elimination works very efficiently with high probability. For the proof, the effects of Gauss elimination are interpreted as a ("coin-flipping") variant of Robin Hood hashing, whose behaviour can be captured in terms of a simple Markov model from queuing theory. Bounds for expected construction time and high success probability follow from results in this area. They readily extend to larger finite fields in place of F_2.
By employing hashing, this matrix family leads to a new implementation of a retrieval data structure, which represents an arbitrary function f: S -> {0,1} for some set S of m = (1-epsilon)n keys. It requires m/(1-epsilon) bits of space, construction takes O(m/epsilon^2) expected time on a word RAM, while queries take O(1/epsilon) time and access only one contiguous segment of O((log m)/epsilon) bits in the representation (O(1/epsilon) consecutive words on a word RAM). The method is readily implemented and highly practical, and it is competitive with state-of-the-art methods. In a more theoretical variant, which works only for unrealistically large S, we can even achieve construction time O(m/epsilon) and query time O(1), accessing O(1) contiguous memory words for a query. By well-established methods the retrieval data structure leads to efficient constructions of (static) perfect hash functions and (static) Bloom filters with almost optimal space and very local storage access patterns for queries
A Time-Space Tradeoff for Triangulations of Points in the Plane
In this paper, we consider time-space trade-offs for reporting a triangulation of points in the plane. The goal is to minimize the amount of working space while keeping the total running time small. We present the first multi-pass algorithm on the problem that returns the edges of a triangulation with their adjacency information. This even improves the previously best known random-access algorithm
Selection from read-only memory with limited workspace
Given an unordered array of elements drawn from a totally ordered set and
an integer in the range from to , in the classic selection problem
the task is to find the -th smallest element in the array. We study the
complexity of this problem in the space-restricted random-access model: The
input array is stored on read-only memory, and the algorithm has access to a
limited amount of workspace. We prove that the linear-time prune-and-search
algorithm---presented in most textbooks on algorithms---can be modified to use
bits instead of words of extra space. Prior to our
work, the best known algorithm by Frederickson could perform the task with
bits of extra space in time. Our result separates
the space-restricted random-access model and the multi-pass streaming model,
since we can surpass the lower bound known for the latter
model. We also generalize our algorithm for the case when the size of the
workspace is bits, where . The running time
of our generalized algorithm is ,
slightly improving over the
bound of Frederickson's algorithm. To obtain the improvements mentioned above,
we developed a new data structure, called the wavelet stack, that we use for
repeated pruning. We expect the wavelet stack to be a useful tool in other
applications as well.Comment: 16 pages, 1 figure, Preliminary version appeared in COCOON-201
Histogram-Aware Sorting for Enhanced Word-Aligned Compression in Bitmap Indexes
Bitmap indexes must be compressed to reduce input/output costs and minimize
CPU usage. To accelerate logical operations (AND, OR, XOR) over bitmaps, we use
techniques based on run-length encoding (RLE), such as Word-Aligned Hybrid
(WAH) compression. These techniques are sensitive to the order of the rows: a
simple lexicographical sort can divide the index size by 9 and make indexes
several times faster. We investigate reordering heuristics based on computed
attribute-value histograms. Simply permuting the columns of the table based on
these histograms can increase the sorting efficiency by 40%.Comment: To appear in proceedings of DOLAP 200
- …