77,451 research outputs found
Concurrent rebalancing on hyperred-black trees
The HyperRed-Black trees are a relaxed version of Red-Black
trees accepting high degree of concurrency. In the Red-Black trees
consecutive red nodes are forbidden. This restriction has been
withdrawn in the Chromatic trees. They have been introduced by
O.~Nurmi and E.~Soisalon-Soininen to work in a concurrent
environment. A Chromatic tree can have big clusters of red nodes
surrounded by black nodes. Nevertheless, concurrent rebalancing of
Chromatic trees into Red-Black trees has a serious drawback:
in big cluster of red nodes only the top node can be updated. Direct
updating inside the cluster is forbidden. This approach gives us
limited degree of concurrency. The HyperRed-Black trees has been
designed to solve this problem. It is possible to update red nodes in
the inside of a red cluster. In a HyperRed-Black tree nodes can
have a multiplicity of colors; they can be red, black or hyper-red.Postprint (published version
Fast Dynamic Arrays
We present a highly optimized implementation of tiered vectors, a data
structure for maintaining a sequence of elements supporting access in time
and insertion and deletion in time for
while using extra space. We consider several different implementation
optimizations in C++ and compare their performance to that of vector and
multiset from the standard library on sequences with up to elements. Our
fastest implementation uses much less space than multiset while providing
speedups of for access operations compared to multiset and speedups
of compared to vector for insertion and deletion operations
while being competitive with both data structures for all other operations
Top-Down Skiplists
We describe todolists (top-down skiplists), a variant of skiplists (Pugh
1990) that can execute searches using at most
binary comparisons per search and that have amortized update time
. A variant of todolists, called working-todolists,
can execute a search for any element using binary comparisons and have amortized search time
. Here, is the "working-set number" of
. No previous data structure is known to achieve a bound better than
comparisons. We show through experiments that, if implemented
carefully, todolists are comparable to other common dictionary implementations
in terms of insertion times and outperform them in terms of search times.Comment: 18 pages, 5 figure
Fast Parallel Operations on Search Trees
Using (a,b)-trees as an example, we show how to perform a parallel split with
logarithmic latency and parallel join, bulk updates, intersection, union (or
merge), and (symmetric) set difference with logarithmic latency and with
information theoretically optimal work. We present both asymptotically optimal
solutions and simplified versions that perform well in practice - they are
several times faster than previous implementations
Computing LZ77 in Run-Compressed Space
In this paper, we show that the LZ77 factorization of a text T {\in\Sigma^n}
can be computed in O(R log n) bits of working space and O(n log R) time, R
being the number of runs in the Burrows-Wheeler transform of T reversed. For
extremely repetitive inputs, the working space can be as low as O(log n) bits:
exponentially smaller than the text itself. As a direct consequence of our
result, we show that a class of repetition-aware self-indexes based on a
combination of run-length encoded BWT and LZ77 can be built in asymptotically
optimal O(R + z) words of working space, z being the size of the LZ77 parsing
Efficient estimation of AUC in a sliding window
In many applications, monitoring area under the ROC curve (AUC) in a sliding
window over a data stream is a natural way of detecting changes in the system.
The drawback is that computing AUC in a sliding window is expensive, especially
if the window size is large and the data flow is significant.
In this paper we propose a scheme for maintaining an approximate AUC in a
sliding window of length . More specifically, we propose an algorithm that,
given , estimates AUC within , and can maintain this
estimate in time, per update, as the window slides.
This provides a speed-up over the exact computation of AUC, which requires
time, per update. The speed-up becomes more significant as the size of
the window increases. Our estimate is based on grouping the data points
together, and using these groups to calculate AUC. The grouping is designed
carefully such that () the groups are small enough, so that the error stays
small, () the number of groups is small, so that enumerating them is not
expensive, and () the definition is flexible enough so that we can
maintain the groups efficiently.
Our experimental evaluation demonstrates that the average approximation error
in practice is much smaller than the approximation guarantee ,
and that we can achieve significant speed-ups with only a modest sacrifice in
accuracy
- …