1 research outputs found
Robin Hood Hashing really has constant average search cost and variance in full tables
Thirty years ago, the Robin Hood collision resolution strategy was introduced
for open addressing hash tables, and a recurrence equation was found for the
distribution of its search cost. Although this recurrence could not be solved
analytically, it allowed for numerical computations that, remarkably, suggested
that the variance of the search cost approached a value of when the
table was full. Furthermore, by using a non-standard mean-centered search
algorithm, this would imply that searches could be performed in expected
constant time even in a full table.
In spite of the time elapsed since these observations were made, no progress
has been made in proving them. In this paper we introduce a technique to work
around the intractability of the recurrence equation by solving instead an
associated differential equation. While this does not provide an exact
solution, it is sufficiently powerful to prove a bound for the variance, and
thus obtain a proof that the variance of Robin Hood is bounded by a small
constant for load factors arbitrarily close to 1. As a corollary, this proves
that the mean-centered search algorithm runs in expected constant time.
We also use this technique to study the performance of Robin Hood hash tables
under a long sequence of insertions and deletions, where deletions are
implemented by marking elements as deleted. We prove that, in this case, the
variance is bounded by , where is the load factor.
To model the behavior of these hash tables, we use a unified approach that
can be applied also to study the First-Come-First-Served and
Last-Come-First-Served collision resolution disciplines, both with and without
deletions