2 research outputs found
A Memory-optimized Bloom Filter using An Additional Hashing Function
Abstract — A Bloom filter is a simple space-efficient randomized data structure for the representation set of items in order to support membership queries. In recent years, Bloom filters have increased in popularity in database and networking applications. In this paper, we introduce a new extension to optimize memory utilization for regular Bloom filters, called Bloom filter with an additional hashing function (BFAH). The regular Bloom filter stores items from a set k times k memory locations that are determined by the k addresses stored in the bit-array structure. Which k addresses to utilize is determined by to which positions in the structure the k (regular) hashing functions are pointing to. Utilizing the additional hashing function, only one out of these k memory addresses is selected to store the item only once. Consequently, there is no longer needed to store the k − 1 redundant copies. We implemented our approach in a software packet classifier based on tuple space search with the H3 class of universal hashing functions. Our results show that our approach is able to reduce the number of collisions when compared to a regular Bloom filter
Optimizing Bloom Filter: Challenges, Solutions, and Comparisons
Bloom filter (BF) has been widely used to support membership query, i.e., to
judge whether a given element x is a member of a given set S or not. Recent
years have seen a flourish design explosion of BF due to its characteristic of
space-efficiency and the functionality of constant-time membership query. The
existing reviews or surveys mainly focus on the applications of BF, but fall
short in covering the current trends, thereby lacking intrinsic understanding
of their design philosophy. To this end, this survey provides an overview of BF
and its variants, with an emphasis on the optimization techniques. Basically,
we survey the existing variants from two dimensions, i.e., performance and
generalization. To improve the performance, dozens of variants devote
themselves to reducing the false positives and implementation costs. Besides,
tens of variants generalize the BF framework in more scenarios by diversifying
the input sets and enriching the output functionalities. To summarize the
existing efforts, we conduct an in-depth study of the existing literature on BF
optimization, covering more than 60 variants. We unearth the design philosophy
of these variants and elaborate how the employed optimization techniques
improve BF. Furthermore, comprehensive analysis and qualitative comparison are
conducted from the perspectives of BF components. Lastly, we highlight the
future trends of designing BFs. This is, to the best of our knowledge, the
first survey that accomplishes such goals