Search CORE

6,079 research outputs found

Retouched Bloom Filters: Allowing Networked Applications to Flexibly Trade Off False Positives Against False Negatives

Author: Baynat Bruno
Donnet Benoit
Friedman Timur
Publication venue
Publication date: 01/07/2006
Field of study

Where distributed agents must share voluminous set membership information, Bloom filters provide a compact, though lossy, way for them to do so. Numerous recent networking papers have examined the trade-offs between the bandwidth consumed by the transmission of Bloom filters, and the error rate, which takes the form of false positives, and which rises the more the filters are compressed. In this paper, we introduce the retouched Bloom filter (RBF), an extension that makes the Bloom filter more flexible by permitting the removal of selected false positives at the expense of generating random false negatives. We analytically show that RBFs created through a random process maintain an overall error rate, expressed as a combination of the false positive rate and the false negative rate, that is equal to the false positive rate of the corresponding Bloom filters. We further provide some simple heuristics and improved algorithms that decrease the false positive rate more than than the corresponding increase in the false negative rate, when creating RBFs. Finally, we demonstrate the advantages of an RBF over a Bloom filter in a distributed network topology measurement application, where information about large stop sets must be shared among route tracing monitors.Comment: This is a new version of the technical reports with improved algorithms and theorical analysis of algorithm

arXiv.org e-Print Archive

Open Repository and Bibliography - Liège

An approximate dynamic programming approach for improving accuracy of lossy data compression by Bloom filters

Author: Alexei Vernitski
Bertsekas
Bertsimas
Bloom
Broder
Bruck
Carrea
Chen
Dong
Donnet
Fréville
Garey
Grothey
Guo
Hao
Kellerer
Kirsch
Laura Carrea
Lorie
Lumetta
Martello
Mitzenmacher
Nemhauser
Pagh
Powell
Pugh
Tarkoma
Tarkoma
Winston
Xinan Yang
Yang
Yang
Zhang
Zhong
Publication venue: 'Elsevier BV'
Publication date: 28/01/2016
Field of study

Bloom filters are a data structure for storing data in a compressed form. They offer excellent space and time efficiency at the cost of some loss of accuracy (so-called lossy compression). This work presents a yes-no Bloom filter, which as a data structure consisting of two parts: the yes-filter which is a standard Bloom filter and the no-filter which is another Bloom filter whose purpose is to represent those objects that were recognised incorrectly by the yes-filter (that is, to recognise the false positives of the yes-filter). By querying the no-filter after an object has been recognised by the yes-filter, we get a chance of rejecting it, which improves the accuracy of data recognition in comparison with the standard Bloom filter of the same total length. A further increase in accuracy is possible if one chooses objects to include in the no-filter so that the no-filter recognises as many as possible false positives but no true positives, thus producing the most accurate yes-no Bloom filter among all yes-no Bloom filters. This paper studies how optimization techniques can be used to maximize the number of false positives recognised by the no-filter, with the constraint being that it should recognise no true positives. To achieve this aim, an Integer Linear Program (ILP) is proposed for the optimal selection of false positives. In practice the problem size is normally large leading to intractable optimal solution. Considering the similarity of the ILP with the Multidimensional Knapsack Problem, an Approximate Dynamic Programming (ADP) model is developed making use of a reduced ILP for the value function approximation. Numerical results show the ADP model works best comparing with a number of heuristics as well as the CPLEX built-in solver (B&B), and this is what can be recommended for use in yes-no Bloom filters. In a wider context of the study of lossy compression algorithms, our researchis an example showing how the arsenal of optimization methods can be applied to improving the accuracy of compressed data

University of Essex Research Repository

Central Archive at the University of Reading

Crossref

Weightless: Lossy Weight Encoding For Deep Neural Network Compression

Author: Adolf Robert
Brooks David
Gupta Udit
Mitzenmacher Michael M.
Reagen Brandon
Rush Alexander M.
Wei Gu-Yeon
Publication venue
Publication date: 13/11/2017
Field of study

The large memory requirements of deep neural networks limit their deployment and adoption on many devices. Model compression methods effectively reduce the memory requirements of these models, usually through applying transformations such as weight pruning or quantization. In this paper, we present a novel scheme for lossy weight encoding which complements conventional compression techniques. The encoding is based on the Bloomier filter, a probabilistic data structure that can save space at the cost of introducing random errors. Leveraging the ability of neural networks to tolerate these imperfections and by re-training around the errors, the proposed technique, Weightless, can compress DNN weights by up to 496x with the same model accuracy. This results in up to a 1.51x improvement over the state-of-the-art

arXiv.org e-Print Archive

UCL Discovery

Opportunistic linked data querying through approximate membership metadata

Author: BH Bloom
C Buil-Aranda
E Oren
G Aluç
I Ermilov
I Filali
M Schmachtenberg
R Gallager
R Verborgh
X Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Between URI dereferencing and the SPARQL protocol lies a largely unexplored axis of possible interfaces to Linked Data, each with its own combination of trade-offs. One of these interfaces is Triple Pattern Fragments, which allows clients to execute SPARQL queries against low-cost servers, at the cost of higher bandwidth. Increasing a client's efficiency means lowering the number of requests, which can among others be achieved through additional metadata in responses. We noted that typical SPARQL query evaluations against Triple Pattern Fragments require a significant portion of membership subqueries, which check the presence of a specific triple, rather than a variable pattern. This paper studies the impact of providing approximate membership functions, i.e., Bloom filters and Golomb-coded sets, as extra metadata. In addition to reducing HTTP requests, such functions allow to achieve full result recall earlier when temporarily allowing lower precision. Half of the tested queries from a WatDiv benchmark test set could be executed with up to a third fewer HTTP requests with only marginally higher server cost. Query times, however, did not improve, likely due to slower metadata generation and transfer. This indicates that approximate membership functions can partly improve the client-side query process with minimal impact on the server and its interface

Crossref

Ghent University Academic Bibliography

Design of a multiple bloom filter for distributed navigation routing

Author: Cheng Yongqiang
Ji Yuanxiang
Jiang Ping
Wang Xiaonian
Zhu Jin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/03/2013
Field of study

Unmanned navigation of vehicles and mobile robots can be greatly simplified by providing environmental intelligence with dispersed wireless sensors. The wireless sensors can work as active landmarks for vehicle localization and routing. However, wireless sensors are often resource scarce and require a resource-saving design. In this paper, a multiple Bloom-filter scheme is proposed to compress a global routing table for a wireless sensor. It is used as a lookup table for routing a vehicle to any destination but requires significantly less memory space and search effort. An error-expectation-based design for a multiple Bloom filter is proposed as an improvement to the conventional false-positive-rate-based design. The new design is shown to provide an equal relative error expectation for all branched paths, which ensures a better network load balance and uses less memory space. The scheme is implemented in a project for wheelchair navigation using wireless camera motes. © 2013 IEEE

Repository@Hull - Worktribe

Delta bloom filter compression using stochastic learning-based weak estimation

Author: Trivedi Priyanka
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2010
Field of study

Substantial research has been done, and sill continues, for reducing the bandwidth requirement and for reliable access to the data, stored and transmitted, in a space efficient manner. Bloom filters and their variants have achieved wide spread acceptability in various fields due to their ability to satisfy these requirements. As this need has increased, especially, for the applications which require heavy use of the transmission bandwidth, distributed computing environment for the databases or the proxy servers, and even the applications which are sensitive to the access to the information with frequent modifications, this thesis proposes a solution in the form of compressed delta Bloom filter. This thesis proposes delta Bloom filter compression, using stochastic learning-based weak estimation and prediction with partial matching to achieve the goal of lossless compression with high compression gain for reducing the large data transferred frequently

Scholarship at UWindsor