79,572 research outputs found
MIHash: Online Hashing with Mutual Information
Learning-based hashing methods are widely used for nearest neighbor
retrieval, and recently, online hashing methods have demonstrated good
performance-complexity trade-offs by learning hash functions from streaming
data. In this paper, we first address a key challenge for online hashing: the
binary codes for indexed data must be recomputed to keep pace with updates to
the hash functions. We propose an efficient quality measure for hash functions,
based on an information-theoretic quantity, mutual information, and use it
successfully as a criterion to eliminate unnecessary hash table updates. Next,
we also show how to optimize the mutual information objective using stochastic
gradient descent. We thus develop a novel hashing method, MIHash, that can be
used in both online and batch settings. Experiments on image retrieval
benchmarks (including a 2.5M image dataset) confirm the effectiveness of our
formulation, both in reducing hash table recomputations and in learning
high-quality hash functions.Comment: International Conference on Computer Vision (ICCV), 201
EFFICIENT SELF-ADJUSTING HASH TABLE
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for an efficient self-adjusting hash table. An aspect may include hash tables that include hints that describe where objects may be found in the hash table. Additionally or alternatively, an aspect may include, hash flooding of the hash tables by using multiple hash functions. Additionally or alternatively, an aspect may include iterating through elements of a hash table using iteration information that may be stored when a hash table has a low densit
PARALLEL, SPACE-EFFICIENT HASH TABLE RESIZE
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for parallel, space-efficient hash table resize. An aspect may include a hash table that may be resized by incrementally de-allocating buckets of an old hash table and incrementally allocating buckets of a new hash table. Additionally or alternatively, an aspect may include a hash table that may be resized by re-allocating buckets from the old hash table to the new hash table and then re-arranging the buckets of the new hash table. Additionally or alternatively, an aspect may include a hash table with chaining that may be resized by copying the elements of the old hash table to corresponding buckets of the new hash table and indicating which elements are not necessarily in a final position. After copying, final positions may be determined for the buckets that are indicated as not necessarily in a final position. Additionally or alternatively, an aspect may include parallezing algorithms for resizing hash tables
Data structures for set manipulation- hash table, 1986
The most important issue addressed in this thesis is the efficient implementation of hash table methods. There are credential trade-offs in a desired implement ion. These are discussed in issues such as hash addressing, handling collision, hash table layout., and bucket overflow problems. The criteria of good hash function is providing even distribution. Collision is the major problem in hash table methods. Two major hashtable methods are discussed. Open Addressing Method places the synonymous items somewhere within the table. The Chaining Method, however, chains all synonymies and stores them somewhere outside the table called overflow area. Hash table is widely used by system software as an ideal data structure. Hash Table -applications canbe found in compiler's symbol table, database, directories of file organizations, as well as in problem-solving application programs
Boosting Multi-Core Reachability Performance with Shared Hash Tables
This paper focuses on data structures for multi-core reachability, which is a
key component in model checking algorithms and other verification methods. A
cornerstone of an efficient solution is the storage of visited states. In
related work, static partitioning of the state space was combined with
thread-local storage and resulted in reasonable speedups, but left open whether
improvements are possible. In this paper, we present a scaling solution for
shared state storage which is based on a lockless hash table implementation.
The solution is specifically designed for the cache architecture of modern
CPUs. Because model checking algorithms impose loose requirements on the hash
table operations, their design can be streamlined substantially compared to
related work on lockless hash tables. Still, an implementation of the hash
table presented here has dozens of sensitive performance parameters (bucket
size, cache line size, data layout, probing sequence, etc.). We analyzed their
impact and compared the resulting speedups with related tools. Our
implementation outperforms two state-of-the-art multi-core model checkers (SPIN
and DiVinE) by a substantial margin, while placing fewer constraints on the
load balancing and search algorithms.Comment: preliminary repor
Efficient end-to-end learning for quantizable representations
Embedding representation learning via neural networks is at the core
foundation of modern similarity based search. While much effort has been put in
developing algorithms for learning binary hamming code representations for
search efficiency, this still requires a linear scan of the entire dataset per
each query and trades off the search accuracy through binarization. To this
end, we consider the problem of directly learning a quantizable embedding
representation and the sparse binary hash code end-to-end which can be used to
construct an efficient hash table not only providing significant search
reduction in the number of data but also achieving the state of the art search
accuracy outperforming previous state of the art deep metric learning methods.
We also show that finding the optimal sparse binary hash code in a mini-batch
can be computed exactly in polynomial time by solving a minimum cost flow
problem. Our results on Cifar-100 and on ImageNet datasets show the state of
the art search accuracy in precision@k and NMI metrics while providing up to
98X and 478X search speedup respectively over exhaustive linear search. The
source code is available at
https://github.com/maestrojeong/Deep-Hash-Table-ICML18Comment: Accepted and to appear at ICML 2018. Camera ready versio
- β¦