24,633 research outputs found

    Linear Hashing is Awesome

    Full text link
    We consider the hash function h(x)=((ax+b)modp)modnh(x) = ((ax+b) \bmod p) \bmod n where a,ba,b are chosen uniformly at random from {0,1,,p1}\{0,1,\ldots,p-1\}. We prove that when we use h(x)h(x) in hashing with chaining to insert nn elements into a table of size nn the expected length of the longest chain is O~ ⁣(n1/3)\tilde{O}\!\left(n^{1/3}\right). The proof also generalises to give the same bound when we use the multiply-shift hash function by Dietzfelbinger et al. [Journal of Algorithms 1997].Comment: A preliminary version appeared at FOCS'1

    Wear Minimization for Cuckoo Hashing: How Not to Throw a Lot of Eggs into One Basket

    Full text link
    We study wear-leveling techniques for cuckoo hashing, showing that it is possible to achieve a memory wear bound of loglogn+O(1)\log\log n+O(1) after the insertion of nn items into a table of size CnCn for a suitable constant CC using cuckoo hashing. Moreover, we study our cuckoo hashing method empirically, showing that it significantly improves on the memory wear performance for classic cuckoo hashing and linear probing in practice.Comment: 13 pages, 1 table, 7 figures; to appear at the 13th Symposium on Experimental Algorithms (SEA 2014

    Fast Supervised Hashing with Decision Trees for High-Dimensional Data

    Get PDF
    Supervised hashing aims to map the original features to compact binary codes that are able to preserve label based similarity in the Hamming space. Non-linear hash functions have demonstrated the advantage over linear ones due to their powerful generalization capability. In the literature, kernel functions are typically used to achieve non-linearity in hashing, which achieve encouraging retrieval performance at the price of slow evaluation and training time. Here we propose to use boosted decision trees for achieving non-linearity in hashing, which are fast to train and evaluate, hence more suitable for hashing with high dimensional data. In our approach, we first propose sub-modular formulations for the hashing binary code inference problem and an efficient GraphCut based block search method for solving large-scale inference. Then we learn hash functions by training boosted decision trees to fit the binary codes. Experiments demonstrate that our proposed method significantly outperforms most state-of-the-art methods in retrieval precision and training time. Especially for high-dimensional data, our method is orders of magnitude faster than many methods in terms of training time.Comment: Appearing in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2014, Ohio, US

    Linear Hashing

    Get PDF
    Consider the set H of all linear (or affine) transformations between two vector spaces over a finite field F. We study how good H is as a class of hash functions, namely we consider hashing a set S of sizen into a range having the same cardinality n by a randomly chosen function from H and look at the expected size of the largest hash bucket. H is a universal class of hash functions for any finite field, butwith respect to our measure different fields behave differently

    Hashing protocol for distilling multipartite CSS states

    Full text link
    We present a hashing protocol for distilling multipartite CSS states by means of local Clifford operations, Pauli measurements and classical communication. It is shown that this hashing protocol outperforms previous versions by exploiting information theory to a full extent an not only applying CNOTs as local Clifford operations. Using the information-theoretical notion of a strongly typical set, we calculate the asymptotic yield of the protocol as the solution of a linear programming problem.Comment: 13 pages, 3 figures, RevTeX

    Improved Asymmetric Locality Sensitive Hashing (ALSH) for Maximum Inner Product Search (MIPS)

    Full text link
    Recently it was shown that the problem of Maximum Inner Product Search (MIPS) is efficient and it admits provably sub-linear hashing algorithms. Asymmetric transformations before hashing were the key in solving MIPS which was otherwise hard. In the prior work, the authors use asymmetric transformations which convert the problem of approximate MIPS into the problem of approximate near neighbor search which can be efficiently solved using hashing. In this work, we provide a different transformation which converts the problem of approximate MIPS into the problem of approximate cosine similarity search which can be efficiently solved using signed random projections. Theoretical analysis show that the new scheme is significantly better than the original scheme for MIPS. Experimental evaluations strongly support the theoretical findings.Comment: arXiv admin note: text overlap with arXiv:1405.586
    corecore