2,732 research outputs found

    Strongly universal string hashing is fast

    Get PDF
    We present fast strongly universal string hashing families: they can process data at a rate of 0.2 CPU cycle per byte. Maybe surprisingly, we find that these families---though they require a large buffer of random numbers---are often faster than popular hash functions with weaker theoretical guarantees. Moreover, conventional wisdom is that hash functions with fewer multiplications are faster. Yet we find that they may fail to be faster due to operation pipelining. We present experimental results on several processors including low-powered processors. Our tests include hash functions designed for processors with the Carry-Less Multiplication (CLMUL) instruction set. We also prove, using accessible proofs, the strong universality of our families.Comment: Software is available at http://code.google.com/p/variablelengthstringhashing/ and https://github.com/lemire/StronglyUniversalStringHashin

    The universality of iterated hashing over variable-length strings

    Get PDF
    Iterated hash functions process strings recursively, one character at a time. At each iteration, they compute a new hash value from the preceding hash value and the next character. We prove that iterated hashing can be pairwise independent, but never 3-wise independent. We show that it can be almost universal over strings much longer than the number of hash values; we bound the maximal string length given the collision probability

    Regular and almost universal hashing: an efficient implementation

    Get PDF
    Random hashing can provide guarantees regarding the performance of data structures such as hash tables---even in an adversarial setting. Many existing families of hash functions are universal: given two data objects, the probability that they have the same hash value is low given that we pick hash functions at random. However, universality fails to ensure that all hash functions are well behaved. We further require regularity: when picking data objects at random they should have a low probability of having the same hash value, for any fixed hash function. We present the efficient implementation of a family of non-cryptographic hash functions (PM+) offering good running times, good memory usage as well as distinguishing theoretical guarantees: almost universality and component-wise regularity. On a variety of platforms, our implementations are comparable to the state of the art in performance. On recent Intel processors, PM+ achieves a speed of 4.7 bytes per cycle for 32-bit outputs and 3.3 bytes per cycle for 64-bit outputs. We review vectorization through SIMD instructions (e.g., AVX2) and optimizations for superscalar execution.Comment: accepted for publication in Software: Practice and Experience in September 201

    Faster 64-bit universal hashing using carry-less multiplications

    Get PDF
    Intel and AMD support the Carry-less Multiplication (CLMUL) instruction set in their x64 processors. We use CLMUL to implement an almost universal 64-bit hash family (CLHASH). We compare this new family with what might be the fastest almost universal family on x64 processors (VHASH). We find that CLHASH is at least 60% faster. We also compare CLHASH with a popular hash function designed for speed (Google's CityHash). We find that CLHASH is 40% faster than CityHash on inputs larger than 64 bytes and just as fast otherwise

    On an almost-universal hash function family with applications to authentication and secrecy codes

    Get PDF
    Universal hashing, discovered by Carter and Wegman in 1979, has many important applications in computer science. MMH∗^*, which was shown to be Δ\Delta-universal by Halevi and Krawczyk in 1997, is a well-known universal hash function family. We introduce a variant of MMH∗^*, that we call GRDH, where we use an arbitrary integer n>1n>1 instead of prime pp and let the keys x=⟹x1,
,xk⟩∈Znk\mathbf{x}=\langle x_1, \ldots, x_k \rangle \in \mathbb{Z}_n^k satisfy the conditions gcd⁥(xi,n)=ti\gcd(x_i,n)=t_i (1≀i≀k1\leq i\leq k), where t1,
,tkt_1,\ldots,t_k are given positive divisors of nn. Then via connecting the universal hashing problem to the number of solutions of restricted linear congruences, we prove that the family GRDH is an Δ\varepsilon-almost-Δ\Delta-universal family of hash functions for some Δ<1\varepsilon<1 if and only if nn is odd and gcd⁥(xi,n)=ti=1\gcd(x_i,n)=t_i=1 (1≀i≀k)(1\leq i\leq k). Furthermore, if these conditions are satisfied then GRDH is 1p−1\frac{1}{p-1}-almost-Δ\Delta-universal, where pp is the smallest prime divisor of nn. Finally, as an application of our results, we propose an authentication code with secrecy scheme which strongly generalizes the scheme studied by Alomair et al. [{\it J. Math. Cryptol.} {\bf 4} (2010), 121--148], and [{\it J.UCS} {\bf 15} (2009), 2937--2956].Comment: International Journal of Foundations of Computer Science, to appea

    Attacks on quantum key distribution protocols that employ non-ITS authentication

    Full text link
    We demonstrate how adversaries with unbounded computing resources can break Quantum Key Distribution (QKD) protocols which employ a particular message authentication code suggested previously. This authentication code, featuring low key consumption, is not Information-Theoretically Secure (ITS) since for each message the eavesdropper has intercepted she is able to send a different message from a set of messages that she can calculate by finding collisions of a cryptographic hash function. However, when this authentication code was introduced it was shown to prevent straightforward Man-In-The-Middle (MITM) attacks against QKD protocols. In this paper, we prove that the set of messages that collide with any given message under this authentication code contains with high probability a message that has small Hamming distance to any other given message. Based on this fact we present extended MITM attacks against different versions of BB84 QKD protocols using the addressed authentication code; for three protocols we describe every single action taken by the adversary. For all protocols the adversary can obtain complete knowledge of the key, and for most protocols her success probability in doing so approaches unity. Since the attacks work against all authentication methods which allow to calculate colliding messages, the underlying building blocks of the presented attacks expose the potential pitfalls arising as a consequence of non-ITS authentication in QKD-postprocessing. We propose countermeasures, increasing the eavesdroppers demand for computational power, and also prove necessary and sufficient conditions for upgrading the discussed authentication code to the ITS level.Comment: 34 page

    Key recycling in authentication

    Full text link
    In their seminal work on authentication, Wegman and Carter propose that to authenticate multiple messages, it is sufficient to reuse the same hash function as long as each tag is encrypted with a one-time pad. They argue that because the one-time pad is perfectly hiding, the hash function used remains completely unknown to the adversary. Since their proof is not composable, we revisit it using a composable security framework. It turns out that the above argument is insufficient: if the adversary learns whether a corrupted message was accepted or rejected, information about the hash function is leaked, and after a bounded finite amount of rounds it is completely known. We show however that this leak is very small: Wegman and Carter's protocol is still Ï”\epsilon-secure, if Ï”\epsilon-almost strongly universal2_2 hash functions are used. This implies that the secret key corresponding to the choice of hash function can be reused in the next round of authentication without any additional error than this Ï”\epsilon. We also show that if the players have a mild form of synchronization, namely that the receiver knows when a message should be received, the key can be recycled for any arbitrary task, not only new rounds of authentication.Comment: 17+3 pages. 11 figures. v3: Rewritten with AC instead of UC. Extended the main result to both synchronous and asynchronous networks. Matches published version up to layout and updated references. v2: updated introduction and reference
    • 

    corecore