19,134 research outputs found

    On randomness in Hash functions

    Get PDF
    In the talk, we shall discuss quality measures for hash functions used in data structures and algorithms, and survey positive and negative results. (This talk is not about cryptographic hash functions.) For the analysis of algorithms involving hash functions, it is often convenient to assume the hash functions used behave fully randomly; in some cases there is no analysis known that avoids this assumption. In practice, one needs to get by with weaker hash functions that can be generated by randomized algorithms. A well-studied range of applications concern realizations of dynamic dictionaries (linear probing, chained hashing, dynamic perfect hashing, cuckoo hashing and its generalizations) or Bloom filters and their variants. A particularly successful and useful means of classification are Carter and Wegman's universal or k-wise independent classes, introduced in 1977. A natural and widely used approach to analyzing an algorithm involving hash functions is to show that it works if a sufficiently strong universal class of hash functions is used, and to substitute one of the known constructions of such classes. This invites research into the question of just how much independence in the hash functions is necessary for an algorithm to work. Some recent analyses that gave impossibility results constructed rather artificial classes that would not work; other results pointed out natural, widely used hash classes that would not work in a particular application. Only recently it was shown that under certain assumptions on some entropy present in the set of keys even 2-wise independent hash classes will lead to strong randomness properties in the hash values. The negative results show that these results may not be taken as justification for using weak hash classes indiscriminately, in particular for key sets with structure. When stronger independence properties are needed for a theoretical analysis, one may resort to classic constructions. Only in 2003 it was found out how full randomness can be simulated using only linear space overhead (which is optimal). The "split-and-share" approach can be used to justify the full randomness assumption in some situations in which full randomness is needed for the analysis to go through, like in many applications involving multiple hash functions (e.g., generalized versions of cuckoo hashing with multiple hash functions or larger bucket sizes, load balancing, Bloom filters and variants, or minimal perfect hash function constructions). For practice, efficiency considerations beyond constant factors are important. It is not hard to construct very efficient 2-wise independent classes. Using k-wise independent classes for constant k bigger than 3 has become feasible in practice only by new constructions involving tabulation. This goes together well with the quite new result that linear probing works with 5-independent hash functions. Recent developments suggest that the classification of hash function constructions by their degree of independence alone may not be adequate in some cases. Thus, one may want to analyze the behavior of specific hash classes in specific applications, circumventing the concept of k-wise independence. Several such results were recently achieved concerning hash functions that utilize tabulation. In particular if the analysis of the application involves using randomness properties in graphs and hypergraphs (generalized cuckoo hashing, also in the version with a "stash", or load balancing), a hash class combining k-wise independence with tabulation has turned out to be very powerful

    Research of collision properties of the modified UMAC algorithm on crypto-code constructions

    Get PDF
    The transfer of information by telecommunication channels is accompanied by message hashing to control the integrity of the data and confirm the authenticity of the data. When using a reliable hash function, it is computationally difficult to create a fake message with a pre-existing hash code, however, due to the weaknesses of specific hashing algorithms, this threat can be feasible. To increase the level of cryptographic strength of transmitted messages over telecommunication channels, there are ways to create hash codes, which, according to practical research, are imperfect in terms of the speed of their formation and the degree of cryptographic strength. The collisional properties of hashing functions formed using the modified UMAC algorithm using the methodology for assessing the universality and strict universality of hash codes are investigated. Based on the results of the research, an assessment of the impact of the proposed modifications at the last stage of the generation of authentication codes on the provision of universal hashing properties was presented. The analysis of the advantages and disadvantages that accompany the formation of the hash code by the previously known methods is carried out. The scheme of cascading generation of data integrity and authenticity control codes using the UMAC algorithm on crypto-code constructions has been improved. Schemes of algorithms for checking hash codes were developed to meet the requirements of universality and strict universality. The calculation and analysis of collision search in the set of generated hash codes was carried out according to the requirements of a universal and strictly universal class for creating hash code

    A Fast Single-Key Two-Level Universal Hash Function

    Get PDF
    Universal hash functions based on univariate polynomials are well known, e.g. Poly1305 and GHASH. Using Horner’s rule to evaluate such hash functionsrequire l − 1 field multiplications for hashing a message consisting of l blocks where each block is one field element. A faster method is based on the class of Bernstein-Rabin-Winograd (BRW) polynomials which require ⌊l/2⌋ multiplications and ⌊lgl⌋ squarings for l≥3 blocks. Though this is significantly smaller than Horner’s rule based hashing, implementation of BRW polynomials for variable length messages present significant difficulties. In this work, we propose a two-level hash function where BRW polynomial based hashing is done at the lower level and Horner’s rule based hashing is done at the higher level. The BRW polynomial based hashing is applied to a fixed number of blocks and hence the difficulties in handling variable length messages is avoided. Even though the hash function has two levels, we show that it is sufficient to use a single field element as the hash key. The basic idea is instantiated to propose two new hash functions, one which hashes a single binary string and the other can hash a vector of binary strings. We describe two actual implementations, one over F2128 and the other over F2256 both using the pclmulqdq instruction available in modern Intel processors. On both the Haswell and Skylake processors, the implementation over F2128 is faster than both an implementation of GHASH by Gueron; and a highly optimised implementation, also by Gueron, of another polynomial based hash function called POLYVAL. We further show that the Fast Fourier Transform based field multiplication over F2256 proposed by Bernstein and Chou can be used to evaluate the new hash function at a cost of about at most 46 bit operations per bit of digest, but, unlike the Bernstein-Chou analysis, there is no hidden cost of generating the hash key. More generally, the new idea of building a two-level hash function having a single field element as the hash key can be applied to other finite fields to build new hash functions

    Increasing the power efficiency of Bloom filters for network string matching

    Get PDF
    Although software based techniques are widely accepted in computer security systems, there is a growing interest to utilize hardware opportunities in order to compensate for the network bandwidth increases. Recently, hardware based virus protection systems have started to emerge. These type of hardware systems work by identifying the malicious content and removing it from the network streams. In principle, they make use of string matching. Bit by bit, they compare the virus signatures with the bit strings in the network. The Bloom filters are ideal data structures for string matching. Nonetheless, they consume large power when many of them used in parallel to match different virus signatures. In this paper, we propose a new type of Bloom filter architecture which exploits well-known pipelining technique. © 2006 IEEE
    • …
    corecore