Search CORE

125 research outputs found

On randomness in Hash functions

Author: Dietzfelbinger Martin
Publication venue
Publication date: 01/01/2012
Field of study

In the talk, we shall discuss quality measures for hash functions used in data structures and algorithms, and survey positive and negative results. (This talk is not about cryptographic hash functions.) For the analysis of algorithms involving hash functions, it is often convenient to assume the hash functions used behave fully randomly; in some cases there is no analysis known that avoids this assumption. In practice, one needs to get by with weaker hash functions that can be generated by randomized algorithms. A well-studied range of applications concern realizations of dynamic dictionaries (linear probing, chained hashing, dynamic perfect hashing, cuckoo hashing and its generalizations) or Bloom filters and their variants. A particularly successful and useful means of classification are Carter and Wegman's universal or k-wise independent classes, introduced in 1977. A natural and widely used approach to analyzing an algorithm involving hash functions is to show that it works if a sufficiently strong universal class of hash functions is used, and to substitute one of the known constructions of such classes. This invites research into the question of just how much independence in the hash functions is necessary for an algorithm to work. Some recent analyses that gave impossibility results constructed rather artificial classes that would not work; other results pointed out natural, widely used hash classes that would not work in a particular application. Only recently it was shown that under certain assumptions on some entropy present in the set of keys even 2-wise independent hash classes will lead to strong randomness properties in the hash values. The negative results show that these results may not be taken as justification for using weak hash classes indiscriminately, in particular for key sets with structure. When stronger independence properties are needed for a theoretical analysis, one may resort to classic constructions. Only in 2003 it was found out how full randomness can be simulated using only linear space overhead (which is optimal). The "split-and-share" approach can be used to justify the full randomness assumption in some situations in which full randomness is needed for the analysis to go through, like in many applications involving multiple hash functions (e.g., generalized versions of cuckoo hashing with multiple hash functions or larger bucket sizes, load balancing, Bloom filters and variants, or minimal perfect hash function constructions). For practice, efficiency considerations beyond constant factors are important. It is not hard to construct very efficient 2-wise independent classes. Using k-wise independent classes for constant k bigger than 3 has become feasible in practice only by new constructions involving tabulation. This goes together well with the quite new result that linear probing works with 5-independent hash functions. Recent developments suggest that the classification of hash function constructions by their degree of independence alone may not be adequate in some cases. Thus, one may want to analyze the behavior of specific hash classes in specific applications, circumventing the concept of k-wise independence. Several such results were recently achieved concerning hash functions that utilize tabulation. In particular if the analysis of the application involves using randomness properties in graphs and hypergraphs (generalized cuckoo hashing, also in the version with a "stash", or load balancing), a hash class combining k-wise independence with tabulation has turned out to be very powerful

Dagstuhl Research Online Publication Server

Digitale Bibliothek Thüringen

Scalable data placement of data-intensive services in geo-distributed clouds

Author: Atrey Ankita
De Turck Filip
Van Seghbroeck Gregory
Volckaert Bruno
Publication venue: 'Scitepress'
Publication date: 01/01/2018
Field of study

Ghent University Academic Bibliography

Computational Utilities for the Game of Simplicial Nim

Author: Penn Nelson
Publication venue: UKnowledge
Publication date: 01/01/2021
Field of study

Simplicial nim games, a class of impartial games, have very interesting mathematical properties. Winning strategies on a simplicial nim game can be determined by the set of positions in the game whose Sprague-Grundy values are zero (also zero positions). In this work, I provide two major contributions to the study of simplicial nim games. First, I provide a modern and efficient implementation of the Sprague-Grundy function for an arbitrary simplicial complex, and discuss its performance and scope of viability. Secondly, I provide a method to find a simple mathematical expression to model that function if it exists. I show the effectiveness of this method on determining mathematical expressions that classify the set of zero positions onseveral simplicial nim games

University of Kentucky

Incremental Edge Orientation in Forests

Author: Bender Michael A.
Kopelowitz Tsvi
Kuszmaul William
Porat Ely
Stein Clifford
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 29th Annual European Symposium on Algorithms (ESA 2021)
Publication date: 01/01/2021
Field of study

For any forest G = (V, E) it is possible to orient the edges E so that no vertex in V has out-degree greater than 1. This paper considers the incremental edge-orientation problem, in which the edges E arrive over time and the algorithm must maintain a low-out-degree edge orientation at all times. We give an algorithm that maintains a maximum out-degree of 3 while flipping at most O(log log n) edge orientations per edge insertion, with high probability in n. The algorithm requires worst-case time O(log n log log n) per insertion, and takes amortized time O(1). The previous state of the art required up to O(log n / log log n) edge flips per insertion. We then apply our edge-orientation results to the problem of dynamic Cuckoo hashing. The problem of designing simple families ? of hash functions that are compatible with Cuckoo hashing has received extensive attention. These families ? are known to satisfy static guarantees, but do not come typically with dynamic guarantees for the running time of inserts and deletes. We show how to transform static guarantees (for 1-associativity) into near-state-of-the-art dynamic guarantees (for O(1)-associativity) in a black-box fashion. Rather than relying on the family ? to supply randomness, as in past work, we instead rely on randomness within our table-maintenance algorithm

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Recommended from our members

ALGORITHMS FOR MASSIVE, EXPENSIVE, OR OTHERWISE INCONVENIENT GRAPHS

Author: Tench David
Publication venue: ScholarWorks@UMass Amherst
Publication date: 18/12/2020
Field of study

A long-standing assumption common in algorithm design is that any part of the input is accessible at any time for unit cost. However, as we work with increasingly large data sets, or as we build smaller devices, we must revisit this assumption. In this thesis, I present some of my work on graph algorithms designed for circumstances where traditional assumptions about inputs do not apply. 1. Classical graph algorithms require direct access to the input graph and this is not feasible when the graph is too large to fit in memory. For computation on massive graphs we consider the dynamic streaming graph model. Given an input graph defined by as a stream of edge insertions and deletions, our goal is to approximate properties of this graph using space that is sublinear in the size of the stream. In this thesis, I present algorithms for approximating vertex connectivity, hypergraph edge connectivity, maximum coverage, unique coverage, and temporal connectivity in graph streams. 2. In certain applications the input graph is not explicitly represented, but its edges may be discovered via queries which require costly computation or measurement. I present two open-source systems which solve real-world problems via graph algorithms which may access their inputs only through costly edge queries. M ESH is a memory manager which compacts memory efficiently by finding an approximate graph matching subject to stringent time and edge query restrictions. PathCache is an efficiently scalable network measurement platform that outperforms the current state of the art

ScholarWorks@UMass Amherst

Applications of Derandomization Theory in Coding

Author: Cheraghchi Mahdi
Publication venue
Publication date: 27/05/2010
Field of study

Randomized techniques play a fundamental role in theoretical computer science and discrete mathematics, in particular for the design of efficient algorithms and construction of combinatorial objects. The basic goal in derandomization theory is to eliminate or reduce the need for randomness in such randomized constructions. In this thesis, we explore some applications of the fundamental notions in derandomization theory to problems outside the core of theoretical computer science, and in particular, certain problems related to coding theory. First, we consider the wiretap channel problem which involves a communication system in which an intruder can eavesdrop a limited portion of the transmissions, and construct efficient and information-theoretically optimal communication protocols for this model. Then we consider the combinatorial group testing problem. In this classical problem, one aims to determine a set of defective items within a large population by asking a number of queries, where each query reveals whether a defective item is present within a specified group of items. We use randomness condensers to explicitly construct optimal, or nearly optimal, group testing schemes for a setting where the query outcomes can be highly unreliable, as well as the threshold model where a query returns positive if the number of defectives pass a certain threshold. Finally, we design ensembles of error-correcting codes that achieve the information-theoretic capacity of a large class of communication channels, and then use the obtained ensembles for construction of explicit capacity achieving codes. [This is a shortened version of the actual abstract in the thesis.]Comment: EPFL Phd Thesi

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

CERN Document Server

Randomness Extraction in AC0 and with Small Locality

Author: Cheng Kuan
Li Xin
Publication venue
Publication date: 01/01/2018
Field of study

Randomness extractors, which extract high quality (almost-uniform) random bits from biased random sources, are important objects both in theory and in practice. While there have been significant progress in obtaining near optimal constructions of randomness extractors in various settings, the computational complexity of randomness extractors is still much less studied. In particular, it is not clear whether randomness extractors with good parameters can be computed in several interesting complexity classes that are much weaker than P. In this paper we study randomness extractors in the following two models of computation: (1) constant-depth circuits (AC0), and (2) the local computation model. Previous work in these models, such as [Vio05a], [GVW15] and [BG13], only achieve constructions with weak parameters. In this work we give explicit constructions of randomness extractors with much better parameters. As an application, we use our AC0 extractors to study pseudorandom generators in AC0, and show that we can construct both cryptographic pseudorandom generators (under reasonable computational assumptions) and unconditional pseudorandom generators for space bounded computation with very good parameters. Our constructions combine several previous techniques in randomness extractors, as well as introduce new techniques to reduce or preserve the complexity of extractors, which may be of independent interest. These include (1) a general way to reduce the error of strong seeded extractors while preserving the AC0 property and small locality, and (2) a seeded randomness condenser with small locality.Comment: 62 page

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Query-Efficient Computation in Property Testing and Learning Theory

Author: Garcia Soriano D. (David)
Publication venue
Publication date: 25/04/2012
Field of study

CWI's Institutional Repository

Communication Steps for Parallel Query Processing

Author: Beame Paul
Koutris Paraschos
Suciu Dan
Publication venue
Publication date: 01/01/2013
Field of study

We consider the problem of computing a relational query

q

on a large input database of size

n

, using a large number

p

of servers. The computation is performed in rounds, and each server can receive only

O(n/p^{1-\varepsilon})

bits of data, where

\varepsilon \in [0,1]

is a parameter that controls replication. We examine how many global communication steps are needed to compute

q

. We establish both lower and upper bounds, in two settings. For a single round of communication, we give lower bounds in the strongest possible model, where arbitrary bits may be exchanged; we show that any algorithm requires

\varepsilon \geq 1-1/\tau^*

, where

\tau^*

is the fractional vertex cover of the hypergraph of

q

. We also give an algorithm that matches the lower bound for a specific class of databases. For multiple rounds of communication, we present lower bounds in a model where routing decisions for a tuple are tuple-based. We show that for the class of tree-like queries there exists a tradeoff between the number of rounds and the space exponent

\varepsilon

. The lower bounds for multiple rounds are the first of their kind. Our results also imply that transitive closure cannot be computed in O(1) rounds of communication

arXiv.org e-Print Archive

CiteSeerX

Crossref