Search CORE

19 research outputs found

The Multiple-orientability Thresholds for Random Hypergraphs

Author: Fountoulakis Nikolaos
Khosla Megha
Panagiotou Konstantinos
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 26/09/2013
Field of study

k

-uniform hypergraph

H = (V, E)

is called

\ell

-orientable, if there is an assignment of each edge

e\in E

to one of its vertices

v\in e

such that no vertex is assigned more than

\ell

edges. Let

H_{n,m,k}

be a hypergraph, drawn uniformly at random from the set of all

k

-uniform hypergraphs with

n

vertices and

m

edges. In this paper we establish the threshold for the

\ell

-orientability of

H_{n,m,k}

for all

k\ge 3

and

\ell \ge 2

, i.e., we determine a critical quantity

c_{k, \ell}^*

such that with probability

1-o(1)

the graph

H_{n,cn,k}

has an

\ell

-orientation if

c c_{k, \ell}^*

. Our result has various applications including sharp load thresholds for cuckoo hashing, load balancing with guaranteed maximum load, and massive parallel access to hard disk arrays.Comment: An extended abstract appeared in the proceedings of SODA 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

The Multiple-Orientability Thresholds for Random Hypergraphs

Author: Fountoulakis Nikolaos
Khosla Megha
Panagiotou Konstantinos
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 20/06/2015
Field of study

A k-uniform hypergraph H = (V, E) is called l-orientable if there is an assignment of each edge e is an element of E to one of its vertices v is an element of e such that no vertex is assigned more than l edges. Let H-n,H-m,H-k be a hypergraph, drawn uniformly at random from the set of all k-uniform hypergraphs with n vertices and m edges. In this paper we establish the threshold for the l-orientability of H-n,H-m,H-k for all k >= 3 and l >= 2, that is, we determine a critical quantity c(*)k,l such that with probability 1-o(1) the graph H-n,H-cn,(k) has an l-orientation if c c(k,l)(*) . Our result has various applications, including sharp load thresholds for cuckoo hashing, load balancing with guaranteed maximum load, and massive parallel access to hard disk arrays

CiteSeerX

Crossref

University of Birmingham Research Portal

Open Access LMU

MPG.PuRe

Multiple choice allocations with small maximum loads

Author: Khosla Megha
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2014
Field of study

The idea of using multiple choices to improve allocation schemes is now well understood and is often illustrated by the following example. Suppose

n

balls are allocated to

n

bins with each ball choosing a bin independently and uniformly at random. The \emph{maximum load}, or the number of balls in the most loaded bin, will then be approximately

\log n \over \log \log n

with high probability. Suppose now the balls are allocated sequentially by placing a ball in the least loaded bin among the

k\ge 2

bins chosen independently and uniformly at random. Azar, Broder, Karlin, and Upfal showed that in this scenario, the maximum load drops to

{\log \log n \over \log k} +\Theta(1)

, with high probability, which is an exponential improvement over the previous case. In this thesis we investigate multiple choice allocations from a slightly different perspective. Instead of minimizing the maximum load, we fix the bin capacities and focus on maximizing the number of balls that can be allocated without overloading any bin. In the process that we consider we have

m=\lfloor cn \rfloor

balls and

n

bins. Each ball chooses

k

bins independently and uniformly at random. \emph{Is it possible to assign each ball to one of its choices such that the no bin receives more than

\ell

balls?} For all

k\ge 3

and

\ell\ge 2

we give a critical value,

c_{k,\ell}^*

, such that when

cc_{k,\ell}^*

this is not the case. In case such an allocation exists, \emph{how quickly can we find it?} Previous work on total allocation time for case

k\ge 3

and

\ell=1

has analyzed a \emph{breadth first strategy} which is shown to be linear only in expectation. We give a simple and efficient algorithm which we also call \emph{local search allocation}(LSA) to find an allocation for all

k\ge 3

and

\ell=1

. Provided the number of balls are below (but arbitrarily close to) the theoretical achievable load threshold, we give a \emph{linear} bound for the total allocation time that holds with high probability. We demonstrate, through simulations, an order of magnitude improvement for total and maximum allocation times when compared to the state of the art method. Our results find applications in many areas including hashing, load balancing, data management, orientability of random hypergraphs and maximum matchings in a special class of bipartite graphs.Die Idee, mehrere Wahlmöglichkeiten zu benutzen, um Zuordnungsschemas zu verbessern, ist mittlerweile gut verstanden und wird oft mit Hilfe des folgenden Beispiels illustriert: Man nehme an, dass n Kugeln auf n Behälter verteilt werden und jede Kugel unabhängig und gleichverteilt per Zufall ihren Behälter wählt. Die maximale Auslastung, bzw. die Anzahl an Kugeln im meist befüllten Behälter, wird dann mit hoher Wahrscheinlichkeit schätzungsweise

\log n \over \log \log n

sein. Alternativ können die Kugeln sequenziell zugeordnet werden, indem jede Kugel k ≥ 2 Behälter unabhängig und gleichverteilt zufällig auswählt und in dem am wenigsten befüllten dieser k Behälter platziert wird. Azar, Broder, Karlin, and Upfal haben gezeigt, dass in diesem Szenario die maximale Auslastung mit hoher Wahrscheinlichkeit auf

{\log \log n \over \log k} +\Theta(1)

sinkt, was eine exponentielle Verbesserung des vorhergehenden Falls darstellt. In dieser Doktorarbeit untersuchen wir solche Zuteilungschemas von einem etwas anderen Standpunkt. Statt die maximale Last zu minimieren, ﬁxieren wir die Kapazitäten der Behälter und konzentrieren uns auf die Maximierung der Anzahl der Kugeln, die ohne Überlastung eines Behälters zugeteilt werden können. In dem von uns betrachteten Prozess haben wir m = bcnc Kugeln und n Behälter. Jede Kugel wählt unabhängig und gleichverteilt zufällig k Behälter. Ist es möglich, jeder Kugel einen Behälter ihrer Wahl zuzuordnen, so dass kein Behälter mehr als Kugeln erhält? Für alle k ≥ 3 und ≥ 2 geben wir einen kritischen Wert

c _{k,\ell}^*

, an sodass für c c {k,\ell}^*

nicht. Im Falle, dass solch eine Zuordnung existiert, stellt sich die Frage, wie schnell diese gefunden werden kann. Die bisher durchgeführten Arbeiten zur Gesamtzuordnungszeit im Falle k ≥ 3 and

\ell = 1

haben eine Breitensuchstrategie analysiert, welche nur im Erwartungswert linear ist. Wir präsentieren einen einfachen und eﬃzienten Algorithmus, welchen wir local search allocation (LSA) nennen und der Zuteilungen für alle k ≥ 3 und

\ell = 1$ ﬁndet. Sofern die Anzahl der Kugeln unter (aber beliebig nahe an) der theoretisch erreichbaren Lastschwelle ist, zeigen wir eine lineare Schranke für die Gesamtzuordnungszeit, die mit hoher Wahrscheinlichkeit gilt. Anhand von Simulationen demonstrieren wir eine Verbesserung der Gesamt- und Maximalzuordnungszeiten um eine Größenordnung im Vergleich zu anderen aktuellen Methoden. Unsere Ergebnisse ﬁnden Anwendung in vielen Bereichen einschließlich Hashing, Lastbalancierung, Datenmanagement, Orientierbarkeit von zufälligen Hypergraphen und maximale Paarungen in einer speziellen Klasse von bipartiten Graphen

Universaar

Acronym

MPG.PuRe

On randomness in Hash functions

Author: Dietzfelbinger Martin
Publication venue
Publication date: 01/01/2012
Field of study

In the talk, we shall discuss quality measures for hash functions used in data structures and algorithms, and survey positive and negative results. (This talk is not about cryptographic hash functions.) For the analysis of algorithms involving hash functions, it is often convenient to assume the hash functions used behave fully randomly; in some cases there is no analysis known that avoids this assumption. In practice, one needs to get by with weaker hash functions that can be generated by randomized algorithms. A well-studied range of applications concern realizations of dynamic dictionaries (linear probing, chained hashing, dynamic perfect hashing, cuckoo hashing and its generalizations) or Bloom filters and their variants. A particularly successful and useful means of classification are Carter and Wegman's universal or k-wise independent classes, introduced in 1977. A natural and widely used approach to analyzing an algorithm involving hash functions is to show that it works if a sufficiently strong universal class of hash functions is used, and to substitute one of the known constructions of such classes. This invites research into the question of just how much independence in the hash functions is necessary for an algorithm to work. Some recent analyses that gave impossibility results constructed rather artificial classes that would not work; other results pointed out natural, widely used hash classes that would not work in a particular application. Only recently it was shown that under certain assumptions on some entropy present in the set of keys even 2-wise independent hash classes will lead to strong randomness properties in the hash values. The negative results show that these results may not be taken as justification for using weak hash classes indiscriminately, in particular for key sets with structure. When stronger independence properties are needed for a theoretical analysis, one may resort to classic constructions. Only in 2003 it was found out how full randomness can be simulated using only linear space overhead (which is optimal). The "split-and-share" approach can be used to justify the full randomness assumption in some situations in which full randomness is needed for the analysis to go through, like in many applications involving multiple hash functions (e.g., generalized versions of cuckoo hashing with multiple hash functions or larger bucket sizes, load balancing, Bloom filters and variants, or minimal perfect hash function constructions). For practice, efficiency considerations beyond constant factors are important. It is not hard to construct very efficient 2-wise independent classes. Using k-wise independent classes for constant k bigger than 3 has become feasible in practice only by new constructions involving tabulation. This goes together well with the quite new result that linear probing works with 5-independent hash functions. Recent developments suggest that the classification of hash function constructions by their degree of independence alone may not be adequate in some cases. Thus, one may want to analyze the behavior of specific hash classes in specific applications, circumventing the concept of k-wise independence. Several such results were recently achieved concerning hash functions that utilize tabulation. In particular if the analysis of the application involves using randomness properties in graphs and hypergraphs (generalized cuckoo hashing, also in the version with a "stash", or load balancing), a hash class combining k-wise independence with tabulation has turned out to be very powerful

Dagstuhl Research Online Publication Server

Digitale Bibliothek Thüringen

Thresholds for Extreme Orientability

Author: Loh Po-Shen
Pagh Rasmus
Publication venue
Publication date: 01/02/2012
Field of study

Multiple-choice load balancing has been a topic of intense study since the seminal paper of Azar, Broder, Karlin, and Upfal. Questions in this area can be phrased in terms of orientations of a graph, or more generally a k-uniform random hypergraph. A (d,b)-orientation is an assignment of each edge to d of its vertices, such that no vertex has more than b edges assigned to it. Conditions for the existence of such orientations have been completely documented except for the "extreme" case of (k-1,1)-orientations. We consider this remaining case, and establish: - The density threshold below which an orientation exists with high probability, and above which it does not exist with high probability. - An algorithm for finding an orientation that runs in linear time with high probability, with explicit polynomial bounds on the failure probability. Previously, the only known algorithms for constructing (k-1,1)-orientations worked for k<=3, and were only shown to have expected linear running time.Comment: Corrected description of relationship to the work of LeLarg

arXiv.org e-Print Archive

The IT University of Copenhagen's Repository

FigShare

Load thresholds for cuckoo hashing with overlapping blocks

Author: Walzer Stefan
Publication venue
Publication date: 01/01/2018
Field of study

Dietzfelbinger and Weidling [DW07] proposed a natural variation of cuckoo hashing where each of

cn

objects is assigned

k = 2

intervals of size

\ell

in a linear (or cyclic) hash table of size

n

and both start points are chosen independently and uniformly at random. Each object must be placed into a table cell within its intervals, but each cell can only hold one object. Experiments suggested that this scheme outperforms the variant with blocks in which intervals are aligned at multiples of

\ell

. In particular, the load threshold is higher, i.e. the load

c

that can be achieved with high probability. For instance, Lehman and Panigrahy [LP09] empirically observed the threshold for

\ell = 2

to be around

96.5\%

as compared to roughly

89.7\%

using blocks. They managed to pin down the asymptotics of the thresholds for large

\ell

, but the precise values resisted rigorous analysis. We establish a method to determine these load thresholds for all

\ell \geq 2

, and, in fact, for general

k \geq 2

. For instance, for

k = \ell = 2

we get

\approx 96.4995\%

. The key tool we employ is an insightful and general theorem due to Leconte, Lelarge, and Massouli\'e [LLM13], which adapts methods from statistical physics to the world of hypergraph orientability. In effect, the orientability thresholds for our graph families are determined by belief propagation equations for certain graph limits. As a side note we provide experimental evidence suggesting that placements can be constructed in linear time with loads close to the threshold using an adapted version of an algorithm by Khosla [Kho13]

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Digitale Bibliothek Thüringen

Load thresholds for cuckoo hashing with double hashing

Author: Mitzenmacher Michael
Panagiotou Konstantinos
Walzer Stefan
Publication venue
Publication date: 01/01/2018
Field of study

In k-ary cuckoo hashing, each of cn objects is associated with k random buckets in a hash table of size n. An l-orientation is an assignment of objects to associated buckets such that each bucket receives at most l objects. Several works have determined load thresholds c^* = c^*(k,l) for k-ary cuckoo hashing; that is, for c c^* no l-orientation exists with high probability. A natural variant of k-ary cuckoo hashing utilizes double hashing, where, when the buckets are numbered 0,1,...,n-1, the k choices of random buckets form an arithmetic progression modulo n. Double hashing simplifies implementation and requires less randomness, and it has been shown that double hashing has the same behavior as fully random hashing in several other data structures that similarly use multiple hashes for each object. Interestingly, previous work has come close to but has not fully shown that the load threshold for k-ary cuckoo hashing is the same when using double hashing as when using fully random hashing. Specifically, previous work has shown that the thresholds for both settings coincide, except that for double hashing it was possible that o(n) objects would have been left unplaced. Here we close this open question by showing the thresholds are indeed the same, by providing a combinatorial argument that reconciles this stubborn difference

Dagstuhl Research Online Publication Server

Digitale Bibliothek Thüringen

The densest subgraph problem in sparse random graphs

Author: Anantharam Venkat
Salez Justin
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2016
Field of study

We determine the asymptotic behavior of the maximum subgraph density of large random graphs with a prescribed degree sequence. The result applies in particular to the Erd\H{o}s-R\'{e}nyi model, where it settles a conjecture of Hajek [IEEE Trans. Inform. Theory 36 (1990) 1398-1414]. Our proof consists in extending the notion of balanced loads from finite graphs to their local weak limits, using unimodularity. This is a new illustration of the objective method described by Aldous and Steele [In Probability on Discrete Structures (2004) 1-72 Springer].Comment: Published at http://dx.doi.org/10.1214/14-AAP1091 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Hal-Diderot

On the Insertion Time of Cuckoo Hashing

Author: Fountoulakis Nikolaos
Panagiotou Konstantinos
Steger Angelika
Publication venue
Publication date: 01/01/2013
Field of study

Cuckoo hashing is an efficient technique for creating large hash tables with high space utilization and guaranteed constant access times. There, each item can be placed in a location given by any one out of k different hash functions. In this paper we investigate further the random walk heuristic for inserting in an online fashion new items into the hash table. Provided that k > 2 and that the number of items in the table is below (but arbitrarily close) to the theoretically achievable load threshold, we show a polylogarithmic bound for the maximum insertion time that holds with high probability.Comment: 27 pages, final version accepted by the SIAM Journal on Computin

arXiv.org e-Print Archive

Crossref

University of Birmingham Research Portal

MPG.PuRe