Search CORE

65 research outputs found

Thresholds for Extreme Orientability

Author: Loh Po-Shen
Pagh Rasmus
Publication venue
Publication date: 01/02/2012
Field of study

Multiple-choice load balancing has been a topic of intense study since the seminal paper of Azar, Broder, Karlin, and Upfal. Questions in this area can be phrased in terms of orientations of a graph, or more generally a k-uniform random hypergraph. A (d,b)-orientation is an assignment of each edge to d of its vertices, such that no vertex has more than b edges assigned to it. Conditions for the existence of such orientations have been completely documented except for the "extreme" case of (k-1,1)-orientations. We consider this remaining case, and establish: - The density threshold below which an orientation exists with high probability, and above which it does not exist with high probability. - An algorithm for finding an orientation that runs in linear time with high probability, with explicit polynomial bounds on the failure probability. Previously, the only known algorithms for constructing (k-1,1)-orientations worked for k<=3, and were only shown to have expected linear running time.Comment: Corrected description of relationship to the work of LeLarg

arXiv.org e-Print Archive

The IT University of Copenhagen's Repository

FigShare

Ramsey games with giants

Author: Alan Frieze
Alan Frieze
Benny Sudakov
Benny Sudakov
Michael Krivelevich
Michael Krivelevich
Po-shen Loh
Po-shen Loh
Ramsey Games Giants
Tom Bohman
Tom Bohman
Publication venue
Publication date: 01/01/2009
Field of study

The classical result in the theory of random graphs, proved by Erdos and Renyi in 1960, concerns the threshold for the appearance of the giant component in the random graph process. We consider a variant of this problem, with a Ramsey flavor. Now, each random edge that arrives in the sequence of rounds must be colored with one of R colors. The goal can be either to create a giant component in every color class, or alternatively, to avoid it in every color. One can analyze the offline or online setting for this problem. In this paper, we consider all these variants and provide nontrivial upper and lower bounds; in certain cases (like online avoidance) the obtained bounds are asymptotically tight.Comment: 29 pages; minor revision

arXiv.org e-Print Archive

CiteSeerX

Random hypergraphs for hashing-based data structures

Author: Walzer Stefan
Publication venue
Publication date: 01/01/2020
Field of study

This thesis concerns dictionaries and related data structures that rely on providing several random possibilities for storing each key. Imagine information on a set S of m = |S| keys should be stored in n memory locations, indexed by [n] = {1,…,n}. Each object x [ELEMENT OF] S is assigned a small set e(x) [SUBSET OF OR EQUAL TO] [n] of locations by a random hash function, independent of other objects. Information on x must then be stored in the locations from e(x) only. It is possible that too many objects compete for the same locations, in particular if the load c = m/n is high. Successfully storing all information may then be impossible. For most distributions of e(x), however, success or failure can be predicted very reliably, since the success probability is close to 1 for loads c less than a certain load threshold c^* and close to 0 for loads greater than this load threshold. We mainly consider two types of data structures: • A cuckoo hash table is a dictionary data structure where each key x [ELEMENT OF] S is stored together with an associated value f(x) in one of the memory locations with an index from e(x). The distribution of e(x) is controlled by the hashing scheme. We analyse three known hashing schemes, and determine their exact load thresholds. The schemes are unaligned blocks, double hashing and a scheme for dynamically growing key sets. • A retrieval data structure also stores a value f(x) for each x [ELEMENT OF] S. This time, the values stored in the memory locations from e(x) must satisfy a linear equation that characterises the value f(x). The resulting data structure is extremely compact, but unusual. It cannot answer questions of the form “is y [ELEMENT OF] S?”. Given a key y it returns a value z. If y [ELEMENT OF] S, then z = f(y) is guaranteed, otherwise z may be an arbitrary value. We consider two new hashing schemes, where the elements of e(x) are contained in one or two contiguous blocks. This yields good access times on a word RAM and high cache efficiency. An important question is whether these types of data structures can be constructed in linear time. The success probability of a natural linear time greedy algorithm exhibits, once again, threshold behaviour with respect to the load c. We identify a hashing scheme that leads to a particularly high threshold value in this regard. In the mathematical model, the memory locations [n] correspond to vertices, and the sets e(x) for x [ELEMENT OF] S correspond to hyperedges. Three properties of the resulting hypergraphs turn out to be important: peelability, solvability and orientability. Therefore, large parts of this thesis examine how hyperedge distribution and load affects the probabilities with which these properties hold and derive corresponding thresholds. Translated back into the world of data structures, we achieve low access times, high memory efficiency and low construction times. We complement and support the theoretical results by experiments.Diese Arbeit behandelt Wörterbücher und verwandte Datenstrukturen, die darauf aufbauen, mehrere zufällige Möglichkeiten zur Speicherung jedes Schlüssels vorzusehen. Man stelle sich vor, Information über eine Menge S von m = |S| Schlüsseln soll in n Speicherplätzen abgelegt werden, die durch [n] = {1,…,n} indiziert sind. Jeder Schlüssel x [ELEMENT OF] S bekommt eine kleine Menge e(x) [SUBSET OF OR EQUAL TO] [n] von Speicherplätzen durch eine zufällige Hashfunktion unabhängig von anderen Schlüsseln zugewiesen. Die Information über x darf nun ausschließlich in den Plätzen aus e(x) untergebracht werden. Es kann hierbei passieren, dass zu viele Schlüssel um dieselben Speicherplätze konkurrieren, insbesondere bei hoher Auslastung c = m/n. Eine erfolgreiche Speicherung der Gesamtinformation ist dann eventuell unmöglich. Für die meisten Verteilungen von e(x) lässt sich Erfolg oder Misserfolg allerdings sehr zuverlässig vorhersagen, da für Auslastung c unterhalb eines gewissen Auslastungsschwellwertes c* die Erfolgswahrscheinlichkeit nahezu 1 ist und für c jenseits dieses Auslastungsschwellwertes nahezu 0 ist. Hauptsächlich werden wir zwei Arten von Datenstrukturen betrachten: • Eine Kuckucks-Hashtabelle ist eine Wörterbuchdatenstruktur, bei der jeder Schlüssel x [ELEMENT OF] S zusammen mit einem assoziierten Wert f(x) in einem der Speicherplätze mit Index aus e(x) gespeichert wird. Die Verteilung von e(x) wird hierbei vom Hashing-Schema festgelegt. Wir analysieren drei bekannte Hashing-Schemata und bestimmen erstmals deren exakte Auslastungsschwellwerte im obigen Sinne. Die Schemata sind unausgerichtete Blöcke, Doppel-Hashing sowie ein Schema für dynamisch wachsenden Schlüsselmengen. • Auch eine Retrieval-Datenstruktur speichert einen Wert f(x) für alle x [ELEMENT OF] S. Diesmal sollen die Werte in den Speicherplätzen aus e(x) eine lineare Gleichung erfüllen, die den Wert f(x) charakterisiert. Die entstehende Datenstruktur ist extrem platzsparend, aber ungewöhnlich: Sie ist ungeeignet um Fragen der Form „ist y [ELEMENT OF] S?“ zu beantworten. Bei Anfrage eines Schlüssels y wird ein Ergebnis z zurückgegeben. Falls y [ELEMENT OF] S ist, so ist z = f(y) garantiert, andernfalls darf z ein beliebiger Wert sein. Wir betrachten zwei neue Hashing-Schemata, bei denen die Elemente von e(x) in einem oder in zwei zusammenhängenden Blöcken liegen. So werden gute Zugriffszeiten auf Word-RAMs und eine hohe Cache-Effizienz erzielt. Eine wichtige Frage ist, ob Datenstrukturen obiger Art in Linearzeit konstruiert werden können. Die Erfolgswahrscheinlichkeit eines naheliegenden Greedy-Algorithmus weist abermals ein Schwellwertverhalten in Bezug auf die Auslastung c auf. Wir identifizieren ein Hashing-Schema, das diesbezüglich einen besonders hohen Schwellwert mit sich bringt. In der mathematischen Modellierung werden die Speicherpositionen [n] als Knoten und die Mengen e(x) für x [ELEMENT OF] S als Hyperkanten aufgefasst. Drei Eigenschaften der entstehenden Hypergraphen stellen sich dann als zentral heraus: Schälbarkeit, Lösbarkeit und Orientierbarkeit. Weite Teile dieser Arbeit beschäftigen sich daher mit den Wahrscheinlichkeiten für das Vorliegen dieser Eigenschaften abhängig von Hashing Schema und Auslastung, sowie mit entsprechenden Schwellwerten. Eine Rückübersetzung der Ergebnisse liefert dann Datenstrukturen mit geringen Anfragezeiten, hoher Speichereffizienz und geringen Konstruktionszeiten. Die theoretischen Überlegungen werden dabei durch experimentelle Ergebnisse ergänzt und gestützt

Digitale Bibliothek Thüringen

Answering Spatial Multiple-Set Intersection Queries Using 2-3 Cuckoo Hash-Filters

Author: Eppstein David
Hagerup Torben
Kopelowitz Tsvi
Yang Xiao
Publication venue
Publication date: 29/08/2017
Field of study

We show how to answer spatial multiple-set intersection queries in O(n(log w)/w + kt) expected time, where n is the total size of the t sets involved in the query, w is the number of bits in a memory word, k is the output size, and c is any fixed constant. This improves the asymptotic performance over previous solutions and is based on an interesting data structure, known as 2-3 cuckoo hash-filters. Our results apply in the word-RAM model (or practical RAM model), which allows for constant-time bit-parallel operations, such as bitwise AND, OR, NOT, and MSB (most-significant 1-bit), as exist in modern CPUs and GPUs. Our solutions apply to any multiple-set intersection queries in spatial data sets that can be reduced to one-dimensional range queries, such as spatial join queries for one-dimensional points or sets of points stored along space-filling curves, which are used in GIS applications.Comment: Full version of paper from 2017 ACM SIGSPATIAL International Conference on Advances in Geographic Information System

arXiv.org e-Print Archive

Crossref

Generation and properties of random graphs and analysis of randomized algorithms

Author: Gao Pu
Publication venue: 'University of Waterloo'
Publication date: 01/01/2009
Field of study

We study a new method of generating random

d

-regular graphs by repeatedly applying an operation called pegging. The pegging algorithm, which applies the pegging operation in each step, is a method of generating large random regular graphs beginning with small ones. We prove that the limiting joint distribution of the numbers of short cycles in the resulting graph is independent Poisson. We use the coupling method to bound the total variation distance between the joint distribution of short cycle counts and its limit and thereby show that

O(\epsilon^{-1})

is an upper bound of the \eps-mixing time. The coupling involves two different, though quite similar, Markov chains that are not time-homogeneous. We also show that the

\epsilon

-mixing time is not

o(\epsilon^{-1})

. This demonstrates that the upper bound is essentially tight. We study also the connectivity of random

d

-regular graphs generated by the pegging algorithm. We show that these graphs are asymptotically almost surely

d

-connected for any even constant

d\ge 4

. The problem of orientation of random hypergraphs is motivated by the classical load balancing problem. Let

h>w>0

be two fixed integers. Let \orH be a hypergraph whose hyperedges are uniformly of size

h

. To {\em

w

-orient} a hyperedge, we assign exactly

w

of its vertices positive signs with respect to this hyperedge, and the rest negative. A

(w,k)

-orientation of \orH consists of a

w

-orientation of all hyperedges of \orH, such that each vertex receives at most

k

positive signs from its incident hyperedges. When

k

is large enough, we determine the threshold of the existence of a

(w,k)

-orientation of a random hypergraph. The

(w,k)

-orientation of hypergraphs is strongly related to a general version of the off-line load balancing problem. The other topic we discuss is computing the probability of induced subgraphs in a random regular graph. Let

0<s<n

and

H

be a graph on

s

vertices. For any

S\subset [n]

with

|S|=s

, we compute the probability that the subgraph of

\mathcal{G}_{n,d}

induced by

S

H

. The result holds for any

d=o(n^{1/3})

and is further extended to

\mathcal{G}_{n,{\bf d}}

, the probability space of random graphs with given degree sequence

\bf d

. This result provides a basic tool for studying properties, for instance the existence or the counts, of certain types of induced subgraphs

CiteSeerX

University of Waterloo's Institutional Repository

Private Set Intersection with Linear Communication from General Assumptions

Author: Brett Hemenway Falk
Daniel Noble
Rafail Ostrovsky
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 01/01/2019
Field of study

This work presents a hashing-based algorithm for Private Set Intersection (PSI) in the honest-but-curious setting. The protocol is generic, modular and provides both asymptotic and concrete efficiency improvements over existing PSI protocols. If each player has

m

elements, our scheme requires only O(m \secpar) communication between the parties, where \secpar is a security parameter. Our protocol builds on the hashing-based PSI protocol of Pinkas et al. (USENIX 2014, USENIX 2015), but we replace one of the sub-protocols (handling the cuckoo ``stash\u27\u27) with a special-purpose PSI protocol that is optimized for comparing sets of unbalanced size. This brings the asymptotic communication complexity of the overall protocol down from \omega(m \secpar) to O(m\secpar), and provides concrete performance improvements (10-15\% reduction in communication costs) over Kolesnikov et al. (CCS 2016) under real-world parameter choices. Our protocol is simple, generic and benefits from the permutation-hashing optimizations of Pinkas et al. (USENIX 2015) and the Batched, Relaxed Oblivious Pseudo Random Functions of Kolesnikov et al. (CCS 2016)

Crossref

Cryptology ePrint Archive