580 research outputs found
Tight Lower Bounds for Greedy Routing in Higher-Dimensional Small-World Grids
We consider Kleinberg's celebrated small world graph model (Kleinberg, 2000),
in which a D-dimensional grid {0,...,n-1}^D is augmented with a constant number
of additional unidirectional edges leaving each node. These long range edges
are determined at random according to a probability distribution (the
augmenting distribution), which is the same for each node. Kleinberg suggested
using the inverse D-th power distribution, in which node v is the long range
contact of node u with a probability proportional to ||u-v||^(-D). He showed
that such an augmenting distribution allows to route a message efficiently in
the resulting random graph: The greedy algorithm, where in each intermediate
node the message travels over a link that brings the message closest to the
target w.r.t. the Manhattan distance, finds a path of expected length O(log^2
n) between any two nodes. In this paper we prove that greedy routing does not
perform asymptotically better for any uniform and isotropic augmenting
distribution, i.e., the probability that node u has a particular long range
contact v is independent of the labels of u and v and only a function of
||u-v||.
In order to obtain the result, we introduce a novel proof technique: We
define a budget game, in which a token travels over a game board, while the
player manages a "probability budget". In each round, the player bets part of
her remaining probability budget on step sizes. A step size is chosen at random
according to a probability distribution of the player's bet. The token then
makes progress as determined by the chosen step size, while some of the
player's bet is removed from her probability budget. We prove a tight lower
bound for such a budget game, and then obtain a lower bound for greedy routing
in the D-dimensional grid by a reduction
Efficient Gauss Elimination for Near-Quadratic Matrices with One Short Random Block per Row, with Applications
In this paper we identify a new class of sparse near-quadratic random Boolean matrices that have full row rank over F_2 = {0,1} with high probability and can be transformed into echelon form in almost linear time by a simple version of Gauss elimination. The random matrix with dimensions n(1-epsilon) x n is generated as follows: In each row, identify a block of length L = O((log n)/epsilon) at a random position. The entries outside the block are 0, the entries inside the block are given by fair coin tosses. Sorting the rows according to the positions of the blocks transforms the matrix into a kind of band matrix, on which, as it turns out, Gauss elimination works very efficiently with high probability. For the proof, the effects of Gauss elimination are interpreted as a ("coin-flipping") variant of Robin Hood hashing, whose behaviour can be captured in terms of a simple Markov model from queuing theory. Bounds for expected construction time and high success probability follow from results in this area. They readily extend to larger finite fields in place of F_2.
By employing hashing, this matrix family leads to a new implementation of a retrieval data structure, which represents an arbitrary function f: S -> {0,1} for some set S of m = (1-epsilon)n keys. It requires m/(1-epsilon) bits of space, construction takes O(m/epsilon^2) expected time on a word RAM, while queries take O(1/epsilon) time and access only one contiguous segment of O((log m)/epsilon) bits in the representation (O(1/epsilon) consecutive words on a word RAM). The method is readily implemented and highly practical, and it is competitive with state-of-the-art methods. In a more theoretical variant, which works only for unrealistically large S, we can even achieve construction time O(m/epsilon) and query time O(1), accessing O(1) contiguous memory words for a query. By well-established methods the retrieval data structure leads to efficient constructions of (static) perfect hash functions and (static) Bloom filters with almost optimal space and very local storage access patterns for queries
A More Reliable Greedy Heuristic for Maximum Matchings in Sparse Random Graphs
We propose a new greedy algorithm for the maximum cardinality matching
problem. We give experimental evidence that this algorithm is likely to find a
maximum matching in random graphs with constant expected degree c>0,
independent of the value of c. This is contrary to the behavior of commonly
used greedy matching heuristics which are known to have some range of c where
they probably fail to compute a maximum matching
Orientability thresholds for random hypergraphs
Let be two fixed integers. Let \orH be a random hypergraph whose
hyperedges are all of cardinality . To {\em -orient} a hyperedge, we
assign exactly of its vertices positive signs with respect to the
hyperedge, and the rest negative. A -orientation of \orH consists of a
-orientation of all hyperedges of \orH, such that each vertex receives at
most positive signs from its incident hyperedges. When is large enough,
we determine the threshold of the existence of a -orientation of a
random hypergraph. The -orientation of hypergraphs is strongly related
to a general version of the off-line load balancing problem. The graph case,
when and , was solved recently by Cain, Sanders and Wormald and
independently by Fernholz and Ramachandran, which settled a conjecture of Karp
and Saks.Comment: 47 pages, 1 figures, the journal version of [16
Tight Thresholds for Cuckoo Hashing via XORSAT
We settle the question of tight thresholds for offline cuckoo hashing. The
problem can be stated as follows: we have n keys to be hashed into m buckets
each capable of holding a single key. Each key has k >= 3 (distinct) associated
buckets chosen uniformly at random and independently of the choices of other
keys. A hash table can be constructed successfully if each key can be placed
into one of its buckets. We seek thresholds alpha_k such that, as n goes to
infinity, if n/m <= alpha for some alpha < alpha_k then a hash table can be
constructed successfully with high probability, and if n/m >= alpha for some
alpha > alpha_k a hash table cannot be constructed successfully with high
probability. Here we are considering the offline version of the problem, where
all keys and hash values are given, so the problem is equivalent to previous
models of multiple-choice hashing. We find the thresholds for all values of k >
2 by showing that they are in fact the same as the previously known thresholds
for the random k-XORSAT problem. We then extend these results to the setting
where keys can have differing number of choices, and provide evidence in the
form of an algorithm for a conjecture extending this result to cuckoo hash
tables that store multiple keys in a bucket.Comment: Revision 3 contains missing details of proofs, as appendix
Towards Optimal Degree-distributions for Left-perfect Matchings in Random Bipartite Graphs
Consider a random bipartite multigraph with left nodes and right nodes. Each left node has random right
neighbors. The average left degree is fixed, . We ask
whether for the probability that has a left-perfect matching it is
advantageous not to fix for each left node but rather choose it at
random according to some (cleverly chosen) distribution. We show the following,
provided that the degrees of the left nodes are independent: If is an
integer then it is optimal to use a fixed degree of for all left
nodes. If is non-integral then an optimal degree-distribution has the
property that each left node has two possible degrees, \floor{\Delta} and
\ceil{\Delta}, with probability and , respectively, where
is from the closed interval and the average over all equals
\ceil{\Delta}-\Delta. Furthermore, if and is
constant, then each distribution of the left degrees that meets the conditions
above determines the same threshold that has the following
property as goes to infinity: If then there exists a
left-perfect matching with high probability. If then there
exists no left-perfect matching with high probability. The threshold
is the same as the known threshold for offline -ary cuckoo
hashing for integral or non-integral
How Good Is Multi-Pivot Quicksort?
Multi-Pivot Quicksort refers to variants of classical quicksort where in the
partitioning step pivots are used to split the input into segments.
For many years, multi-pivot quicksort was regarded as impractical, but in 2009
a 2-pivot approach by Yaroslavskiy, Bentley, and Bloch was chosen as the
standard sorting algorithm in Sun's Java 7. In 2014 at ALENEX, Kushagra et al.
introduced an even faster algorithm that uses three pivots. This paper studies
what possible advantages multi-pivot quicksort might offer in general. The
contributions are as follows: Natural comparison-optimal algorithms for
multi-pivot quicksort are devised and analyzed. The analysis shows that the
benefits of using multiple pivots with respect to the average comparison count
are marginal and these strategies are inferior to simpler strategies such as
the well known median-of- approach. A substantial part of the partitioning
cost is caused by rearranging elements. A rigorous analysis of an algorithm for
rearranging elements in the partitioning step is carried out, observing mainly
how often array cells are accessed during partitioning. The algorithm behaves
best if 3 to 5 pivots are used. Experiments show that this translates into good
cache behavior and is closest to predicting observed running times of
multi-pivot quicksort algorithms. Finally, it is studied how choosing pivots
from a sample affects sorting cost. The study is theoretical in the sense that
although the findings motivate design recommendations for multipivot quicksort
algorithms that lead to running time improvements over known algorithms in an
experimental setting, these improvements are small.Comment: Submitted to a journal, v2: Fixed statement of Gibb's inequality, v3:
Revised version, especially improving on the experiments in Section
Fast Scalable Construction of (Minimal Perfect Hash) Functions
Recent advances in random linear systems on finite fields have paved the way
for the construction of constant-time data structures representing static
functions and minimal perfect hash functions using less space with respect to
existing techniques. The main obstruction for any practical application of
these results is the cubic-time Gaussian elimination required to solve these
linear systems: despite they can be made very small, the computation is still
too slow to be feasible.
In this paper we describe in detail a number of heuristics and programming
techniques to speed up the resolution of these systems by several orders of
magnitude, making the overall construction competitive with the standard and
widely used MWHC technique, which is based on hypergraph peeling. In
particular, we introduce broadword programming techniques for fast equation
manipulation and a lazy Gaussian elimination algorithm. We also describe a
number of technical improvements to the data structure which further reduce
space usage and improve lookup speed.
Our implementation of these techniques yields a minimal perfect hash function
data structure occupying 2.24 bits per element, compared to 2.68 for MWHC-based
ones, and a static function data structure which reduces the multiplicative
overhead from 1.23 to 1.03
- …
