416 research outputs found
Tight Lower Bounds for Greedy Routing in Higher-Dimensional Small-World Grids
We consider Kleinberg's celebrated small world graph model (Kleinberg, 2000),
in which a D-dimensional grid {0,...,n-1}^D is augmented with a constant number
of additional unidirectional edges leaving each node. These long range edges
are determined at random according to a probability distribution (the
augmenting distribution), which is the same for each node. Kleinberg suggested
using the inverse D-th power distribution, in which node v is the long range
contact of node u with a probability proportional to ||u-v||^(-D). He showed
that such an augmenting distribution allows to route a message efficiently in
the resulting random graph: The greedy algorithm, where in each intermediate
node the message travels over a link that brings the message closest to the
target w.r.t. the Manhattan distance, finds a path of expected length O(log^2
n) between any two nodes. In this paper we prove that greedy routing does not
perform asymptotically better for any uniform and isotropic augmenting
distribution, i.e., the probability that node u has a particular long range
contact v is independent of the labels of u and v and only a function of
||u-v||.
In order to obtain the result, we introduce a novel proof technique: We
define a budget game, in which a token travels over a game board, while the
player manages a "probability budget". In each round, the player bets part of
her remaining probability budget on step sizes. A step size is chosen at random
according to a probability distribution of the player's bet. The token then
makes progress as determined by the chosen step size, while some of the
player's bet is removed from her probability budget. We prove a tight lower
bound for such a budget game, and then obtain a lower bound for greedy routing
in the D-dimensional grid by a reduction
Efficient Gauss Elimination for Near-Quadratic Matrices with One Short Random Block per Row, with Applications
In this paper we identify a new class of sparse near-quadratic random Boolean matrices that have full row rank over F_2 = {0,1} with high probability and can be transformed into echelon form in almost linear time by a simple version of Gauss elimination. The random matrix with dimensions n(1-epsilon) x n is generated as follows: In each row, identify a block of length L = O((log n)/epsilon) at a random position. The entries outside the block are 0, the entries inside the block are given by fair coin tosses. Sorting the rows according to the positions of the blocks transforms the matrix into a kind of band matrix, on which, as it turns out, Gauss elimination works very efficiently with high probability. For the proof, the effects of Gauss elimination are interpreted as a ("coin-flipping") variant of Robin Hood hashing, whose behaviour can be captured in terms of a simple Markov model from queuing theory. Bounds for expected construction time and high success probability follow from results in this area. They readily extend to larger finite fields in place of F_2.
By employing hashing, this matrix family leads to a new implementation of a retrieval data structure, which represents an arbitrary function f: S -> {0,1} for some set S of m = (1-epsilon)n keys. It requires m/(1-epsilon) bits of space, construction takes O(m/epsilon^2) expected time on a word RAM, while queries take O(1/epsilon) time and access only one contiguous segment of O((log m)/epsilon) bits in the representation (O(1/epsilon) consecutive words on a word RAM). The method is readily implemented and highly practical, and it is competitive with state-of-the-art methods. In a more theoretical variant, which works only for unrealistically large S, we can even achieve construction time O(m/epsilon) and query time O(1), accessing O(1) contiguous memory words for a query. By well-established methods the retrieval data structure leads to efficient constructions of (static) perfect hash functions and (static) Bloom filters with almost optimal space and very local storage access patterns for queries
A More Reliable Greedy Heuristic for Maximum Matchings in Sparse Random Graphs
We propose a new greedy algorithm for the maximum cardinality matching
problem. We give experimental evidence that this algorithm is likely to find a
maximum matching in random graphs with constant expected degree c>0,
independent of the value of c. This is contrary to the behavior of commonly
used greedy matching heuristics which are known to have some range of c where
they probably fail to compute a maximum matching
Ex 17,8-16 und Dt 25,17-19 beim Wort genommen
In this article the texts of Ex 17,8-16 and Dt 25,17-19 are compared. Both texts deal with the attack by Amalek upon Israel. Although the description is different in each text, the consequences are quite the same: to show the purpose to destroy the foreign nation and, according to the will of Yahveh, to erase the memory of Amalek. There is, however, a crucial difference in the presentation of the two episodes: while the Exodus text refrains from any moral judgement of the Amalek attack, the Deuteronomist author makes an explicit judgement. It seems that the spontaneous and somehow natural reaction presented in Ex 17 was not enough for him. Could this moral justification be regarded as a historical progress?En este artículo se comparan Ex 17,8-16 y Deut 25,17-19. Ambos textos tratan del ataque de Amaleq a Israel. A pesar de que este hecho se narra de manera distinta en cada uno de ellos, la consecuencia es la misma: poner de manifiesto el propósito de exterminar al otro pueblo para, según la voluntad de Yahveh, extinguir la memoria de Amaleq. Una importante diferencia en la descripción de ambos episodios es que el texto del Éxodo se abstiene de valorar moralmente el ataque de Amaleq, mientras que el autor deuteronómico sí lo valora de manera explícita; parece ser que, para él, la espontánea y en cierto modo natural reacción reflejada en Ex 17 no era suficiente. ¿Se puede reconocer en esta forma de justificación moral un avance histórico
On randomness in Hash functions
In the talk, we shall discuss quality measures for hash functions used in data structures and algorithms, and survey positive and negative results. (This talk is not about cryptographic hash functions.) For the analysis of algorithms involving hash functions, it is often convenient to assume the hash functions used behave fully randomly; in some cases there is no analysis known that avoids this assumption. In practice, one needs to get by with weaker hash functions that can be generated by randomized algorithms. A well-studied range of applications concern realizations of dynamic dictionaries (linear probing, chained hashing, dynamic perfect hashing, cuckoo hashing and its generalizations) or Bloom filters and their variants. A particularly successful and useful means of classification are Carter and Wegman's universal or k-wise independent classes, introduced in 1977. A natural and widely used approach to analyzing an algorithm involving hash functions is to show that it works if a sufficiently strong universal class of hash functions is used, and to substitute one of the known constructions of such classes. This invites research into the question of just how much independence in the hash functions is necessary for an algorithm to work. Some recent analyses that gave impossibility results constructed rather artificial classes that would not work; other results pointed out natural, widely used hash classes that would not work in a particular application. Only recently it was shown that under certain assumptions on some entropy present in the set of keys even 2-wise independent hash classes will lead to strong randomness properties in the hash values. The negative results show that these results may not be taken as justification for using weak hash classes indiscriminately, in particular for key sets with structure. When stronger independence properties are needed for a theoretical analysis, one may resort to classic constructions. Only in 2003 it was found out how full randomness can be simulated using only linear space overhead (which is optimal). The "split-and-share" approach can be used to justify the full randomness assumption in some situations in which full randomness is needed for the analysis to go through, like in many applications involving multiple hash functions (e.g., generalized versions of cuckoo hashing with multiple hash functions or larger bucket sizes, load balancing, Bloom filters and variants, or minimal perfect hash function constructions). For practice, efficiency considerations beyond constant factors are important. It is not hard to construct very efficient 2-wise independent classes. Using k-wise independent classes for constant k bigger than 3 has become feasible in practice only by new constructions involving tabulation. This goes together well with the quite new result that linear probing works with 5-independent hash functions. Recent developments suggest that the classification of hash function constructions by their degree of independence alone may not be adequate in some cases. Thus, one may want to analyze the behavior of specific hash classes in specific applications, circumventing the concept of k-wise independence. Several such results were recently achieved concerning hash functions that utilize tabulation. In particular if the analysis of the application involves using randomness properties in graphs and hypergraphs (generalized cuckoo hashing, also in the version with a "stash", or load balancing), a hash class combining k-wise independence with tabulation has turned out to be very powerful
Towards Optimal Degree-distributions for Left-perfect Matchings in Random Bipartite Graphs
Consider a random bipartite multigraph with left nodes and right nodes. Each left node has random right
neighbors. The average left degree is fixed, . We ask
whether for the probability that has a left-perfect matching it is
advantageous not to fix for each left node but rather choose it at
random according to some (cleverly chosen) distribution. We show the following,
provided that the degrees of the left nodes are independent: If is an
integer then it is optimal to use a fixed degree of for all left
nodes. If is non-integral then an optimal degree-distribution has the
property that each left node has two possible degrees, \floor{\Delta} and
\ceil{\Delta}, with probability and , respectively, where
is from the closed interval and the average over all equals
\ceil{\Delta}-\Delta. Furthermore, if and is
constant, then each distribution of the left degrees that meets the conditions
above determines the same threshold that has the following
property as goes to infinity: If then there exists a
left-perfect matching with high probability. If then there
exists no left-perfect matching with high probability. The threshold
is the same as the known threshold for offline -ary cuckoo
hashing for integral or non-integral
Orientability thresholds for random hypergraphs
Let be two fixed integers. Let \orH be a random hypergraph whose
hyperedges are all of cardinality . To {\em -orient} a hyperedge, we
assign exactly of its vertices positive signs with respect to the
hyperedge, and the rest negative. A -orientation of \orH consists of a
-orientation of all hyperedges of \orH, such that each vertex receives at
most positive signs from its incident hyperedges. When is large enough,
we determine the threshold of the existence of a -orientation of a
random hypergraph. The -orientation of hypergraphs is strongly related
to a general version of the off-line load balancing problem. The graph case,
when and , was solved recently by Cain, Sanders and Wormald and
independently by Fernholz and Ramachandran, which settled a conjecture of Karp
and Saks.Comment: 47 pages, 1 figures, the journal version of [16
Succinct Data Structures for Retrieval and Approximate Membership
The retrieval problem is the problem of associating data with keys in a set.
Formally, the data structure must store a function f: U ->{0,1}^r that has
specified values on the elements of a given set S, a subset of U, |S|=n, but
may have any value on elements outside S. Minimal perfect hashing makes it
possible to avoid storing the set S, but this induces a space overhead of
Theta(n) bits in addition to the nr bits needed for function values. In this
paper we show how to eliminate this overhead. Moreover, we show that for any k
query time O(k) can be achieved using space that is within a factor 1+e^{-k} of
optimal, asymptotically for large n. If we allow logarithmic evaluation time,
the additive overhead can be reduced to O(log log n) bits whp. The time to
construct the data structure is O(n), expected. A main technical ingredient is
to utilize existing tight bounds on the probability of almost square random
matrices with rows of low weight to have full row rank. In addition to direct
constructions, we point out a close connection between retrieval structures and
hash tables where keys are stored in an array and some kind of probing scheme
is used. Further, we propose a general reduction that transfers the results on
retrieval into analogous results on approximate membership, a problem
traditionally addressed using Bloom filters. Again, we show how to eliminate
the space overhead present in previously known methods, and get arbitrarily
close to the lower bound. The evaluation procedures of our data structures are
extremely simple (similar to a Bloom filter). For the results stated above we
assume free access to fully random hash functions. However, we show how to
justify this assumption using extra space o(n) to simulate full randomness on a
RAM
Tight Thresholds for Cuckoo Hashing via XORSAT
We settle the question of tight thresholds for offline cuckoo hashing. The
problem can be stated as follows: we have n keys to be hashed into m buckets
each capable of holding a single key. Each key has k >= 3 (distinct) associated
buckets chosen uniformly at random and independently of the choices of other
keys. A hash table can be constructed successfully if each key can be placed
into one of its buckets. We seek thresholds alpha_k such that, as n goes to
infinity, if n/m <= alpha for some alpha < alpha_k then a hash table can be
constructed successfully with high probability, and if n/m >= alpha for some
alpha > alpha_k a hash table cannot be constructed successfully with high
probability. Here we are considering the offline version of the problem, where
all keys and hash values are given, so the problem is equivalent to previous
models of multiple-choice hashing. We find the thresholds for all values of k >
2 by showing that they are in fact the same as the previously known thresholds
for the random k-XORSAT problem. We then extend these results to the setting
where keys can have differing number of choices, and provide evidence in the
form of an algorithm for a conjecture extending this result to cuckoo hash
tables that store multiple keys in a bucket.Comment: Revision 3 contains missing details of proofs, as appendix
- …