580 research outputs found

    Tight Lower Bounds for Greedy Routing in Higher-Dimensional Small-World Grids

    Full text link
    We consider Kleinberg's celebrated small world graph model (Kleinberg, 2000), in which a D-dimensional grid {0,...,n-1}^D is augmented with a constant number of additional unidirectional edges leaving each node. These long range edges are determined at random according to a probability distribution (the augmenting distribution), which is the same for each node. Kleinberg suggested using the inverse D-th power distribution, in which node v is the long range contact of node u with a probability proportional to ||u-v||^(-D). He showed that such an augmenting distribution allows to route a message efficiently in the resulting random graph: The greedy algorithm, where in each intermediate node the message travels over a link that brings the message closest to the target w.r.t. the Manhattan distance, finds a path of expected length O(log^2 n) between any two nodes. In this paper we prove that greedy routing does not perform asymptotically better for any uniform and isotropic augmenting distribution, i.e., the probability that node u has a particular long range contact v is independent of the labels of u and v and only a function of ||u-v||. In order to obtain the result, we introduce a novel proof technique: We define a budget game, in which a token travels over a game board, while the player manages a "probability budget". In each round, the player bets part of her remaining probability budget on step sizes. A step size is chosen at random according to a probability distribution of the player's bet. The token then makes progress as determined by the chosen step size, while some of the player's bet is removed from her probability budget. We prove a tight lower bound for such a budget game, and then obtain a lower bound for greedy routing in the D-dimensional grid by a reduction

    Efficient Gauss Elimination for Near-Quadratic Matrices with One Short Random Block per Row, with Applications

    Get PDF
    In this paper we identify a new class of sparse near-quadratic random Boolean matrices that have full row rank over F_2 = {0,1} with high probability and can be transformed into echelon form in almost linear time by a simple version of Gauss elimination. The random matrix with dimensions n(1-epsilon) x n is generated as follows: In each row, identify a block of length L = O((log n)/epsilon) at a random position. The entries outside the block are 0, the entries inside the block are given by fair coin tosses. Sorting the rows according to the positions of the blocks transforms the matrix into a kind of band matrix, on which, as it turns out, Gauss elimination works very efficiently with high probability. For the proof, the effects of Gauss elimination are interpreted as a ("coin-flipping") variant of Robin Hood hashing, whose behaviour can be captured in terms of a simple Markov model from queuing theory. Bounds for expected construction time and high success probability follow from results in this area. They readily extend to larger finite fields in place of F_2. By employing hashing, this matrix family leads to a new implementation of a retrieval data structure, which represents an arbitrary function f: S -> {0,1} for some set S of m = (1-epsilon)n keys. It requires m/(1-epsilon) bits of space, construction takes O(m/epsilon^2) expected time on a word RAM, while queries take O(1/epsilon) time and access only one contiguous segment of O((log m)/epsilon) bits in the representation (O(1/epsilon) consecutive words on a word RAM). The method is readily implemented and highly practical, and it is competitive with state-of-the-art methods. In a more theoretical variant, which works only for unrealistically large S, we can even achieve construction time O(m/epsilon) and query time O(1), accessing O(1) contiguous memory words for a query. By well-established methods the retrieval data structure leads to efficient constructions of (static) perfect hash functions and (static) Bloom filters with almost optimal space and very local storage access patterns for queries

    A More Reliable Greedy Heuristic for Maximum Matchings in Sparse Random Graphs

    Full text link
    We propose a new greedy algorithm for the maximum cardinality matching problem. We give experimental evidence that this algorithm is likely to find a maximum matching in random graphs with constant expected degree c>0, independent of the value of c. This is contrary to the behavior of commonly used greedy matching heuristics which are known to have some range of c where they probably fail to compute a maximum matching

    Orientability thresholds for random hypergraphs

    Full text link
    Let h>w>0h>w>0 be two fixed integers. Let \orH be a random hypergraph whose hyperedges are all of cardinality hh. To {\em ww-orient} a hyperedge, we assign exactly ww of its vertices positive signs with respect to the hyperedge, and the rest negative. A (w,k)(w,k)-orientation of \orH consists of a ww-orientation of all hyperedges of \orH, such that each vertex receives at most kk positive signs from its incident hyperedges. When kk is large enough, we determine the threshold of the existence of a (w,k)(w,k)-orientation of a random hypergraph. The (w,k)(w,k)-orientation of hypergraphs is strongly related to a general version of the off-line load balancing problem. The graph case, when h=2h=2 and w=1w=1, was solved recently by Cain, Sanders and Wormald and independently by Fernholz and Ramachandran, which settled a conjecture of Karp and Saks.Comment: 47 pages, 1 figures, the journal version of [16

    Tight Thresholds for Cuckoo Hashing via XORSAT

    Full text link
    We settle the question of tight thresholds for offline cuckoo hashing. The problem can be stated as follows: we have n keys to be hashed into m buckets each capable of holding a single key. Each key has k >= 3 (distinct) associated buckets chosen uniformly at random and independently of the choices of other keys. A hash table can be constructed successfully if each key can be placed into one of its buckets. We seek thresholds alpha_k such that, as n goes to infinity, if n/m <= alpha for some alpha < alpha_k then a hash table can be constructed successfully with high probability, and if n/m >= alpha for some alpha > alpha_k a hash table cannot be constructed successfully with high probability. Here we are considering the offline version of the problem, where all keys and hash values are given, so the problem is equivalent to previous models of multiple-choice hashing. We find the thresholds for all values of k > 2 by showing that they are in fact the same as the previously known thresholds for the random k-XORSAT problem. We then extend these results to the setting where keys can have differing number of choices, and provide evidence in the form of an algorithm for a conjecture extending this result to cuckoo hash tables that store multiple keys in a bucket.Comment: Revision 3 contains missing details of proofs, as appendix

    Towards Optimal Degree-distributions for Left-perfect Matchings in Random Bipartite Graphs

    Full text link
    Consider a random bipartite multigraph GG with nn left nodes and mn2m \geq n \geq 2 right nodes. Each left node xx has dx1d_x \geq 1 random right neighbors. The average left degree Δ\Delta is fixed, Δ2\Delta \geq 2. We ask whether for the probability that GG has a left-perfect matching it is advantageous not to fix dxd_x for each left node xx but rather choose it at random according to some (cleverly chosen) distribution. We show the following, provided that the degrees of the left nodes are independent: If Δ\Delta is an integer then it is optimal to use a fixed degree of Δ\Delta for all left nodes. If Δ\Delta is non-integral then an optimal degree-distribution has the property that each left node xx has two possible degrees, \floor{\Delta} and \ceil{\Delta}, with probability pxp_x and 1px1-p_x, respectively, where pxp_x is from the closed interval [0,1][0,1] and the average over all pxp_x equals \ceil{\Delta}-\Delta. Furthermore, if n=cmn=c\cdot m and Δ>2\Delta>2 is constant, then each distribution of the left degrees that meets the conditions above determines the same threshold c(Δ)c^*(\Delta) that has the following property as nn goes to infinity: If c<c(Δ)c<c^*(\Delta) then there exists a left-perfect matching with high probability. If c>c(Δ)c>c^*(\Delta) then there exists no left-perfect matching with high probability. The threshold c(Δ)c^*(\Delta) is the same as the known threshold for offline kk-ary cuckoo hashing for integral or non-integral k=Δk=\Delta

    How Good Is Multi-Pivot Quicksort?

    Get PDF
    Multi-Pivot Quicksort refers to variants of classical quicksort where in the partitioning step kk pivots are used to split the input into k+1k + 1 segments. For many years, multi-pivot quicksort was regarded as impractical, but in 2009 a 2-pivot approach by Yaroslavskiy, Bentley, and Bloch was chosen as the standard sorting algorithm in Sun's Java 7. In 2014 at ALENEX, Kushagra et al. introduced an even faster algorithm that uses three pivots. This paper studies what possible advantages multi-pivot quicksort might offer in general. The contributions are as follows: Natural comparison-optimal algorithms for multi-pivot quicksort are devised and analyzed. The analysis shows that the benefits of using multiple pivots with respect to the average comparison count are marginal and these strategies are inferior to simpler strategies such as the well known median-of-kk approach. A substantial part of the partitioning cost is caused by rearranging elements. A rigorous analysis of an algorithm for rearranging elements in the partitioning step is carried out, observing mainly how often array cells are accessed during partitioning. The algorithm behaves best if 3 to 5 pivots are used. Experiments show that this translates into good cache behavior and is closest to predicting observed running times of multi-pivot quicksort algorithms. Finally, it is studied how choosing pivots from a sample affects sorting cost. The study is theoretical in the sense that although the findings motivate design recommendations for multipivot quicksort algorithms that lead to running time improvements over known algorithms in an experimental setting, these improvements are small.Comment: Submitted to a journal, v2: Fixed statement of Gibb's inequality, v3: Revised version, especially improving on the experiments in Section

    Fast Scalable Construction of (Minimal Perfect Hash) Functions

    Full text link
    Recent advances in random linear systems on finite fields have paved the way for the construction of constant-time data structures representing static functions and minimal perfect hash functions using less space with respect to existing techniques. The main obstruction for any practical application of these results is the cubic-time Gaussian elimination required to solve these linear systems: despite they can be made very small, the computation is still too slow to be feasible. In this paper we describe in detail a number of heuristics and programming techniques to speed up the resolution of these systems by several orders of magnitude, making the overall construction competitive with the standard and widely used MWHC technique, which is based on hypergraph peeling. In particular, we introduce broadword programming techniques for fast equation manipulation and a lazy Gaussian elimination algorithm. We also describe a number of technical improvements to the data structure which further reduce space usage and improve lookup speed. Our implementation of these techniques yields a minimal perfect hash function data structure occupying 2.24 bits per element, compared to 2.68 for MWHC-based ones, and a static function data structure which reduces the multiplicative overhead from 1.23 to 1.03
    corecore