28 research outputs found
Convergence of multivariate belief propagation, with applications to cuckoo hashing and load balancing.
International audienceThis paper is motivated by two applications, namely i) generalizations of cuckoo hashing, a computationally simple approach to assigning keys to objects, and ii) load balancing in content distribution networks, where one is interested in determining the impact of content replication on performance. These two problems admit a common abstraction: in both scenarios, performance is characterized by the maximum weight of a generalization of a matching in a bipartite graph, featuring node and edge capacities. Our main result is a law of large numbers characterizing the asymptotic maximum weight matching in the limit of large bipartite random graphs, when the graphs admit a local weak limit that is a tree. This result specializes to the two application scenarios, yielding new results in both contexts. In contrast with previous results, the key novelty is the ability to handle edge capacities with arbitrary integer values. An analysis of belief propagation algorithms (BP) with multivariate belief vectors underlies the proof. In particular, we show convergence of the corresponding BP by exploiting monotonicity of the belief vectors with respect to the so-called upshifted likelihood ratio stochastic order. This auxiliary result can be of independent interest, providing a new set of structural conditions which ensure convergence of BP
Load thresholds for cuckoo hashing with overlapping blocks
Dietzfelbinger and Weidling [DW07] proposed a natural variation of cuckoo
hashing where each of objects is assigned intervals of size
in a linear (or cyclic) hash table of size and both start points are chosen
independently and uniformly at random. Each object must be placed into a table
cell within its intervals, but each cell can only hold one object. Experiments
suggested that this scheme outperforms the variant with blocks in which
intervals are aligned at multiples of . In particular, the load threshold
is higher, i.e. the load that can be achieved with high probability. For
instance, Lehman and Panigrahy [LP09] empirically observed the threshold for
to be around as compared to roughly using blocks.
They managed to pin down the asymptotics of the thresholds for large ,
but the precise values resisted rigorous analysis.
We establish a method to determine these load thresholds for all , and, in fact, for general . For instance, for we
get . The key tool we employ is an insightful and general
theorem due to Leconte, Lelarge, and Massouli\'e [LLM13], which adapts methods
from statistical physics to the world of hypergraph orientability. In effect,
the orientability thresholds for our graph families are determined by belief
propagation equations for certain graph limits. As a side note we provide
experimental evidence suggesting that placements can be constructed in linear
time with loads close to the threshold using an adapted version of an algorithm
by Khosla [Kho13]
Load thresholds for cuckoo hashing with double hashing
In k-ary cuckoo hashing, each of cn objects is associated with k random buckets in a hash table of size n. An l-orientation is an assignment of objects to associated buckets such that each bucket receives at most l objects. Several works have determined load thresholds c^* = c^*(k,l) for k-ary cuckoo hashing; that is, for c c^* no l-orientation exists with high probability.
A natural variant of k-ary cuckoo hashing utilizes double hashing, where, when the buckets are numbered 0,1,...,n-1, the k choices of random buckets form an arithmetic progression modulo n. Double hashing simplifies implementation and requires less randomness, and it has been shown that double hashing has the same behavior as fully random hashing in several other data structures that similarly use multiple hashes for each object. Interestingly, previous work has come close to but has not fully shown that the load threshold for k-ary cuckoo hashing is the same when using double hashing as when using fully random hashing. Specifically, previous work has shown that the thresholds for both settings coincide, except that for double hashing it was possible that o(n) objects would have been left unplaced. Here we close this open question by showing the thresholds are indeed the same, by providing a combinatorial argument that reconciles this stubborn difference
The Multiple-orientability Thresholds for Random Hypergraphs
A -uniform hypergraph is called -orientable, if there
is an assignment of each edge to one of its vertices such
that no vertex is assigned more than edges. Let be a
hypergraph, drawn uniformly at random from the set of all -uniform
hypergraphs with vertices and edges. In this paper we establish the
threshold for the -orientability of for all and
, i.e., we determine a critical quantity such that
with probability the graph has an -orientation if
.
Our result has various applications including sharp load thresholds for
cuckoo hashing, load balancing with guaranteed maximum load, and massive
parallel access to hard disk arrays.Comment: An extended abstract appeared in the proceedings of SODA 201
The densest subgraph problem in sparse random graphs
We determine the asymptotic behavior of the maximum subgraph density of large
random graphs with a prescribed degree sequence. The result applies in
particular to the Erd\H{o}s-R\'{e}nyi model, where it settles a conjecture of
Hajek [IEEE Trans. Inform. Theory 36 (1990) 1398-1414]. Our proof consists in
extending the notion of balanced loads from finite graphs to their local weak
limits, using unimodularity. This is a new illustration of the objective method
described by Aldous and Steele [In Probability on Discrete Structures (2004)
1-72 Springer].Comment: Published at http://dx.doi.org/10.1214/14-AAP1091 in the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
The Multiple-Orientability Thresholds for Random Hypergraphs
A k-uniform hypergraph H = (V, E) is called l-orientable if there is an assignment of each edge e is an element of E to one of its vertices v is an element of e such that no vertex is assigned more than l edges. Let H-n,H-m,H-k be a hypergraph, drawn uniformly at random from the set of all k-uniform hypergraphs with n vertices and m edges. In this paper we establish the threshold for the l-orientability of H-n,H-m,H-k for all k >= 3 and l >= 2, that is, we determine a critical quantity c(*)k,l such that with probability 1-o(1) the graph H-n,H-cn,(k) has an l-orientation if c c(k,l)(*) . Our result has various applications, including sharp load thresholds for cuckoo hashing, load balancing with guaranteed maximum load, and massive parallel access to hard disk arrays
Topics in random graphs, combinatorial optimization, and statistical inference
The manuscript is made of three chapters presenting three differenttopics on which I worked with Ph.D. students. Each chapter can be read independently of the others andshould be relatively self-contained. Chapter 1 is a gentle introduction to the theory of random graphswith an emphasis on contagions on such networks. In Chapter 2, I explain the main ideas of the objectivemethod developed by Aldous and Steele applied to the spectral measure of random graphs and themonomer-dimer problem. This topic is dear to me and I hope that this chapter will convince the readerthat it is an exciting field of research. Chapter 3 deals with problems in high-dimensional statistics whichnow occupy a large proportion of my time. Unlike Chapters 1 and 2 which could be easily extended inlecture notes, I felt that the material in Chapter 3 was not ready for such a treatment. This field ofresearch is currently very active and I decided to present two of my recent contributions
Applied Metaheuristic Computing
For decades, Applied Metaheuristic Computing (AMC) has been a prevailing optimization technique for tackling perplexing engineering and business problems, such as scheduling, routing, ordering, bin packing, assignment, facility layout planning, among others. This is partly because the classic exact methods are constrained with prior assumptions, and partly due to the heuristics being problem-dependent and lacking generalization. AMC, on the contrary, guides the course of low-level heuristics to search beyond the local optimality, which impairs the capability of traditional computation methods. This topic series has collected quality papers proposing cutting-edge methodology and innovative applications which drive the advances of AMC