64 research outputs found
Tight Load Balancing via Randomized Local Search
We consider the following balls-into-bins process with bins and
balls: each ball is equipped with a mutually independent exponential clock of
rate 1. Whenever a ball's clock rings, the ball samples a random bin and moves
there if the number of balls in the sampled bin is smaller than in its current
bin. This simple process models a typical load balancing problem where users
(balls) seek a selfish improvement of their assignment to resources (bins).
From a game theoretic perspective, this is a randomized approach to the
well-known Koutsoupias-Papadimitriou model, while it is known as randomized
local search (RLS) in load balancing literature. Up to now, the best bound on
the expected time to reach perfect balance was due to Ganesh, Lilienthal, Manjunath, Proutiere, and Simatos
(Load balancing via random local search in closed and open systems, Queueing
Systems, 2012). We improve this to an asymptotically tight
. Our analysis is based on the crucial observation
that performing "destructive moves" (reversals of RLS moves) cannot decrease
the balancing time. This allows us to simplify problem instances and to ignore
"inconvenient moves" in the analysis.Comment: 24 pages, 3 figures, preliminary version appeared in proceedings of
2017 IEEE International Parallel and Distributed Processing Symposium
(IPDPS'17
Self-stabilizing Balls & Bins in Batches: The Power of Leaky Bins
A fundamental problem in distributed computing is the distribution of requests to a set of uniform servers without a centralized controller. Classically, such problems are modelled as static balls into bins processes, where m balls (tasks) are to be distributed to n bins (servers). In a seminal work, [Azar et al.; JoC'99] proposed the sequential strategy Greedy[d] for n = m. When thrown, a ball queries the load of d random bins and is allocated to a least loaded of these. [Azar et al.; JoC'99] showed that d=2 yields an exponential improvement compared to d=1. [Berenbrink et al.; JoC'06] extended this to m ⇒ n, showing that the maximal load difference is independent of m for d=2 (in contrast to d=1). We propose a new variant of an infinite balls into bins process. In each round an expected number of λ n new balls arrive and are distributed (in parallel) to the bins and each non-empty bin deletes one of its balls. This setting models a set of servers processing incoming requests, where clients can query a server's current load but receive no information about parallel requests. We study the Greedy[d] distribution scheme in this setting and show a strong self-stabilizing property: For any arrival rate λ=λ(n) < 1, the system load is time-invariant. Moreover, for any (even super-exponential) round t, the maximum system load is (w.h.p.) O(1 over 1-λ•logn over 1-λ) for d=1 and O(log n over 1-λ) for d=2. In particular, Greedy[2] has an exponentially smaller system load for high arrival rates
An Improved Drift Theorem for Balanced Allocations
In the balanced allocations framework, there are jobs (balls) to be
allocated to servers (bins). The goal is to minimize the gap, the
difference between the maximum and the average load.
Peres, Talwar and Wieder (RSA 2015) used the hyperbolic cosine potential
function to analyze a large family of allocation processes including the
-process and graphical balanced allocations. The key ingredient was
to prove that the potential drops in every step, i.e., a drift inequality.
In this work we improve the drift inequality so that (i) it is asymptotically
tighter, (ii) it assumes weaker preconditions, (iii) it applies not only to
processes allocating to more than one bin in a single step and (iv) to
processes allocating a varying number of balls depending on the sampled bin.
Our applications include the processes of (RSA 2015), but also several new
processes, and we believe that our techniques may lead to further results in
future work.Comment: This paper refines and extends the content on the drift theorem and
applications in arXiv:2203.13902. It consists of 38 pages, 7 figures, 1 tabl
Communication Patterns for Randomized Algorithms
Examples of large scale networks include the Internet, peer-to-peer networks, parallel computing systems, cloud computing systems, sensor networks, and social networks. Efficient dissemination of information in large networks such as these is a funda- mental problem. In many scenarios the gathering of information by a centralised controller can be impractical. When designing and analysing distributed algorithms we must consider the limitations imposed by the heterogeneity of devices in the networks. Devices may have limited computational ability or space. This makes randomised algorithms attractive solutions. Randomised algorithms can often be simpler and easier to implement than their deterministic counterparts. This thesis analyses the effect of communication patterns on the performance of distributed randomised algorithms. We study randomized algorithms with application to three different areas.
Firstly, we study a generalization of the balls-into-bins game. Balls into bins games have been used to analyse randomised load balancing. Under the Greedy[d] allocation scheme each ball queries the load of d random bins and is then allocated to the least loaded of them. We consider an infinite, parallel setting where expectedly λn balls are allocated in parallel according to the Greedy[d] allocation scheme in to n bins and subsequently each non-empty bin removes a ball. Our results show that for d = 1,2, the Greedy[d] allocation scheme is self-stabilizing and that in any round the maximum system load for high arrival rates is exponentially smaller for d = 2 compared to d = 1 (w.h.p).
Secondly, we introduce protocols that solve the plurality consensus problem on arbitrary graphs for arbitrarily small bias. Typically, protocols depend heavily on the employed communication mechanism. Our protocols are based on an interest- ing relationship between plurality consensus and distributed load balancing. This relationship allows us to design protocols that are both time and space efficient and generalize the state of the art for a large range of problem parameters.
Finally, we investigate the effect of restricting the communication of the classical PULL algorithm for randomised rumour spreading. Rumour spreading (broadcast) is a fundamental task in distributed computing. Under the classical PULL algo- rithm, a node with the rumour that receives multiple requests is able to respond to all of them in a given round. Our model restricts nodes such that they can re- spond to at most one request per round. Our results show that the restricted PULL algorithm is optimal for several graph classes such as complete graphs, expanders, random graphs and several Cayley graphs
Self-stabilizing balls and bins in batches: The power of leaky bins
A fundamental problem in distributed computing is the distribution of requests to a set of uniform servers without a centralized controller. Classically, such problems are modelled as static balls into bins processes, where m balls (tasks) are to be distributed among n bins (servers). In a seminal work, Azar et al. [4] proposed the sequential strategy Greedy[d] for n = m. Each ball queries the load of d random bins and is allocated to a least loaded of them. Azar et al. showed that d = 2 yields an exponential improvement compared to d = 1. Berenbrink et al. [7] extended this to m n, showing that for d = 2 the maximal load difference is independent of m (in contrast to the d = 1 case). We propose a new variant of an infinite balls-into-bins process. In each round an expected number of λn new balls arrive and are distributed (in parallel) to the bins, and each non-empty bin deletes one of its balls. This setting models a set of servers processing incoming requests, where clients can query a server’s current load but receive no information about parallel requests. We study the Greedy[d] distribution scheme in this setting and show a strong self-stabilizing property: for any arrival rate λ = λ(n) < 1, the system load is time-invariant. Moreover, for any (even super-exponential) round t, the maximum system load is (w.h.p.) O 1 1−λ· log n 1−λ for d = 1 and O log n 1−λ for d = 2. In particular, Greedy[2] has an exponentially smaller system load for high arrival rates
HIGH-DIMENSIONAL PROBLEMS IN STATISTICS AND PROBABILITY: CORRELATION MINING AND DISTRIBUTED LOAD BALANCING
Technological progress has encouraged the study of various high-dimensional systems through the lens of statistics and probability. In this dissertation, we consider two such high-dimensional problems: the first arising in the integration of genomic data, and the second arising in probabilistic models for load balancing. A brief description follows. It is now common across many scientific and engineering disciplines to have multiple types of features measured on the same set of samples. In the first part of this dissertation, in the context of two measurement types, we focus on the exploratory problem of finding bimodules: these are sets of features from the two data types that have significant aggregate cross-correlation. Based on the iterative-testing framework that has been recently used in other settings, we design a new methodology to find bimodules. We apply this methodology to the problem of eQTL-analysis in genomics to identify gene-SNP association networks. In the second part of this dissertation, motivated by load balancing problems in large data centers, we study a processing system with multiple queues known as the Supermarket model. In this system, each incoming job is routed into one of the n available queues based on the following randomized scheme: d out of the n queues are sampled at random and the job is assigned into the smallest of the d sampled queues. Hence, when d=1, each job joins a random queue, while for d=n, each job joins the shortest of all n queues. Here we prove functional central limit theorems for this system in various regimes where d and n scale to infinity and the system load approaches criticality.Doctor of Philosoph
Unit Operations of Particulate Solids
Suitable for practicing engineers and engineers in training, this book covers the most important operations involving particulate solids. Through clear explanations of theoretical principles and practical laboratory exercises, the text provides an understanding of the behavior of powders and pulverized systems. It also helps readers develop skills for operating, optimizing, and innovating particle processing technologies and machinery in order to carry out industrial operations. The author explores common bulk solids processing operations, including milling, agglomeration, fluidization, mixing, and solid-fluid separation
On deep learning in physics
Machine learning, and most notably deep neural networks, have seen unprecedented success in recent years due to their ability to learn complex nonlinear mappings by ingesting large amounts of data through the process of training. This learning-by-example approach has slowly made its way into the physical sciences in recent years. In this dissertation I present a collection of contributions at the intersection of the fields of physics and deep learning. These contributions constitute some of the earlier introductions of deep learning to the physical sciences, and comprises a range of machine learning techniques, such as feed forward neural networks, generative models, and reinforcement learning. A focus will be placed on the lessons and techniques learned along the way that would influence future research projects
On deep learning in physics
Machine learning, and most notably deep neural networks, have seen unprecedented success in recent years due to their ability to learn complex nonlinear mappings by ingesting large amounts of data through the process of training. This learning-by-example approach has slowly made its way into the physical sciences in recent years. In this dissertation I present a collection of contributions at the intersection of the fields of physics and deep learning. These contributions constitute some of the earlier introductions of deep learning to the physical sciences, and comprises a range of machine learning techniques, such as feed forward neural networks, generative models, and reinforcement learning. A focus will be placed on the lessons and techniques learned along the way that would influence future research projects
- …