296 research outputs found
Distributed Data Summarization in Well-Connected Networks
We study distributed algorithms for some fundamental problems in data summarization. Given a communication graph G of n nodes each of which may hold a value initially, we focus on computing sum_{i=1}^N g(f_i), where f_i is the number of occurrences of value i and g is some fixed function. This includes important statistics such as the number of distinct elements, frequency moments, and the empirical entropy of the data.
In the CONGEST~ model, a simple adaptation from streaming lower bounds shows that it requires Omega~(D+ n) rounds, where D is the diameter of the graph, to compute some of these statistics exactly. However, these lower bounds do not hold for graphs that are well-connected. We give an algorithm that computes sum_{i=1}^{N} g(f_i) exactly in {tau_{G}} * 2^{O(sqrt{log n})} rounds where {tau_{G}} is the mixing time of G. This also has applications in computing the top k most frequent elements.
We demonstrate that there is a high similarity between the GOSSIP~ model and the CONGEST~ model in well-connected graphs. In particular, we show that each round of the GOSSIP~ model can be simulated almost perfectly in O~({tau_{G}}) rounds of the CONGEST~ model. To this end, we develop a new algorithm for the GOSSIP~ model that 1 +/- epsilon approximates the p-th frequency moment F_p = sum_{i=1}^N f_i^p in O~(epsilon^{-2} n^{1-k/p}) roundsfor p >= 2, when the number of distinct elements F_0 is at most O(n^{1/(k-1)}). This result can be translated back to the CONGEST~ model with a factor O~({tau_{G}}) blow-up in the number of rounds
Algorithms for Fundamental Problems in Computer Networks.
Traditional studies of algorithms consider the sequential setting, where the whole input data is fed into a single device that computes the solution. Today, the network, such as the Internet, contains of a vast amount of information. The overhead of aggregating all the information into a single device is too expensive, so a distributed approach to solve the problem is often preferable. In this thesis, we aim to develop efficient algorithms for the following fundamental graph problems that arise in networks, in both sequential and distributed settings.
Graph coloring is a basic symmetry breaking problem in distributed computing. Each node is to be assigned a color such that adjacent nodes are assigned different colors. Both the efficiency and the quality of coloring are important measures of an algorithm. One of our main contributions is providing tools for obtaining colorings of good quality whose existence are non-trivial. We also consider other optimization problems in the distributed setting. For example, we investigate efficient methods for identifying the connectivity as well as the bottleneck edges in a distributed network. Our approximation algorithm is almost-tight in the sense that the running time matches the known lower bound up to a poly-logarithmic factor. For another example, we model how the task allocation can be done in ant colonies, when the ants may have different capabilities in doing different tasks.
The matching problems are one of the classic combinatorial optimization problems. We study the weighted matching problems in the sequential setting. We give a new scaling algorithm for finding the maximum weight perfect matching in general graphs, which improves the long-standing Gabow-Tarjan's algorithm (1991) and matches the running time of the best weighted bipartite perfect matching algorithm (Gabow and Tarjan, 1989). Furthermore, for the maximum weight matching problem in bipartite graphs, we give a faster scaling algorithm whose running time is faster than Gabow and Tarjan's weighted bipartite {it perfect} matching algorithm.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113540/1/hsinhao_1.pd
Ant-Inspired Density Estimation via Random Walks
Many ant species employ distributed population density estimation in
applications ranging from quorum sensing [Pra05], to task allocation [Gor99],
to appraisal of enemy colony strength [Ada90]. It has been shown that ants
estimate density by tracking encounter rates -- the higher the population
density, the more often the ants bump into each other [Pra05,GPT93].
We study distributed density estimation from a theoretical perspective. We
prove that a group of anonymous agents randomly walking on a grid are able to
estimate their density within a small multiplicative error in few steps by
measuring their rates of encounter with other agents. Despite dependencies
inherent in the fact that nearby agents may collide repeatedly (and, worse,
cannot recognize when this happens), our bound nearly matches what would be
required to estimate density by independently sampling grid locations.
From a biological perspective, our work helps shed light on how ants and
other social insects can obtain relatively accurate density estimates via
encounter rates. From a technical perspective, our analysis provides new tools
for understanding complex dependencies in the collision probabilities of
multiple random walks. We bound the strength of these dependencies using
of the underlying graph. Our results extend beyond
the grid to more general graphs and we discuss applications to size estimation
for social networks and density estimation for robot swarms
Optimal Gossip Algorithms for Exact and Approximate Quantile Computations
This paper gives drastically faster gossip algorithms to compute exact and
approximate quantiles.
Gossip algorithms, which allow each node to contact a uniformly random other
node in each round, have been intensely studied and been adopted in many
applications due to their fast convergence and their robustness to failures.
Kempe et al. [FOCS'03] gave gossip algorithms to compute important aggregate
statistics if every node is given a value. In particular, they gave a beautiful
round algorithm to -approximate
the sum of all values and an round algorithm to compute the exact
-quantile, i.e., the the smallest value.
We give an quadratically faster and in fact optimal gossip algorithm for the
exact -quantile problem which runs in rounds. We furthermore
show that one can achieve an exponential speedup if one allows for an
-approximation. We give an
round gossip algorithm which computes a value of rank between and
at every node.% for any and . Our algorithms are extremely simple and very robust - they can
be operated with the same running times even if every transmission fails with
a, potentially different, constant probability. We also give a matching
lower bound which shows that
our algorithm is optimal for all values of
Distributed -Coloring in Sublogarithmic Rounds
We give a new randomized distributed algorithm for -coloring in
the LOCAL model, running in
rounds in a graph of maximum degree~. This implies that the
-coloring problem is easier than the maximal independent set
problem and the maximal matching problem, due to their lower bounds of by Kuhn, Moscibroda, and Wattenhofer [PODC'04].
Our algorithm also extends to list-coloring where the palette of each node
contains colors. We extend the set of distributed symmetry-breaking
techniques by performing a decomposition of graphs into dense and sparse parts
Lower Bounds for Dynamic Distributed Task Allocation
We study the problem of distributed task allocation in multi-agent systems. Suppose there is a collection of agents, a collection of tasks, and a demand vector, which specifies the number of agents required to perform each task. The goal of the agents is to cooperatively allocate themselves to the tasks to satisfy the demand vector. We study the dynamic version of the problem where the demand vector changes over time. Here, the goal is to minimize the switching cost, which is the number of agents that change tasks in response to a change in the demand vector. The switching cost is an important metric since changing tasks may incur significant overhead.
We study a mathematical formalization of the above problem introduced by Su, Su, Dornhaus, and Lynch [Su et al., 2017], which can be reformulated as a question of finding a low distortion embedding from symmetric difference to Hamming distance. In this model it is trivial to prove that the switching cost is at least 2. We present the first non-trivial lower bounds for the switching cost, by giving lower bounds of 3 and 4 for different ranges of the parameters
- …