123 research outputs found

    Distributed Planar Reachability in Nearly Optimal Time

    Get PDF

    Being Fast Means Being Chatty: The Local Information Cost of Graph Spanners

    Full text link
    We introduce a new measure for quantifying the amount of information that the nodes in a network need to learn to jointly solve a graph problem. We show that the local information cost (LIC\textsf{LIC}) presents a natural lower bound on the communication complexity of distributed algorithms. For the synchronous CONGEST-KT1 model, where each node has initial knowledge of its neighbors' IDs, we prove that Ω(LICγ(P)/logτlogn)\Omega(\textsf{LIC}_\gamma(P)/ \log\tau \log n) bits are required for solving a graph problem PP with a τ\tau-round algorithm that errs with probability at most γ\gamma. Our result is the first lower bound that yields a general trade-off between communication and time for graph problems in the CONGEST-KT1 model. We demonstrate how to apply the local information cost by deriving a lower bound on the communication complexity of computing a (2t1)(2t-1)-spanner that consists of at most O(n1+1/t+ϵ)O(n^{1+1/t + \epsilon}) edges, where ϵ=Θ(1/t2)\epsilon = \Theta(1/t^2). Our main result is that any O(poly(n))O(\textsf{poly}(n))-time algorithm must send at least Ω~((1/t2)n1+1/2t)\tilde\Omega((1/t^2) n^{1+1/2t}) bits in the CONGEST model under the KT1 assumption. Previously, only a trivial lower bound of Ω~(n)\tilde \Omega(n) bits was known for this problem. A consequence of our lower bound is that achieving both time- and communication-optimality is impossible when designing a distributed spanner algorithm. In light of the work of King, Kutten, and Thorup (PODC 2015), this shows that computing a minimum spanning tree can be done significantly faster than finding a spanner when considering algorithms with O~(n)\tilde O(n) communication complexity. Our result also implies time complexity lower bounds for constructing a spanner in the node-congested clique of Augustine et al. (2019) and in the push-pull gossip model with limited bandwidth

    Efficient concurrent data structure access parallelism techniques for increasing scalability

    Get PDF
    Multi-core processors have revolutionised the way data structures are designed by bringing parallelism to mainstream computing. Key to exploiting hardware parallelism available in multi-core processors are concurrent data structures. However, some concurrent data structure abstractions are inherently sequential and incapable of harnessing the parallelism performance of multi-core processors. Designing and implementing concurrent data structures to harness hardware parallelism is challenging due to the requirement of correctness, efficiency and practicability under various application constraints. In this thesis, our research contribution is towards improving concurrent data structure access parallelism to increase data structure performance. We propose new design frameworks that improve access parallelism of already existing concurrent data structure designs. Also, we propose new concurrent data structure designs with significant performance improvements. To give an insight into the interplay between hardware and concurrent data structure access parallelism, we give a detailed analysis and model the performance scalability with varying parallelism.In the first part of the thesis, we focus on data structure semantic relaxation. By relaxing the semantics of a data structure, a bigger design space, that allows weaker synchronization and more useful parallelism, is unveiled. Investigating new data structure designs, capable of trading semantics for achieving better performance in a monotonic way, is a major challenge in the area. We algorithmically address this challenge in this part of the thesis. We present an efficient, lock-free, concurrent data structure design framework for out-of-order semantic relaxation. We introduce a new two-dimensional algorithmic design, that uses multiple instances of a given data structure to improve access parallelism. In the second part of the thesis, we propose an efficient priority queue that improves access parallelism by reducing the number of synchronization points for each operation. Priority queues are fundamental abstract data types, often used to manage limited resources in parallel systems. Typical proposed parallel priority queue implementations are based on heaps or skip lists. In recent literature, skip lists have been shown to be the most efficient design choice for implementing priority queues. Though numerous intricate implementations of skip list based queues have been proposed in the literature, their performance is constrained by the high number of global atomic updates per operation and the high memory consumption, which are proportional to the number of sub-lists in the queue. In this part of the thesis, we propose an alternative approach for designing lock-free linearizable priority queues, that significantly improve memory efficiency and throughput performance, by reducing the number of global atomic updates and memory consumption as compared to skip-list based queues. To achieve this, our new design combines two structures; a search tree and a linked list, forming what we call a Tree Search List Queue (TSLQueue). Subsequently, we analyse and introduce a model for lock-free concurrent data structure access parallelism. The major impediment to scaling concurrent data structures is memory contention when accessing shared data structure access points, leading to thread serialisation, and hindering parallelism. Aiming to address this challenge, a significant amount of work in the literature has proposed multi-access techniques that improve concurrent data structure parallelism. However, there is little work on analysing and modelling the execution behaviour of concurrent multi-access data structures especially in a shared memory setting. In this part of the thesis, we analyse and model the general execution behaviour of concurrent multi-access data structures in the shared memory setting. We study and analyse the behaviour of the two popular random access patterns: shared (Remote) and exclusive (Local) access, and the behaviour of the two most commonly used atomic primitives for designing lock-free data structures: Compare and Swap, and, Fetch and Add

    Broadcast CONGEST Algorithms against Adversarial Edges

    Get PDF
    We consider the corner-stone broadcast task with an adaptive adversary that controls a fixed number of tt edges in the input communication graph. In this model, the adversary sees the entire communication in the network and the random coins of the nodes, while maliciously manipulating the messages sent through a set of tt edges (unknown to the nodes). Since the influential work of [Pease, Shostak and Lamport, JACM'80], broadcast algorithms against plentiful adversarial models have been studied in both theory and practice for over more than four decades. Despite this extensive research, there is no round efficient broadcast algorithm for general graphs in the CONGEST model of distributed computing. We provide the first round-efficient broadcast algorithms against adaptive edge adversaries. Our two key results for nn-node graphs of diameter DD are as follows: 1. For t=1t=1, there is a deterministic algorithm that solves the problem within O~(D2)\widetilde{O}(D^2) rounds, provided that the graph is 3 edge-connected. This round complexity beats the natural barrier of O(D3)O(D^3) rounds, the existential lower bound on the maximal length of 33 edge-disjoint paths between a given pair of nodes in GG. This algorithm can be extended to a O~(DO(t))\widetilde{O}(D^{O(t)})-round algorithm against tt adversarial edges in (2t+1)(2t+1) edge-connected graphs. 2. For expander graphs with minimum degree of Ω(t2logn)\Omega(t^2\log n), there is an improved broadcast algorithm with O(tlog2n)O(t \log ^2 n) rounds against tt adversarial edges. This algorithm exploits the connectivity and conductance properties of G-subgraphs obtained by employing the Karger's edge sampling technique. Our algorithms mark a new connection between the areas of fault-tolerant network design and reliable distributed communication.Comment: accepted to DISC2

    General CONGEST Compilers against Adversarial Edges

    Get PDF

    The Complexity of Symmetry Breaking in Massive Graphs

    Get PDF
    The goal of this paper is to understand the complexity of symmetry breaking problems, specifically maximal independent set (MIS) and the closely related beta-ruling set problem, in two computational models suited for large-scale graph processing, namely the k-machine model and the graph streaming model. We present a number of results. For MIS in the k-machine model, we improve the O~(m/k^2 + Delta/k)-round upper bound of Klauck et al. (SODA 2015) by presenting an O~(m/k^2)-round algorithm. We also present an Omega~(n/k^2) round lower bound for MIS, the first lower bound for a symmetry breaking problem in the k-machine model. For beta-ruling sets, we use hierarchical sampling to obtain more efficient algorithms in the k-machine model and also in the graph streaming model. More specifically, we obtain a k-machine algorithm that runs in O~(beta n Delta^{1/beta}/k^2) rounds and, by using a similar hierarchical sampling technique, we obtain one-pass algorithms for both insertion-only and insertion-deletion streams that use O(beta * n^{1+1/2^{beta-1}}) space. The latter result establishes a clear separation between MIS, which is known to require Omega(n^2) space (Cormode et al., ICALP 2019), and beta-ruling sets, even for beta = 2. Finally, we present an even faster 2-ruling set algorithm in the k-machine model, one that runs in O~(n/k^{2-epsilon} + k^{1-epsilon}) rounds for any epsilon, 0 <=epsilon <=1. For a wide range of values of k this round complexity simplifies to O~(n/k^2) rounds, which we conjecture is optimal. Our results use a variety of techniques. For our upper bounds, we prove and use simulation theorems for beeping algorithms, hierarchical sampling, and L_0-sampling, whereas for our lower bounds we use information-theoretic arguments and reductions to 2-party communication complexity problems

    Performance Analysis and Modelling of Concurrent Multi-access Data Structures

    Get PDF
    The major impediment to scaling concurrent data structures is memory contention when accessing shared data structure access-points, leading to thread serialisation, hindering parallelism. Aiming to address this challenge, significant amount of work in the literature has proposed multi-access techniques that improve concurrent data structure parallelism. However, there is little work on analysing and modelling the execution behaviour of concurrent multi-access data structures especially in a shared memory setting. In this paper, we analyse and model the general execution behaviour of concurrent multi-access data structures in the shared memory setting. We study and analyse the behaviour of the two popular random access patterns: shared (Remote) and exclusive (Local) access, and the behaviour of the two most commonly used atomic primitives for designing lock-free data structures: Compare and Swap, and, Fetch and Add. We model the concurrent multi-accesses by splitting the thread execution procedure into five logical sessions: i) side-work, ii) access-point search iii) access-point acquisition, iv) access-point data acquisition and v) access-point data operation. We model the acquisition of an access-point, as a system of closed queuing networks with parallel servers, and data acquisition in terms of where the data is located within the memory system. We evaluate our model on a set of concurrent data structure designs including a counter, a stack and a FIFO queue. The evaluation is carried out on two state of the art multi-core processors: Intel Xeon Phi CPU 7290 with 72 physical cores and Intel Xeon E5-2695 with 14 physical cores. Our model is able to predict the throughput performance of the given concurrent data structures with 80% to 100% accuracy on both architectures

    A Simplicial Model for KB4_n: Epistemic Logic with Agents That May Die

    Get PDF

    A Simplicial Model for KB4nKB4_n: Epistemic Logic with Agents that May Die

    Full text link
    The standard semantics of multi-agent epistemic logic S5 is based on Kripke models whose accessibility relations are reflexive, symmetric and transitive. This one dimensional structure contains implicit higher-dimensional information beyond pairwise interactions, that we formalized as pure simplicial models in a previous work (Information and Computation, 2021). Here we extend the theory to encompass simplicial models that are not necessarily pure. The corresponding class of Kripke models are those where the accessibility relation is symmetric and transitive, but might not be reflexive. Such models correspond to the epistemic logic KB4 . Impure simplicial models arise in situations where two possible worlds may not have the same set of agents. We illustrate it with distributed computing examples of synchronous systems where processes may crash

    An Almost Singularly Optimal Asynchronous Distributed MST Algorithm

    Full text link
    A singularly (near) optimal distributed algorithm is one that is (near) optimal in \emph{two} criteria, namely, its time and message complexities. For \emph{synchronous} CONGEST networks, such algorithms are known for fundamental distributed computing problems such as leader election [Kutten et al., JACM 2015] and Minimum Spanning Tree (MST) construction [Pandurangan et al., STOC 2017, Elkin, PODC 2017]. However, it is open whether a singularly (near) optimal bound can be obtained for the MST construction problem in general \emph{asynchronous} CONGEST networks. We present a randomized distributed MST algorithm that, with high probability, computes an MST in \emph{asynchronous} CONGEST networks and takes O~(D1+ϵ+n)\tilde{O}(D^{1+\epsilon} + \sqrt{n}) time and O~(m)\tilde{O}(m) messages, where nn is the number of nodes, mm the number of edges, DD is the diameter of the network, and ϵ>0\epsilon >0 is an arbitrarily small constant (both time and message bounds hold with high probability). Our algorithm is message optimal (up to a polylog(n)(n) factor) and almost time optimal (except for a DϵD^{\epsilon} factor). Our result answers an open question raised in Mashregi and King [DISC 2019] by giving the first known asynchronous MST algorithm that has sublinear time (for all D=O(n1ϵ)D = O(n^{1-\epsilon})) and uses O~(m)\tilde{O}(m) messages. Using a result of Mashregi and King [DISC 2019], this also yields the first asynchronous MST algorithm that is sublinear in both time and messages in the KT1KT_1 CONGEST model. A key tool in our algorithm is the construction of a low diameter rooted spanning tree in asynchronous CONGEST that has depth O~(D1+ϵ)\tilde{O}(D^{1+\epsilon}) (for an arbitrarily small constant ϵ>0\epsilon > 0) in O~(D1+ϵ)\tilde{O}(D^{1+\epsilon}) time and O~(m)\tilde{O}(m) messages. To the best of our knowledge, this is the first such construction that is almost singularly optimal in the asynchronous setting.Comment: 27 pages, accepted to DISC 202
    corecore