38,229 research outputs found

    Coresets Meet EDCS: Algorithms for Matching and Vertex Cover on Massive Graphs

    Full text link
    As massive graphs become more prevalent, there is a rapidly growing need for scalable algorithms that solve classical graph problems, such as maximum matching and minimum vertex cover, on large datasets. For massive inputs, several different computational models have been introduced, including the streaming model, the distributed communication model, and the massively parallel computation (MPC) model that is a common abstraction of MapReduce-style computation. In each model, algorithms are analyzed in terms of resources such as space used or rounds of communication needed, in addition to the more traditional approximation ratio. In this paper, we give a single unified approach that yields better approximation algorithms for matching and vertex cover in all these models. The highlights include: * The first one pass, significantly-better-than-2-approximation for matching in random arrival streams that uses subquadratic space, namely a (1.5+ϵ)(1.5+\epsilon)-approximation streaming algorithm that uses O(n1.5)O(n^{1.5}) space for constant ϵ>0\epsilon > 0. * The first 2-round, better-than-2-approximation for matching in the MPC model that uses subquadratic space per machine, namely a (1.5+ϵ)(1.5+\epsilon)-approximation algorithm with O(mn+n)O(\sqrt{mn} + n) memory per machine for constant ϵ>0\epsilon > 0. By building on our unified approach, we further develop parallel algorithms in the MPC model that give a (1+ϵ)(1 + \epsilon)-approximation to matching and an O(1)O(1)-approximation to vertex cover in only O(loglogn)O(\log\log{n}) MPC rounds and O(n/polylog(n))O(n/poly\log{(n)}) memory per machine. These results settle multiple open questions posed in the recent paper of Czumaj~et.al. [STOC 2018]

    Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

    Full text link
    There has been significant recent interest in parallel graph processing due to the need to quickly analyze the large graphs available today. Many graph codes have been designed for distributed memory or external memory. However, today even the largest publicly-available real-world graph (the Hyperlink Web graph with over 3.5 billion vertices and 128 billion edges) can fit in the memory of a single commodity multicore server. Nevertheless, most experimental work in the literature report results on much smaller graphs, and the ones for the Hyperlink graph use distributed or external memory. Therefore, it is natural to ask whether we can efficiently solve a broad class of graph problems on this graph in memory. This paper shows that theoretically-efficient parallel graph algorithms can scale to the largest publicly-available graphs using a single machine with a terabyte of RAM, processing them in minutes. We give implementations of theoretically-efficient parallel algorithms for 20 important graph problems. We also present the optimizations and techniques that we used in our implementations, which were crucial in enabling us to process these large graphs quickly. We show that the running times of our implementations outperform existing state-of-the-art implementations on the largest real-world graphs. For many of the problems that we consider, this is the first time they have been solved on graphs at this scale. We have made the implementations developed in this work publicly-available as the Graph-Based Benchmark Suite (GBBS).Comment: This is the full version of the paper appearing in the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), 201

    Robust randomized matchings

    Full text link
    The following game is played on a weighted graph: Alice selects a matching MM and Bob selects a number kk. Alice's payoff is the ratio of the weight of the kk heaviest edges of MM to the maximum weight of a matching of size at most kk. If MM guarantees a payoff of at least α\alpha then it is called α\alpha-robust. In 2002, Hassin and Rubinstein gave an algorithm that returns a 1/21/\sqrt{2}-robust matching, which is best possible. We show that Alice can improve her payoff to 1/ln(4)1/\ln(4) by playing a randomized strategy. This result extends to a very general class of independence systems that includes matroid intersection, b-matchings, and strong 2-exchange systems. It also implies an improved approximation factor for a stochastic optimization variant known as the maximum priority matching problem and translates to an asymptotic robustness guarantee for deterministic matchings, in which Bob can only select numbers larger than a given constant. Moreover, we give a new LP-based proof of Hassin and Rubinstein's bound

    Distributed Connectivity Decomposition

    Full text link
    We present time-efficient distributed algorithms for decomposing graphs with large edge or vertex connectivity into multiple spanning or dominating trees, respectively. As their primary applications, these decompositions allow us to achieve information flow with size close to the connectivity by parallelizing it along the trees. More specifically, our distributed decomposition algorithms are as follows: (I) A decomposition of each undirected graph with vertex-connectivity kk into (fractionally) vertex-disjoint weighted dominating trees with total weight Ω(klogn)\Omega(\frac{k}{\log n}), in O~(D+n)\widetilde{O}(D+\sqrt{n}) rounds. (II) A decomposition of each undirected graph with edge-connectivity λ\lambda into (fractionally) edge-disjoint weighted spanning trees with total weight λ12(1ε)\lceil\frac{\lambda-1}{2}\rceil(1-\varepsilon), in O~(D+nλ)\widetilde{O}(D+\sqrt{n\lambda}) rounds. We also show round complexity lower bounds of Ω~(D+nk)\tilde{\Omega}(D+\sqrt{\frac{n}{k}}) and Ω~(D+nλ)\tilde{\Omega}(D+\sqrt{\frac{n}{\lambda}}) for the above two decompositions, using techniques of [Das Sarma et al., STOC'11]. Moreover, our vertex-connectivity decomposition extends to centralized algorithms and improves the time complexity of [Censor-Hillel et al., SODA'14] from O(n3)O(n^3) to near-optimal O~(m)\tilde{O}(m). As corollaries, we also get distributed oblivious routing broadcast with O(1)O(1)-competitive edge-congestion and O(logn)O(\log n)-competitive vertex-congestion. Furthermore, the vertex connectivity decomposition leads to near-time-optimal O(logn)O(\log n)-approximation of vertex connectivity: centralized O~(m)\widetilde{O}(m) and distributed O~(D+n)\tilde{O}(D+\sqrt{n}). The former moves toward the 1974 conjecture of Aho, Hopcroft, and Ullman postulating an O(m)O(m) centralized exact algorithm while the latter is the first distributed vertex connectivity approximation

    Optimal Dynamic Distributed MIS

    Full text link
    Finding a maximal independent set (MIS) in a graph is a cornerstone task in distributed computing. The local nature of an MIS allows for fast solutions in a static distributed setting, which are logarithmic in the number of nodes or in their degrees. The result trivially applies for the dynamic distributed model, in which edges or nodes may be inserted or deleted. In this paper, we take a different approach which exploits locality to the extreme, and show how to update an MIS in a dynamic distributed setting, either \emph{synchronous} or \emph{asynchronous}, with only \emph{a single adjustment} and in a single round, in expectation. These strong guarantees hold for the \emph{complete fully dynamic} setting: Insertions and deletions, of edges as well as nodes, gracefully and abruptly. This strongly separates the static and dynamic distributed models, as super-constant lower bounds exist for computing an MIS in the former. Our results are obtained by a novel analysis of the surprisingly simple solution of carefully simulating the greedy \emph{sequential} MIS algorithm with a random ordering of the nodes. As such, our algorithm has a direct application as a 33-approximation algorithm for correlation clustering. This adds to the important toolbox of distributed graph decompositions, which are widely used as crucial building blocks in distributed computing. Finally, our algorithm enjoys a useful \emph{history-independence} property, meaning the output is independent of the history of topology changes that constructed that graph. This means the output cannot be chosen, or even biased, by the adversary in case its goal is to prevent us from optimizing some objective function.Comment: 19 pages including appendix and reference

    Edge-Stable Equimatchable Graphs

    Full text link
    A graph GG is \emph{equimatchable} if every maximal matching of GG has the same cardinality. We are interested in equimatchable graphs such that the removal of any edge from the graph preserves the equimatchability. We call an equimatchable graph GG \emph{edge-stable} if GeG\setminus {e}, that is the graph obtained by the removal of edge ee from GG, is also equimatchable for any eE(G)e \in E(G). After noticing that edge-stable equimatchable graphs are either 2-connected factor-critical or bipartite, we characterize edge-stable equimatchable graphs. This characterization yields an O(min(n3.376,n1.5m))O(\min(n^{3.376}, n^{1.5}m)) time recognition algorithm. Lastly, we introduce and shortly discuss the related notions of edge-critical, vertex-stable and vertex-critical equimatchable graphs. In particular, we emphasize the links between our work and the well-studied notion of shedding vertices, and point out some open questions
    corecore