Search CORE

33,079 research outputs found

Constant Factor Approximation for Balanced Cut in the PIE model

Author: Bilu Yonatan
Bilu Yonatan
Condon Anne
Dimitriou Tassos
Newman Mark
Publication venue
Publication date: 21/06/2014
Field of study

We propose and study a new semi-random semi-adversarial model for Balanced Cut, a planted model with permutation-invariant random edges (PIE). Our model is much more general than planted models considered previously. Consider a set of vertices V partitioned into two clusters

L

and

R

of equal size. Let

G

be an arbitrary graph on

V

with no edges between

L

and

R

. Let

E_{random}

be a set of edges sampled from an arbitrary permutation-invariant distribution (a distribution that is invariant under permutation of vertices in

L

and in

R

). Then we say that

G + E_{random}

is a graph with permutation-invariant random edges. We present an approximation algorithm for the Balanced Cut problem that finds a balanced cut of cost

O(|E_{random}|) + n \text{polylog}(n)

in this model. In the regime when

|E_{random}| = \Omega(n \text{polylog}(n))

, this is a constant factor approximation with respect to the cost of the planted cut.Comment: Full version of the paper at the 46th ACM Symposium on the Theory of Computing (STOC 2014). 32 page

arXiv.org e-Print Archive

Crossref

Matching Is as Easy as the Decision Problem, in the NC Model

Author: Anari Nima
Vazirani Vijay V.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)
Publication date: 01/01/2020
Field of study

Is matching in NC, i.e., is there a deterministic fast parallel algorithm for it? This has been an outstanding open question in TCS for over three decades, ever since the discovery of randomized NC matching algorithms [KUW85, MVV87]. Over the last five years, the theoretical computer science community has launched a relentless attack on this question, leading to the discovery of several powerful ideas. We give what appears to be the culmination of this line of work: An NC algorithm for finding a minimum-weight perfect matching in a general graph with polynomially bounded edge weights, provided it is given an oracle for the decision problem. Consequently, for settling the main open problem, it suffices to obtain an NC algorithm for the decision problem. We believe this new fact has qualitatively changed the nature of this open problem. All known efficient matching algorithms for general graphs follow one of two approaches: given by Edmonds [Edm65] and Lov\'asz [Lov79]. Our oracle-based algorithm follows a new approach and uses many of the ideas discovered in the last five years. The difficulty of obtaining an NC perfect matching algorithm led researchers to study matching vis-a-vis clever relaxations of the class NC. In this vein, recently Goldwasser and Grossman [GG15] gave a pseudo-deterministic RNC algorithm for finding a perfect matching in a bipartite graph, i.e., an RNC algorithm with the additional requirement that on the same graph, it should return the same (i.e., unique) perfect matching for almost all choices of random bits. A corollary of our reduction is an analogous algorithm for general graphs.Comment: Appeared in ITCS 202

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Clustering and Community Detection in Directed Networks: A Survey

Author: Malliaros Fragkiskos D.
Vazirgiannis Michalis
Publication venue: 'Elsevier BV'
Publication date: 05/08/2013
Field of study

Networks (or graphs) appear as dominant structures in diverse domains, including sociology, biology, neuroscience and computer science. In most of the aforementioned cases graphs are directed - in the sense that there is directionality on the edges, making the semantics of the edges non symmetric. An interesting feature that real networks present is the clustering or community structure property, under which the graph topology is organized into modules commonly called communities or clusters. The essence here is that nodes of the same community are highly similar while on the contrary, nodes across communities present low similarity. Revealing the underlying community structure of directed complex networks has become a crucial and interdisciplinary topic with a plethora of applications. Therefore, naturally there is a recent wealth of research production in the area of mining directed graphs - with clustering being the primary method and tool for community detection and evaluation. The goal of this paper is to offer an in-depth review of the methods presented so far for clustering directed networks along with the relevant necessary methodological background and also related applications. The survey commences by offering a concise review of the fundamental concepts and methodological base on which graph clustering algorithms capitalize on. Then we present the relevant work along two orthogonal classifications. The first one is mostly concerned with the methodological principles of the clustering algorithms, while the second one approaches the methods from the viewpoint regarding the properties of a good cluster in a directed network. Further, we present methods and metrics for evaluating graph clustering results, demonstrate interesting application domains and provide promising future research directions.Comment: 86 pages, 17 figures. Physics Reports Journal (To Appear

arXiv.org e-Print Archive

CiteSeerX

GraphX: Unifying Data-Parallel and Graph-Parallel Analytics

Author: Crankshaw Daniel
Dave Ankur
Franklin Michael J.
Gonzalez Joseph E.
Stoica Ion
Xin Reynold S.
Publication venue
Publication date: 11/02/2014
Field of study

From social networks to language modeling, the growing scale and importance of graph data has driven the development of numerous new graph-parallel systems (e.g., Pregel, GraphLab). By restricting the computation that can be expressed and introducing new techniques to partition and distribute the graph, these systems can efficiently execute iterative graph algorithms orders of magnitude faster than more general data-parallel systems. However, the same restrictions that enable the performance gains also make it difficult to express many of the important stages in a typical graph-analytics pipeline: constructing the graph, modifying its structure, or expressing computation that spans multiple graphs. As a consequence, existing graph analytics pipelines compose graph-parallel and data-parallel systems using external storage systems, leading to extensive data movement and complicated programming model. To address these challenges we introduce GraphX, a distributed graph computation framework that unifies graph-parallel and data-parallel computation. GraphX provides a small, core set of graph-parallel operators expressive enough to implement the Pregel and PowerGraph abstractions, yet simple enough to be cast in relational algebra. GraphX uses a collection of query optimization techniques such as automatic join rewrites to efficiently implement these graph-parallel operators. We evaluate GraphX on real-world graphs and workloads and demonstrate that GraphX achieves comparable performance as specialized graph computation systems, while outperforming them in end-to-end graph pipelines. Moreover, GraphX achieves a balance between expressiveness, performance, and ease of use

arXiv.org e-Print Archive

CiteSeerX

Improved bounds and algorithms for graph cuts and network reliability

Author: Harris David G.
Srinivasan Aravind
Publication venue
Publication date: 04/08/2017
Field of study

Karger (SIAM Journal on Computing, 1999) developed the first fully-polynomial approximation scheme to estimate the probability that a graph

G

becomes disconnected, given that its edges are removed independently with probability

p

. This algorithm runs in

n^{5+o(1)} \epsilon^{-3}

time to obtain an estimate within relative error

\epsilon

. We improve this run-time through algorithmic and graph-theoretic advances. First, there is a certain key sub-problem encountered by Karger, for which a generic estimation procedure is employed, we show that this has a special structure for which a much more efficient algorithm can be used. Second, we show better bounds on the number of edge cuts which are likely to fail. Here, Karger's analysis uses a variety of bounds for various graph parameters, we show that these bounds cannot be simultaneously tight. We describe a new graph parameter, which simultaneously influences all the bounds used by Karger, and obtain much tighter estimates of the cut structure of

G

. These techniques allow us to improve the runtime to

n^{3+o(1)} \epsilon^{-2}

, our results also rigorously prove certain experimental observations of Karger & Tai (Proc. ACM-SIAM Symposium on Discrete Algorithms, 1997). Our rigorous proofs are motivated by certain non-rigorous differential-equation approximations which, however, provably track the worst-case trajectories of the relevant parameters. A key driver of Karger's approach (and other cut-related results) is a bound on the number of small cuts: we improve these estimates when the min-cut size is "small" and odd, augmenting, in part, a result of Bixby (Bulletin of the AMS, 1974)

arXiv.org e-Print Archive