33,079 research outputs found
Constant Factor Approximation for Balanced Cut in the PIE model
We propose and study a new semi-random semi-adversarial model for Balanced
Cut, a planted model with permutation-invariant random edges (PIE). Our model
is much more general than planted models considered previously. Consider a set
of vertices V partitioned into two clusters and of equal size. Let
be an arbitrary graph on with no edges between and . Let
be a set of edges sampled from an arbitrary permutation-invariant
distribution (a distribution that is invariant under permutation of vertices in
and in ). Then we say that is a graph with
permutation-invariant random edges.
We present an approximation algorithm for the Balanced Cut problem that finds
a balanced cut of cost in this model.
In the regime when , this is a
constant factor approximation with respect to the cost of the planted cut.Comment: Full version of the paper at the 46th ACM Symposium on the Theory of
Computing (STOC 2014). 32 page
Matching Is as Easy as the Decision Problem, in the NC Model
Is matching in NC, i.e., is there a deterministic fast parallel algorithm for
it? This has been an outstanding open question in TCS for over three decades,
ever since the discovery of randomized NC matching algorithms [KUW85, MVV87].
Over the last five years, the theoretical computer science community has
launched a relentless attack on this question, leading to the discovery of
several powerful ideas. We give what appears to be the culmination of this line
of work: An NC algorithm for finding a minimum-weight perfect matching in a
general graph with polynomially bounded edge weights, provided it is given an
oracle for the decision problem. Consequently, for settling the main open
problem, it suffices to obtain an NC algorithm for the decision problem. We
believe this new fact has qualitatively changed the nature of this open
problem.
All known efficient matching algorithms for general graphs follow one of two
approaches: given by Edmonds [Edm65] and Lov\'asz [Lov79]. Our oracle-based
algorithm follows a new approach and uses many of the ideas discovered in the
last five years.
The difficulty of obtaining an NC perfect matching algorithm led researchers
to study matching vis-a-vis clever relaxations of the class NC. In this vein,
recently Goldwasser and Grossman [GG15] gave a pseudo-deterministic RNC
algorithm for finding a perfect matching in a bipartite graph, i.e., an RNC
algorithm with the additional requirement that on the same graph, it should
return the same (i.e., unique) perfect matching for almost all choices of
random bits. A corollary of our reduction is an analogous algorithm for general
graphs.Comment: Appeared in ITCS 202
Clustering and Community Detection in Directed Networks: A Survey
Networks (or graphs) appear as dominant structures in diverse domains,
including sociology, biology, neuroscience and computer science. In most of the
aforementioned cases graphs are directed - in the sense that there is
directionality on the edges, making the semantics of the edges non symmetric.
An interesting feature that real networks present is the clustering or
community structure property, under which the graph topology is organized into
modules commonly called communities or clusters. The essence here is that nodes
of the same community are highly similar while on the contrary, nodes across
communities present low similarity. Revealing the underlying community
structure of directed complex networks has become a crucial and
interdisciplinary topic with a plethora of applications. Therefore, naturally
there is a recent wealth of research production in the area of mining directed
graphs - with clustering being the primary method and tool for community
detection and evaluation. The goal of this paper is to offer an in-depth review
of the methods presented so far for clustering directed networks along with the
relevant necessary methodological background and also related applications. The
survey commences by offering a concise review of the fundamental concepts and
methodological base on which graph clustering algorithms capitalize on. Then we
present the relevant work along two orthogonal classifications. The first one
is mostly concerned with the methodological principles of the clustering
algorithms, while the second one approaches the methods from the viewpoint
regarding the properties of a good cluster in a directed network. Further, we
present methods and metrics for evaluating graph clustering results,
demonstrate interesting application domains and provide promising future
research directions.Comment: 86 pages, 17 figures. Physics Reports Journal (To Appear
GraphX: Unifying Data-Parallel and Graph-Parallel Analytics
From social networks to language modeling, the growing scale and importance
of graph data has driven the development of numerous new graph-parallel systems
(e.g., Pregel, GraphLab). By restricting the computation that can be expressed
and introducing new techniques to partition and distribute the graph, these
systems can efficiently execute iterative graph algorithms orders of magnitude
faster than more general data-parallel systems. However, the same restrictions
that enable the performance gains also make it difficult to express many of the
important stages in a typical graph-analytics pipeline: constructing the graph,
modifying its structure, or expressing computation that spans multiple graphs.
As a consequence, existing graph analytics pipelines compose graph-parallel and
data-parallel systems using external storage systems, leading to extensive data
movement and complicated programming model.
To address these challenges we introduce GraphX, a distributed graph
computation framework that unifies graph-parallel and data-parallel
computation. GraphX provides a small, core set of graph-parallel operators
expressive enough to implement the Pregel and PowerGraph abstractions, yet
simple enough to be cast in relational algebra. GraphX uses a collection of
query optimization techniques such as automatic join rewrites to efficiently
implement these graph-parallel operators. We evaluate GraphX on real-world
graphs and workloads and demonstrate that GraphX achieves comparable
performance as specialized graph computation systems, while outperforming them
in end-to-end graph pipelines. Moreover, GraphX achieves a balance between
expressiveness, performance, and ease of use
Improved bounds and algorithms for graph cuts and network reliability
Karger (SIAM Journal on Computing, 1999) developed the first fully-polynomial
approximation scheme to estimate the probability that a graph becomes
disconnected, given that its edges are removed independently with probability
. This algorithm runs in time to obtain an
estimate within relative error .
We improve this run-time through algorithmic and graph-theoretic advances.
First, there is a certain key sub-problem encountered by Karger, for which a
generic estimation procedure is employed, we show that this has a special
structure for which a much more efficient algorithm can be used. Second, we
show better bounds on the number of edge cuts which are likely to fail. Here,
Karger's analysis uses a variety of bounds for various graph parameters, we
show that these bounds cannot be simultaneously tight. We describe a new graph
parameter, which simultaneously influences all the bounds used by Karger, and
obtain much tighter estimates of the cut structure of . These techniques
allow us to improve the runtime to , our results also
rigorously prove certain experimental observations of Karger & Tai (Proc.
ACM-SIAM Symposium on Discrete Algorithms, 1997). Our rigorous proofs are
motivated by certain non-rigorous differential-equation approximations which,
however, provably track the worst-case trajectories of the relevant parameters.
A key driver of Karger's approach (and other cut-related results) is a bound
on the number of small cuts: we improve these estimates when the min-cut size
is "small" and odd, augmenting, in part, a result of Bixby (Bulletin of the
AMS, 1974)
- …