7,185 research outputs found
Fully Dynamic Algorithm for Top- Densest Subgraphs
Given a large graph, the densest-subgraph problem asks to find a subgraph
with maximum average degree. When considering the top- version of this
problem, a na\"ive solution is to iteratively find the densest subgraph and
remove it in each iteration. However, such a solution is impractical due to
high processing cost. The problem is further complicated when dealing with
dynamic graphs, since adding or removing an edge requires re-running the
algorithm. In this paper, we study the top- densest-subgraph problem in the
sliding-window model and propose an efficient fully-dynamic algorithm. The
input of our algorithm consists of an edge stream, and the goal is to find the
node-disjoint subgraphs that maximize the sum of their densities. In contrast
to existing state-of-the-art solutions that require iterating over the entire
graph upon any update, our algorithm profits from the observation that updates
only affect a limited region of the graph. Therefore, the top- densest
subgraphs are maintained by only applying local updates. We provide a
theoretical analysis of the proposed algorithm and show empirically that the
algorithm often generates denser subgraphs than state-of-the-art competitors.
Experiments show an improvement in efficiency of up to five orders of magnitude
compared to state-of-the-art solutions.Comment: 10 pages, 8 figures, accepted at CIKM 201
Approximating ATSP by Relaxing Connectivity
The standard LP relaxation of the asymmetric traveling salesman problem has
been conjectured to have a constant integrality gap in the metric case. We
prove this conjecture when restricted to shortest path metrics of node-weighted
digraphs. Our arguments are constructive and give a constant factor
approximation algorithm for these metrics. We remark that the considered case
is more general than the directed analog of the special case of the symmetric
traveling salesman problem for which there were recent improvements on
Christofides' algorithm.
The main idea of our approach is to first consider an easier problem obtained
by significantly relaxing the general connectivity requirements into local
connectivity conditions. For this relaxed problem, it is quite easy to give an
algorithm with a guarantee of 3 on node-weighted shortest path metrics. More
surprisingly, we then show that any algorithm (irrespective of the metric) for
the relaxed problem can be turned into an algorithm for the asymmetric
traveling salesman problem by only losing a small constant factor in the
performance guarantee. This leaves open the intriguing task of designing a
"good" algorithm for the relaxed problem on general metrics.Comment: 25 pages, 2 figures, fixed some typos in previous versio
Coresets Meet EDCS: Algorithms for Matching and Vertex Cover on Massive Graphs
As massive graphs become more prevalent, there is a rapidly growing need for
scalable algorithms that solve classical graph problems, such as maximum
matching and minimum vertex cover, on large datasets. For massive inputs,
several different computational models have been introduced, including the
streaming model, the distributed communication model, and the massively
parallel computation (MPC) model that is a common abstraction of
MapReduce-style computation. In each model, algorithms are analyzed in terms of
resources such as space used or rounds of communication needed, in addition to
the more traditional approximation ratio.
In this paper, we give a single unified approach that yields better
approximation algorithms for matching and vertex cover in all these models. The
highlights include:
* The first one pass, significantly-better-than-2-approximation for matching
in random arrival streams that uses subquadratic space, namely a
-approximation streaming algorithm that uses space
for constant .
* The first 2-round, better-than-2-approximation for matching in the MPC
model that uses subquadratic space per machine, namely a
-approximation algorithm with memory per
machine for constant .
By building on our unified approach, we further develop parallel algorithms
in the MPC model that give a -approximation to matching and an
-approximation to vertex cover in only MPC rounds and
memory per machine. These results settle multiple open
questions posed in the recent paper of Czumaj~et.al. [STOC 2018]
Detecting High Log-Densities -- an O(n^1/4) Approximation for Densest k-Subgraph
In the Densest k-Subgraph problem, given a graph G and a parameter k, one
needs to find a subgraph of G induced on k vertices that contains the largest
number of edges. There is a significant gap between the best known upper and
lower bounds for this problem. It is NP-hard, and does not have a PTAS unless
NP has subexponential time algorithms. On the other hand, the current best
known algorithm of Feige, Kortsarz and Peleg, gives an approximation ratio of
n^(1/3-epsilon) for some specific epsilon > 0 (estimated at around 1/60).
We present an algorithm that for every epsilon > 0 approximates the Densest
k-Subgraph problem within a ratio of n^(1/4+epsilon) in time n^O(1/epsilon). In
particular, our algorithm achieves an approximation ratio of O(n^1/4) in time
n^O(log n). Our algorithm is inspired by studying an average-case version of
the problem where the goal is to distinguish random graphs from graphs with
planted dense subgraphs. The approximation ratio we achieve for the general
case matches the distinguishing ratio we obtain for this planted problem.
At a high level, our algorithms involve cleverly counting appropriately
defined trees of constant size in G, and using these counts to identify the
vertices of the dense subgraph. Our algorithm is based on the following
principle. We say that a graph G(V,E) has log-density alpha if its average
degree is Theta(|V|^alpha). The algorithmic core of our result is a family of
algorithms that output k-subgraphs of nontrivial density whenever the
log-density of the densest k-subgraph is larger than the log-density of the
host graph.Comment: 23 page
A Novel Approach to Finding Near-Cliques: The Triangle-Densest Subgraph Problem
Many graph mining applications rely on detecting subgraphs which are
near-cliques. There exists a dichotomy between the results in the existing work
related to this problem: on the one hand the densest subgraph problem (DSP)
which maximizes the average degree over all subgraphs is solvable in polynomial
time but for many networks fails to find subgraphs which are near-cliques. On
the other hand, formulations that are geared towards finding near-cliques are
NP-hard and frequently inapproximable due to connections with the Maximum
Clique problem.
In this work, we propose a formulation which combines the best of both
worlds: it is solvable in polynomial time and finds near-cliques when the DSP
fails. Surprisingly, our formulation is a simple variation of the DSP.
Specifically, we define the triangle densest subgraph problem (TDSP): given
, find a subset of vertices such that , where is the number of triangles induced
by the set . We provide various exact and approximation algorithms which the
solve the TDSP efficiently. Furthermore, we show how our algorithms adapt to
the more general problem of maximizing the -clique average density. Finally,
we provide empirical evidence that the TDSP should be used whenever the output
of the DSP fails to output a near-clique.Comment: 42 page
Approximate Closest Community Search in Networks
Recently, there has been significant interest in the study of the community
search problem in social and information networks: given one or more query
nodes, find densely connected communities containing the query nodes. However,
most existing studies do not address the "free rider" issue, that is, nodes far
away from query nodes and irrelevant to them are included in the detected
community. Some state-of-the-art models have attempted to address this issue,
but not only are their formulated problems NP-hard, they do not admit any
approximations without restrictive assumptions, which may not always hold in
practice.
In this paper, given an undirected graph G and a set of query nodes Q, we
study community search using the k-truss based community model. We formulate
our problem of finding a closest truss community (CTC), as finding a connected
k-truss subgraph with the largest k that contains Q, and has the minimum
diameter among such subgraphs. We prove this problem is NP-hard. Furthermore,
it is NP-hard to approximate the problem within a factor , for
any . However, we develop a greedy algorithmic framework,
which first finds a CTC containing Q, and then iteratively removes the furthest
nodes from Q, from the graph. The method achieves 2-approximation to the
optimal solution. To further improve the efficiency, we make use of a compact
truss index and develop efficient algorithms for k-truss identification and
maintenance as nodes get eliminated. In addition, using bulk deletion
optimization and local exploration strategies, we propose two more efficient
algorithms. One of them trades some approximation quality for efficiency while
the other is a very efficient heuristic. Extensive experiments on 6 real-world
networks show the effectiveness and efficiency of our community model and
search algorithms
- …