127 research outputs found
A Novel Approach to Finding Near-Cliques: The Triangle-Densest Subgraph Problem
Many graph mining applications rely on detecting subgraphs which are
near-cliques. There exists a dichotomy between the results in the existing work
related to this problem: on the one hand the densest subgraph problem (DSP)
which maximizes the average degree over all subgraphs is solvable in polynomial
time but for many networks fails to find subgraphs which are near-cliques. On
the other hand, formulations that are geared towards finding near-cliques are
NP-hard and frequently inapproximable due to connections with the Maximum
Clique problem.
In this work, we propose a formulation which combines the best of both
worlds: it is solvable in polynomial time and finds near-cliques when the DSP
fails. Surprisingly, our formulation is a simple variation of the DSP.
Specifically, we define the triangle densest subgraph problem (TDSP): given
, find a subset of vertices such that , where is the number of triangles induced
by the set . We provide various exact and approximation algorithms which the
solve the TDSP efficiently. Furthermore, we show how our algorithms adapt to
the more general problem of maximizing the -clique average density. Finally,
we provide empirical evidence that the TDSP should be used whenever the output
of the DSP fails to output a near-clique.Comment: 42 page
Densest Subgraph in Dynamic Graph Streams
In this paper, we consider the problem of approximating the densest subgraph
in the dynamic graph stream model. In this model of computation, the input
graph is defined by an arbitrary sequence of edge insertions and deletions and
the goal is to analyze properties of the resulting graph given memory that is
sub-linear in the size of the stream. We present a single-pass algorithm that
returns a approximation of the maximum density with high
probability; the algorithm uses O(\epsilon^{-2} n \polylog n) space,
processes each stream update in \polylog (n) time, and uses \poly(n)
post-processing time where is the number of nodes. The space used by our
algorithm matches the lower bound of Bahmani et al.~(PVLDB 2012) up to a
poly-logarithmic factor for constant . The best existing results for
this problem were established recently by Bhattacharya et al.~(STOC 2015). They
presented a approximation algorithm using similar space and
another algorithm that both processed each update and maintained a
approximation of the current maximum density in \polylog (n)
time per-update.Comment: To appear in MFCS 201
Fully Dynamic Algorithm for Top- Densest Subgraphs
Given a large graph, the densest-subgraph problem asks to find a subgraph
with maximum average degree. When considering the top- version of this
problem, a na\"ive solution is to iteratively find the densest subgraph and
remove it in each iteration. However, such a solution is impractical due to
high processing cost. The problem is further complicated when dealing with
dynamic graphs, since adding or removing an edge requires re-running the
algorithm. In this paper, we study the top- densest-subgraph problem in the
sliding-window model and propose an efficient fully-dynamic algorithm. The
input of our algorithm consists of an edge stream, and the goal is to find the
node-disjoint subgraphs that maximize the sum of their densities. In contrast
to existing state-of-the-art solutions that require iterating over the entire
graph upon any update, our algorithm profits from the observation that updates
only affect a limited region of the graph. Therefore, the top- densest
subgraphs are maintained by only applying local updates. We provide a
theoretical analysis of the proposed algorithm and show empirically that the
algorithm often generates denser subgraphs than state-of-the-art competitors.
Experiments show an improvement in efficiency of up to five orders of magnitude
compared to state-of-the-art solutions.Comment: 10 pages, 8 figures, accepted at CIKM 201
Space- and Time-Efficient Algorithm for Maintaining Dense Subgraphs on One-Pass Dynamic Streams
While in many graph mining applications it is crucial to handle a stream of
updates efficiently in terms of {\em both} time and space, not much was known
about achieving such type of algorithm. In this paper we study this issue for a
problem which lies at the core of many graph mining applications called {\em
densest subgraph problem}. We develop an algorithm that achieves time- and
space-efficiency for this problem simultaneously. It is one of the first of its
kind for graph problems to the best of our knowledge.
In a graph , the "density" of a subgraph induced by a subset of
nodes is defined as , where is the set of
edges in with both endpoints in . In the densest subgraph problem, the
goal is to find a subset of nodes that maximizes the density of the
corresponding induced subgraph. For any , we present a dynamic
algorithm that, with high probability, maintains a -approximation
to the densest subgraph problem under a sequence of edge insertions and
deletions in a graph with nodes. It uses space, and has an
amortized update time of and a query time of . Here,
hides a O(\poly\log_{1+\epsilon} n) term. The approximation ratio
can be improved to at the cost of increasing the query time to
. It can be extended to a -approximation
sublinear-time algorithm and a distributed-streaming algorithm. Our algorithm
is the first streaming algorithm that can maintain the densest subgraph in {\em
one pass}. The previously best algorithm in this setting required
passes [Bahmani, Kumar and Vassilvitskii, VLDB'12]. The space required by our
algorithm is tight up to a polylogarithmic factor.Comment: A preliminary version of this paper appeared in STOC 201
Equivalence Classes and Conditional Hardness in Massively Parallel Computations
The Massively Parallel Computation (MPC) model serves as a common abstraction of many modern large-scale data processing frameworks, and has been receiving increasingly more attention over the past few years, especially in the context of classical graph problems. So far, the only way to argue lower bounds for this model is to condition on conjectures about the hardness of some specific problems, such as graph connectivity on promise graphs that are either one cycle or two cycles, usually called the one cycle vs. two cycles problem. This is unlike the traditional arguments based on conjectures about complexity classes (e.g., P ? NP), which are often more robust in the sense that refuting them would lead to groundbreaking algorithms for a whole bunch of problems.
In this paper we present connections between problems and classes of problems that allow the latter type of arguments. These connections concern the class of problems solvable in a sublogarithmic amount of rounds in the MPC model, denoted by MPC(o(log N)), and some standard classes concerning space complexity, namely L and NL, and suggest conjectures that are robust in the sense that refuting them would lead to many surprisingly fast new algorithms in the MPC model. We also obtain new conditional lower bounds, and prove new reductions and equivalences between problems in the MPC model
Robust Densest Subgraph Discovery
Dense subgraph discovery is an important primitive in graph mining, which has
a wide variety of applications in diverse domains. In the densest subgraph
problem, given an undirected graph with an edge-weight vector
, we aim to find that maximizes the density,
i.e., , where is the sum of the weights of the edges in the
subgraph induced by . Although the densest subgraph problem is one of the
most well-studied optimization problems for dense subgraph discovery, there is
an implicit strong assumption; it is assumed that the weights of all the edges
are known exactly as input. In real-world applications, there are often cases
where we have only uncertain information of the edge weights. In this study, we
provide a framework for dense subgraph discovery under the uncertainty of edge
weights. Specifically, we address such an uncertainty issue using the theory of
robust optimization. First, we formulate our fundamental problem, the robust
densest subgraph problem, and present a simple algorithm. We then formulate the
robust densest subgraph problem with sampling oracle that models dense subgraph
discovery using an edge-weight sampling oracle, and present an algorithm with a
strong theoretical performance guarantee. Computational experiments using both
synthetic graphs and popular real-world graphs demonstrate the effectiveness of
our proposed algorithms.Comment: 10 pages; Accepted to ICDM 201
- …