23 research outputs found
Linear Time Subgraph Counting, Graph Degeneracy, and the Chasm at Size Six
We consider the problem of counting all k-vertex subgraphs in an input graph, for any constant k. This problem (denoted SUB-CNT_k) has been studied extensively in both theory and practice. In a classic result, Chiba and Nishizeki (SICOMP 85) gave linear time algorithms for clique and 4-cycle counting for bounded degeneracy graphs. This is a rich class of sparse graphs that contains, for example, all minor-free families and preferential attachment graphs. The techniques from this result have inspired a number of recent practical algorithms for SUB-CNT_k. Towards a better understanding of the limits of these techniques, we ask: for what values of k can SUB_CNT_k be solved in linear time?
We discover a chasm at k=6. Specifically, we prove that for k < 6, SUB_CNT_k can be solved in linear time. Assuming a standard conjecture in fine-grained complexity, we prove that for all k ? 6, SUB-CNT_k cannot be solved even in near-linear time
Parallel Five-Cycle Counting Algorithms
Counting the frequency of subgraphs in large networks is a classic research question that reveals the underlying substructures of these networks for important applications. However, subgraph counting is a challenging problem, even for subgraph sizes as small as five, due to the combinatorial explosion in the number of possible occurrences. This paper focuses on the five-cycle, which is an important special case of five-vertex subgraph counting and one of the most difficult to count efficiently.
We design two new parallel five-cycle counting algorithms and prove that they are work-efficient and achieve polylogarithmic span. Both algorithms are based on computing low out-degree orientations, which enables the efficient computation of directed two-paths and three-paths, and the algorithms differ in the ways in which they use this orientation to eliminate double-counting. We develop fast multicore implementations of the algorithms and propose a work scheduling optimization to improve their performance. Our experiments on a variety of real-world graphs using a 36-core machine with two-way hyper-threading show that our algorithms achieves 10-46x self-relative speed-up, outperform our serial benchmarks by 10-32x, and outperform the previous state-of-the-art serial algorithm by up to 818x
Counting Subgraphs in Somewhere Dense Graphs
We study the problems of counting copies and induced copies of a small pattern graph H in a large host graph G. Recent work fully classified the complexity of those problems according to structural restrictions on the patterns H. In this work, we address the more challenging task of analysing the complexity for restricted patterns and restricted hosts. Specifically we ask which families of allowed patterns and hosts imply fixed-parameter tractability, i.e., the existence of an algorithm running in time f(H)?|G|^O(1) for some computable function f. Our main results present exhaustive and explicit complexity classifications for families that satisfy natural closure properties. Among others, we identify the problems of counting small matchings and independent sets in subgraph-closed graph classes ? as our central objects of study and establish the following crisp dichotomies as consequences of the Exponential Time Hypothesis:
- Counting k-matchings in a graph G ? ? is fixed-parameter tractable if and only if ? is nowhere dense.
- Counting k-independent sets in a graph G ? ? is fixed-parameter tractable if and only if ? is nowhere dense. Moreover, we obtain almost tight conditional lower bounds if ? is somewhere dense, i.e., not nowhere dense. These base cases of our classifications subsume a wide variety of previous results on the matching and independent set problem, such as counting k-matchings in bipartite graphs (Curticapean, Marx; FOCS 14), in F-colourable graphs (Roth, Wellnitz; SODA 20), and in degenerate graphs (Bressan, Roth; FOCS 21), as well as counting k-independent sets in bipartite graphs (Curticapean et al.; Algorithmica 19).
At the same time our proofs are much simpler: using structural characterisations of somewhere dense graphs, we show that a colourful version of a recent breakthrough technique for analysing pattern counting problems (Curticapean, Dell, Marx; STOC 17) applies to any subgraph-closed somewhere dense class of graphs, yielding a unified view of our current understanding of the complexity of subgraph counting
Computing complexity measures of degenerate graphs
We show that the VC-dimension of a graph can be computed in time , where is the degeneracy of the input graph. The core idea
of our algorithm is a data structure to efficiently query the number of
vertices that see a specific subset of vertices inside of a (small) query set.
The construction of this data structure takes time , afterwards
queries can be computed efficiently using fast M\"obius inversion.
This data structure turns out to be useful for a range of tasks, especially
for finding bipartite patterns in degenerate graphs, and we outline an
efficient algorithms for counting the number of times specific patterns occur
in a graph. The largest factor in the running time of this algorithm is
, where is a parameter of the pattern we call its left covering
number.
Concrete applications of this algorithm include counting the number of
(non-induced) bicliques in linear time, the number of co-matchings in quadratic
time, as well as a constant-factor approximation of the ladder index in linear
time.
Finally, we supplement our theoretical results with several implementations
and run experiments on more than 200 real-world datasets -- the largest of
which has 8 million edges -- where we obtain interesting insights into the
VC-dimension of real-world networks.Comment: Accepted for publication in the 18th International Symposium on
Parameterized and Exact Computation (IPEC 2023
Parallel Algorithms for Small Subgraph Counting
Subgraph counting is a fundamental problem in analyzing massive graphs, often
studied in the context of social and complex networks. There is a rich
literature on designing efficient, accurate, and scalable algorithms for this
problem. In this work, we tackle this challenge and design several new
algorithms for subgraph counting in the Massively Parallel Computation (MPC)
model:
Given a graph over vertices, edges and triangles, our first
main result is an algorithm that, with high probability, outputs a
-approximation to , with optimal round and space complexity
provided any space per machine, assuming
.
Our second main result is an -rounds
algorithm for exactly counting the number of triangles, parametrized by the
arboricity of the input graph. The space per machine is
for any constant , and the total space is ,
which matches the time complexity of (combinatorial) triangle counting in the
sequential model. We also prove that this result can be extended to exactly
counting -cliques for any constant , with the same round complexity and
total space . Alternatively, allowing space per
machine, the total space requirement reduces to .
Finally, we prove that a recent result of Bera, Pashanasangi and Seshadhri
(ITCS 2020) for exactly counting all subgraphs of size at most , can be
implemented in the MPC model in rounds,
space per machine and total space. Therefore,
this result also exhibits the phenomenon that a time bound in the sequential
model translates to a space bound in the MPC model