Search CORE

3,908 research outputs found

Faster Subgraph Counting in Sparse Graphs

Author: Bressan Marco
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 14th International Symposium on Parameterized and Exact Computation (IPEC 2019)
Publication date: 01/01/2019
Field of study

A fundamental graph problem asks to compute the number of induced copies of a k-node pattern graph H in an n-node graph G. The fastest algorithm to date is still the 35-years-old algorithm by Nesetril and Poljak [Nesetril and Poljak, 1985], with running time f(k) * O(n^{omega floor[k/3] + 2}) where omega <=2.373 is the matrix multiplication exponent. In this work we show that, if one takes into account the degeneracy d of G, then the picture becomes substantially richer and leads to faster algorithms when G is sufficiently sparse. More precisely, after introducing a novel notion of graph width, the DAG-treewidth, we prove what follows. If H has DAG-treewidth tau(H) and G has degeneracy d, then the induced copies of H in G can be counted in time f(d,k) * O~(n^{tau(H)}); and, under the Exponential Time Hypothesis, no algorithm can solve the problem in time f(d,k) * n^{o(tau(H)/ln tau(H))} for all H. This result characterises the complexity of counting subgraphs in a d-degenerate graph. Developing bounds on tau(H), then, we obtain natural generalisations of classic results and faster algorithms for sparse graphs. For example, when d=O(poly log(n)) we can count the induced copies of any H in time f(k) * O~(n^{floor[k/4] + 2}), beating the Nesetril-Poljak algorithm by essentially a cubic factor in n

Dagstuhl Research Online Publication Server

Combinatorial algorithm for counting small induced graphs and orbits

Author: Demšar Janez
Hočevar Tomaž
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 25/01/2016
Field of study

Graphlet analysis is an approach to network analysis that is particularly popular in bioinformatics. We show how to set up a system of linear equations that relate the orbit counts and can be used in an algorithm that is significantly faster than the existing approaches based on direct enumeration of graphlets. The algorithm requires existence of a vertex with certain properties; we show that such vertex exists for graphlets of arbitrary size, except for complete graphs and

C_4

, which are treated separately. Empirical analysis of running time agrees with the theoretical results

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

Shared-memory Graph Truss Decomposition

Author: Kabir Humayun
Madduri Kamesh
Publication venue
Publication date: 06/07/2017
Field of study

We present PKT, a new shared-memory parallel algorithm and OpenMP implementation for the truss decomposition of large sparse graphs. A k-truss is a dense subgraph definition that can be considered a relaxation of a clique. Truss decomposition refers to a partitioning of all the edges in the graph based on their k-truss membership. The truss decomposition of a graph has many applications. We show that our new approach PKT consistently outperforms other truss decomposition approaches for a collection of large sparse graphs and on a 24-core shared-memory server. PKT is based on a recently proposed algorithm for k-core decomposition.Comment: 10 pages, conference submissio

arXiv.org e-Print Archive

Crossref

Distributed Estimation of Graph 4-Profiles

Author: Borokhovich Michael
Dimakis Alexandros G.
Elenberg Ethan R.
Shanmugam Karthikeyan
Publication venue
Publication date: 04/04/2016
Field of study

We present a novel distributed algorithm for counting all four-node induced subgraphs in a big graph. These counts, called the

4

-profile, describe a graph's connectivity properties and have found several uses ranging from bioinformatics to spam detection. We also study the more complicated problem of estimating the local

4

-profiles centered at each vertex of the graph. The local

4

-profile embeds every vertex in an

11

-dimensional space that characterizes the local geometry of its neighborhood: vertices that connect different clusters will have different local

4

-profiles compared to those that are only part of one dense cluster. Our algorithm is a local, distributed message-passing scheme on the graph and computes all the local

4

-profiles in parallel. We rely on two novel theoretical contributions: we show that local

4

-profiles can be calculated using compressed two-hop information and also establish novel concentration results that show that graphs can be substantially sparsified and still retain good approximation quality for the global

4

-profile. We empirically evaluate our algorithm using a distributed GraphLab implementation that we scaled up to

640

cores. We show that our algorithm can compute global and local

4

-profiles of graphs with millions of edges in a few minutes, significantly improving upon the previous state of the art.Comment: To appear in part at WWW'1

arXiv.org e-Print Archive

Crossref

A Novel Approach to Finding Near-Cliques: The Triangle-Densest Subgraph Problem

Author: Tsourakakis Charalampos E.
Publication venue
Publication date: 20/05/2014
Field of study

Many graph mining applications rely on detecting subgraphs which are near-cliques. There exists a dichotomy between the results in the existing work related to this problem: on the one hand the densest subgraph problem (DSP) which maximizes the average degree over all subgraphs is solvable in polynomial time but for many networks fails to find subgraphs which are near-cliques. On the other hand, formulations that are geared towards finding near-cliques are NP-hard and frequently inapproximable due to connections with the Maximum Clique problem. In this work, we propose a formulation which combines the best of both worlds: it is solvable in polynomial time and finds near-cliques when the DSP fails. Surprisingly, our formulation is a simple variation of the DSP. Specifically, we define the triangle densest subgraph problem (TDSP): given

G(V,E)

, find a subset of vertices

S^*

such that

\tau(S^*)=\max_{S \subseteq V} \frac{t(S)}{|S|}

, where

t(S)

is the number of triangles induced by the set

S

. We provide various exact and approximation algorithms which the solve the TDSP efficiently. Furthermore, we show how our algorithms adapt to the more general problem of maximizing the

k

-clique average density. Finally, we provide empirical evidence that the TDSP should be used whenever the output of the DSP fails to output a near-clique.Comment: 42 page

arXiv.org e-Print Archive

CiteSeerX