Search CORE

17 research outputs found

On Approximating the Number of $k$ -cliques in Sublinear Time

Author: Avron H.
Curvature
Eden T.
New
On
Onak K.
Portes Alejandro
Seshadhri C.
Publication venue
Publication date: 12/03/2018
Field of study

We study the problem of approximating the number of

k

-cliques in a graph when given query access to the graph. We consider the standard query model for general graphs via (1) degree queries, (2) neighbor queries and (3) pair queries. Let

n

denote the number of vertices in the graph,

m

the number of edges, and

C_k

the number of

k

-cliques. We design an algorithm that outputs a

(1+\varepsilon)

-approximation (with high probability) for

C_k

, whose expected query complexity and running time are O\left(\frac{n}{C_k^{1/k}}+\frac{m^{k/2}}{C_k}\right)\poly(\log n,1/\varepsilon,k). Hence, the complexity of the algorithm is sublinear in the size of the graph for

C_k = \omega(m^{k/2-1})

. Furthermore, we prove a lower bound showing that the query complexity of our algorithm is essentially optimal (up to the dependence on

\log n

1/\varepsilon

and

k

). The previous results in this vein are by Feige (SICOMP 06) and by Goldreich and Ron (RSA 08) for edge counting (

k=2

) and by Eden et al. (FOCS 2015) for triangle counting (

k=3

). Our result matches the complexities of these results. The previous result by Eden et al. hinges on a certain amortization technique that works only for triangle counting, and does not generalize for larger cliques. We obtain a general algorithm that works for any

k\geq 3

by designing a procedure that samples each

k

-clique incident to a given set

S

of vertices with approximately equal probability. The primary difficulty is in finding cliques incident to purely high-degree vertices, since random sampling within neighbors has a low success probability. This is achieved by an algorithm that samples uniform random high degree vertices and a careful tradeoff between estimating cliques incident purely to high-degree vertices and those that include a low-degree vertex

arXiv.org e-Print Archive

Crossref

Counting Butterfies from a Large Bipartite Graph Stream

Author: Sanei-Mehri Seyed-Vahid
Sariyuce Ahmet
Tirthapura Srikanta
Tirthapura Srikanta
Zhang Yu
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2018
Field of study

We consider the estimation of properties on massive bipartite graph streams, where each edge represents a connection between entities in two different partitions. We present sublinear-space one-pass algorithms for accurately estimating the number of butterflies in the graph stream. Our estimates have provable guarantees on their quality, and experiments show promising tradeoffs between space and accuracy. We also present extensions to sliding windows. While there are many works on counting subgraphs within unipartite graph streams, our work seems to be one of the few to effectively handle bipartite graph streams

Digital Repository @ Iowa State University (ISU)

Butterfly Counting in Bipartite Networks

Author: Sanei-Mehri Seyed-Vahid
Sariyuce Ahmet Erdem
Tirthapura Srikanta
Publication venue
Publication date: 15/03/2018
Field of study

We consider the problem of counting motifs in bipartite affiliation networks, such as author-paper, user-product, and actor-movie relations. We focus on counting the number of occurrences of a "butterfly", a complete

2 \times 2

biclique, the simplest cohesive higher-order structure in a bipartite graph. Our main contribution is a suite of randomized algorithms that can quickly approximate the number of butterflies in a graph with a provable guarantee on accuracy. An experimental evaluation on large real-world networks shows that our algorithms return accurate estimates within a few seconds, even for networks with trillions of butterflies and hundreds of millions of edges.Comment: 28 pages, 5 tables, 6 figure

arXiv.org e-Print Archive

Digital Repository @ Iowa State University (ISU)

Crossref