17 research outputs found
On Approximating the Number of -cliques in Sublinear Time
We study the problem of approximating the number of -cliques in a graph
when given query access to the graph.
We consider the standard query model for general graphs via (1) degree
queries, (2) neighbor queries and (3) pair queries. Let denote the number
of vertices in the graph, the number of edges, and the number of
-cliques. We design an algorithm that outputs a
-approximation (with high probability) for , whose
expected query complexity and running time are
O\left(\frac{n}{C_k^{1/k}}+\frac{m^{k/2}}{C_k}\right)\poly(\log
n,1/\varepsilon,k).
Hence, the complexity of the algorithm is sublinear in the size of the graph
for . Furthermore, we prove a lower bound showing that
the query complexity of our algorithm is essentially optimal (up to the
dependence on , and ).
The previous results in this vein are by Feige (SICOMP 06) and by Goldreich
and Ron (RSA 08) for edge counting () and by Eden et al. (FOCS 2015) for
triangle counting (). Our result matches the complexities of these
results.
The previous result by Eden et al. hinges on a certain amortization technique
that works only for triangle counting, and does not generalize for larger
cliques. We obtain a general algorithm that works for any by
designing a procedure that samples each -clique incident to a given set
of vertices with approximately equal probability. The primary difficulty is in
finding cliques incident to purely high-degree vertices, since random sampling
within neighbors has a low success probability. This is achieved by an
algorithm that samples uniform random high degree vertices and a careful
tradeoff between estimating cliques incident purely to high-degree vertices and
those that include a low-degree vertex
Counting Butterfies from a Large Bipartite Graph Stream
We consider the estimation of properties on massive bipartite graph streams, where each edge represents a connection between entities in two different partitions. We present sublinear-space one-pass algorithms for accurately estimating the number of butterflies in the graph stream. Our estimates have provable guarantees on their quality, and experiments show promising tradeoffs between space and accuracy. We also present extensions to sliding windows. While there are many works on counting subgraphs within unipartite graph streams, our work seems to be one of the few to effectively handle bipartite graph streams
Butterfly Counting in Bipartite Networks
We consider the problem of counting motifs in bipartite affiliation networks,
such as author-paper, user-product, and actor-movie relations. We focus on
counting the number of occurrences of a "butterfly", a complete
biclique, the simplest cohesive higher-order structure in a bipartite graph.
Our main contribution is a suite of randomized algorithms that can quickly
approximate the number of butterflies in a graph with a provable guarantee on
accuracy. An experimental evaluation on large real-world networks shows that
our algorithms return accurate estimates within a few seconds, even for
networks with trillions of butterflies and hundreds of millions of edges.Comment: 28 pages, 5 tables, 6 figure