Search CORE

1,897 research outputs found

On Approximating the Number of $k$ -cliques in Sublinear Time

Author: Avron H.
Curvature
Eden T.
New
On
Onak K.
Portes Alejandro
Seshadhri C.
Publication venue
Publication date: 12/03/2018
Field of study

We study the problem of approximating the number of

k

-cliques in a graph when given query access to the graph. We consider the standard query model for general graphs via (1) degree queries, (2) neighbor queries and (3) pair queries. Let

n

denote the number of vertices in the graph,

m

the number of edges, and

C_k

the number of

k

-cliques. We design an algorithm that outputs a

(1+\varepsilon)

-approximation (with high probability) for

C_k

, whose expected query complexity and running time are O\left(\frac{n}{C_k^{1/k}}+\frac{m^{k/2}}{C_k}\right)\poly(\log n,1/\varepsilon,k). Hence, the complexity of the algorithm is sublinear in the size of the graph for

C_k = \omega(m^{k/2-1})

. Furthermore, we prove a lower bound showing that the query complexity of our algorithm is essentially optimal (up to the dependence on

\log n

1/\varepsilon

and

k

). The previous results in this vein are by Feige (SICOMP 06) and by Goldreich and Ron (RSA 08) for edge counting (

k=2

) and by Eden et al. (FOCS 2015) for triangle counting (

k=3

). Our result matches the complexities of these results. The previous result by Eden et al. hinges on a certain amortization technique that works only for triangle counting, and does not generalize for larger cliques. We obtain a general algorithm that works for any

k\geq 3

by designing a procedure that samples each

k

-clique incident to a given set

S

of vertices with approximately equal probability. The primary difficulty is in finding cliques incident to purely high-degree vertices, since random sampling within neighbors has a low success probability. This is achieved by an algorithm that samples uniform random high degree vertices and a careful tradeoff between estimating cliques incident purely to high-degree vertices and those that include a low-degree vertex

arXiv.org e-Print Archive

Crossref

Estimating the weight of metric minimum spanning trees in sublinear time

Author: Artur Czumaj
Christian Sohler
Czumaj A.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 05/02/2008
Field of study

In this paper we present a sublinear-time

(1+\varepsilon)

-approximation randomized algorithm to estimate the weight of the minimum spanning tree of an

n

-point metric space. The running time of the algorithm is

\widetilde{\mathcal{O}}(n/\varepsilon^{\mathcal{O}(1)})

. Since the full description of an

n

-point metric space is of size

\Theta(n^2)

, the complexity of our algorithm is sublinear with respect to the input size. Our algorithm is almost optimal as it is not possible to approximate in

o(n)

time the weight of the minimum spanning tree to within any factor. We also show that no deterministic algorithm can achieve a

B

-approximation in

o(n^2/B^3)

time. Furthermore, it has been previously shown that no

o(n^2)

algorithm exists that returns a spanning tree whose weight is within a constant times the optimum

CiteSeerX

Crossref

Warwick Research Archives Portal Repository

Sublinear-Time Algorithms for Monomer-Dimer Systems on Bounded Degree Graphs

Author: A. Czumaj
B. Chazelle
C. Bordenave
C.D. Godsil
D. Gamarnik
L.E. Blume
M. Gonen
M. Jerrum
M. Jerrum
M. Mézard
M. Parnas
O. Goldreich
O.J. Heilmann
R. Canetti
R. Rubinfeld
S.P. Vadhan
U. Feige
Publication venue
Publication date: 01/01/2013
Field of study

For a graph

G

, let

Z(G,\lambda)

be the partition function of the monomer-dimer system defined by

\sum_k m_k(G)\lambda^k

, where

m_k(G)

is the number of matchings of size

k

G

. We consider graphs of bounded degree and develop a sublinear-time algorithm for estimating

\log Z(G,\lambda)

at an arbitrary value

\lambda>0

within additive error

\epsilon n

with high probability. The query complexity of our algorithm does not depend on the size of

G

and is polynomial in

1/\epsilon

, and we also provide a lower bound quadratic in

1/\epsilon

for this problem. This is the first analysis of a sublinear-time approximation algorithm for a # P-complete problem. Our approach is based on the correlation decay of the Gibbs distribution associated with

Z(G,\lambda)

. We show that our algorithm approximates the probability for a vertex to be covered by a matching, sampled according to this Gibbs distribution, in a near-optimal sublinear time. We extend our results to approximate the average size and the entropy of such a matching within an additive error with high probability, where again the query complexity is polynomial in

1/\epsilon

and the lower bound is quadratic in

1/\epsilon

. Our algorithms are simple to implement and of practical use when dealing with massive datasets. Our results extend to other systems where the correlation decay is known to hold as for the independent set problem up to the critical activity

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Distributed Approximation Algorithms for Weighted Shortest Paths

Author: Abram J.
Demetrescu C.
Lynch N. A.
Santoro N.
Publication venue
Publication date: 22/05/2014
Field of study

A distributed network is modeled by a graph having

n

nodes (processors) and diameter

D

. We study the time complexity of approximating {\em weighted} (undirected) shortest paths on distributed networks with a

O(\log n)

{\em bandwidth restriction} on edges (the standard synchronous \congest model). The question whether approximation algorithms help speed up the shortest paths (more precisely distance computation) was raised since at least 2004 by Elkin (SIGACT News 2004). The unweighted case of this problem is well-understood while its weighted counterpart is fundamental problem in the area of distributed approximation algorithms and remains widely open. We present new algorithms for computing both single-source shortest paths (\sssp) and all-pairs shortest paths (\apsp) in the weighted case. Our main result is an algorithm for \sssp. Previous results are the classic

O(n)

-time Bellman-Ford algorithm and an

\tilde O(n^{1/2+1/2k}+D)

-time

(8k\lceil \log (k+1) \rceil -1)

-approximation algorithm, for any integer

k\geq 1

, which follows from the result of Lenzen and Patt-Shamir (STOC 2013). (Note that Lenzen and Patt-Shamir in fact solve a harder problem, and we use

\tilde O(\cdot)

to hide the O(\poly\log n) term.) We present an

\tilde O(n^{1/2}D^{1/4}+D)

-time

(1+o(1))

-approximation algorithm for \sssp. This algorithm is {\em sublinear-time} as long as

D

is sublinear, thus yielding a sublinear-time algorithm with almost optimal solution. When

D

is small, our running time matches the lower bound of

\tilde \Omega(n^{1/2}+D)

by Das Sarma et al. (SICOMP 2012), which holds even when

D=\Theta(\log n)

, up to a \poly\log n factor.Comment: Full version of STOC 201

arXiv.org e-Print Archive

Crossref

Approximately Counting Triangles in Sublinear Time

Author: Eden Talya
Levi Amit
Ron Dana
Seshadhri C.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/09/2015
Field of study

We consider the problem of estimating the number of triangles in a graph. This problem has been extensively studied in both theory and practice, but all existing algorithms read the entire graph. In this work we design a {\em sublinear-time\/} algorithm for approximating the number of triangles in a graph, where the algorithm is given query access to the graph. The allowed queries are degree queries, vertex-pair queries and neighbor queries. We show that for any given approximation parameter

0<\epsilon<1

, the algorithm provides an estimate

\widehat{t}

such that with high constant probability,

(1-\epsilon)\cdot t< \widehat{t}<(1+\epsilon)\cdot t

, where

t

is the number of triangles in the graph

G

. The expected query complexity of the algorithm is

\!\left(\frac{n}{t^{1/3}} + \min\left\{m, \frac{m^{3/2}}{t}\right\}\right)\cdot {\rm poly}(\log n, 1/\epsilon)

, where

n

is the number of vertices in the graph and

m

is the number of edges, and the expected running time is

\!\left(\frac{n}{t^{1/3}} + \frac{m^{3/2}}{t}\right)\cdot {\rm poly}(\log n, 1/\epsilon)

. We also prove that

\Omega\!\left(\frac{n}{t^{1/3}} + \min\left\{m, \frac{m^{3/2}}{t}\right\}\right)

queries are necessary, thus establishing that the query complexity of this algorithm is optimal up to polylogarithmic factors in

n

(and the dependence on

1/\epsilon

).Comment: To appear in the 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2015

arXiv.org e-Print Archive

Crossref

Sublinear algorithms for local graph centrality estimation

Author: Bressan Marco
Peserico Enoch
Pretto Luca
Publication venue
Publication date: 01/01/2018
Field of study

We study the complexity of local graph centrality estimation, with the goal of approximating the centrality score of a given target node while exploring only a sublinear number of nodes/arcs of the graph and performing a sublinear number of elementary operations. We develop a technique, that we apply to the PageRank and Heat Kernel centralities, for building a low-variance score estimator through a local exploration of the graph. We obtain an algorithm that, given any node in any graph of

m

arcs, with probability

(1-\delta)

computes a multiplicative

(1\pm\epsilon)

-approximation of its score by examining only

\tilde{O}(\min(m^{2/3} \Delta^{1/3} d^{-2/3},\, m^{4/5} d^{-3/5}))

nodes/arcs, where

\Delta

and

d

are respectively the maximum and average outdegree of the graph (omitting for readability

\operatorname{poly}(\epsilon^{-1})

and

\operatorname{polylog}(\delta^{-1})

factors). A similar bound holds for computational complexity. We also prove a lower bound of

\Omega(\min(m^{1/2} \Delta^{1/2} d^{-1/2}, \, m^{2/3} d^{-1/3}))

for both query complexity and computational complexity. Moreover, our technique yields a

\tilde{O}(n^{2/3})

query complexity algorithm for the graph access model of [Brautbar et al., 2010], widely used in social network mining; we show this algorithm is optimal up to a sublogarithmic factor. These are the first algorithms yielding worst-case sublinear bounds for general directed graphs and any choice of the target node.Comment: 29 pages, 1 figur

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Archivio istituzionale della ricerca - Università di Padova