    On Constant-Size Graphs That Preserve the Local Structure of High-Girth Graphs

    Let G=(V,E) be an undirected graph with maximum degree d. The k-disc of a vertex v is defined as the rooted subgraph that is induced by all vertices whose distance to v is at most k. The k-disc frequency vector of G, freq(G), is a vector indexed by all isomorphism types of k-discs. For each such isomorphism type Gamma, the k-disc frequency vector counts the fraction of vertices that have k-disc isomorphic to Gamma. Thus, the frequency vector freq(G) of G captures the local structure of G. A natural question is whether one can construct a much smaller graph H such that H has a similar local structure. N. Alon proved that for any epsilon>0 there always exists a graph H whose size is independent of |V| and whose frequency vector satisfies ||freq(G) - freq(G)||_1 <= epsilon. However, his proof is only existential and neither gives an explicit bound on the size of H nor an efficient algorithm. He gave the open problem to find such explicit bounds. In this paper, we solve this problem for the special case of high girth graphs. We show how to efficiently compute a graph H with the above properties when G has girth at least 2k+2 and we give explicit bounds on the size of H

    Approximating the Spectrum of a Graph

    The spectrum of a network or graph G=(V,E)G=(V,E) with adjacency matrix AA, consists of the eigenvalues of the normalized Laplacian L=I−D−1/2AD−1/2L= I - D^{-1/2} A D^{-1/2}. This set of eigenvalues encapsulates many aspects of the structure of the graph, including the extent to which the graph posses community structures at multiple scales. We study the problem of approximating the spectrum λ=(λ1,…,λ∣V∣)\lambda = (\lambda_1,\dots,\lambda_{|V|}), 0≤λ1,≤…,≤λ∣V∣≤20 \le \lambda_1,\le \dots, \le \lambda_{|V|}\le 2 of GG in the regime where the graph is too large to explicitly calculate the spectrum. We present a sublinear time algorithm that, given the ability to query a random node in the graph and select a random neighbor of a given node, computes a succinct representation of an approximation λ~=(λ~1,…,λ~∣V∣)\widetilde \lambda = (\widetilde \lambda_1,\dots,\widetilde \lambda_{|V|}), 0≤λ~1,≤…,≤λ~∣V∣≤20 \le \widetilde \lambda_1,\le \dots, \le \widetilde \lambda_{|V|}\le 2 such that ∥λ~−λ∥1≤ϵ∣V∣\|\widetilde \lambda - \lambda\|_1 \le \epsilon |V|. Our algorithm has query complexity and running time exp(O(1/ϵ))exp(O(1/\epsilon)), independent of the size of the graph, ∣V∣|V|. We demonstrate the practical viability of our algorithm on 15 different real-world graphs from the Stanford Large Network Dataset Collection, including social networks, academic collaboration graphs, and road networks. For the smallest of these graphs, we are able to validate the accuracy of our algorithm by explicitly calculating the true spectrum; for the larger graphs, such a calculation is computationally prohibitive. In addition we study the implications of our algorithm to property testing in the bounded degree graph model