53,202 research outputs found

    Spectral density of the non-backtracking operator

    Full text link
    The non-backtracking operator was recently shown to provide a significant improvement when used for spectral clustering of sparse networks. In this paper we analyze its spectral density on large random sparse graphs using a mapping to the correlation functions of a certain interacting quantum disordered system on the graph. On sparse, tree-like graphs, this can be solved efficiently by the cavity method and a belief propagation algorithm. We show that there exists a paramagnetic phase, leading to zero spectral density, that is stable outside a circle of radius ρ\sqrt{\rho}, where ρ\rho is the leading eigenvalue of the non-backtracking operator. We observe a second-order phase transition at the edge of this circle, between a zero and a non-zero spectral density. That fact that this phase transition is absent in the spectral density of other matrices commonly used for spectral clustering provides a physical justification of the performances of the non-backtracking operator in spectral clustering.Comment: 6 pages, 6 figures, submitted to EP

    Towards Scalable Spectral Clustering via Spectrum-Preserving Sparsification

    Get PDF
    Eigenvalue decomposition of Laplacian matrices for large nearest-neighbor (NN)graphs is the major computational bottleneck in spectral clustering (SC). To fundamentally address this computational challenge in SC, we propose a scalable spectral sparsification framework that enables to construct nearly-linear-sized ultra-sparse NN graphs with guaranteed preservation of key eigenvalues and eigenvectors of the original Laplacian. The proposed method is based on the latest theoretical results in spectral graph theory and thus can be applied to robustly handle general undirected graphs. By leveraging a nearly-linear time spectral graph topology sparsification phase and a subgraph scaling phase via stochastic gradient descent (SGD) iterations, our approach allows computing tree-like NN graphs that can serve as high-quality proxies of the original NN graphs, leading to highly-scalable and accurate SC of large data sets. Our extensive experimental results on a variety of public domain data sets show dramatically improved performance when compared with state-of-the-art SC methods

    Properties of dense partially random graphs

    Full text link
    We study the properties of random graphs where for each vertex a {\it neighbourhood} has been previously defined. The probability of an edge joining two vertices depends on whether the vertices are neighbours or not, as happens in Small World Graphs (SWGs). But we consider the case where the average degree of each node is of order of the size of the graph (unlike SWGs, which are sparse). This allows us to calculate the mean distance and clustering, that are qualitatively similar (although not in such a dramatic scale range) to the case of SWGs. We also obtain analytically the distribution of eigenvalues of the corresponding adjacency matrices. This distribution is discrete for large eigenvalues and continuous for small eigenvalues. The continuous part of the distribution follows a semicircle law, whose width is proportional to the "disorder" of the graph, whereas the discrete part is simply a rescaling of the spectrum of the substrate. We apply our results to the calculation of the mixing rate and the synchronizability threshold.Comment: 14 pages. To be published in Physical Review

    Scale free effects in world currency exchange network

    Full text link
    A large collection of daily time series for 60 world currencies' exchange rates is considered. The correlation matrices are calculated and the corresponding Minimal Spanning Tree (MST) graphs are constructed for each of those currencies used as reference for the remaining ones. It is shown that multiplicity of the MST graphs' nodes to a good approximation develops a power like, scale free distribution with the scaling exponent similar as for several other complex systems studied so far. Furthermore, quantitative arguments in favor of the hierarchical organization of the world currency exchange network are provided by relating the structure of the above MST graphs and their scaling exponents to those that are derived from an exactly solvable hierarchical network model. A special status of the USD during the period considered can be attributed to some departures of the MST features, when this currency (or some other tied to it) is used as reference, from characteristics typical to such a hierarchical clustering of nodes towards those that correspond to the random graphs. Even though in general the basic structure of the MST is robust with respect to changing the reference currency some trace of a systematic transition from somewhat dispersed -- like the USD case -- towards more compact MST topology can be observed when correlations increase.Comment: Eur. Phys. J. B (2008) in pres

    When local and global clustering of networks diverge

    Get PDF
    The average Watts-Strogatz clustering coecient and the network transitivity are widely used descriptors for characterizing the transitivity of relations in real-world graphs (networks). These indices are bounded between zero and one, with low values indicating poor transtivity and large ones indicating a high proportion of closed triads in the graphs. Here, we prove that these two indices diverge for windmill graphs when the number of nodes tends to infinity. We also give evidence that this divergence occurs in many real-world networks, especially in citation and collaboration graphs. We obtain analytic expressions for the eigenvalues and eigenvectors of the adjacency and the Laplacian matrices of the windmill graphs. Using this information we show the main characteristics of two dynamical processes when taking place on windmill graphs: synchronization and epidemic spreading. Finally, we show that many of the structural and dynamical properties of a real-world citation network are well reproduced by the appropriate windmill graph, showing the potential of these graphs as models for certain classes of real-world networks

    Compressive PCA for Low-Rank Matrices on Graphs

    Get PDF
    We introduce a novel framework for an approxi- mate recovery of data matrices which are low-rank on graphs, from sampled measurements. The rows and columns of such matrices belong to the span of the first few eigenvectors of the graphs constructed between their rows and columns. We leverage this property to recover the non-linear low-rank structures efficiently from sampled data measurements, with a low cost (linear in n). First, a Resrtricted Isometry Property (RIP) condition is introduced for efficient uniform sampling of the rows and columns of such matrices based on the cumulative coherence of graph eigenvectors. Secondly, a state-of-the-art fast low-rank recovery method is suggested for the sampled data. Finally, several efficient, parallel and parameter-free decoders are presented along with their theoretical analysis for decoding the low-rank and cluster indicators for the full data matrix. Thus, we overcome the computational limitations of the standard linear low-rank recovery methods for big datasets. Our method can also be seen as a major step towards efficient recovery of non- linear low-rank structures. For a matrix of size n X p, on a single core machine, our method gains a speed up of p2/kp^2/k over Robust Principal Component Analysis (RPCA), where k << p is the subspace dimension. Numerically, we can recover a low-rank matrix of size 10304 X 1000, 100 times faster than Robust PCA
    corecore