53,202 research outputs found
Spectral density of the non-backtracking operator
The non-backtracking operator was recently shown to provide a significant
improvement when used for spectral clustering of sparse networks. In this paper
we analyze its spectral density on large random sparse graphs using a mapping
to the correlation functions of a certain interacting quantum disordered system
on the graph. On sparse, tree-like graphs, this can be solved efficiently by
the cavity method and a belief propagation algorithm. We show that there exists
a paramagnetic phase, leading to zero spectral density, that is stable outside
a circle of radius , where is the leading eigenvalue of the
non-backtracking operator. We observe a second-order phase transition at the
edge of this circle, between a zero and a non-zero spectral density. That fact
that this phase transition is absent in the spectral density of other matrices
commonly used for spectral clustering provides a physical justification of the
performances of the non-backtracking operator in spectral clustering.Comment: 6 pages, 6 figures, submitted to EP
Towards Scalable Spectral Clustering via Spectrum-Preserving Sparsification
Eigenvalue decomposition of Laplacian matrices for large nearest-neighbor (NN)graphs is the major computational bottleneck in spectral clustering (SC). To fundamentally address this computational challenge in SC, we propose a scalable spectral sparsification framework that enables to construct nearly-linear-sized ultra-sparse NN graphs with guaranteed preservation of key eigenvalues and eigenvectors of the original Laplacian. The proposed method is based on the latest theoretical results in spectral graph theory and thus can be applied to robustly handle general undirected graphs. By leveraging a nearly-linear time spectral graph topology sparsification phase and a subgraph scaling phase via stochastic gradient descent (SGD) iterations, our approach allows computing tree-like NN graphs that can serve as high-quality proxies of the original NN graphs, leading to highly-scalable and accurate SC of large data sets. Our extensive experimental results on a variety of public domain data sets show dramatically improved performance when compared with state-of-the-art SC methods
Properties of dense partially random graphs
We study the properties of random graphs where for each vertex a {\it
neighbourhood} has been previously defined. The probability of an edge joining
two vertices depends on whether the vertices are neighbours or not, as happens
in Small World Graphs (SWGs). But we consider the case where the average degree
of each node is of order of the size of the graph (unlike SWGs, which are
sparse). This allows us to calculate the mean distance and clustering, that are
qualitatively similar (although not in such a dramatic scale range) to the case
of SWGs. We also obtain analytically the distribution of eigenvalues of the
corresponding adjacency matrices. This distribution is discrete for large
eigenvalues and continuous for small eigenvalues. The continuous part of the
distribution follows a semicircle law, whose width is proportional to the
"disorder" of the graph, whereas the discrete part is simply a rescaling of the
spectrum of the substrate. We apply our results to the calculation of the
mixing rate and the synchronizability threshold.Comment: 14 pages. To be published in Physical Review
Scale free effects in world currency exchange network
A large collection of daily time series for 60 world currencies' exchange
rates is considered. The correlation matrices are calculated and the
corresponding Minimal Spanning Tree (MST) graphs are constructed for each of
those currencies used as reference for the remaining ones. It is shown that
multiplicity of the MST graphs' nodes to a good approximation develops a power
like, scale free distribution with the scaling exponent similar as for several
other complex systems studied so far. Furthermore, quantitative arguments in
favor of the hierarchical organization of the world currency exchange network
are provided by relating the structure of the above MST graphs and their
scaling exponents to those that are derived from an exactly solvable
hierarchical network model. A special status of the USD during the period
considered can be attributed to some departures of the MST features, when this
currency (or some other tied to it) is used as reference, from characteristics
typical to such a hierarchical clustering of nodes towards those that
correspond to the random graphs. Even though in general the basic structure of
the MST is robust with respect to changing the reference currency some trace of
a systematic transition from somewhat dispersed -- like the USD case -- towards
more compact MST topology can be observed when correlations increase.Comment: Eur. Phys. J. B (2008) in pres
When local and global clustering of networks diverge
The average Watts-Strogatz clustering coecient and the network transitivity are widely used descriptors for characterizing the transitivity of relations in real-world graphs (networks). These indices are bounded between zero and one, with low values indicating poor transtivity and large ones indicating a high proportion of closed triads in the graphs. Here, we prove that these two indices diverge for windmill graphs when the number of nodes tends to infinity. We also give evidence that this divergence occurs in many real-world networks, especially in citation and collaboration graphs. We obtain analytic expressions for the eigenvalues and eigenvectors of the adjacency and the Laplacian matrices of the windmill graphs. Using this information we show the main characteristics of two dynamical processes when taking place on windmill graphs: synchronization and epidemic spreading. Finally, we show that many of the structural and dynamical properties of a real-world citation network are well reproduced by the appropriate windmill graph, showing the potential of these graphs as models for certain classes of real-world networks
Compressive PCA for Low-Rank Matrices on Graphs
We introduce a novel framework for an approxi- mate recovery of data matrices
which are low-rank on graphs, from sampled measurements. The rows and columns
of such matrices belong to the span of the first few eigenvectors of the graphs
constructed between their rows and columns. We leverage this property to
recover the non-linear low-rank structures efficiently from sampled data
measurements, with a low cost (linear in n). First, a Resrtricted Isometry
Property (RIP) condition is introduced for efficient uniform sampling of the
rows and columns of such matrices based on the cumulative coherence of graph
eigenvectors. Secondly, a state-of-the-art fast low-rank recovery method is
suggested for the sampled data. Finally, several efficient, parallel and
parameter-free decoders are presented along with their theoretical analysis for
decoding the low-rank and cluster indicators for the full data matrix. Thus, we
overcome the computational limitations of the standard linear low-rank recovery
methods for big datasets. Our method can also be seen as a major step towards
efficient recovery of non- linear low-rank structures. For a matrix of size n X
p, on a single core machine, our method gains a speed up of over Robust
Principal Component Analysis (RPCA), where k << p is the subspace dimension.
Numerically, we can recover a low-rank matrix of size 10304 X 1000, 100 times
faster than Robust PCA
- …