63,466 research outputs found

    Learning multifractal structure in large networks

    Full text link
    Generating random graphs to model networks has a rich history. In this paper, we analyze and improve upon the multifractal network generator (MFNG) introduced by Palla et al. We provide a new result on the probability of subgraphs existing in graphs generated with MFNG. From this result it follows that we can quickly compute moments of an important set of graph properties, such as the expected number of edges, stars, and cliques. Specifically, we show how to compute these moments in time complexity independent of the size of the graph and the number of recursive levels in the generative model. We leverage this theory to a new method of moments algorithm for fitting large networks to MFNG. Empirically, this new approach effectively simulates properties of several social and information networks. In terms of matching subgraph counts, our method outperforms similar algorithms used with the Stochastic Kronecker Graph model. Furthermore, we present a fast approximation algorithm to generate graph instances following the multi- fractal structure. The approximation scheme is an improvement over previous methods, which ran in time complexity quadratic in the number of vertices. Combined, our method of moments and fast sampling scheme provide the first scalable framework for effectively modeling large networks with MFNG

    Degree-based goodness-of-fit tests for heterogeneous random graph models : independent and exchangeable cases

    Full text link
    The degrees are a classical and relevant way to study the topology of a network. They can be used to assess the goodness-of-fit for a given random graph model. In this paper we introduce goodness-of-fit tests for two classes of models. First, we consider the case of independent graph models such as the heterogeneous Erd\"os-R\'enyi model in which the edges have different connection probabilities. Second, we consider a generic model for exchangeable random graphs called the W-graph. The stochastic block model and the expected degree distribution model fall within this framework. We prove the asymptotic normality of the degree mean square under these independent and exchangeable models and derive formal tests. We study the power of the proposed tests and we prove the asymptotic normality under specific sparsity regimes. The tests are illustrated on real networks from social sciences and ecology, and their performances are assessed via a simulation study

    An Ensemble Framework for Detecting Community Changes in Dynamic Networks

    Full text link
    Dynamic networks, especially those representing social networks, undergo constant evolution of their community structure over time. Nodes can migrate between different communities, communities can split into multiple new communities, communities can merge together, etc. In order to represent dynamic networks with evolving communities it is essential to use a dynamic model rather than a static one. Here we use a dynamic stochastic block model where the underlying block model is different at different times. In order to represent the structural changes expressed by this dynamic model the network will be split into discrete time segments and a clustering algorithm will assign block memberships for each segment. In this paper we show that using an ensemble of clustering assignments accommodates for the variance in scalable clustering algorithms and produces superior results in terms of pairwise-precision and pairwise-recall. We also demonstrate that the dynamic clustering produced by the ensemble can be visualized as a flowchart which encapsulates the community evolution succinctly.Comment: 6 pages, under submission to HPEC Graph Challeng

    Covariate-assisted spectral clustering

    Full text link
    Biological and social systems consist of myriad interacting units. The interactions can be represented in the form of a graph or network. Measurements of these graphs can reveal the underlying structure of these interactions, which provides insight into the systems that generated the graphs. Moreover, in applications such as connectomics, social networks, and genomics, graph data are accompanied by contextualizing measures on each node. We utilize these node covariates to help uncover latent communities in a graph, using a modification of spectral clustering. Statistical guarantees are provided under a joint mixture model that we call the node-contextualized stochastic blockmodel, including a bound on the mis-clustering rate. The bound is used to derive conditions for achieving perfect clustering. For most simulated cases, covariate-assisted spectral clustering yields results superior to regularized spectral clustering without node covariates and to an adaptation of canonical correlation analysis. We apply our clustering method to large brain graphs derived from diffusion MRI data, using the node locations or neurological region membership as covariates. In both cases, covariate-assisted spectral clustering yields clusters that are easier to interpret neurologically.Comment: 28 pages, 4 figures, includes substantial changes to theoretical result

    Connectivity of Random Annulus Graphs and the Geometric Block Model

    Get PDF
    We provide new connectivity results for {\em vertex-random graphs} or {\em random annulus graphs} which are significant generalizations of random geometric graphs. Random geometric graphs (RGG) are one of the most basic models of random graphs for spatial networks proposed by Gilbert in 1961, shortly after the introduction of the Erd\H{o}s-R\'{en}yi random graphs. They resemble social networks in many ways (e.g. by spontaneously creating cluster of nodes with high modularity). The connectivity properties of RGG have been studied since its introduction, and analyzing them has been significantly harder than their Erd\H{o}s-R\'{en}yi counterparts due to correlated edge formation. Our next contribution is in using the connectivity of random annulus graphs to provide necessary and sufficient conditions for efficient recovery of communities for {\em the geometric block model} (GBM). The GBM is a probabilistic model for community detection defined over an RGG in a similar spirit as the popular {\em stochastic block model}, which is defined over an Erd\H{o}s-R\'{en}yi random graph. The geometric block model inherits the transitivity properties of RGGs and thus models communities better than a stochastic block model. However, analyzing them requires fresh perspectives as all prior tools fail due to correlation in edge formation. We provide a simple and efficient algorithm that can recover communities in GBM exactly with high probability in the regime of connectivity
    • …
    corecore