392 research outputs found

    Finding local community structure in networks

    Full text link
    Although the inference of global community structure in networks has recently become a topic of great interest in the physics community, all such algorithms require that the graph be completely known. Here, we define both a measure of local community structure and an algorithm that infers the hierarchy of communities that enclose a given vertex by exploring the graph one vertex at a time. This algorithm runs in time O(d*k^2) for general graphs when dd is the mean degree and k is the number of vertices to be explored. For graphs where exploring a new vertex is time-consuming, the running time is linear, O(k). We show that on computer-generated graphs this technique compares favorably to algorithms that require global knowledge. We also use this algorithm to extract meaningful local clustering information in the large recommender network of an online retailer and show the existence of mesoscopic structure.Comment: 7 pages, 6 figure

    Revisiting a summer vacation: digital restoration and typesetter forensics

    Get PDF
    In 1979 the Computing Science Research Center (‘Center 127’) at Bell Laboratories bought a Linotron 202 typesetter from the Mergenthaler company. This was a ‘third generation’ digital machine that used a CRT to image characters onto photographic paper. The intent was to use existing Linotype fonts and also to develop new ones to exploit the 202’s line-drawing capabilities. Use of the 202 was hindered by Mergenthaler’s refusal to reveal the inner structure and encoding mechanisms of the font files. The particular 202 was further dogged by extreme hardware and software unreliability. A memorandum describing the experience was written in early 1980 but was deemed to be too “sensitive” to release. The original troff input for the memorandum exists and now, more than 30 years later, the memorandum can be released. However, the only available record of its visual appearance was a poor-quality scanned photocopy of the original printed version. This paper details our efforts in rebuilding a faithful retypeset replica of the original memorandum, given that the Linotron 202 disappeared long ago, and that this episode at Bell Labs occurred 5 years before the dawn of PostScript (and later PDF) as de facto standards for digital document preservation. The paper concludes with some lessons for digital archiving policy drawn from this rebuilding exercise

    Finding community structure in very large networks

    Full text link
    The discovery and analysis of community structure in networks is a topic of considerable recent interest within the physics community, but most methods proposed so far are unsuitable for very large networks because of their computational cost. Here we present a hierarchical agglomeration algorithm for detecting community structure which is faster than many competing algorithms: its running time on a network with n vertices and m edges is O(m d log n) where d is the depth of the dendrogram describing the community structure. Many real-world networks are sparse and hierarchical, with m ~ n and d ~ log n, in which case our algorithm runs in essentially linear time, O(n log^2 n). As an example of the application of this algorithm we use it to analyze a network of items for sale on the web-site of a large online retailer, items in the network being linked if they are frequently purchased by the same buyer. The network has more than 400,000 vertices and 2 million edges. We show that our algorithm can extract meaningful communities from this network, revealing large-scale patterns present in the purchasing habits of customers

    Stochastic blockmodels and community structure in networks

    Full text link
    Stochastic blockmodels have been proposed as a tool for detecting community structure in networks as well as for generating synthetic networks for use as benchmarks. Most blockmodels, however, ignore variation in vertex degree, making them unsuitable for applications to real-world networks, which typically display broad degree distributions that can significantly distort the results. Here we demonstrate how the generalization of blockmodels to incorporate this missing element leads to an improved objective function for community detection in complex networks. We also propose a heuristic algorithm for community detection using this objective function or its non-degree-corrected counterpart and show that the degree-corrected version dramatically outperforms the uncorrected one in both real-world and synthetic networks.Comment: 11 pages, 3 figure

    Efficient modularity optimization by multistep greedy algorithm and vertex mover refinement

    Full text link
    Identifying strongly connected substructures in large networks provides insight into their coarse-grained organization. Several approaches based on the optimization of a quality function, e.g., the modularity, have been proposed. We present here a multistep extension of the greedy algorithm (MSG) that allows the merging of more than one pair of communities at each iteration step. The essential idea is to prevent the premature condensation into few large communities. Upon convergence of the MSG a simple refinement procedure called "vertex mover" (VM) is used for reassigning vertices to neighboring communities to improve the final modularity value. With an appropriate choice of the step width, the combined MSG-VM algorithm is able to find solutions of higher modularity than those reported previously. The multistep extension does not alter the scaling of computational cost of the greedy algorithm.Comment: 7 pages, parts of text rewritten, illustrations and pseudocode representation of algorithms adde

    The practice of programming

    Get PDF

    Analysis of weighted networks

    Full text link
    The connections in many networks are not merely binary entities, either present or not, but have associated weights that record their strengths relative to one another. Recent studies of networks have, by and large, steered clear of such weighted networks, which are often perceived as being harder to analyze than their unweighted counterparts. Here we point out that weighted networks can in many cases be analyzed using a simple mapping from a weighted network to an unweighted multigraph, allowing us to apply standard techniques for unweighted graphs to weighted ones as well. We give a number of examples of the method, including an algorithm for detecting community structure in weighted networks and a new and simple proof of the max-flow/min-cut theorem.Comment: 9 pages, 3 figure

    An efficient and principled method for detecting communities in networks

    Full text link
    A fundamental problem in the analysis of network data is the detection of network communities, groups of densely interconnected nodes, which may be overlapping or disjoint. Here we describe a method for finding overlapping communities based on a principled statistical approach using generative network models. We show how the method can be implemented using a fast, closed-form expectation-maximization algorithm that allows us to analyze networks of millions of nodes in reasonable running times. We test the method both on real-world networks and on synthetic benchmarks and find that it gives results competitive with previous methods. We also show that the same approach can be used to extract nonoverlapping community divisions via a relaxation method, and demonstrate that the algorithm is competitively fast and accurate for the nonoverlapping problem.Comment: 14 pages, 5 figures, 1 tabl

    d_c=4 is the upper critical dimension for the Bak-Sneppen model

    Full text link
    Numerical results are presented indicating d_c=4 as the upper critical dimension for the Bak-Sneppen evolution model. This finding agrees with previous theoretical arguments, but contradicts a recent Letter [Phys. Rev. Lett. 80, 5746-5749 (1998)] that placed d_c as high as d=8. In particular, we find that avalanches are compact for all dimensions d<=4, and are fractal for d>4. Under those conditions, scaling arguments predict a d_c=4, where hyperscaling relations hold for d<=4. Other properties of avalanches, studied for 1<=d<=6, corroborate this result. To this end, an improved numerical algorithm is presented that is based on the equivalent branching process.Comment: 4 pages, RevTex4, as to appear in Phys. Rev. Lett., related papers available at http://userwww.service.emory.edu/~sboettc

    Detecting community structure in networks using edge prediction methods

    Full text link
    Community detection and edge prediction are both forms of link mining: they are concerned with discovering the relations between vertices in networks. Some of the vertex similarity measures used in edge prediction are closely related to the concept of community structure. We use this insight to propose a novel method for improving existing community detection algorithms by using a simple vertex similarity measure. We show that this new strategy can be more effective in detecting communities than the basic community detection algorithms.Comment: 5 pages, 2 figure
    corecore