5 research outputs found
An efficient and principled method for detecting communities in networks
A fundamental problem in the analysis of network data is the detection of
network communities, groups of densely interconnected nodes, which may be
overlapping or disjoint. Here we describe a method for finding overlapping
communities based on a principled statistical approach using generative network
models. We show how the method can be implemented using a fast, closed-form
expectation-maximization algorithm that allows us to analyze networks of
millions of nodes in reasonable running times. We test the method both on
real-world networks and on synthetic benchmarks and find that it gives results
competitive with previous methods. We also show that the same approach can be
used to extract nonoverlapping community divisions via a relaxation method, and
demonstrate that the algorithm is competitively fast and accurate for the
nonoverlapping problem.Comment: 14 pages, 5 figures, 1 tabl
Blockmodeling Techniques for Complex Networks.
The class of network models known as stochastic blockmodels has recently been gaining popularity. In this dissertation, we present new work that uses blockmodels to answer questions about networks. We create a blockmodel based on the idea of link communities, which naturally gives rise to overlapping vertex communities. We derive a fast and accurate algorithm to fit the model to networks. This model can be related to another blockmodel, which allows the method to efficiently find nonoverlapping communities as well. We then create a heuristic based on the link community model whose use is to find the correct number of communities in a network. The heuristic is based on intuitive corrections to likelihood ratio tests. It does a good job finding the correct number of communities in both real networks and synthetic networks generated from the link communities model. Two commonly studied types of networks are citation networks, where research papers cite other papers, and coauthorship networks, where authors are connected if they've written a paper together. We study a multi-modal network from a large dataset of Physics publications that is the combination of the two, allowing for directed links between papers as citations, and an undirected edge between a scientist and a paper if they helped to write it. This allows for new insights on the relation between social interaction and scientific production. We also have the publication dates of papers, which lets us track our measures over time. Finally, we create a stochastic model for ranking vertices in a semi-directed network. The probability of connection between two vertices depends on the difference of their ranks. When this model is fit to high school friendship networks, the ranks appear to correspond with a measure of social status. Students have reciprocated and some unreciprocated edges with other students of closely similar rank that correspond to true friendship, and claim an aspirational friendship with a much higher ranked individual a fraction of the time. In general, students with more friends have higher ranks than those with fewer friends, and older students have higher ranks than younger students.PhDPhysicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/108855/1/briball_1.pd