115,416 research outputs found
Algorithms For Discovering Communities In Complex Networks
It has been observed that real-world random networks like the WWW, Internet, social networks, citation networks, etc., organize themselves into closely-knit groups that are locally dense and globally sparse. These closely-knit groups are termed communities. Nodes within a community are similar in some aspect. For example in a WWW network, communities might consist of web pages that share similar contents. Mining these communities facilitates better understanding of their evolution and topology, and is of great theoretical and commercial significance. Community related research has focused on two main problems: community discovery and community identification. Community discovery is the problem of extracting all the communities in a given network, whereas community identification is the problem of identifying the community, to which, a given set of nodes belong. We make a comparative study of various existing community-discovery algorithms. We then propose a new algorithm based on bibliographic metrics, which addresses the drawbacks in existing approaches. Bibliographic metrics are used to study similarities between publications in a citation network. Our algorithm classifies nodes in the network based on the similarity of their neighborhoods. One of the drawbacks of the current community-discovery algorithms is their computational complexity. These algorithms do not scale up to the enormous size of the real-world networks. We propose a hash-table-based technique that helps us compute the bibliometric similarity between nodes in O(m ?) time. Here m is the number of edges in the graph and ?, the largest degree. Next, we investigate different centrality metrics. Centrality metrics are used to portray the importance of a node in the network. We propose an algorithm that utilizes centrality metrics of the nodes to compute the importance of the edges in the network. Removal of the edges in ascending order of their importance breaks the network into components, each of which represent a community. We compare the performance of the algorithm on synthetic networks with a known community structure using several centrality metrics. Performance was measured as the percentage of nodes that were correctly classified. As an illustration, we model the ucf.edu domain as a web graph and analyze the changes in its properties like densification power law, edge density, degree distribution, diameter, etc., over a five-year period. Our results show super-linear growth in the number of edges with time. We observe (and explain) that despite the increase in average degree of the nodes, the edge density decreases with time
Finding and evaluating community structure in networks
We propose and study a set of algorithms for discovering community structure
in networks -- natural divisions of network nodes into densely connected
subgroups. Our algorithms all share two definitive features: first, they
involve iterative removal of edges from the network to split it into
communities, the edges removed being identified using one of a number of
possible "betweenness" measures, and second, these measures are, crucially,
recalculated after each removal. We also propose a measure for the strength of
the community structure found by our algorithms, which gives us an objective
metric for choosing the number of communities into which a network should be
divided. We demonstrate that our algorithms are highly effective at discovering
community structure in both computer-generated and real-world network data, and
show how they can be used to shed light on the sometimes dauntingly complex
structure of networked systems.Comment: 16 pages, 13 figure
Discovering Communities of Community Discovery
Discovering communities in complex networks means grouping nodes similar to
each other, to uncover latent information about them. There are hundreds of
different algorithms to solve the community detection task, each with its own
understanding and definition of what a "community" is. Dozens of review works
attempt to order such a diverse landscape -- classifying community discovery
algorithms by the process they employ to detect communities, by their
explicitly stated definition of community, or by their performance on a
standardized task. In this paper, we classify community discovery algorithms
according to a fourth criterion: the similarity of their results. We create an
Algorithm Similarity Network (ASN), whose nodes are the community detection
approaches, connected if they return similar groupings. We then perform
community detection on this network, grouping algorithms that consistently
return the same partitions or overlapping coverage over a span of more than one
thousand synthetic and real world networks. This paper is an attempt to create
a similarity-based classification of community detection algorithms based on
empirical data. It improves over the state of the art by comparing more than
seventy approaches, discovering that the ASN contains well-separated groups,
making it a sensible tool for practitioners, aiding their choice of algorithms
fitting their analytic needs
- …