1,758 research outputs found

    Efficient modularity density heuristics in graph clustering and their applications

    Get PDF
    Modularity Density Maximization is a graph clustering problem which avoids the resolution limit degeneracy of the Modularity Maximization problem. This thesis aims at solving larger instances than current Modularity Density heuristics do, and show how close the obtained solutions are to the expected clustering. Three main contributions arise from this objective. The first one is about the theoretical contributions about properties of Modularity Density based prioritizers. The second one is the development of eight Modularity Density Maximization heuristics. Our heuristics are compared with optimal results from the literature, and with GAOD, iMeme-Net, HAIN, BMD- heuristics. Our results are also compared with CNM and Louvain which are heuristics for Modularity Maximization that solve instances with thousands of nodes. The tests were carried out by using graphs from the “Stanford Large Network Dataset Collection”. The experiments have shown that our eight heuristics found solutions for graphs with hundreds of thousands of nodes. Our results have also shown that five of our heuristics surpassed the current state-of-the-art Modularity Density Maximization heuristic solvers for large graphs. A third contribution is the proposal of six column generation methods. These methods use exact and heuristic auxiliary solvers and an initial variable generator. Comparisons among our proposed column generations and state-of-the-art algorithms were also carried out. The results showed that: (i) two of our methods surpassed the state-of-the-art algorithms in terms of time, and (ii) our methods proved the optimal value for larger instances than current approaches can tackle. Our results suggest clear improvements to the state-of-the-art results for the Modularity Density Maximization problem

    Community Detection via Maximization of Modularity and Its Variants

    Full text link
    In this paper, we first discuss the definition of modularity (Q) used as a metric for community quality and then we review the modularity maximization approaches which were used for community detection in the last decade. Then, we discuss two opposite yet coexisting problems of modularity optimization: in some cases, it tends to favor small communities over large ones while in others, large communities over small ones (so called the resolution limit problem). Next, we overview several community quality metrics proposed to solve the resolution limit problem and discuss Modularity Density (Qds) which simultaneously avoids the two problems of modularity. Finally, we introduce two novel fine-tuned community detection algorithms that iteratively attempt to improve the community quality measurements by splitting and merging the given network community structure. The first of them, referred to as Fine-tuned Q, is based on modularity (Q) while the second one is based on Modularity Density (Qds) and denoted as Fine-tuned Qds. Then, we compare the greedy algorithm of modularity maximization (denoted as Greedy Q), Fine-tuned Q, and Fine-tuned Qds on four real networks, and also on the classical clique network and the LFR benchmark networks, each of which is instantiated by a wide range of parameters. The results indicate that Fine-tuned Qds is the most effective among the three algorithms discussed. Moreover, we show that Fine-tuned Qds can be applied to the communities detected by other algorithms to significantly improve their results

    Finding community structure in networks using the eigenvectors of matrices

    Get PDF
    We consider the problem of detecting communities or modules in networks, groups of vertices with a higher-than-average density of edges connecting them. Previous work indicates that a robust approach to this problem is the maximization of the benefit function known as "modularity" over possible divisions of a network. Here we show that this maximization process can be written in terms of the eigenspectrum of a matrix we call the modularity matrix, which plays a role in community detection similar to that played by the graph Laplacian in graph partitioning calculations. This result leads us to a number of possible algorithms for detecting community structure, as well as several other results, including a spectral measure of bipartite structure in networks and a new centrality measure that identifies those vertices that occupy central positions within the communities to which they belong. The algorithms and measures proposed are illustrated with applications to a variety of real-world complex networks.Comment: 22 pages, 8 figures, minor corrections in this versio

    COMMUNITY DETECTION IN COMPLEX NETWORKS AND APPLICATION TO DENSE WIRELESS SENSOR NETWORKS LOCALIZATION

    Get PDF
    Complex network analysis is applied in numerous researches. Features and characteristics of complex networks provide information associated with a network feature called community structure. Naturally, nodes with similar attributes will be more likely to form a community. Community detection is described as the process by which complex network data are analyzed to uncover organizational properties, and structure; and ultimately to enable extraction of useful information. Analysis of Wireless Sensor Networks (WSN) is considered as one of the most important categories of network analysis due to their enormous and emerging applications. Most WSN applications are location-aware, which entails precise localization of the deployed sensor nodes. However, localization of sensor nodes in very dense network is a challenging task. Among various challenges associated with localization of dense WSNs, anchor node selection is shown as a prominent open problem. Optimum anchor selection impacts overall sensor node localization in terms of accuracy and consumed energy. In this thesis, various approaches are developed to address both overlapping and non-overlapping community detection. The proposed approaches target small-size to very large-size networks in near linear time, which is important for very large, densely-connected networks. Performance of the proposed techniques are evaluated over real-world data-sets with up to 106 nodes and syntactic networks via Newman\u27s Modularity and Normalized Mutual Information (NMI). Moreover, the proposed community detection approaches are extended to develop a novel criterion for range-free anchor selection in WSNs. Our approach uses novel objective functions based on nodes\u27 community memberships to reveal a set of anchors among all available permutations of anchors-selection sets. The performance---the mean and variance of the localization error---of the proposed approach is evaluated for a variety of node deployment scenarios and compared with random anchor selection and the full-ranging approach. In order to study the effectiveness of our algorithm, the performance is evaluated over several simulations that randomly generate network configurations. By incorporating our proposed criteria, the accuracy of the position estimate is improved significantly relative to random anchor selection localization methods. Simulation results show that the proposed technique significantly improves both the accuracy and the precision of the location estimation

    Unbiased sampling of network ensembles

    Get PDF
    Sampling random graphs with given properties is a key step in the analysis of networks, as random ensembles represent basic null models required to identify patterns such as communities and motifs. An important requirement is that the sampling process is unbiased and efficient. The main approaches are microcanonical, i.e. they sample graphs that match the enforced constraints exactly. Unfortunately, when applied to strongly heterogeneous networks (like most real-world examples), the majority of these approaches become biased and/or time-consuming. Moreover, the algorithms defined in the simplest cases, such as binary graphs with given degrees, are not easily generalizable to more complicated ensembles. Here we propose a solution to the problem via the introduction of a "Maximize and Sample" ("Max & Sam" for short) method to correctly sample ensembles of networks where the constraints are `soft', i.e. realized as ensemble averages. Our method is based on exact maximum-entropy distributions and is therefore unbiased by construction, even for strongly heterogeneous networks. It is also more computationally efficient than most microcanonical alternatives. Finally, it works for both binary and weighted networks with a variety of constraints, including combined degree-strength sequences and full reciprocity structure, for which no alternative method exists. Our canonical approach can in principle be turned into an unbiased microcanonical one, via a restriction to the relevant subset. Importantly, the analysis of the fluctuations of the constraints suggests that the microcanonical and canonical versions of all the ensembles considered here are not equivalent. We show various real-world applications and provide a code implementing all our algorithms.Comment: MatLab code available at http://www.mathworks.it/matlabcentral/fileexchange/46912-max-sam-package-zi
    corecore