57 research outputs found

    Additive Approximation Algorithms for Modularity Maximization

    Get PDF
    The modularity is a quality function in community detection, which was introduced by Newman and Girvan (2004). Community detection in graphs is now often conducted through modularity maximization: given an undirected graph G=(V,E)G=(V,E), we are asked to find a partition C\mathcal{C} of VV that maximizes the modularity. Although numerous algorithms have been developed to date, most of them have no theoretical approximation guarantee. Recently, to overcome this issue, the design of modularity maximization algorithms with provable approximation guarantees has attracted significant attention in the computer science community. In this study, we further investigate the approximability of modularity maximization. More specifically, we propose a polynomial-time (cos(354π)1+58)\left(\cos\left(\frac{3-\sqrt{5}}{4}\pi\right) - \frac{1+\sqrt{5}}{8}\right)-additive approximation algorithm for the modularity maximization problem. Note here that cos(354π)1+58<0.42084\cos\left(\frac{3-\sqrt{5}}{4}\pi\right) - \frac{1+\sqrt{5}}{8} < 0.42084 holds. This improves the current best additive approximation error of 0.46720.4672, which was recently provided by Dinh, Li, and Thai (2015). Interestingly, our analysis also demonstrates that the proposed algorithm obtains a nearly-optimal solution for any instance with a very high modularity value. Moreover, we propose a polynomial-time 0.165980.16598-additive approximation algorithm for the maximum modularity cut problem. It should be noted that this is the first non-trivial approximability result for the problem. Finally, we demonstrate that our approximation algorithm can be extended to some related problems.Comment: 23 pages, 4 figure

    Community Detection in Cyber Networks

    Get PDF
    Community detection has been widely studied and implemented across various research domains such as social networks, biological networks, neuroscience, and cybersecurity. In the context of cyber networks, it involves identifying the groups of network nodes such that the network connections are dense within the group and are sparser between the groups. Various community detection algorithms can be utilized to detect the underlying community structure of a given network. However, it is crucial to evaluate the quality of the detected communities as there are a number of ways that a particular network may be partitioned into communities, and thus, a quality evaluation metric needs to be used to determine the best partitioning. Modularity is one such measure, and when evaluating the modularity index, researchers have considered null models for graphs with specific structures or characteristics. However, most real-world complex networks as a whole do not exhibit one specific characteristic but instead consist of various identifiable subgraphs that do respectively exhibit particular characteristcs, and accordingly, formulating a null model for these individual subgraphs may improve the modularity value and thereby improve the quality of the partitioning otherwise known as the detected communities. This research investigates the extent to which the modularity value increases when a bipartite subgraph is taken into consideration while performing community detection. This is accomplished by designing and developing an empirical setting that first identifies the presence of a bipartite subgraph and then utilizes it to perform community detection. Our empirical study and results suggest that the quality of the detected communities is enhanced by leveraging the presence of bipartite subnetwork in the given real world complex network. Furthermore, we present the applicability of this research in cybersecurity domain to alleviate the consequences of any worm attack. We can achieve this by employing our technique to obtain a better underlying community structure for identifying the most vulnerable set of nodes in the compromised network

    Community detection in graphs

    Full text link
    The modern science of networks has brought significant advances to our understanding of complex systems. One of the most relevant features of graphs representing real systems is community structure, or clustering, i. e. the organization of vertices in clusters, with many edges joining vertices of the same cluster and comparatively few edges joining vertices of different clusters. Such clusters, or communities, can be considered as fairly independent compartments of a graph, playing a similar role like, e. g., the tissues or the organs in the human body. Detecting communities is of great importance in sociology, biology and computer science, disciplines where systems are often represented as graphs. This problem is very hard and not yet satisfactorily solved, despite the huge effort of a large interdisciplinary community of scientists working on it over the past few years. We will attempt a thorough exposition of the topic, from the definition of the main elements of the problem, to the presentation of most methods developed, with a special focus on techniques designed by statistical physicists, from the discussion of crucial issues like the significance of clustering and how methods should be tested and compared against each other, to the description of applications to real networks.Comment: Review article. 103 pages, 42 figures, 2 tables. Two sections expanded + minor modifications. Three figures + one table + references added. Final version published in Physics Report

    Mesoscopic descriptions of complex networks

    Get PDF
    [spa] El objetivo de la presente tesis es el estudio de las subestructuras que aparecen a un nivel de resolución mesoscópico en las redes complejas. Dichas subestructuras, que en el campo de las redes complejas son denominadas comunidades, intentan agrupar los nodos de una red de manera que los nodos que forman parte de una misma comunidad estén más conectados entre ellos que con el resto de nodos de la red. La importada del análisis de estas estructuras radica en que nos permiten comprender mejor las redes complejas dándonos información sobre la funcionalidad de las comunidades que las componen. Hemos llevado a cabo el estudio de estas estructuras mesoscópicas utilizando la información topológica de las redes, y en cuanto a los métodos empleados éstos se pueden agrupar en dos grandes familias conocidas habitualmente como clustering jerárquico y clustering modular. Dentro de la primera familia de métodos nos hemos fijado en la existencia de un problema de no unicidad en el clustering jerárquico aglomerativo, y hemos propuesto una solución a dicho problema basada en el uso de una nueva herramienta de clasificación que denominamos multidendrograma. A continuación, hemos aplicado el resultado de una clasificación jerárquica para resolver un problema dentro de las redes complejas financieras. Más concretamente, hemos aprovechado una partición en clusters para resolver de manera más eficiente el problema de optimizar una cartera de valores. Por lo que respecta a la segunda familia de métodos de clustering estudiados, ésta se basa en la optimización de una función objetivo llamada modularidad El inconveniente que presenta la optimización de la modularidad es su elevado coste computacional, la cual cosa nos ha llevado a idear una reducción analítica del tamaño de las redes complejas de manera que se conserva toda la información necesaria en la red original de cara a hallar la estructura de comunidades que optimice la modularidad. A continuación hemos podido utilizar dicha simplificación de los cálculos en el análisis de toda la mesoescala topológica de las redes complejas. Dicho mesoescala la hemos estudiado añadiendo un mismo valor a todos los nodos de una red que mide su resistencia a formar parte de comunidades, La optimización de la modularidad para estas nuevas instancias de la red original obtenidas a partir de unos valores de resistencia acotados analíticamente, nos permite analizar la mesoescala topológica de las redes. Por último, hemos propuesto una generalización de la función de modularidad donde los bloques constituyentes ya no son solamente arcos sino que pueden ser distintos tipos de motifs. Esto nos permite obtener descripciones más generales de grupos de nodos que incluyen como caso particular a las comunidades

    Structural and dynamical interdependencies in complex networks at meso- and macroscale: nestedness, modularity, and in-block nestedness

    Get PDF
    Many real systems like the brain are considered to be complex, i.e. they are made of several interacting components and display a collective behaviour that cannot be inferred from how the individual parts behave. They are usually described as networks, with the components represented as nodes and the interactions between them as links. Research into networks mainly focuses on exploring how a network's dynamic behaviour is constrained by the nature and topology of the interactions between its elements. Analyses of this sort are performed on three scales: the microscale, based on single nodes; the macroscale, which explores the whole network; and the mesoscale, which studies groups of nodes. Nonetheless, most studies so far have focused on only one scale, despite increasing evidence suggesting that networks exhibit structure on several scales. In our thesis, we apply structural analysis to a variety of synthetic and empirical networks on multiple scales. We focus on the examination of nested, modular, and in-block nested patterns, and the effects that they impose on each other. Finally, we introduce a theoretical model to help us to better understand some of the mechanisms that enable such patterns to emerge.Molts sistemes, com el cervell o internet, són considerats complexos: sistemes formats per una gran quantitat d'elements que interactuen entre si, que exhibeixen un comportament col·lectiu que no es pot inferir des de les propietats dels seus elements aïllats. Aquests sistemes s'estudien mitjançant xarxes, en les quals els elements constituents són els nodes, i les interaccions entre ells, els enllaços. La recerca en xarxes s'enfoca principalment a explorar com el comportament dinàmic d'una xarxa està definit per la naturalesa i la topologia de les interaccions entre els seus elements. Aquesta anàlisi sovint es fa en tres escales: la microescala, que estudia les propietats dels nodes individuals; la macroescala, que explora les propietats de tota la xarxa, i la mesoescala, basada en les propietats de grups de nodes. No obstant, la majoria dels estudis se centren només en una escala, tot i la creixent evidència que suggereix que les xarxes sovint exhibeixen estructura a múltiples escales. En aquesta tesi estudiarem les propietats estructurals de les xarxes a escala múltiple. Analitzarem les propietats estructurals dels patrons in-block nested i la seva relació amb els patrons niats i modulars. Finalment, introduirem un model teòric per explorar alguns dels mecanismes que permeten l'emergència d'aquests patrons.Muchos sistemas, como el cerebro o internet, son considerados complejos: sistemas formados por una gran cantidad de elementos que interactúan entre sí, que exhiben un comportamiento colectivo que no puede inferirse desde las propiedades de sus elementos aislados. Estos sistemas se estudian mediante redes, en las que los elementos constituyentes son los nodos, y las interacciones entre ellos, los enlaces. La investigación en redes se enfoca principalmente a explorar cómo el comportamiento dinámico de una red está definido por la naturaleza y la topología de las interacciones entre sus elementos. Este análisis a menudo se hace en tres escalas: la microescala, que estudia las propiedades de los nodos individuales; la macroescala, que explora las propiedades de toda la red, y la mesoescala, basada en las propiedades de grupos de nodos. No obstante, la mayoría de los estudios se centran solo en una escala, a pesar de la creciente evidencia que sugiere que las redes a menudo exhiben estructura a múltiples escalas. En esta tesis estudiaremos las propiedades estructurales de las redes a escala múltiple. Analizaremos las propiedades estructurales de los patrones in-block nested y su relación con los patrones anidados y modulares. Finalmente, introduciremos un modelo teórico para explorar algunos de los mecanismos que permiten la emergencia de estos patrones.Tecnologías de la información y de rede

    Community Detection In Evolving Networks

    Get PDF
    Most social networks are characterized by the presence of community structure, viz. the existence of clusters of nodes with a much higher proportion of links within the clusters than between the clusters. Community detection has many applications in many kinds of networks, including social networks and biological networks. Many different approaches have been proposed to solve the problem. An approach that has been shown to scale well to large networks is the Louvain method, based on maximizing modularity, which is a quality function of a partition of the nodes. In this thesis, we address the problem of community detection in evolving social networks. As social networks evolve, the community structure of the network can change. How can the community structure be updated in an efficient way? How often should community structure be updated? In this thesis, we give two methods based on the Louvain algorithm, to determine when to update the community structure. The first method, called the Edge-Distribution-Analysis algorithm, analyzes the newly added edges in order to make this decision. The second method, called the Modularity-Change-Rate algorithm, finds the rate of modularity change in a given network, and uses it to predict whether an update is required. Due to the sparsity of real datasets of evolving networks, we propose three models to generate evolving networks: a Random model, a model based on the well-known phenomenon of homophily in social networks, and another based on the phenomenon of triadic and cyclic closure. Starting with real-world data sets, we used these models to generate evolving networks. We evaluated the Edge-Distribution-Analysis algorithm and Modularity-Change-Rate algorithm on these data sets. Our results show that both our methods predict quite well when the community structure should be updated. They result in significant computational savings compared to approaches that would update the community structure after a fixed number of edge additions, while ensuring that the quality of the community structure is comparable
    corecore