24,036 research outputs found

    Clustering and Community Detection in Directed Networks: A Survey

    Full text link
    Networks (or graphs) appear as dominant structures in diverse domains, including sociology, biology, neuroscience and computer science. In most of the aforementioned cases graphs are directed - in the sense that there is directionality on the edges, making the semantics of the edges non symmetric. An interesting feature that real networks present is the clustering or community structure property, under which the graph topology is organized into modules commonly called communities or clusters. The essence here is that nodes of the same community are highly similar while on the contrary, nodes across communities present low similarity. Revealing the underlying community structure of directed complex networks has become a crucial and interdisciplinary topic with a plethora of applications. Therefore, naturally there is a recent wealth of research production in the area of mining directed graphs - with clustering being the primary method and tool for community detection and evaluation. The goal of this paper is to offer an in-depth review of the methods presented so far for clustering directed networks along with the relevant necessary methodological background and also related applications. The survey commences by offering a concise review of the fundamental concepts and methodological base on which graph clustering algorithms capitalize on. Then we present the relevant work along two orthogonal classifications. The first one is mostly concerned with the methodological principles of the clustering algorithms, while the second one approaches the methods from the viewpoint regarding the properties of a good cluster in a directed network. Further, we present methods and metrics for evaluating graph clustering results, demonstrate interesting application domains and provide promising future research directions.Comment: 86 pages, 17 figures. Physics Reports Journal (To Appear

    Characterization of complex networks: A survey of measurements

    Full text link
    Each complex network (or class of networks) presents specific topological features which characterize its connectivity and highly influence the dynamics of processes executed on the network. The analysis, discrimination, and synthesis of complex networks therefore rely on the use of measurements capable of expressing the most relevant topological features. This article presents a survey of such measurements. It includes general considerations about complex network characterization, a brief review of the principal models, and the presentation of the main existing measurements. Important related issues covered in this work comprise the representation of the evolution of complex networks in terms of trajectories in several measurement spaces, the analysis of the correlations between some of the most traditional measurements, perturbation analysis, as well as the use of multivariate statistics for feature selection and network classification. Depending on the network and the analysis task one has in mind, a specific set of features may be chosen. It is hoped that the present survey will help the proper application and interpretation of measurements.Comment: A working manuscript with 78 pages, 32 figures. Suggestions of measurements for inclusion are welcomed by the author

    Detecting Core-Periphery Structures by Surprise

    Get PDF
    Detecting the presence of mesoscale structures in complex networks is of primary importance. This is especially true for financial networks, whose structural organization deeply affects their resilience to events like default cascades, shocks propagation, etc. Several methods have been proposed, so far, to detect communities, i.e. groups of nodes whose connectivity is significantly large. Communities, however do not represent the only kind of mesoscale structures characterizing real-world networks: other examples are provided by bow-tie structures, core-periphery structures and bipartite structures. Here we propose a novel method to detect statistically-signifcant bimodular structures, i.e. either bipartite or core-periphery ones. It is based on a modification of the surprise, recently proposed for detecting communities. Our variant allows for bimodular nodes partitions to be revealed, by letting links to be placed either 1) within the core part and between the core and the periphery parts or 2) just between the (empty) layers of a bipartite network. From a technical point of view, this is achieved by employing a multinomial hypergeometric distribution instead of the traditional (binomial) hypergeometric one; as in the latter case, this allows a p-value to be assigned to any given (bi)partition of the nodes. To illustrate the performance of our method, we report the results of its application to several real-world networks, including social, economic and financial ones.Comment: 11 pages, 10 figures. Python code freely available at https://github.com/jeroenvldj/bimodular_surpris

    Mal-Netminer: Malware Classification Approach based on Social Network Analysis of System Call Graph

    Get PDF
    As the security landscape evolves over time, where thousands of species of malicious codes are seen every day, antivirus vendors strive to detect and classify malware families for efficient and effective responses against malware campaigns. To enrich this effort, and by capitalizing on ideas from the social network analysis domain, we build a tool that can help classify malware families using features driven from the graph structure of their system calls. To achieve that, we first construct a system call graph that consists of system calls found in the execution of the individual malware families. To explore distinguishing features of various malware species, we study social network properties as applied to the call graph, including the degree distribution, degree centrality, average distance, clustering coefficient, network density, and component ratio. We utilize features driven from those properties to build a classifier for malware families. Our experimental results show that influence-based graph metrics such as the degree centrality are effective for classifying malware, whereas the general structural metrics of malware are less effective for classifying malware. Our experiments demonstrate that the proposed system performs well in detecting and classifying malware families within each malware class with accuracy greater than 96%.Comment: Mathematical Problems in Engineering, Vol 201

    Identifying communities by influence dynamics in social networks

    Full text link
    Communities are not static; they evolve, split and merge, appear and disappear, i.e. they are product of dynamical processes that govern the evolution of the network. A good algorithm for community detection should not only quantify the topology of the network, but incorporate the dynamical processes that take place on the network. We present a novel algorithm for community detection that combines network structure with processes that support creation and/or evolution of communities. The algorithm does not embrace the universal approach but instead tries to focus on social networks and model dynamic social interactions that occur on those networks. It identifies leaders, and communities that form around those leaders. It naturally supports overlapping communities by associating each node with a membership vector that describes node's involvement in each community. This way, in addition to overlapping communities, we can identify nodes that are good followers to their leader, and also nodes with no clear community involvement that serve as a proxy between several communities and are equally as important. We run the algorithm for several real social networks which we believe represent a good fraction of the wide body of social networks and discuss the results including other possible applications.Comment: 10 pages, 6 figure

    Overlapping modularity at the critical point of k-clique percolation

    Get PDF
    One of the most remarkable social phenomena is the formation of communities in social networks corresponding to families, friendship circles, work teams, etc. Since people usually belong to several different communities at the same time, the induced overlaps result in an extremely complicated web of the communities themselves. Thus, uncovering the intricate community structure of social networks is a non-trivial task with great potential for practical applications, gaining a notable interest in the recent years. The Clique Percolation Method (CPM) is one of the earliest overlapping community finding methods, which was already used in the analysis of several different social networks. In this approach the communities correspond to k-clique percolation clusters, and the general heuristic for setting the parameters of the method is to tune the system just below the critical point of k-clique percolation. However, this rule is based on simple physical principles and its validity was never subject to quantitative analysis. Here we examine the quality of the partitioning in the vicinity of the critical point using recently introduced overlapping modularity measures. According to our results on real social- and other networks, the overlapping modularities show a maximum close to the critical point, justifying the original criteria for the optimal parameter settings.Comment: 20 pages, 6 figure
    • …
    corecore