465,942 research outputs found

    Multi-scale Laplacian community detection in heterogeneous networks

    Full text link
    Heterogeneous and complex networks represent the intertwined interactions between real-world elements or agents. A fundamental problem of complex network theory involves finding inherent partitions, clusters, or communities. By taking advantage of the recent Laplacian Renormalization Group approach, we scrutinize information diffusion pathways throughout networks to shed further light on this issue. Based on inter-node communicability, our definition provides a unifying framework for multiple partitioning measures: multi-scale Laplacian (MSL) community detection algorithm. This new framework permits to introduce a scale-dependent optimal partition in communities and to determine the existence of a particular class of nodes, called metastable nodes, that switching community at different scales are expected to play a central role in the communication between different communities and, therefore in the control of the whole network.Comment: 14 pages, 12 figure

    A network approach to topic models

    Full text link
    One of the main computational and scientific challenges in the modern age is to extract useful information from unstructured texts. Topic models are one popular machine-learning approach which infers the latent topical structure of a collection of documents. Despite their success --- in particular of its most widely used variant called Latent Dirichlet Allocation (LDA) --- and numerous applications in sociology, history, and linguistics, topic models are known to suffer from severe conceptual and practical problems, e.g. a lack of justification for the Bayesian priors, discrepancies with statistical properties of real texts, and the inability to properly choose the number of topics. Here we obtain a fresh view on the problem of identifying topical structures by relating it to the problem of finding communities in complex networks. This is achieved by representing text corpora as bipartite networks of documents and words. By adapting existing community-detection methods -- using a stochastic block model (SBM) with non-parametric priors -- we obtain a more versatile and principled framework for topic modeling (e.g., it automatically detects the number of topics and hierarchically clusters both the words and documents). The analysis of artificial and real corpora demonstrates that our SBM approach leads to better topic models than LDA in terms of statistical model selection. More importantly, our work shows how to formally relate methods from community detection and topic modeling, opening the possibility of cross-fertilization between these two fields.Comment: 22 pages, 10 figures, code available at https://topsbm.github.io

    Combined node and link partitions method for finding overlapping communities in complex networks

    Get PDF
    Community detection in complex networks is a fundamental data analysis task in various domains, and how to effectively find overlapping communities in real applications is still a challenge. In this work, we propose a new unified model and method for finding the best overlapping communities on the basis of the associated node and link partitions derived from the same framework. Specifically, we first describe a unified model that accommodates node and link communities (partitions) together, and then present a nonnegative matrix factorization method to learn the parameters of the model. Thereafter, we infer the overlapping communities based on the derived node and link communities, i.e., determine each overlapped community between the corresponding node and link community with a greedy optimization of a local community function conductance. Finally, we introduce a model selection method based on consensus clustering to determine the number of communities. We have evaluated our method on both synthetic and real-world networks with ground-truths, and compared it with seven state-of-the-art methods. The experimental results demonstrate the superior performance of our method over the competing ones in detecting overlapping communities for all analysed data sets. Improved performance is particularly pronounced in cases of more complicated networked community structures

    Topology-Agnostic Detection of Temporal Money Laundering Flows in Billion-Scale Transactions

    Full text link
    Money launderers exploit the weaknesses in detection systems by purposefully placing their ill-gotten money into multiple accounts, at different banks. That money is then layered and moved around among mule accounts to obscure the origin and the flow of transactions. Consequently, the money is integrated into the financial system without raising suspicion. Path finding algorithms that aim at tracking suspicious flows of money usually struggle with scale and complexity. Existing community detection techniques also fail to properly capture the time-dependent relationships. This is particularly evident when performing analytics over massive transaction graphs. We propose a framework (called FaSTMAN), adapted for domain-specific constraints, to efficiently construct a temporal graph of sequential transactions. The framework includes a weighting method, using 2nd order graph representation, to quantify the significance of the edges. This method enables us to distribute complex queries on smaller and densely connected networks of flows. Finally, based on those queries, we can effectively identify networks of suspicious flows. We extensively evaluate the scalability and the effectiveness of our framework against two state-of-the-art solutions for detecting suspicious flows of transactions. For a dataset of over 1 Billion transactions from multiple large European banks, the results show a clear superiority of our framework both in efficiency and usefulness

    The Power Of Locality In Network Algorithms

    Get PDF
    Over the last decade we have witnessed the rapid proliferation of large-scale complex networks, spanning many social, information and technological domains. While many of the tasks which users of such networks face are essentially global and involve the network as a whole, the size of these networks is huge and the information available to users is only local. In this dissertation we show that even when faced with stringent locality constraints, one can still effectively solve prominent algorithmic problems on such networks. In the first part of the dissertation we present a natural algorithmic framework designed to model the behaviour of an external agent trying to solve a network optimization problem with limited access to the network data. Our study focuses on local information algorithms --- sequential algorithms where the network topology is initially unknown and is revealed only within a local neighborhood of vertices that have been irrevocably added to the output set. We address both network coverage problems as well as network search problems. Our results include local information algorithms for coverage problems whose performance closely match the best possible even when information about network structure is unrestricted. We also demonstrate a sharp threshold on the level of visibility required: at a certain visibility level it is possible to design algorithms that nearly match the best approximation possible even with full access to the network structure, but with any less information it is impossible to achieve a reasonable approximation. For preferential attachment networks, we obtain polylogarithmic approximations to the problem of finding the smallest subgraph that connects a subset of nodes and the problem of finding the highest-degree nodes. This is achieved by addressing a decade-old open question of Bollobás and Riordan on locally finding the root in a preferential attachment process. In the second part of the dissertation we focus on designing highly time efficient local algorithms for central mining problems on complex networks that have been in the focus of the research community over a decade: finding a small set of influential nodes in the network, and fast ranking of nodes. Among our results is an essentially runtime-optimal local algorithm for the influence maximization problem in the standard independent cascades model of information diffusion and an essentially runtime-optimal local algorithm for the problem of returning all nodes with PageRank bigger than a given threshold. Our work demonstrates that locality is powerful enough to allow efficient solutions to many central algorithmic problems on complex networks

    Evolving network structure of academic institutions

    Get PDF
    Today’s colleges and universities consist of highly complex structures that dictate interactions between the administration, faculty, and student body. These structures can play a role in dictating the efficiency of policy enacted by the administration and determine the effect that curriculum changes in one department have on other departments. Despite the fact that the features of these complex structures have a strong impact on the institutions, they remain by-and-large unknown in many cases. In this paper we study the academic structure of our home institution of Trinity College in Hartford, CT using the major and minor patterns between graduating students to build a temporal multiplex network describing the interactions between different departments. Using recent network science techniques developed for such temporal networks we identify the evolving community structures that organize departments’ interactions, as well as quantify the interdisciplinary centrality of each department. We implement this framework for Trinity College, finding practical insights and applications, but also present it as a general framework for colleges and universities to better understand their own structural makeup in order to better inform academic and administrative policy

    Evolving network structure of academic institutions

    Get PDF
    Today’s colleges and universities consist of highly complex structures that dictate interactions between the administration, faculty, and student body. These structures can play a role in dictating the efficiency of policy enacted by the administration and determine the effect that curriculum changes in one department have on other departments. Despite the fact that the features of these complex structures have a strong impact on the institutions, they remain by-and-large unknown in many cases. In this paper we study the academic structure of our home institution of Trinity College in Hartford, CT using the major and minor patterns between graduating students to build a temporal multiplex network describing the interactions between different departments. Using recent network science techniques developed for such temporal networks we identify the evolving community structures that organize departments’ interactions, as well as quantify the interdisciplinary centrality of each department. We implement this framework for Trinity College, finding practical insights and applications, but also present it as a general framework for colleges and universities to better understand their own structural makeup in order to better inform academic and administrative policy

    Community Detection in Complex Networks

    Get PDF
    Finding communities of connected individuals in social networks is essential for understanding our society and interactions within the network. Recently attention has turned to analyse these communities in complex network systems. In this thesis, we study three challenges. Firstly, analysing and evaluating the robustness of new and existing score functions as these functions are used to assess the community structure for a given network. Secondly, unfolding community structures in static social networks. Finally, detecting the dynamics of communities that change over time. The score functions are evaluated on different community structures. The behaviour of these functions is studied by migrating nodes randomly from their community to a random community in a given true partition until all nodes will be migrated far from their communities. Then Multi-Objective Evolutionary Algorithm Based Community Detection in Social Networks (MOEA-CD) is used to capture the intuition of community identi cation with dense connections within the community and sparse with others. This algorithm redirects the design of objective functions according to the nodes' relations within community and with other communities. This new model includes two new contradictory objectives, the rst is to maximise the internal neighbours for each node within a community and the second is to minimise the maximum external links for each node within a community with respect to its internal neighbours. Both of these objectives are optimised simultaneously to nd a set of estimated Pareto-optimal solutions where each solution corresponds to a network partition. Moreover, we propose a new local heuristic search, namely, the Neighbour Node Centrality (NNC) strategy which is combined with the proposed model to improve the performance of MOEA-CD to nd a local optimal solution. We also design an algorithm which produces community structures that evolve over time. Recognising that there may be many possible community structures that ex- plain the observed social network at each time step, in contrast to existing methods, which generally treat this as a coupled optimisation problem, we formulate the prob- lem in a Hidden Markov Model framework, which allows the most likely sequence of communities to be found using the Viterbi algorithm where there are many candi- date community structures which are generated using Multi-Objective Evolutionary Algorithm. To demonstrate that our study is effective, it is evaluated on synthetic and real-life dynamic networks and it is used to discover the changing Twitter communities of MPs preceding the Brexit referendum

    Community Detection in Complex Networks

    Get PDF
    Finding communities of connected individuals in social networks is essential for understanding our society and interactions within the network. Recently attention has turned to analyse these communities in complex network systems. In this thesis, we study three challenges. Firstly, analysing and evaluating the robustness of new and existing score functions as these functions are used to assess the community structure for a given network. Secondly, unfolding community structures in static social networks. Finally, detecting the dynamics of communities that change over time. The score functions are evaluated on different community structures. The behaviour of these functions is studied by migrating nodes randomly from their community to a random community in a given true partition until all nodes will be migrated far from their communities. Then Multi-Objective Evolutionary Algorithm Based Community Detection in Social Networks (MOEA-CD) is used to capture the intuition of community identi cation with dense connections within the community and sparse with others. This algorithm redirects the design of objective functions according to the nodes' relations within community and with other communities. This new model includes two new contradictory objectives, the rst is to maximise the internal neighbours for each node within a community and the second is to minimise the maximum external links for each node within a community with respect to its internal neighbours. Both of these objectives are optimised simultaneously to nd a set of estimated Pareto-optimal solutions where each solution corresponds to a network partition. Moreover, we propose a new local heuristic search, namely, the Neighbour Node Centrality (NNC) strategy which is combined with the proposed model to improve the performance of MOEA-CD to nd a local optimal solution. We also design an algorithm which produces community structures that evolve over time. Recognising that there may be many possible community structures that ex- plain the observed social network at each time step, in contrast to existing methods, which generally treat this as a coupled optimisation problem, we formulate the prob- lem in a Hidden Markov Model framework, which allows the most likely sequence of communities to be found using the Viterbi algorithm where there are many candi- date community structures which are generated using Multi-Objective Evolutionary Algorithm. To demonstrate that our study is effective, it is evaluated on synthetic and real-life dynamic networks and it is used to discover the changing Twitter communities of MPs preceding the Brexit referendum

    Motif-based communities in complex networks

    Full text link
    Community definitions usually focus on edges, inside and between the communities. However, the high density of edges within a community determines correlations between nodes going beyond nearest-neighbours, and which are indicated by the presence of motifs. We show how motifs can be used to define general classes of nodes, including communities, by extending the mathematical expression of Newman-Girvan modularity. We construct then a general framework and apply it to some synthetic and real networks
    • …
    corecore