4 research outputs found

    Finding maximal bicliques in bipartite networks using node similarity

    Get PDF
    In real world complex networks, communities are usually both overlapping and hierarchical. A very important class of complex networks is the bipartite networks. Maximal bicliques are the strongest possible structural communities within them. Here we consider overlapping communities in bipartite networks and propose a method that detects an order-limited number of overlapping maximal bicliques covering the network. We formalise a measure of relative community strength by which communities can be categorised, compared and ranked. There are very few real bipartite datasets for which any external ground truth about overlapping communities is known. Here we test three such datasets. We categorise and rank the maximal biclique communities found by our algorithm according to our measure of strength. Deeper analysis of these bicliques shows they accord with ground truth and give useful additional insight. Based on this we suggest our algorithm can find true communities at the first level of a hierarchy. We add a heuristic merging stage to the maximal biclique algorithm to produce a second level hierarchy with fewer communities and obtain positive results when compared with other overlapping community detection algorithms for bipartite networks

    Analysis of two crime-related networks derived from bipartite social networks

    No full text
    In this paper we investigate two real crime-related networks, which are both bipartite. The bipartite networks are: a spatial network where crimes of various types are committed in different local government areas; and a dark terrorist network where individuals attend events or have common affiliations. In each case we analyse the communities found by a random-walk based algorithm in the primary weighted projection network. We demonstrate that the identified communities represent meaningful information, and in particular, that the small communities found in the terrorist network represent meaningful cliques

    Complex network tools to enable identification of a criminal community

    Get PDF
    Retrieving criminal ties and mining evidence from an organised crime incident, for example money laundering, has been a difficult task for crime investigators due to the involvement of different groups of people and their complex relationships. Extracting the criminal association from enormous amount of raw data and representing them explicitly is tedious and time consuming. A study of the complex networks literature reveals that graph-based detection methods have not, as yet, been used for money laundering detection. In this research, I explore the use of complex network analysis to identify the money laundering criminals’ communication associations, that is, the important people who communicate between known criminals and the reliance of the known criminals on the other individuals in a communication path. For this purpose, I use the publicly available Enron email database that happens to contain the communications of 10 criminals who were convicted of a money laundering crime. I show that my new shortest paths network search algorithm (SPNSA) combining shortest paths and network centrality measures is better able to isolate and identify criminals’ connections when compared with existing community detection algorithms and k-neighbourhood detection. The SPNSA is validated using three different investigative scenarios and in each scenario, the criminal network graphs formed are small and sparse hence suitable for further investigation. My research starts with isolating emails with ‘BCC’ recipients with a minimum of two recipients bcc-ed. ‘BCC’ recipients are inherently secretive and the email connections imply a trust relationship between sender and ‘BCC’ recipients. There are no studies on the usage of only those emails that have ‘BCC’ recipients to form a trust network, which leads me to analyse the ‘BCC’ email group separately. SPNSA is able to identify the group of criminals and their active intermediaries in this ‘BCC’ trust network. Corroborating this information with published information about the crimes that led to the collapse of Enron yields the discovery of persons of interest that were hidden between criminals, and could have contributed to the money laundering activity. For validation, larger email datasets that comprise of all ‘BCC’ and ‘TO/CC’ email transactions are used. On comparison with existing community detection algorithms, SPNSA is found to perform much better with regards to isolating the sub-networks that contain criminals. I have adapted the betweenness centrality measure to develop a reliance measure. This measure calculates the reliance of a criminal on an intermediate node and ranks the importance level of each intermediate node based on this reliability value. Both SPNSA and the reliance measure could be used as primary investigation tools to investigate connections between criminals in a complex network

    Complex information networks – detecting community structure in bipartite networks

    Get PDF
    The last decade has witnessed great expansion in research and study of complex networks. A complex network is a large-scale network that reflects the interactions between objects or components of complicated systems. These components, known as clusters, communities or modules, perform together in order to provide one or more functions of the system. A vast number of systems, from the brain to ecosystems, power grids and the Internet, criminal relationships and financial transactions, can all be described as large complex networks. For most complex networks, the complexity arises from the fact that the structure is highly irregular, complex and dynamically evolving in time; and that the observed patterns of interactions highly influence the behaviour of the entire system. One of the topological properties that can expose the hierarchical structure of complex networks is community structure. Community detection is a common problem in complex networks that consists in general of finding groups of densely connected nodes with few connections to nodes outside of a group. The lack of consensus on a definition for a community leads to extensive studies on community structure of complex networks in order to provide improved community detection methods. Community structure is a common and important topological characteristic of many real world complex networks. In particular, identifying communities in bipartite networks is an important task in many scientific domains. In a bipartite network, the node set consists of two disjoint sets of nodes, primary set (P) and secondary set (S), such that links between nodes may occur only if the nodes belong to different sets. There are really two approaches to identifying clusters in a bipartite network: the first, and more common, is when our real interest is in community structure within the primary node set P; and the second is when our real interest is in bipartite communities within the whole network. Thus, in this research we investigate and study the state-of-the-art of community detection algorithms, in particular, those to identify the communities in bipartite networks in order to provide us with a more complete understanding of the relationship between communities. The practical aim is to derive a coarse-grain description of the network topology that will aid understanding of its hierarchical structure. The research of the thesis consists of four main phases. First, one of the best algorithms for community detection in classical networks, Infomap, has not been adapted to the big and important class of bipartite networks. This research gap is one focus of the thesis. We integrate the weighted projection method for bipartite networks based on common neighbors similarity into Infomap, to acquire a weighted one mode network that can be clustered by this random walks technique. We apply this method to a number of real world bipartite networks, to detect significant community structure. We measure the performance of our approach based on the ground truth. This requires deep knowledge of the formation of relations within and between clusters in these real world networks. Although such investigation is excessively time consuming, and impractical or impossible in large networks, the result is much more accurate and more meaningful and gives us confidence that this method can be usefully applied to large networks where ground truth is not known. Second, several possible edge additions are conducted to test how random walks based algorithm, Infomap, performs when the minimal modification is made to convert a bipartite network to a nearly bipartite (but unipartite) network. The experiments on small bipartite networks obtain encouraging results. Third, we shift focus from community detection based on random walks to community detection based on the strongest communities possible in a bipartite network, which are bicliques. We develop a novel algorithm to identify overlapping communities at the base level of hierarchy in bipartite networks. We combine existing techniques (bicliques, cliques, structural equivalence) into a novel method to solve this new research problem. We classify the output communities into 5 categories based on community strength. From this base level, we apply the Jaccard index as a threshold in order to reduce the redundancy of overlapping communities, to obtain higher levels of the hierarchy. We compare results from our overlapping approach with other concurrent approaches not only directly to the ground truth, but also using a widely accepted scale for evaluating the quality of partitions, Normalized Mutual Information (NMI). In the last phase of the thesis, a large financial bipartite network collected during 6 months fieldwork is analysed and tested in order to reveal its hierarchical structure. We apply all methods presented in Chapter 3, Chapter 4 and Chapter 5. The main contribution of this thesis is an improved method to detect the hierarchical and overlapping community structure in bipartite complex networks based on structural equivalence of nodes. More generally, it aims to derive a coarse-grain depiction of real large-scale networks through structural properties of their identified communities as well as their performance with respect to the known ground truth
    corecore