62 research outputs found

    Semi-supervised stochastic blockmodel for structure analysis of signed networks

    Full text link
    © 2020 Elsevier B.V. Finding hidden structural patterns is a critical problem for all types of networks, including signed networks. Among all of the methods for structural analysis of complex network, stochastic blockmodel (SBM) is an important research tool because it is flexible and can generate networks with many different types of structures. However, most existing SBM learning methods for signed networks are unsupervised, leading to poor performance in terms of finding hidden structural patterns, especially when handling noisy and sparse networks. Learning SBM in a semi-supervised way is a promising avenue for overcoming the above difficulty. In this type of model, a small number of labelled nodes and a large number of unlabelled nodes, coupled with their network structures, are simultaneously used to train SBM. We propose a novel semi-supervised signed stochastic blockmodel and its learning algorithm based on variational Bayesian inference, with the goal of discovering both assortative (the nodes connect more densely in same clusters than that in different clusters) and disassortative (the nodes link more sparsely in same clusters than that in different clusters) structures from signed networks. The proposed model is validated through a number of experiments wherein it compared with the state-of-the-art methods using both synthetic and real-world data. The carefully designed tests, allowing to account for different scenarios, show our method outperforms other approaches existing in this space. It is especially relevant in the case of noisy and sparse networks as they constitute the majority of the real-world networks

    SSBM: A Signed Stochastic Block Model for Multiple Structure Discovery in Large-Scale Exploratory Signed Networks

    Full text link
    Signed network structure discovery has received extensive attention and has become a research focus in the field of network science. However, most of the existing studies are focused on the networks with a single structure, e.g., community or bipartite, while ignoring multiple structures, e.g., the coexistence of community and bipartite structures. Furthermore, existing studies were faced with challenge regarding large-scale signed networks due to their high time complexity, especially when determining the number of clusters in the observed network without any prior knowledge. In view of this, we propose a mathematically principled method for signed network multiple structure discovery named the Signed Stochastic Block Model (SSBM). The SSBM can capture the multiple structures contained in signed networks, e.g., community, bipartite, and coexistence of them, by adopting a probabilistic model. Moreover, by integrating the minimum message length (MML) criterion and component-wise EM (CEM) algorithm, a scalable learning algorithm that has the ability of model selection is proposed to handle large-scale signed networks. By comparing state-of-the-art methods on synthetic and real-world signed networks, extensive experimental results demonstrate the effectiveness and efficiency of SSBM in discovering large-scale exploratory signed networks with multiple structures

    Beyond the arithmetic mean : extensions of spectral clustering and semi-supervised learning for signed and multilayer graphs via matrix power means

    Get PDF
    In this thesis we present extensions of spectral clustering and semi-supervised learning to signed and multilayer graphs. These extensions are based on a one-parameter family of matrix functions called Matrix Power Means. In the scalar case, this family has the arithmetic, geometric and harmonic means as particular cases. We study the effectivity of this family of matrix functions through suitable versions of the stochastic block model to signed and multilayer graphs. We provide provable properties in expectation and further identify regimes where the state of the art fails whereas our approach provably performs well. Some of the settings that we analyze are as follows: first, the case where each layer presents a reliable approximation to the overall clustering; second, the case when one single layer has information about the clusters whereas the remaining layers are potentially just noise; third, the case when each layer has only partial information but all together show global information about the underlying clustering structure. We present extensive numerical verifications of all our results and provide matrix-free numerical schemes. With these numerical schemes we are able to show that our proposed approach based on matrix power means is scalable to large sparse signed and multilayer graphs. Finally, we evaluate our methods in real world datasets. For instance, we show that our approach consistently identifies clustering structure in a real signed network where previous approaches failed. This further verifies that our methods are competitive to the state of the art.In dieser Arbeit stellen wir Erweiterungen von spektralem Clustering und teilüberwachtem Lernen auf signierte und mehrschichtige Graphen vor. Diese Erweiterungen basieren auf einer einparametrischen Familie von Matrixfunktionen, die Potenzmittel genannt werden. Im skalaren Fall hat diese Familie die arithmetischen, geometrischen und harmonischen Mittel als Spezialfälle. Wir untersuchen die Effektivität dieser Familie von Matrixfunktionen durch Versionen des stochastischen Blockmodells, die für signierte und mehrschichtige Graphen geeignet sind. Wir stellen beweisbare Eigenschaften vor und identifizieren darüber hinaus Situationen in denen neueste, gegenwärtig verwendete Methoden versagen, während unser Ansatz nachweislich gut abschneidet. Wir untersuchen unter anderem folgende Situationen: erstens den Fall, dass jede Schicht eine zuverlässige Approximation an die Gesamtclusterung darstellt; zweitens den Fall, dass eine einzelne Schicht Informationen über die Cluster hat, während die übrigen Schichten möglicherweise nur Rauschen sind; drittens den Fall, dass jede Schicht nur partielle Informationen hat, aber alle zusammen globale Informationen über die zugrunde liegende Clusterstruktur liefern. Wir präsentieren umfangreiche numerische Verifizierungen aller unserer Ergebnisse und stellen matrixfreie numerische Verfahren zur Verfügung. Mit diesen numerischen Methoden sind wir in der Lage zu zeigen, dass unser vorgeschlagener Ansatz, der auf Potenzmitteln basiert, auf große, dünnbesetzte signierte und mehrschichtige Graphen skalierbar ist. Schließlich evaluieren wir unsere Methoden an realen Datensätzen. Zum Beispiel zeigen wir, dass unser Ansatz konsistent Clustering-Strukturen in einem realen signierten Netzwerk identifiziert, wo frühere Ansätze versagten. Dies ist ein weiterer Nachweis, dass unsere Methoden konkurrenzfähig zu den aktuell verwendeten Methoden sind

    An MBO scheme for clustering and semi-supervised clustering of signed networks

    Get PDF
    We introduce a principled method for the signed clustering problem, where the goal is to partition a weighted undirected graph whose edge weights take both positive and negative values, such that edges within the same cluster are mostly positive, while edges spanning across clusters are mostly negative. Our method relies on a graph-based diffuse interface model formulation utilizing the Ginzburg–Landau functional, based on an adaptation of the classic numerical Merriman–Bence–Osher (MBO) scheme for minimizing such graph-based functionals. The proposed objective function aims to minimize the total weight of inter-cluster positively-weighted edges, while maximizing the total weight of the inter-cluster negatively-weighted edges. Our method scales to large sparse networks, and can be easily adjusted to incorporate labelled data information, as is often the case in the context of semisupervised learning. We tested our method on a number of both synthetic stochastic block models and real-world data sets (including financial correlation matrices), and obtained promising results that compare favourably against a number of state-of-the-art approaches from the recent literature

    Clustering and Community Detection in Directed Networks: A Survey

    Full text link
    Networks (or graphs) appear as dominant structures in diverse domains, including sociology, biology, neuroscience and computer science. In most of the aforementioned cases graphs are directed - in the sense that there is directionality on the edges, making the semantics of the edges non symmetric. An interesting feature that real networks present is the clustering or community structure property, under which the graph topology is organized into modules commonly called communities or clusters. The essence here is that nodes of the same community are highly similar while on the contrary, nodes across communities present low similarity. Revealing the underlying community structure of directed complex networks has become a crucial and interdisciplinary topic with a plethora of applications. Therefore, naturally there is a recent wealth of research production in the area of mining directed graphs - with clustering being the primary method and tool for community detection and evaluation. The goal of this paper is to offer an in-depth review of the methods presented so far for clustering directed networks along with the relevant necessary methodological background and also related applications. The survey commences by offering a concise review of the fundamental concepts and methodological base on which graph clustering algorithms capitalize on. Then we present the relevant work along two orthogonal classifications. The first one is mostly concerned with the methodological principles of the clustering algorithms, while the second one approaches the methods from the viewpoint regarding the properties of a good cluster in a directed network. Further, we present methods and metrics for evaluating graph clustering results, demonstrate interesting application domains and provide promising future research directions.Comment: 86 pages, 17 figures. Physics Reports Journal (To Appear
    • …
    corecore