11 research outputs found

    Federated Deep Multi-View Clustering with Global Self-Supervision

    Full text link
    Federated multi-view clustering has the potential to learn a global clustering model from data distributed across multiple devices. In this setting, label information is unknown and data privacy must be preserved, leading to two major challenges. First, views on different clients often have feature heterogeneity, and mining their complementary cluster information is not trivial. Second, the storage and usage of data from multiple clients in a distributed environment can lead to incompleteness of multi-view data. To address these challenges, we propose a novel federated deep multi-view clustering method that can mine complementary cluster structures from multiple clients, while dealing with data incompleteness and privacy concerns. Specifically, in the server environment, we propose sample alignment and data extension techniques to explore the complementary cluster structures of multiple views. The server then distributes global prototypes and global pseudo-labels to each client as global self-supervised information. In the client environment, multiple clients use the global self-supervised information and deep autoencoders to learn view-specific cluster assignments and embedded features, which are then uploaded to the server for refining the global self-supervised information. Finally, the results of our extensive experiments demonstrate that our proposed method exhibits superior performance in addressing the challenges of incomplete multi-view data in distributed environments

    A Survey of Deep Graph Clustering: Taxonomy, Challenge, and Application

    Full text link
    Graph clustering, which aims to divide the nodes in the graph into several distinct clusters, is a fundamental and challenging task. In recent years, deep graph clustering methods have been increasingly proposed and achieved promising performance. However, the corresponding survey paper is scarce and it is imminent to make a summary in this field. From this motivation, this paper makes the first comprehensive survey of deep graph clustering. Firstly, the detailed definition of deep graph clustering and the important baseline methods are introduced. Besides, the taxonomy of deep graph clustering methods is proposed based on four different criteria including graph type, network architecture, learning paradigm, and clustering method. In addition, through the careful analysis of the existing works, the challenges and opportunities from five perspectives are summarized. At last, the applications of deep graph clustering in four domains are presented. It is worth mentioning that a collection of state-of-the-art deep graph clustering methods including papers, codes, and datasets is available on GitHub. We hope this work will serve as a quick guide and help researchers to overcome challenges in this vibrant field.Comment: 13 pages, 13 figure

    Deep Clustering: A Comprehensive Survey

    Full text link
    Cluster analysis plays an indispensable role in machine learning and data mining. Learning a good data representation is crucial for clustering algorithms. Recently, deep clustering, which can learn clustering-friendly representations using deep neural networks, has been broadly applied in a wide range of clustering tasks. Existing surveys for deep clustering mainly focus on the single-view fields and the network architectures, ignoring the complex application scenarios of clustering. To address this issue, in this paper we provide a comprehensive survey for deep clustering in views of data sources. With different data sources and initial conditions, we systematically distinguish the clustering methods in terms of methodology, prior knowledge, and architecture. Concretely, deep clustering methods are introduced according to four categories, i.e., traditional single-view deep clustering, semi-supervised deep clustering, deep multi-view clustering, and deep transfer clustering. Finally, we discuss the open challenges and potential future opportunities in different fields of deep clustering

    Redundancy-Free Self-Supervised Relational Learning for Graph Clustering

    Full text link
    Graph clustering, which learns the node representations for effective cluster assignments, is a fundamental yet challenging task in data analysis and has received considerable attention accompanied by graph neural networks in recent years. However, most existing methods overlook the inherent relational information among the non-independent and non-identically distributed nodes in a graph. Due to the lack of exploration of relational attributes, the semantic information of the graph-structured data fails to be fully exploited which leads to poor clustering performance. In this paper, we propose a novel self-supervised deep graph clustering method named Relational Redundancy-Free Graph Clustering (R2^2FGC) to tackle the problem. It extracts the attribute- and structure-level relational information from both global and local views based on an autoencoder and a graph autoencoder. To obtain effective representations of the semantic information, we preserve the consistent relation among augmented nodes, whereas the redundant relation is further reduced for learning discriminative embeddings. In addition, a simple yet valid strategy is utilized to alleviate the over-smoothing issue. Extensive experiments are performed on widely used benchmark datasets to validate the superiority of our R2^2FGC over state-of-the-art baselines. Our codes are available at https://github.com/yisiyu95/R2FGC.Comment: Accepted by IEEE Transactions on Neural Networks and Learning Systems (TNNLS 2024

    A Comprehensive Survey on Graph Summarization with Graph Neural Networks

    Full text link
    As large-scale graphs become more widespread, more and more computational challenges with extracting, processing, and interpreting large graph data are being exposed. It is therefore natural to search for ways to summarize these expansive graphs while preserving their key characteristics. In the past, most graph summarization techniques sought to capture the most important part of a graph statistically. However, today, the high dimensionality and complexity of modern graph data are making deep learning techniques more popular. Hence, this paper presents a comprehensive survey of progress in deep learning summarization techniques that rely on graph neural networks (GNNs). Our investigation includes a review of the current state-of-the-art approaches, including recurrent GNNs, convolutional GNNs, graph autoencoders, and graph attention networks. A new burgeoning line of research is also discussed where graph reinforcement learning is being used to evaluate and improve the quality of graph summaries. Additionally, the survey provides details of benchmark datasets, evaluation metrics, and open-source tools that are often employed in experimentation settings, along with a discussion on the practical uses of graph summarization in different fields. Finally, the survey concludes with a number of open research challenges to motivate further study in this area.Comment: 20 pages, 4 figures, 3 tables, Journal of IEEE Transactions on Artificial Intelligenc

    High-order Multi-view Clustering for Generic Data

    Full text link
    Graph-based multi-view clustering has achieved better performance than most non-graph approaches. However, in many real-world scenarios, the graph structure of data is not given or the quality of initial graph is poor. Additionally, existing methods largely neglect the high-order neighborhood information that characterizes complex intrinsic interactions. To tackle these problems, we introduce an approach called high-order multi-view clustering (HMvC) to explore the topology structure information of generic data. Firstly, graph filtering is applied to encode structure information, which unifies the processing of attributed graph data and non-graph data in a single framework. Secondly, up to infinity-order intrinsic relationships are exploited to enrich the learned graph. Thirdly, to explore the consistent and complementary information of various views, an adaptive graph fusion mechanism is proposed to achieve a consensus graph. Comprehensive experimental results on both non-graph and attributed graph data show the superior performance of our method with respect to various state-of-the-art techniques, including some deep learning methods

    Graph Fuzzy System: Concepts, Models and Algorithms

    Full text link
    Fuzzy systems (FSs) have enjoyed wide applications in various fields, including pattern recognition, intelligent control, data mining and bioinformatics, which is attributed to the strong interpretation and learning ability. In traditional application scenarios, FSs are mainly applied to model Euclidean space data and cannot be used to handle graph data of non-Euclidean structure in nature, such as social networks and traffic route maps. Therefore, development of FS modeling method that is suitable for graph data and can retain the advantages of traditional FSs is an important research. To meet this challenge, a new type of FS for graph data modeling called Graph Fuzzy System (GFS) is proposed in this paper, where the concepts, modeling framework and construction algorithms are systematically developed. First, GFS related concepts, including graph fuzzy rule base, graph fuzzy sets and graph consequent processing unit (GCPU), are defined. A GFS modeling framework is then constructed and the antecedents and consequents of the GFS are presented and analyzed. Finally, a learning framework of GFS is proposed, in which a kernel K-prototype graph clustering (K2PGC) is proposed to develop the construction algorithm for the GFS antecedent generation, and then based on graph neural network (GNNs), consequent parameters learning algorithm is proposed for GFS. Specifically, three different versions of the GFS implementation algorithm are developed for comprehensive evaluations with experiments on various benchmark graph classification datasets. The results demonstrate that the proposed GFS inherits the advantages of both existing mainstream GNNs methods and conventional FSs methods while achieving better performance than the counterparts.Comment: This paper has been submitted to a journa

    Investigating and Mitigating the Side Effects of Noisy Views in Multi-view Clustering in Practical Scenarios

    Full text link
    Multi-view clustering (MvC) aims at exploring category structures among multi-view data without label supervision. Multiple views provide more information than single views and thus existing MvC methods can achieve satisfactory performance. However, their performance might seriously degenerate when the views are noisy in practical scenarios. In this paper, we first formally investigate the drawback of noisy views and then propose a theoretically grounded deep MvC method (namely MvCAN) to address this issue. Specifically, we propose a novel MvC objective that enables un-shared parameters and inconsistent clustering predictions across multiple views to reduce the side effects of noisy views. Furthermore, a non-parametric iterative process is designed to generate a robust learning target for mining multiple views' useful information. Theoretical analysis reveals that MvCAN works by achieving the multi-view consistency, complementarity, and noise robustness. Finally, experiments on extensive public datasets demonstrate that MvCAN outperforms state-of-the-art methods and is robust against the existence of noisy views
    corecore