11 research outputs found
Federated Deep Multi-View Clustering with Global Self-Supervision
Federated multi-view clustering has the potential to learn a global
clustering model from data distributed across multiple devices. In this
setting, label information is unknown and data privacy must be preserved,
leading to two major challenges. First, views on different clients often have
feature heterogeneity, and mining their complementary cluster information is
not trivial. Second, the storage and usage of data from multiple clients in a
distributed environment can lead to incompleteness of multi-view data. To
address these challenges, we propose a novel federated deep multi-view
clustering method that can mine complementary cluster structures from multiple
clients, while dealing with data incompleteness and privacy concerns.
Specifically, in the server environment, we propose sample alignment and data
extension techniques to explore the complementary cluster structures of
multiple views. The server then distributes global prototypes and global
pseudo-labels to each client as global self-supervised information. In the
client environment, multiple clients use the global self-supervised information
and deep autoencoders to learn view-specific cluster assignments and embedded
features, which are then uploaded to the server for refining the global
self-supervised information. Finally, the results of our extensive experiments
demonstrate that our proposed method exhibits superior performance in
addressing the challenges of incomplete multi-view data in distributed
environments
A Survey of Deep Graph Clustering: Taxonomy, Challenge, and Application
Graph clustering, which aims to divide the nodes in the graph into several
distinct clusters, is a fundamental and challenging task. In recent years, deep
graph clustering methods have been increasingly proposed and achieved promising
performance. However, the corresponding survey paper is scarce and it is
imminent to make a summary in this field. From this motivation, this paper
makes the first comprehensive survey of deep graph clustering. Firstly, the
detailed definition of deep graph clustering and the important baseline methods
are introduced. Besides, the taxonomy of deep graph clustering methods is
proposed based on four different criteria including graph type, network
architecture, learning paradigm, and clustering method. In addition, through
the careful analysis of the existing works, the challenges and opportunities
from five perspectives are summarized. At last, the applications of deep graph
clustering in four domains are presented. It is worth mentioning that a
collection of state-of-the-art deep graph clustering methods including papers,
codes, and datasets is available on GitHub. We hope this work will serve as a
quick guide and help researchers to overcome challenges in this vibrant field.Comment: 13 pages, 13 figure
Deep Clustering: A Comprehensive Survey
Cluster analysis plays an indispensable role in machine learning and data
mining. Learning a good data representation is crucial for clustering
algorithms. Recently, deep clustering, which can learn clustering-friendly
representations using deep neural networks, has been broadly applied in a wide
range of clustering tasks. Existing surveys for deep clustering mainly focus on
the single-view fields and the network architectures, ignoring the complex
application scenarios of clustering. To address this issue, in this paper we
provide a comprehensive survey for deep clustering in views of data sources.
With different data sources and initial conditions, we systematically
distinguish the clustering methods in terms of methodology, prior knowledge,
and architecture. Concretely, deep clustering methods are introduced according
to four categories, i.e., traditional single-view deep clustering,
semi-supervised deep clustering, deep multi-view clustering, and deep transfer
clustering. Finally, we discuss the open challenges and potential future
opportunities in different fields of deep clustering
Redundancy-Free Self-Supervised Relational Learning for Graph Clustering
Graph clustering, which learns the node representations for effective cluster
assignments, is a fundamental yet challenging task in data analysis and has
received considerable attention accompanied by graph neural networks in recent
years. However, most existing methods overlook the inherent relational
information among the non-independent and non-identically distributed nodes in
a graph. Due to the lack of exploration of relational attributes, the semantic
information of the graph-structured data fails to be fully exploited which
leads to poor clustering performance. In this paper, we propose a novel
self-supervised deep graph clustering method named Relational Redundancy-Free
Graph Clustering (RFGC) to tackle the problem. It extracts the attribute-
and structure-level relational information from both global and local views
based on an autoencoder and a graph autoencoder. To obtain effective
representations of the semantic information, we preserve the consistent
relation among augmented nodes, whereas the redundant relation is further
reduced for learning discriminative embeddings. In addition, a simple yet valid
strategy is utilized to alleviate the over-smoothing issue. Extensive
experiments are performed on widely used benchmark datasets to validate the
superiority of our RFGC over state-of-the-art baselines. Our codes are
available at https://github.com/yisiyu95/R2FGC.Comment: Accepted by IEEE Transactions on Neural Networks and Learning Systems
(TNNLS 2024
A Comprehensive Survey on Graph Summarization with Graph Neural Networks
As large-scale graphs become more widespread, more and more computational
challenges with extracting, processing, and interpreting large graph data are
being exposed. It is therefore natural to search for ways to summarize these
expansive graphs while preserving their key characteristics. In the past, most
graph summarization techniques sought to capture the most important part of a
graph statistically. However, today, the high dimensionality and complexity of
modern graph data are making deep learning techniques more popular. Hence, this
paper presents a comprehensive survey of progress in deep learning
summarization techniques that rely on graph neural networks (GNNs). Our
investigation includes a review of the current state-of-the-art approaches,
including recurrent GNNs, convolutional GNNs, graph autoencoders, and graph
attention networks. A new burgeoning line of research is also discussed where
graph reinforcement learning is being used to evaluate and improve the quality
of graph summaries. Additionally, the survey provides details of benchmark
datasets, evaluation metrics, and open-source tools that are often employed in
experimentation settings, along with a discussion on the practical uses of
graph summarization in different fields. Finally, the survey concludes with a
number of open research challenges to motivate further study in this area.Comment: 20 pages, 4 figures, 3 tables, Journal of IEEE Transactions on
Artificial Intelligenc
High-order Multi-view Clustering for Generic Data
Graph-based multi-view clustering has achieved better performance than most
non-graph approaches. However, in many real-world scenarios, the graph
structure of data is not given or the quality of initial graph is poor.
Additionally, existing methods largely neglect the high-order neighborhood
information that characterizes complex intrinsic interactions. To tackle these
problems, we introduce an approach called high-order multi-view clustering
(HMvC) to explore the topology structure information of generic data. Firstly,
graph filtering is applied to encode structure information, which unifies the
processing of attributed graph data and non-graph data in a single framework.
Secondly, up to infinity-order intrinsic relationships are exploited to enrich
the learned graph. Thirdly, to explore the consistent and complementary
information of various views, an adaptive graph fusion mechanism is proposed to
achieve a consensus graph. Comprehensive experimental results on both non-graph
and attributed graph data show the superior performance of our method with
respect to various state-of-the-art techniques, including some deep learning
methods
Graph Fuzzy System: Concepts, Models and Algorithms
Fuzzy systems (FSs) have enjoyed wide applications in various fields,
including pattern recognition, intelligent control, data mining and
bioinformatics, which is attributed to the strong interpretation and learning
ability. In traditional application scenarios, FSs are mainly applied to model
Euclidean space data and cannot be used to handle graph data of non-Euclidean
structure in nature, such as social networks and traffic route maps. Therefore,
development of FS modeling method that is suitable for graph data and can
retain the advantages of traditional FSs is an important research. To meet this
challenge, a new type of FS for graph data modeling called Graph Fuzzy System
(GFS) is proposed in this paper, where the concepts, modeling framework and
construction algorithms are systematically developed. First, GFS related
concepts, including graph fuzzy rule base, graph fuzzy sets and graph
consequent processing unit (GCPU), are defined. A GFS modeling framework is
then constructed and the antecedents and consequents of the GFS are presented
and analyzed. Finally, a learning framework of GFS is proposed, in which a
kernel K-prototype graph clustering (K2PGC) is proposed to develop the
construction algorithm for the GFS antecedent generation, and then based on
graph neural network (GNNs), consequent parameters learning algorithm is
proposed for GFS. Specifically, three different versions of the GFS
implementation algorithm are developed for comprehensive evaluations with
experiments on various benchmark graph classification datasets. The results
demonstrate that the proposed GFS inherits the advantages of both existing
mainstream GNNs methods and conventional FSs methods while achieving better
performance than the counterparts.Comment: This paper has been submitted to a journa
Investigating and Mitigating the Side Effects of Noisy Views in Multi-view Clustering in Practical Scenarios
Multi-view clustering (MvC) aims at exploring category structures among
multi-view data without label supervision. Multiple views provide more
information than single views and thus existing MvC methods can achieve
satisfactory performance. However, their performance might seriously degenerate
when the views are noisy in practical scenarios. In this paper, we first
formally investigate the drawback of noisy views and then propose a
theoretically grounded deep MvC method (namely MvCAN) to address this issue.
Specifically, we propose a novel MvC objective that enables un-shared
parameters and inconsistent clustering predictions across multiple views to
reduce the side effects of noisy views. Furthermore, a non-parametric iterative
process is designed to generate a robust learning target for mining multiple
views' useful information. Theoretical analysis reveals that MvCAN works by
achieving the multi-view consistency, complementarity, and noise robustness.
Finally, experiments on extensive public datasets demonstrate that MvCAN
outperforms state-of-the-art methods and is robust against the existence of
noisy views