17 research outputs found
Transforming Graphs for Enhanced Attribute Clustering: An Innovative Graph Transformer-Based Method
Graph Representation Learning (GRL) is an influential methodology, enabling a
more profound understanding of graph-structured data and aiding graph
clustering, a critical task across various domains. The recent incursion of
attention mechanisms, originally an artifact of Natural Language Processing
(NLP), into the realm of graph learning has spearheaded a notable shift in
research trends. Consequently, Graph Attention Networks (GATs) and Graph
Attention Auto-Encoders have emerged as preferred tools for graph clustering
tasks. Yet, these methods primarily employ a local attention mechanism, thereby
curbing their capacity to apprehend the intricate global dependencies between
nodes within graphs. Addressing these impediments, this study introduces an
innovative method known as the Graph Transformer Auto-Encoder for Graph
Clustering (GTAGC). By melding the Graph Auto-Encoder with the Graph
Transformer, GTAGC is adept at capturing global dependencies between nodes.
This integration amplifies the graph representation and surmounts the
constraints posed by the local attention mechanism. The architecture of GTAGC
encompasses graph embedding, integration of the Graph Transformer within the
autoencoder structure, and a clustering component. It strategically alternates
between graph embedding and clustering, thereby tailoring the Graph Transformer
for clustering tasks, whilst preserving the graph's global structural
information. Through extensive experimentation on diverse benchmark datasets,
GTAGC has exhibited superior performance against existing state-of-the-art
graph clustering methodologies
Self-supervised Heterogeneous Graph Variational Autoencoders
Heterogeneous Information Networks (HINs), which consist of various types of
nodes and edges, have recently demonstrated excellent performance in graph
mining. However, most existing heterogeneous graph neural networks (HGNNs)
ignore the problems of missing attributes, inaccurate attributes and scarce
labels for nodes, which limits their expressiveness. In this paper, we propose
a generative self-supervised model SHAVA to address these issues
simultaneously. Specifically, SHAVA first initializes all the nodes in the
graph with a low-dimensional representation matrix. After that, based on the
variational graph autoencoder framework, SHAVA learns both node-level and
attribute-level embeddings in the encoder, which can provide fine-grained
semantic information to construct node attributes. In the decoder, SHAVA
reconstructs both links and attributes. Instead of directly reconstructing raw
features for attributed nodes, SHAVA generates the initial low-dimensional
representation matrix for all the nodes, based on which raw features of
attributed nodes are further reconstructed to leverage accurate attributes. In
this way, SHAVA can not only complete informative features for non-attributed
nodes, but rectify inaccurate ones for attributed nodes. Finally, we conduct
extensive experiments to show the superiority of SHAVA in tackling HINs with
missing and inaccurate attributes
Unsupervised Graph Attention Autoencoder for Attributed Networks using K-means Loss
Several natural phenomena and complex systems are often represented as
networks. Discovering their community structure is a fundamental task for
understanding these networks. Many algorithms have been proposed, but recently,
Graph Neural Networks (GNN) have emerged as a compelling approach for enhancing
this task.In this paper, we introduce a simple, efficient, and
clustering-oriented model based on unsupervised \textbf{G}raph Attention
\textbf{A}uto\textbf{E}ncoder for community detection in attributed networks
(GAECO). The proposed model adeptly learns representations from both the
network's topology and attribute information, simultaneously addressing dual
objectives: reconstruction and community discovery. It places a particular
emphasis on discovering compact communities by robustly minimizing clustering
errors. The model employs k-means as an objective function and utilizes a
multi-head Graph Attention Auto-Encoder for decoding the representations.
Experiments conducted on three datasets of attributed networks show that our
method surpasses state-of-the-art algorithms in terms of NMI and ARI.
Additionally, our approach scales effectively with the size of the network,
making it suitable for large-scale applications. The implications of our
findings extend beyond biological network interpretation and social network
analysis, where knowledge of the fundamental community structure is essential.Comment: 7 pages, 5 Figure
Reconstructing Humpty Dumpty: Multi-feature Graph Autoencoder for Open Set Action Recognition
Most action recognition datasets and algorithms assume a closed world, where
all test samples are instances of the known classes. In open set problems, test
samples may be drawn from either known or unknown classes. Existing open set
action recognition methods are typically based on extending closed set methods
by adding post hoc analysis of classification scores or feature distances and
do not capture the relations among all the video clip elements. Our approach
uses the reconstruction error to determine the novelty of the video since
unknown classes are harder to put back together and thus have a higher
reconstruction error than videos from known classes. We refer to our solution
to the open set action recognition problem as "Humpty Dumpty", due to its
reconstruction abilities. Humpty Dumpty is a novel graph-based autoencoder that
accounts for contextual and semantic relations among the clip pieces for
improved reconstruction. A larger reconstruction error leads to an increased
likelihood that the action can not be reconstructed, i.e., can not put Humpty
Dumpty back together again, indicating that the action has never been seen
before and is novel/unknown. Extensive experiments are performed on two
publicly available action recognition datasets including HMDB-51 and UCF-101,
showing the state-of-the-art performance for open set action recognition.Comment: Accepted to WACV 202
Motif-aware Attribute Masking for Molecular Graph Pre-training
Attribute reconstruction is used to predict node or edge features in the
pre-training of graph neural networks. Given a large number of molecules, they
learn to capture structural knowledge, which is transferable for various
downstream property prediction tasks and vital in chemistry, biomedicine, and
material science. Previous strategies that randomly select nodes to do
attribute masking leverage the information of local neighbors However, the
over-reliance of these neighbors inhibits the model's ability to learn from
higher-level substructures. For example, the model would learn little from
predicting three carbon atoms in a benzene ring based on the other three but
could learn more from the inter-connections between the functional groups, or
called chemical motifs. In this work, we propose and investigate motif-aware
attribute masking strategies to capture inter-motif structures by leveraging
the information of atoms in neighboring motifs. Once each graph is decomposed
into disjoint motifs, the features for every node within a sample motif are
masked. The graph decoder then predicts the masked features of each node within
the motif for reconstruction. We evaluate our approach on eight molecular
property prediction datasets and demonstrate its advantages
NCAGC: A Neighborhood Contrast Framework for Attributed Graph Clustering
Attributed graph clustering is one of the most fundamental tasks among graph
learning field, the goal of which is to group nodes with similar
representations into the same cluster without human annotations. Recent studies
based on graph contrastive learning method have achieved remarkable results
when exploit graph-structured data. However, most existing methods 1) do not
directly address the clustering task, since the representation learning and
clustering process are separated; 2) depend too much on data augmentation,
which greatly limits the capability of contrastive learning; 3) ignore the
contrastive message for clustering tasks, which adversely degenerate the
clustering results. In this paper, we propose a Neighborhood Contrast Framework
for Attributed Graph Clustering, namely NCAGC, seeking for conquering the
aforementioned limitations. Specifically, by leveraging the Neighborhood
Contrast Module, the representation of neighbor nodes will be 'push closer' and
become clustering-oriented with the neighborhood contrast loss. Moreover, a
Contrastive Self-Expression Module is built by minimizing the node
representation before and after the self-expression layer to constraint the
learning of self-expression matrix. All the modules of NCAGC are optimized in a
unified framework, so the learned node representation contains
clustering-oriented messages. Extensive experimental results on four attributed
graph datasets demonstrate the promising performance of NCAGC compared with 16
state-of-the-art clustering methods. The code is available at
https://github.com/wangtong627/NCAGC
Learning Persistent Community Structures in Dynamic Networks via Topological Data Analysis
Dynamic community detection methods often lack effective mechanisms to ensure
temporal consistency, hindering the analysis of network evolution. In this
paper, we propose a novel deep graph clustering framework with temporal
consistency regularization on inter-community structures, inspired by the
concept of minimal network topological changes within short intervals.
Specifically, to address the representation collapse problem, we first
introduce MFC, a matrix factorization-based deep graph clustering algorithm
that preserves node embedding. Based on static clustering results, we construct
probabilistic community networks and compute their persistence homology, a
robust topological measure, to assess structural similarity between them.
Moreover, a novel neural network regularization TopoReg is introduced to ensure
the preservation of topological similarity between inter-community structures
over time intervals. Our approach enhances temporal consistency and clustering
accuracy on real-world datasets with both fixed and varying numbers of
communities. It is also a pioneer application of TDA in temporally persistent
community detection, offering an insightful contribution to field of network
analysis. Code and data are available at the public git repository:
https://github.com/kundtx/MFC_TopoRegComment: AAAI 202
STGIC: a graph and image convolution-based method for spatial transcriptomic clustering
Spatial transcriptomic (ST) clustering employs spatial and transcription
information to group spots spatially coherent and transcriptionally similar
together into the same spatial domain. Graph convolution network (GCN) and
graph attention network (GAT), fed with spatial coordinates derived adjacency
and transcription profile derived feature matrix are often used to solve the
problem. Our proposed method STGIC (spatial transcriptomic clustering with
graph and image convolution) utilizes an adaptive graph convolution (AGC) to
get high quality pseudo-labels and then resorts to dilated convolution
framework (DCF) for virtual image converted from gene expression information
and spatial coordinates of spots. The dilation rates and kernel sizes are set
appropriately and updating of weight values in the kernels is made to be
subject to the spatial distance from the position of corresponding elements to
kernel centers so that feature extraction of each spot is better guided by
spatial distance to neighbor spots. Self-supervision realized by KL-divergence,
spatial continuity loss and cross entropy calculated among spots with high
confidence pseudo-labels make up the training objective of DCF. STGIC attains
state-of-the-art (SOTA) clustering performance on the benchmark dataset of
human dorsolateral prefrontal cortex (DLPFC). Besides, it's capable of
depicting fine structures of other tissues from other species as well as
guiding the identification of marker genes. Also, STGIC is expandable to
Stereo-seq data with high spatial resolution.Comment: Major revision has been made to generate the current version as
follows: 1. Writing style has been thoroughly changed. 2. Four more datasets
have been added. 3. Contrastive learning has been removed since it doesn't
make significant difference to the performance. 4. Two more authors are adde