35,013 research outputs found
Transformer and Snowball Graph Convolution Learning for Biomedical Graph Classification
Graph or network has been widely used for describing and modeling complex
systems in biomedicine. Deep learning methods, especially graph neural networks
(GNNs), have been developed to learn and predict with such structured data. In
this paper, we proposed a novel transformer and snowball encoding networks
(TSEN) for biomedical graph classification, which introduced transformer
architecture with graph snowball connection into GNNs for learning whole-graph
representation. TSEN combined graph snowball connection with graph transformer
by snowball encoding layers, which enhanced the power to capture multi-scale
information and global patterns to learn the whole-graph features. On the other
hand, TSEN also used snowball graph convolution as position embedding in
transformer structure, which was a simple yet effective method for capturing
local patterns naturally. Results of experiments using four graph
classification datasets demonstrated that TSEN outperformed the
state-of-the-art typical GNN models and the graph-transformer based GNN models.Comment: Prepared for submitting to TB
Transformers over Directed Acyclic Graphs
Transformer models have recently gained popularity in graph representation
learning as they have the potential to learn complex relationships beyond the
ones captured by regular graph neural networks. The main research question is
how to inject the structural bias of graphs into the transformer architecture,
and several proposals have been made for undirected molecular graphs and,
recently, also for larger network graphs. In this paper, we study transformers
over directed acyclic graphs (DAGs) and propose architecture adaptations
tailored to DAGs: (1) An attention mechanism that is considerably more
efficient than the regular quadratic complexity of transformers and at the same
time faithfully captures the DAG structure, and (2) a positional encoding of
the DAG's partial order, complementing the former. We rigorously evaluate our
approach over various types of tasks, ranging from classifying source code
graphs to nodes in citation networks, and show that it is effective in two
important aspects: in making graph transformers generally outperform graph
neural networks tailored to DAGs and in improving SOTA graph transformer
performance in terms of both quality and efficiency
Long Range Graph Benchmark
Graph Neural Networks (GNNs) that are based on the message passing (MP)
paradigm generally exchange information between 1-hop neighbors to build node
representations at each layer. In principle, such networks are not able to
capture long-range interactions (LRI) that may be desired or necessary for
learning a given task on graphs. Recently, there has been an increasing
interest in development of Transformer-based methods for graphs that can
consider full node connectivity beyond the original sparse structure, thus
enabling the modeling of LRI. However, MP-GNNs that simply rely on 1-hop
message passing often fare better in several existing graph benchmarks when
combined with positional feature representations, among other innovations,
hence limiting the perceived utility and ranking of Transformer-like
architectures. Here, we present the Long Range Graph Benchmark (LRGB) with 5
graph learning datasets: PascalVOC-SP, COCO-SP, PCQM-Contact, Peptides-func and
Peptides-struct that arguably require LRI reasoning to achieve strong
performance in a given task. We benchmark both baseline GNNs and Graph
Transformer networks to verify that the models which capture long-range
dependencies perform significantly better on these tasks. Therefore, these
datasets are suitable for benchmarking and exploration of MP-GNNs and Graph
Transformer architectures that are intended to capture LRI.Comment: Added reference to T\"onshoff et al., 2023 in Sec. 4.1; NeurIPS 2022
Track on D&B; Open-sourced at: https://github.com/vijaydwivedi75/lrg
Dynamic Graph Representation Learning via Graph Transformer Networks
Dynamic graph representation learning is an important task with widespread
applications. Previous methods on dynamic graph learning are usually sensitive
to noisy graph information such as missing or spurious connections, which can
yield degenerated performance and generalization. To overcome this challenge,
we propose a Transformer-based dynamic graph learning method named Dynamic
Graph Transformer (DGT) with spatial-temporal encoding to effectively learn
graph topology and capture implicit links. To improve the generalization
ability, we introduce two complementary self-supervised pre-training tasks and
show that jointly optimizing the two pre-training tasks results in a smaller
Bayesian error rate via an information-theoretic analysis. We also propose a
temporal-union graph structure and a target-context node sampling strategy for
efficient and scalable training. Extensive experiments on real-world datasets
illustrate that DGT presents superior performance compared with several
state-of-the-art baselines
GTNet: Graph Transformer Network for 3D Point Cloud Classification and Semantic Segmentation
Recently, graph-based and Transformer-based deep learning networks have
demonstrated excellent performances on various point cloud tasks. Most of the
existing graph methods are based on static graph, which take a fixed input to
establish graph relations. Moreover, many graph methods apply maximization and
averaging to aggregate neighboring features, so that only a single neighboring
point affects the feature of centroid or different neighboring points have the
same influence on the centroid's feature, which ignoring the correlation and
difference between points. Most Transformer-based methods extract point cloud
features based on global attention and lack the feature learning on local
neighbors. To solve the problems of these two types of models, we propose a new
feature extraction block named Graph Transformer and construct a 3D point point
cloud learning network called GTNet to learn features of point clouds on local
and global patterns. Graph Transformer integrates the advantages of graph-based
and Transformer-based methods, and consists of Local Transformer and Global
Transformer modules. Local Transformer uses a dynamic graph to calculate all
neighboring point weights by intra-domain cross-attention with dynamically
updated graph relations, so that every neighboring point could affect the
features of centroid with different weights; Global Transformer enlarges the
receptive field of Local Transformer by a global self-attention. In addition,
to avoid the disappearance of the gradient caused by the increasing depth of
network, we conduct residual connection for centroid features in GTNet; we also
adopt the features of centroid and neighbors to generate the local geometric
descriptors in Local Transformer to strengthen the local information learning
capability of the model. Finally, we use GTNet for shape classification, part
segmentation and semantic segmentation tasks in this paper
NAR-Former V2: Rethinking Transformer for Universal Neural Network Representation Learning
As more deep learning models are being applied in real-world applications,
there is a growing need for modeling and learning the representations of neural
networks themselves. An efficient representation can be used to predict target
attributes of networks without the need for actual training and deployment
procedures, facilitating efficient network deployment and design. Recently,
inspired by the success of Transformer, some Transformer-based representation
learning frameworks have been proposed and achieved promising performance in
handling cell-structured models. However, graph neural network (GNN) based
approaches still dominate the field of learning representation for the entire
network. In this paper, we revisit Transformer and compare it with GNN to
analyse their different architecture characteristics. We then propose a
modified Transformer-based universal neural network representation learning
model NAR-Former V2. It can learn efficient representations from both
cell-structured networks and entire networks. Specifically, we first take the
network as a graph and design a straightforward tokenizer to encode the network
into a sequence. Then, we incorporate the inductive representation learning
capability of GNN into Transformer, enabling Transformer to generalize better
when encountering unseen architecture. Additionally, we introduce a series of
simple yet effective modifications to enhance the ability of the Transformer in
learning representation from graph structures. Our proposed method surpasses
the GNN-based method NNLP by a significant margin in latency estimation on the
NNLQP dataset. Furthermore, regarding accuracy prediction on the NASBench101
and NASBench201 datasets, our method achieves highly comparable performance to
other state-of-the-art methods.Comment: 9 pages, 2 figures, 6 tables. Code is available at
https://github.com/yuny220/NAR-Former-V
Transforming Graphs for Enhanced Attribute Clustering: An Innovative Graph Transformer-Based Method
Graph Representation Learning (GRL) is an influential methodology, enabling a
more profound understanding of graph-structured data and aiding graph
clustering, a critical task across various domains. The recent incursion of
attention mechanisms, originally an artifact of Natural Language Processing
(NLP), into the realm of graph learning has spearheaded a notable shift in
research trends. Consequently, Graph Attention Networks (GATs) and Graph
Attention Auto-Encoders have emerged as preferred tools for graph clustering
tasks. Yet, these methods primarily employ a local attention mechanism, thereby
curbing their capacity to apprehend the intricate global dependencies between
nodes within graphs. Addressing these impediments, this study introduces an
innovative method known as the Graph Transformer Auto-Encoder for Graph
Clustering (GTAGC). By melding the Graph Auto-Encoder with the Graph
Transformer, GTAGC is adept at capturing global dependencies between nodes.
This integration amplifies the graph representation and surmounts the
constraints posed by the local attention mechanism. The architecture of GTAGC
encompasses graph embedding, integration of the Graph Transformer within the
autoencoder structure, and a clustering component. It strategically alternates
between graph embedding and clustering, thereby tailoring the Graph Transformer
for clustering tasks, whilst preserving the graph's global structural
information. Through extensive experimentation on diverse benchmark datasets,
GTAGC has exhibited superior performance against existing state-of-the-art
graph clustering methodologies
- …