18 research outputs found
Understanding microbiome dynamics via interpretable graph representation learning
Large-scale perturbations in the microbiome constitution are strongly correlated, whether as a driver or a consequence, with the health and functioning of human physiology. However, understanding the difference in the microbiome profiles of healthy and ill individuals can be complicated due to the large number of complex interactions among microbes. We propose to model these interactions as a time-evolving graph where nodes represent microbes and edges are interactions among them. Motivated by the need to analyse such complex interactions, we develop a method that can learn a low-dimensional representation of the time-evolving graph while maintaining the dynamics occurring in the high-dimensional space. Through our experiments, we show that we can extract graph features such as clusters of nodes or edges that have the highest impact on the model to learn the low-dimensional representation. This information is crucial for identifying microbes and interactions among them that are strongly correlated with clinical diseases. We conduct our experiments on both synthetic and real-world microbiome datasets
Adaptive-Step Graph Meta-Learner for Few-Shot Graph Classification
Graph classification aims to extract accurate information from
graph-structured data for classification and is becoming more and more
important in graph learning community. Although Graph Neural Networks (GNNs)
have been successfully applied to graph classification tasks, most of them
overlook the scarcity of labeled graph data in many applications. For example,
in bioinformatics, obtaining protein graph labels usually needs laborious
experiments. Recently, few-shot learning has been explored to alleviate this
problem with only given a few labeled graph samples of test classes. The shared
sub-structures between training classes and test classes are essential in
few-shot graph classification. Exiting methods assume that the test classes
belong to the same set of super-classes clustered from training classes.
However, according to our observations, the label spaces of training classes
and test classes usually do not overlap in real-world scenario. As a result,
the existing methods don't well capture the local structures of unseen test
classes. To overcome the limitation, in this paper, we propose a direct method
to capture the sub-structures with well initialized meta-learner within a few
adaptation steps. More specifically, (1) we propose a novel framework
consisting of a graph meta-learner, which uses GNNs based modules for fast
adaptation on graph data, and a step controller for the robustness and
generalization of meta-learner; (2) we provide quantitative analysis for the
framework and give a graph-dependent upper bound of the generalization error
based on our framework; (3) the extensive experiments on real-world datasets
demonstrate that our framework gets state-of-the-art results on several
few-shot graph classification tasks compared to baselines
Contrastive Brain Network Learning via Hierarchical Signed Graph Pooling Model
Recently brain networks have been widely adopted to study brain dynamics,
brain development and brain diseases. Graph representation learning techniques
on brain functional networks can facilitate the discovery of novel biomarkers
for clinical phenotypes and neurodegenerative diseases. However, current graph
learning techniques have several issues on brain network mining. Firstly, most
current graph learning models are designed for unsigned graph, which hinders
the analysis of many signed network data (e.g., brain functional networks).
Meanwhile, the insufficiency of brain network data limits the model performance
on clinical phenotypes predictions. Moreover, few of current graph learning
model is interpretable, which may not be capable to provide biological insights
for model outcomes. Here, we propose an interpretable hierarchical signed graph
representation learning model to extract graph-level representations from brain
functional networks, which can be used for different prediction tasks. In order
to further improve the model performance, we also propose a new strategy to
augment functional brain network data for contrastive learning. We evaluate
this framework on different classification and regression tasks using the data
from HCP and OASIS. Our results from extensive experiments demonstrate the
superiority of the proposed model compared to several state-of-the-art
techniques. Additionally, we use graph saliency maps, derived from these
prediction tasks, to demonstrate detection and interpretation of phenotypic
biomarkers
Impact-Oriented Contextual Scholar Profiling using Self-Citation Graphs
Quantitatively profiling a scholar's scientific impact is important to modern
research society. Current practices with bibliometric indicators (e.g.,
h-index), lists, and networks perform well at scholar ranking, but do not
provide structured context for scholar-centric, analytical tasks such as
profile reasoning and understanding. This work presents GeneticFlow (GF), a
suite of novel graph-based scholar profiles that fulfill three essential
requirements: structured-context, scholar-centric, and evolution-rich. We
propose a framework to compute GF over large-scale academic data sources with
millions of scholars. The framework encompasses a new unsupervised
advisor-advisee detection algorithm, a well-engineered citation type classifier
using interpretable features, and a fine-tuned graph neural network (GNN)
model. Evaluations are conducted on the real-world task of scientific award
inference. Experiment outcomes show that the F1 score of best GF profile
significantly outperforms alternative methods of impact indicators and
bibliometric networks in all the 6 computer science fields considered.
Moreover, the core GF profiles, with 63.6%-66.5% nodes and 12.5%-29.9% edges of
the full profile, still significantly outrun existing methods in 5 out of 6
fields studied. Visualization of GF profiling result also reveals human
explainable patterns for high-impact scholars
Projective Ranking-based GNN Evasion Attacks
Graph neural networks (GNNs) offer promising learning methods for
graph-related tasks. However, GNNs are at risk of adversarial attacks. Two
primary limitations of the current evasion attack methods are highlighted: (1)
The current GradArgmax ignores the "long-term" benefit of the perturbation. It
is faced with zero-gradient and invalid benefit estimates in certain
situations. (2) In the reinforcement learning-based attack methods, the learned
attack strategies might not be transferable when the attack budget changes. To
this end, we first formulate the perturbation space and propose an evaluation
framework and the projective ranking method. We aim to learn a powerful attack
strategy then adapt it as little as possible to generate adversarial samples
under dynamic budget settings. In our method, based on mutual information, we
rank and assess the attack benefits of each perturbation for an effective
attack strategy. By projecting the strategy, our method dramatically minimizes
the cost of learning a new attack strategy when the attack budget changes. In
the comparative assessment with GradArgmax and RL-S2V, the results show our
method owns high attack performance and effective transferability. The
visualization of our method also reveals various attack patterns in the
generation of adversarial samples.Comment: Accepted by IEEE Transactions on Knowledge and Data Engineerin
On Exploring Node-feature and Graph-structure Diversities for Node Drop Graph Pooling
A pooling operation is essential for effective graph-level representation
learning, where the node drop pooling has become one mainstream graph pooling
technology. However, current node drop pooling methods usually keep the top-k
nodes according to their significance scores, which ignore the graph diversity
in terms of the node features and the graph structures, thus resulting in
suboptimal graph-level representations. To address the aforementioned issue, we
propose a novel plug-and-play score scheme and refer to it as MID, which
consists of a \textbf{M}ultidimensional score space with two operations,
\textit{i.e.}, fl\textbf{I}pscore and \textbf{D}ropscore. Specifically, the
multidimensional score space depicts the significance of nodes through multiple
criteria; the flipscore encourages the maintenance of dissimilar node features;
and the dropscore forces the model to notice diverse graph structures instead
of being stuck in significant local structures. To evaluate the effectiveness
of our proposed MID, we perform extensive experiments by applying it to a wide
variety of recent node drop pooling methods, including TopKPool, SAGPool,
GSAPool, and ASAP. Specifically, the proposed MID can efficiently and
consistently achieve about 2.8\% average improvements over the above four
methods on seventeen real-world graph classification datasets, including four
social datasets (IMDB-BINARY, IMDB-MULTI, REDDIT-BINARY, and COLLAB), and
thirteen biochemical datasets (D\&D, PROTEINS, NCI1, MUTAG, PTC-MR, NCI109,
ENZYMES, MUTAGENICITY, FRANKENSTEIN, HIV, BBBP, TOXCAST, and TOX21). Code is
available at~\url{https://github.com/whuchuang/mid}.Comment: 14 pages, 14 figure