105 research outputs found
Subgraph-level Universal Prompt Tuning
In the evolving landscape of machine learning, the adaptation of pre-trained
models through prompt tuning has become increasingly prominent. This trend is
particularly observable in the graph domain, where diverse pre-training
strategies present unique challenges in developing effective prompt-based
tuning methods for graph neural networks. Previous approaches have been
limited, focusing on specialized prompting functions tailored to models with
edge prediction pre-training tasks. These methods, however, suffer from a lack
of generalizability across different pre-training strategies. Recently, a
simple prompt tuning method has been designed for any pre-training strategy,
functioning within the input graph's feature space. This allows it to
theoretically emulate any type of prompting function, thereby significantly
increasing its versatility for a range of downstream applications.
Nevertheless, the capacity of such simple prompts to fully grasp the complex
contexts found in graphs remains an open question, necessitating further
investigation. Addressing this challenge, our work introduces the
Subgraph-level Universal Prompt Tuning (SUPT) approach, focusing on the
detailed context within subgraphs. In SUPT, prompt features are assigned at the
subgraph-level, preserving the method's universal capability. This requires
extremely fewer tuning parameters than fine-tuning-based methods, outperforming
them in 42 out of 45 full-shot scenario experiments with an average improvement
of over 2.5%. In few-shot scenarios, it excels in 41 out of 45 experiments,
achieving an average performance increase of more than 6.6%
Non-equilibrium steady state phases of the interacting Aubry-Andre-Harper model
Here we study the phase diagram of the Aubry-Andre-Harper model in the
presence of strong interactions as the strength of the quasiperiodic potential
is varied. Previous work has established the existence of many-body localized
phase at large potential strength; here, we find a rich phase diagram in the
delocalized regime characterized by spin transport and unusual correlations. We
calculate the non-equilibrium steady states of a boundary-driven strongly
interacting Aubry-Andre-Harper model by employing the time-evolving block
decimation algorithm on matrix product density operators. From these steady
states, we extract spin transport as a function of system size and
quasiperiodic potential strength. This data shows spin transport going from
superdiffusive to subdiffusive well before the localization transition;
comparing to previous results, we also find that the transport transition is
distinct from a transition observed in the speed of operator growth in the
model. We also investigate the correlation structure of the steady state and
find an unusual oscillation pattern for intermediate values of the potential
strength. The unusual spin transport and quantum correlation structure suggest
multiple dynamical phases between the much-studied thermal and
many-body-localized phases.Comment: 5+2 pages, 7+3 figure
Towards Flexible Time-to-event Modeling: Optimizing Neural Networks via Rank Regression
Time-to-event analysis, also known as survival analysis, aims to predict the
time of occurrence of an event, given a set of features. One of the major
challenges in this area is dealing with censored data, which can make learning
algorithms more complex. Traditional methods such as Cox's proportional hazards
model and the accelerated failure time (AFT) model have been popular in this
field, but they often require assumptions such as proportional hazards and
linearity. In particular, the AFT models often require pre-specified parametric
distributional assumptions. To improve predictive performance and alleviate
strict assumptions, there have been many deep learning approaches for
hazard-based models in recent years. However, representation learning for AFT
has not been widely explored in the neural network literature, despite its
simplicity and interpretability in comparison to hazard-focused methods. In
this work, we introduce the Deep AFT Rank-regression model for Time-to-event
prediction (DART). This model uses an objective function based on Gehan's rank
statistic, which is efficient and reliable for representation learning. On top
of eliminating the requirement to establish a baseline event time distribution,
DART retains the advantages of directly predicting event time in standard AFT
models. The proposed method is a semiparametric approach to AFT modeling that
does not impose any distributional assumptions on the survival time
distribution. This also eliminates the need for additional hyperparameters or
complex model architectures, unlike existing neural network-based AFT models.
Through quantitative analysis on various benchmark datasets, we have shown that
DART has significant potential for modeling high-throughput censored
time-to-event data.Comment: Accepted at ECAI 202
Breaking the Spurious Causality of Conditional Generation via Fairness Intervention with Corrective Sampling
To capture the relationship between samples and labels, conditional
generative models often inherit spurious correlations from the training
dataset. This can result in label-conditional distributions that are imbalanced
with respect to another latent attribute. To mitigate this issue, which we call
spurious causality of conditional generation, we propose a general two-step
strategy. (a) Fairness Intervention (FI): emphasize the minority samples that
are hard to generate due to the spurious correlation in the training dataset.
(b) Corrective Sampling (CS): explicitly filter the generated samples and
ensure that they follow the desired latent attribute distribution. We have
designed the fairness intervention to work for various degrees of supervision
on the spurious attribute, including unsupervised, weakly-supervised, and
semi-supervised scenarios. Our experimental results demonstrate that FICS can
effectively resolve spurious causality of conditional generation across various
datasets.Comment: TMLR 202
Spread Spurious Attribute: Improving Worst-group Accuracy with Spurious Attribute Estimation
The paradigm of worst-group loss minimization has shown its promise in
avoiding to learn spurious correlations, but requires costly additional
supervision on spurious attributes. To resolve this, recent works focus on
developing weaker forms of supervision -- e.g., hyperparameters discovered with
a small number of validation samples with spurious attribute annotation -- but
none of the methods retain comparable performance to methods using full
supervision on the spurious attribute. In this paper, instead of searching for
weaker supervisions, we ask: Given access to a fixed number of samples with
spurious attribute annotations, what is the best achievable worst-group loss if
we "fully exploit" them? To this end, we propose a pseudo-attribute-based
algorithm, coined Spread Spurious Attribute (SSA), for improving the
worst-group accuracy. In particular, we leverage samples both with and without
spurious attribute annotations to train a model to predict the spurious
attribute, then use the pseudo-attribute predicted by the trained model as
supervision on the spurious attribute to train a new robust model having
minimal worst-group loss. Our experiments on various benchmark datasets show
that our algorithm consistently outperforms the baseline methods using the same
number of validation samples with spurious attribute annotations. We also
demonstrate that the proposed SSA can achieve comparable performances to
methods using full (100%) spurious attribute supervision, by using a much
smaller number of annotated samples -- from 0.6% and up to 1.5%, depending on
the dataset.Comment: ICLR 2022 camera read
Co-attention Graph Pooling for Efficient Pairwise Graph Interaction Learning
Graph Neural Networks (GNNs) have proven to be effective in processing and
learning from graph-structured data. However, previous works mainly focused on
understanding single graph inputs while many real-world applications require
pair-wise analysis for graph-structured data (e.g., scene graph matching, code
searching, and drug-drug interaction prediction). To this end, recent works
have shifted their focus to learning the interaction between pairs of graphs.
Despite their improved performance, these works were still limited in that the
interactions were considered at the node-level, resulting in high computational
costs and suboptimal performance. To address this issue, we propose a novel and
efficient graph-level approach for extracting interaction representations using
co-attention in graph pooling. Our method, Co-Attention Graph Pooling
(CAGPool), exhibits competitive performance relative to existing methods in
both classification and regression tasks using real-world datasets, while
maintaining lower computational complexity.Comment: Published at IEEE Acces
- …