4,746 research outputs found
Predict the Future from the Past? On the Temporal Data Distribution Shift in Financial Sentiment Classifications
Temporal data distribution shift is prevalent in the financial text. How can
a financial sentiment analysis system be trained in a volatile market
environment that can accurately infer sentiment and be robust to temporal data
distribution shifts? In this paper, we conduct an empirical study on the
financial sentiment analysis system under temporal data distribution shifts
using a real-world financial social media dataset that spans three years. We
find that the fine-tuned models suffer from general performance degradation in
the presence of temporal distribution shifts. Furthermore, motivated by the
unique temporal nature of the financial text, we propose a novel method that
combines out-of-distribution detection with time series modeling for temporal
financial sentiment analysis. Experimental results show that the proposed
method enhances the model's capability to adapt to evolving temporal shifts in
a volatile financial market.Comment: EMNLP 2023 main conferenc
Reducing Spurious Correlations for Aspect-Based Sentiment Analysis with Variational Information Bottleneck and Contrastive Learning
Deep learning techniques have dominated the literature on aspect-based
sentiment analysis (ABSA), yielding state-of-the-art results. However, these
deep models generally suffer from spurious correlation problems between input
features and output labels, which creates significant barriers to robustness
and generalization capability. In this paper, we propose a novel Contrastive
Variational Information Bottleneck framework (called CVIB) to reduce spurious
correlations for ABSA. The proposed CVIB framework is composed of an original
network and a self-pruned network, and these two networks are optimized
simultaneously via contrastive learning. Concretely, we employ the Variational
Information Bottleneck (VIB) principle to learn an informative and compressed
network (self-pruned network) from the original network, which discards the
superfluous patterns or spurious correlations between input features and
prediction labels. Then, self-pruning contrastive learning is devised to pull
together semantically similar positive pairs and push away dissimilar pairs,
where the representations of the anchor learned by the original and self-pruned
networks respectively are regarded as a positive pair while the representations
of two different sentences within a mini-batch are treated as a negative pair.
To verify the effectiveness of our CVIB method, we conduct extensive
experiments on five benchmark ABSA datasets and the experimental results show
that our approach achieves better performance than the strong competitors in
terms of overall prediction performance, robustness, and generalization
Distributionally Robust Optimization with Probabilistic Group
Modern machine learning models may be susceptible to learning spurious
correlations that hold on average but not for the atypical group of samples. To
address the problem, previous approaches minimize the empirical worst-group
risk. Despite the promise, they often assume that each sample belongs to one
and only one group, which does not allow expressing the uncertainty in group
labeling. In this paper, we propose a novel framework PG-DRO, which explores
the idea of probabilistic group membership for distributionally robust
optimization. Key to our framework, we consider soft group membership instead
of hard group annotations. The group probabilities can be flexibly generated
using either supervised learning or zero-shot approaches. Our framework
accommodates samples with group membership ambiguity, offering stronger
flexibility and generality than the prior art. We comprehensively evaluate
PG-DRO on both image classification and natural language processing benchmarks,
establishing superior performanceComment: Published at AAAI 202
Looking at the Overlooked: An Analysis on the Word-Overlap Bias in Natural Language Inference
It has been shown that NLI models are usually biased with respect to the
word-overlap between premise and hypothesis; they take this feature as a
primary cue for predicting the entailment label. In this paper, we focus on an
overlooked aspect of the overlap bias in NLI models: the reverse word-overlap
bias. Our experimental results demonstrate that current NLI models are highly
biased towards the non-entailment label on instances with low overlap, and the
existing debiasing methods, which are reportedly successful on existing
challenge datasets, are generally ineffective in addressing this category of
bias. We investigate the reasons for the emergence of the overlap bias and the
role of minority examples in its mitigation. For the former, we find that the
word-overlap bias does not stem from pre-training, and for the latter, we
observe that in contrast to the accepted assumption, eliminating minority
examples does not affect the generalizability of debiasing methods with respect
to the overlap bias.Comment: Accepted at EMNLP 202
Learning Stable Classifiers by Transferring Unstable Features
While unbiased machine learning models are essential for many applications,
bias is a human-defined concept that can vary across tasks. Given only
input-label pairs, algorithms may lack sufficient information to distinguish
stable (causal) features from unstable (spurious) features. However, related
tasks often share similar biases -- an observation we may leverage to develop
stable classifiers in the transfer setting. In this work, we explicitly inform
the target classifier about unstable features in the source tasks.
Specifically, we derive a representation that encodes the unstable features by
contrasting different data environments in the source task. We achieve
robustness by clustering data of the target task according to this
representation and minimizing the worst-case risk across these clusters. We
evaluate our method on both text and image classifications. Empirical results
demonstrate that our algorithm is able to maintain robustness on the target
task, outperforming the best baseline by 22.9% in absolute accuracy across 12
transfer settings. Our code is available at https://github.com/YujiaBao/Tofu
Confounder Balancing in Adversarial Domain Adaptation for Pre-Trained Large Models Fine-Tuning
The excellent generalization, contextual learning, and emergence abilities in
the pre-trained large models (PLMs) handle specific tasks without direct
training data, making them the better foundation models in the adversarial
domain adaptation (ADA) methods to transfer knowledge learned from the source
domain to target domains. However, existing ADA methods fail to account for the
confounder properly, which is the root cause of the source data distribution
that differs from the target domains. This study proposes an adversarial domain
adaptation with confounder balancing for PLMs fine-tuning (ADA-CBF). The
ADA-CBF includes a PLM as the foundation model for a feature extractor, a
domain classifier and a confounder classifier, and they are jointly trained
with an adversarial loss. This loss is designed to improve the domain-invariant
representation learning by diluting the discrimination in the domain
classifier. At the same time, the adversarial loss also balances the confounder
distribution among source and unmeasured domains in training. Compared to
existing ADA methods, ADA-CBF can correctly identify confounders in
domain-invariant features, thereby eliminating the confounder biases in the
extracted features from PLMs. The confounder classifier in ADA-CBF is designed
as a plug-and-play and can be applied in the confounder measurable,
unmeasurable, or partially measurable environments. Empirical results on
natural language processing and computer vision downstream tasks show that
ADA-CBF outperforms the newest GPT-4, LLaMA2, ViT and ADA methods
Knowledge is Power: Understanding Causality Makes Legal judgment Prediction Models More Generalizable and Robust
Legal judgment Prediction (LJP), aiming to predict a judgment based on fact
descriptions, serves as legal assistance to mitigate the great work burden of
limited legal practitioners. Most existing methods apply various large-scale
pre-trained language models (PLMs) finetuned in LJP tasks to obtain consistent
improvements. However, we discover the fact that the state-of-the-art (SOTA)
model makes judgment predictions according to wrong (or non-casual)
information, which not only weakens the model's generalization capability but
also results in severe social problems like discrimination. Here, we analyze
the causal mechanism misleading the LJP model to learn the spurious
correlations, and then propose a framework to guide the model to learn the
underlying causality knowledge in the legal texts. Specifically, we first
perform open information extraction (OIE) to refine the text having a high
proportion of causal information, according to which we generate a new set of
data. Then, we design a model learning the weights of the refined data and the
raw data for LJP model training. The extensive experimental results show that
our model is more generalizable and robust than the baselines and achieves a
new SOTA performance on two commonly used legal-specific datasets
- …