4 research outputs found
On Robustness and Bias Analysis of BERT-based Relation Extraction
Fine-tuning pre-trained models have achieved impressive performance on
standard natural language processing benchmarks. However, the resultant model
generalizability remains poorly understood. We do not know, for example, how
excellent performance can lead to the perfection of generalization models. In
this study, we analyze a fine-tuned BERT model from different perspectives
using relation extraction. We also characterize the differences in
generalization techniques according to our proposed improvements. From
empirical experimentation, we find that BERT suffers a bottleneck in terms of
robustness by way of randomizations, adversarial and counterfactual tests, and
biases (i.e., selection and semantic). These findings highlight opportunities
for future improvements. Our open-sourced testbed DiagnoseRE is available in
\url{https://github.com/zjunlp/DiagnoseRE}.Comment: work in progres
Normal vs. Adversarial: Salience-based Analysis of Adversarial Samples for Relation Extraction
Recent neural-based relation extraction approaches, though achieving
promising improvement on benchmark datasets, have reported their vulnerability
towards adversarial attacks. Thus far, efforts mostly focused on generating
adversarial samples or defending adversarial attacks, but little is known about
the difference between normal and adversarial samples. In this work, we take
the first step to leverage the salience-based method to analyze those
adversarial samples. We observe that salience tokens have a direct correlation
with adversarial perturbations. We further find the adversarial perturbations
are either those tokens not existing in the training set or superficial cues
associated with relation labels. To some extent, our approach unveils the
characters against adversarial samples. We release an open-source testbed,
"DiagnoseAdv" in https://github.com/zjunlp/DiagnoseAdv.Comment: IJCKG 202
Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners
Large-scale pre-trained language models have contributed significantly to
natural language processing by demonstrating remarkable abilities as few-shot
learners. However, their effectiveness depends mainly on scaling the model
parameters and prompt design, hindering their implementation in most real-world
applications. This study proposes a novel pluggable, extensible, and efficient
approach named DifferentiAble pRompT (DART), which can convert small language
models into better few-shot learners without any prompt engineering. The main
principle behind this approach involves reformulating potential natural
language processing tasks into the task of a pre-trained language model and
differentially optimizing the prompt template as well as the target label with
backpropagation. Furthermore, the proposed approach can be: (i) Plugged to any
pre-trained language models; (ii) Extended to widespread classification tasks.
A comprehensive evaluation of standard NLP tasks demonstrates that the proposed
approach achieves a better few-shot performance. Code is available in
https://github.com/zjunlp/DART.Comment: Accepted by ICLR 202