Search CORE

4 research outputs found

On Robustness and Bias Analysis of BERT-based Relation Extraction

Author: Bi Zhen
Chen Huajun
Chen Xiang
Deng Shumin
Li Luoqiu
Ye Hongbin
Zhang Ningyu
Publication venue
Publication date: 06/04/2021
Field of study

Fine-tuning pre-trained models have achieved impressive performance on standard natural language processing benchmarks. However, the resultant model generalizability remains poorly understood. We do not know, for example, how excellent performance can lead to the perfection of generalization models. In this study, we analyze a fine-tuned BERT model from different perspectives using relation extraction. We also characterize the differences in generalization techniques according to our proposed improvements. From empirical experimentation, we find that BERT suffers a bottleneck in terms of robustness by way of randomizations, adversarial and counterfactual tests, and biases (i.e., selection and semantic). These findings highlight opportunities for future improvements. Our open-sourced testbed DiagnoseRE is available in \url{https://github.com/zjunlp/DiagnoseRE}.Comment: work in progres

arXiv.org e-Print Archive

Normal vs. Adversarial: Salience-based Analysis of Adversarial Samples for Relation Extraction

Author: Bi Zhen
Chen Huajun
Chen Mosha
Chen Xiang
Deng Shumin
Li Luoqiu
Tan Chuanqi
Xie Xin
Zhang Ningyu
Publication venue
Publication date: 25/11/2021
Field of study

Recent neural-based relation extraction approaches, though achieving promising improvement on benchmark datasets, have reported their vulnerability towards adversarial attacks. Thus far, efforts mostly focused on generating adversarial samples or defending adversarial attacks, but little is known about the difference between normal and adversarial samples. In this work, we take the first step to leverage the salience-based method to analyze those adversarial samples. We observe that salience tokens have a direct correlation with adversarial perturbations. We further find the adversarial perturbations are either those tokens not existing in the training set or superficial cues associated with relation labels. To some extent, our approach unveils the characters against adversarial samples. We release an open-source testbed, "DiagnoseAdv" in https://github.com/zjunlp/DiagnoseAdv.Comment: IJCKG 202

arXiv.org e-Print Archive

Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners

Author: Bi Zhen
Chen Huajun
Chen Xiang
Deng Shumin
Huang Fei
Li Luoqiu
Tan Chuanqi
Zhang Ningyu
Publication venue
Publication date: 04/05/2022
Field of study

Large-scale pre-trained language models have contributed significantly to natural language processing by demonstrating remarkable abilities as few-shot learners. However, their effectiveness depends mainly on scaling the model parameters and prompt design, hindering their implementation in most real-world applications. This study proposes a novel pluggable, extensible, and efficient approach named DifferentiAble pRompT (DART), which can convert small language models into better few-shot learners without any prompt engineering. The main principle behind this approach involves reformulating potential natural language processing tasks into the task of a pre-trained language model and differentially optimizing the prompt template as well as the target label with backpropagation. Furthermore, the proposed approach can be: (i) Plugged to any pre-trained language models; (ii) Extended to widespread classification tasks. A comprehensive evaluation of standard NLP tasks demonstrates that the proposed approach achieves a better few-shot performance. Code is available in https://github.com/zjunlp/DART.Comment: Accepted by ICLR 202

arXiv.org e-Print Archive