9 research outputs found
Meta-Learning with Dynamic-Memory-Based Prototypical Network for Few-Shot Event Detection
Event detection (ED), a sub-task of event extraction, involves identifying
triggers and categorizing event mentions. Existing methods primarily rely upon
supervised learning and require large-scale labeled event datasets which are
unfortunately not readily available in many real-life applications. In this
paper, we consider and reformulate the ED task with limited labeled data as a
Few-Shot Learning problem. We propose a Dynamic-Memory-Based Prototypical
Network (DMB-PN), which exploits Dynamic Memory Network (DMN) to not only learn
better prototypes for event types, but also produce more robust sentence
encodings for event mentions. Differing from vanilla prototypical networks
simply computing event prototypes by averaging, which only consume event
mentions once, our model is more robust and is capable of distilling contextual
information from event mentions for multiple times due to the multi-hop
mechanism of DMNs. The experiments show that DMB-PN not only deals with sample
scarcity better than a series of baseline models but also performs more
robustly when the variety of event types is relatively large and the instance
quantity is extremely small.Comment: Accepted by WSDM 202
Normal vs. Adversarial: Salience-based Analysis of Adversarial Samples for Relation Extraction
Recent neural-based relation extraction approaches, though achieving
promising improvement on benchmark datasets, have reported their vulnerability
towards adversarial attacks. Thus far, efforts mostly focused on generating
adversarial samples or defending adversarial attacks, but little is known about
the difference between normal and adversarial samples. In this work, we take
the first step to leverage the salience-based method to analyze those
adversarial samples. We observe that salience tokens have a direct correlation
with adversarial perturbations. We further find the adversarial perturbations
are either those tokens not existing in the training set or superficial cues
associated with relation labels. To some extent, our approach unveils the
characters against adversarial samples. We release an open-source testbed,
"DiagnoseAdv" in https://github.com/zjunlp/DiagnoseAdv.Comment: IJCKG 202
On Robustness and Bias Analysis of BERT-based Relation Extraction
Fine-tuning pre-trained models have achieved impressive performance on
standard natural language processing benchmarks. However, the resultant model
generalizability remains poorly understood. We do not know, for example, how
excellent performance can lead to the perfection of generalization models. In
this study, we analyze a fine-tuned BERT model from different perspectives
using relation extraction. We also characterize the differences in
generalization techniques according to our proposed improvements. From
empirical experimentation, we find that BERT suffers a bottleneck in terms of
robustness by way of randomizations, adversarial and counterfactual tests, and
biases (i.e., selection and semantic). These findings highlight opportunities
for future improvements. Our open-sourced testbed DiagnoseRE is available in
\url{https://github.com/zjunlp/DiagnoseRE}.Comment: work in progres
MsPrompt: Multi-step Prompt Learning for Debiasing Few-shot Event Detection
Event detection (ED) is aimed to identify the key trigger words in
unstructured text and predict the event types accordingly. Traditional ED
models are too data-hungry to accommodate real applications with scarce labeled
data. Besides, typical ED models are facing the context-bypassing and disabled
generalization issues caused by the trigger bias stemming from ED datasets.
Therefore, we focus on the true few-shot paradigm to satisfy the low-resource
scenarios. In particular, we propose a multi-step prompt learning model
(MsPrompt) for debiasing few-shot event detection, that consists of the
following three components: an under-sampling module targeting to construct a
novel training set that accommodates the true few-shot setting, a multi-step
prompt module equipped with a knowledge-enhanced ontology to leverage the event
semantics and latent prior knowledge in the PLMs sufficiently for tackling the
context-bypassing problem, and a prototypical module compensating for the
weakness of classifying events with sparse data and boost the generalization
performance. Experiments on two public datasets ACE-2005 and FewEvent show that
MsPrompt can outperform the state-of-the-art models, especially in the strict
low-resource scenarios reporting 11.43% improvement in terms of weighted
F1-score against the best-performing baseline and achieving an outstanding
debiasing performance
AliCG: Fine-grained and Evolvable Conceptual Graph Construction for Semantic Search at Alibaba
Conceptual graphs, which is a particular type of Knowledge Graphs, play an
essential role in semantic search. Prior conceptual graph construction
approaches typically extract high-frequent, coarse-grained, and time-invariant
concepts from formal texts. In real applications, however, it is necessary to
extract less-frequent, fine-grained, and time-varying conceptual knowledge and
build taxonomy in an evolving manner. In this paper, we introduce an approach
to implementing and deploying the conceptual graph at Alibaba. Specifically, We
propose a framework called AliCG which is capable of a) extracting fine-grained
concepts by a novel bootstrapping with alignment consensus approach, b) mining
long-tail concepts with a novel low-resource phrase mining approach, c)
updating the graph dynamically via a concept distribution estimation method
based on implicit and explicit user behaviors. We have deployed the framework
at Alibaba UC Browser. Extensive offline evaluation as well as online A/B
testing demonstrate the efficacy of our approach.Comment: Accepted by KDD 2021 (Applied Data Science Track