4,430 research outputs found
When is multitask learning effective? Semantic sequence prediction under varying data conditions
Multitask learning has been applied successfully to a range of tasks, mostly
morphosyntactic. However, little is known on when MTL works and whether there
are data characteristics that help to determine its success. In this paper we
evaluate a range of semantic sequence labeling tasks in a MTL setup. We examine
different auxiliary tasks, amongst which a novel setup, and correlate their
impact to data-dependent conditions. Our results show that MTL is not always
effective, significant improvements are obtained only for 1 out of 5 tasks.
When successful, auxiliary tasks with compact and more uniform label
distributions are preferable.Comment: In EACL 201
Syntax-Aware Graph-to-Graph Transformer for Semantic Role Labelling
Recent models have shown that incorporating syntactic knowledge into the
semantic role labelling (SRL) task leads to a significant improvement. In this
paper, we propose Syntax-aware Graph-to-Graph Transformer (SynG2G-Tr) model,
which encodes the syntactic structure using a novel way to input graph
relations as embeddings, directly into the self-attention mechanism of
Transformer. This approach adds a soft bias towards attention patterns that
follow the syntactic structure but also allows the model to use this
information to learn alternative patterns. We evaluate our model on both
span-based and dependency-based SRL datasets, and outperform previous
alternative methods in both in-domain and out-of-domain settings, on CoNLL 2005
and CoNLL 2009 datasets.Comment: Accepted to Rep4NLP at ACL 202
Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus
Many efforts of research are devoted to semantic role labeling (SRL) which is
crucial for natural language understanding. Supervised approaches have achieved
impressing performances when large-scale corpora are available for
resource-rich languages such as English. While for the low-resource languages
with no annotated SRL dataset, it is still challenging to obtain competitive
performances. Cross-lingual SRL is one promising way to address the problem,
which has achieved great advances with the help of model transferring and
annotation projection. In this paper, we propose a novel alternative based on
corpus translation, constructing high-quality training datasets for the target
languages from the source gold-standard SRL annotations. Experimental results
on Universal Proposition Bank show that the translation-based method is highly
effective, and the automatic pseudo datasets can improve the target-language
SRL performances significantly.Comment: Accepted at ACL 202
- …