Search CORE

1,324 research outputs found

Predicate Informed Syntax-Guidance for Semantic Role Labeling

Author: Wang Sijia
Publication venue: Washington University Open Scholarship
Publication date: 15/05/2020
Field of study

In this thesis, we consider neural network approaches to the semantic role labeling task in seman-tic parsing. Recent state-of-the-art results for semantic role labeling are achieved by combiningLSTM neural networks and pre-trained features. This work offers a simple BERT-based modelwhich shows that, contrary to the popular belief that more complexity means better performance,removing LSTM improves the state of the art for span-based semantic role labeling. This modelhas improved F1 scores on both the test set of CoNLL-2012, and the Brown test set of CoNLL-2005 by at least 3 percentage points.In addition to this refinement of existing architectures, we also propose a new mechanism. Therehas been an active line of research focusing on incorporating syntax information into the atten-tion mechanism for semantic parsing. However, the existing models do not make use of whichsub-clause a given token belongs to or where the boundary of the sub-clause lies. In this thesis,we propose a predicate-aware attention mechanism that explicitly incorporates the portion of theparsing spanning from the predicate. The proposed Syntax-Guidance (SG) mechanism further improves the model performance. We compare the predicate informed method with three other SG mechanisms in detailed error analysis, showing the advantage and potential research directions ofthe proposed method

Washington University St. Louis: Open Scholarship

Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus

Author: Fei Hao
Ji Donghong
Zhang Meishan
Publication venue
Publication date: 01/01/2020
Field of study

Many efforts of research are devoted to semantic role labeling (SRL) which is crucial for natural language understanding. Supervised approaches have achieved impressing performances when large-scale corpora are available for resource-rich languages such as English. While for the low-resource languages with no annotated SRL dataset, it is still challenging to obtain competitive performances. Cross-lingual SRL is one promising way to address the problem, which has achieved great advances with the help of model transferring and annotation projection. In this paper, we propose a novel alternative based on corpus translation, constructing high-quality training datasets for the target languages from the source gold-standard SRL annotations. Experimental results on Universal Proposition Bank show that the translation-based method is highly effective, and the automatic pseudo datasets can improve the target-language SRL performances significantly.Comment: Accepted at ACL 202

arXiv.org e-Print Archive

Crossref