26,254 research outputs found
Towards Semi-Supervised Learning for Deep Semantic Role Labeling
Neural models have shown several state-of-the-art performances on Semantic
Role Labeling (SRL). However, the neural models require an immense amount of
semantic-role corpora and are thus not well suited for low-resource languages
or domains. The paper proposes a semi-supervised semantic role labeling method
that outperforms the state-of-the-art in limited SRL training corpora. The
method is based on explicitly enforcing syntactic constraints by augmenting the
training objective with a syntactic-inconsistency loss component and uses
SRL-unlabeled instances to train a joint-objective LSTM. On CoNLL-2012 English
section, the proposed semi-supervised training with 1%, 10% SRL-labeled data
and varying amounts of SRL-unlabeled data achieves +1.58, +0.78 F1,
respectively, over the pre-trained models that were trained on SOTA
architecture with ELMo on the same SRL-labeled data. Additionally, by using the
syntactic-inconsistency loss on inference time, the proposed model achieves
+3.67, +2.1 F1 over pre-trained model on 1%, 10% SRL-labeled data,
respectively.Comment: EMNLP 201
From Natural Language Specifications to Program Input Parsers
We present a method for automatically generating input parsers from English specifications of input file formats. We use a Bayesian generative model to capture relevant natural language phenomena and translate the English specification into a specification tree, which is then translated into a C++ input parser. We model the problem as a joint dependency parsing and semantic role labeling task. Our method is based on two sources of information: (1) the correlation between the text and the specification tree and (2) noisy supervision as determined by the success of the generated C++ parser in reading input examples.
Our results show that our approach achieves 80.0\% F-Score accuracy compared to an F-Score of 66.7\% produced by a state-of-the-art semantic parser on a dataset of input format specifications from the ACM International Collegiate Programming Contest (which were written in English for humans with no intention of providing support for automated processing)National Science Foundation (U.S.) (Grant IIS-0835652)Battelle Memorial Institute (PO #300662
- …