987 research outputs found
Evaluating Scoped Meaning Representations
Semantic parsing offers many opportunities to improve natural language
understanding. We present a semantically annotated parallel corpus for English,
German, Italian, and Dutch where sentences are aligned with scoped meaning
representations in order to capture the semantics of negation, modals,
quantification, and presupposition triggers. The semantic formalism is based on
Discourse Representation Theory, but concepts are represented by WordNet
synsets and thematic roles by VerbNet relations. Translating scoped meaning
representations to sets of clauses enables us to compare them for the purpose
of semantic parser evaluation and checking translations. This is done by
computing precision and recall on matching clauses, in a similar way as is done
for Abstract Meaning Representations. We show that our matching tool for
evaluating scoped meaning representations is both accurate and efficient.
Applying this matching tool to three baseline semantic parsers yields F-scores
between 43% and 54%. A pilot study is performed to automatically find changes
in meaning by comparing meaning representations of translations. This
comparison turns out to be an additional way of (i) finding annotation mistakes
and (ii) finding instances where our semantic analysis needs to be improved.Comment: Camera-ready for LREC 201
Matching Natural Language Sentences with Hierarchical Sentence Factorization
Semantic matching of natural language sentences or identifying the
relationship between two sentences is a core research problem underlying many
natural language tasks. Depending on whether training data is available, prior
research has proposed both unsupervised distance-based schemes and supervised
deep learning schemes for sentence matching. However, previous approaches
either omit or fail to fully utilize the ordered, hierarchical, and flexible
structures of language objects, as well as the interactions between them. In
this paper, we propose Hierarchical Sentence Factorization---a technique to
factorize a sentence into a hierarchical representation, with the components at
each different scale reordered into a "predicate-argument" form. The proposed
sentence factorization technique leads to the invention of: 1) a new
unsupervised distance metric which calculates the semantic distance between a
pair of text snippets by solving a penalized optimal transport problem while
preserving the logical relationship of words in the reordered sentences, and 2)
new multi-scale deep learning models for supervised semantic training, based on
factorized sentence hierarchies. We apply our techniques to text-pair
similarity estimation and text-pair relationship classification tasks, based on
multiple datasets such as STSbenchmark, the Microsoft Research paraphrase
identification (MSRP) dataset, the SICK dataset, etc. Extensive experiments
show that the proposed hierarchical sentence factorization can be used to
significantly improve the performance of existing unsupervised distance-based
metrics as well as multiple supervised deep learning models based on the
convolutional neural network (CNN) and long short-term memory (LSTM).Comment: Accepted by WWW 2018, 10 page
Content Differences in Syntactic and Semantic Representations
Syntactic analysis plays an important role in semantic parsing, but the
nature of this role remains a topic of ongoing debate. The debate has been
constrained by the scarcity of empirical comparative studies between syntactic
and semantic schemes, which hinders the development of parsing methods informed
by the details of target schemes and constructions. We target this gap, and
take Universal Dependencies (UD) and UCCA as a test case. After abstracting
away from differences of convention or formalism, we find that most content
divergences can be ascribed to: (1) UCCA's distinction between a Scene and a
non-Scene; (2) UCCA's distinction between primary relations, secondary ones and
participants; (3) different treatment of multi-word expressions, and (4)
different treatment of inter-clause linkage. We further discuss the long tail
of cases where the two schemes take markedly different approaches. Finally, we
show that the proposed comparison methodology can be used for fine-grained
evaluation of UCCA parsing, highlighting both challenges and potential sources
for improvement. The substantial differences between the schemes suggest that
semantic parsers are likely to benefit downstream text understanding
applications beyond their syntactic counterparts.Comment: NAACL-HLT 2019 camera read
A Type-coherent, Expressive Representation as an Initial Step to Language Understanding
A growing interest in tasks involving language understanding by the NLP
community has led to the need for effective semantic parsing and inference.
Modern NLP systems use semantic representations that do not quite fulfill the
nuanced needs for language understanding: adequately modeling language
semantics, enabling general inferences, and being accurately recoverable. This
document describes underspecified logical forms (ULF) for Episodic Logic (EL),
which is an initial form for a semantic representation that balances these
needs. ULFs fully resolve the semantic type structure while leaving issues such
as quantifier scope, word sense, and anaphora unresolved; they provide a
starting point for further resolution into EL, and enable certain structural
inferences without further resolution. This document also presents preliminary
results of creating a hand-annotated corpus of ULFs for the purpose of training
a precise ULF parser, showing a three-person pairwise interannotator agreement
of 0.88 on confident annotations. We hypothesize that a divide-and-conquer
approach to semantic parsing starting with derivation of ULFs will lead to
semantic analyses that do justice to subtle aspects of linguistic meaning, and
will enable construction of more accurate semantic parsers.Comment: Accepted for publication at The 13th International Conference on
Computational Semantics (IWCS 2019
- …