693 research outputs found
Towards Universal Semantic Tagging
The paper proposes the task of universal semantic tagging---tagging word
tokens with language-neutral, semantically informative tags. We argue that the
task, with its independent nature, contributes to better semantic analysis for
wide-coverage multilingual text. We present the initial version of the semantic
tagset and show that (a) the tags provide semantically fine-grained
information, and (b) they are suitable for cross-lingual semantic parsing. An
application of the semantic tagging in the Parallel Meaning Bank supports both
of these points as the tags contribute to formal lexical semantics and their
cross-lingual projection. As a part of the application, we annotate a small
corpus with the semantic tags and present new baseline result for universal
semantic tagging.Comment: 9 pages, International Conference on Computational Semantics (IWCS
Neural Semantic Parsing by Character-based Translation: Experiments with Abstract Meaning Representations
We evaluate the character-level translation method for neural semantic
parsing on a large corpus of sentences annotated with Abstract Meaning
Representations (AMRs). Using a sequence-to-sequence model, and some trivial
preprocessing and postprocessing of AMRs, we obtain a baseline accuracy of 53.1
(F-score on AMR-triples). We examine five different approaches to improve this
baseline result: (i) reordering AMR branches to match the word order of the
input sentence increases performance to 58.3; (ii) adding part-of-speech tags
(automatically produced) to the input shows improvement as well (57.2); (iii)
So does the introduction of super characters (conflating frequent sequences of
characters to a single character), reaching 57.4; (iv) optimizing the training
process by using pre-training and averaging a set of models increases
performance to 58.7; (v) adding silver-standard training data obtained by an
off-the-shelf parser yields the biggest improvement, resulting in an F-score of
64.0. Combining all five techniques leads to an F-score of 71.0 on holdout
data, which is state-of-the-art in AMR parsing. This is remarkable because of
the relative simplicity of the approach.Comment: Camera ready for CLIN 2017 journa
The Meaning Factory at SemEval-2017 Task 9: Producing AMRs with Neural Semantic Parsing
We evaluate a semantic parser based on a character-based sequence-to-sequence
model in the context of the SemEval-2017 shared task on semantic parsing for
AMRs. With data augmentation, super characters, and POS-tagging we gain major
improvements in performance compared to a baseline character-level model.
Although we improve on previous character-based neural semantic parsing models,
the overall accuracy is still lower than a state-of-the-art AMR parser. An
ensemble combining our neural semantic parser with an existing, traditional
parser, yields a small gain in performance.Comment: To appear in Proceedings of SemEval, 2017 (camera-ready
Semantic Tagging with Deep Residual Networks
We propose a novel semantic tagging task, sem-tagging, tailored for the
purpose of multilingual semantic parsing, and present the first tagger using
deep residual networks (ResNets). Our tagger uses both word and character
representations and includes a novel residual bypass architecture. We evaluate
the tagset both intrinsically on the new task of semantic tagging, as well as
on Part-of-Speech (POS) tagging. Our system, consisting of a ResNet and an
auxiliary loss function predicting our semantic tags, significantly outperforms
prior results on English Universal Dependencies POS tagging (95.71% accuracy on
UD v1.2 and 95.67% accuracy on UD v1.3).Comment: COLING 2016, camera ready versio
Applying automated deduction to natural language understanding
AbstractVery few natural language understanding applications employ methods from automated deduction. This is mainly because (i) a high level of interdisciplinary knowledge is required, (ii) there is a huge gap between formal semantic theory and practical implementation, and (iii) statistical rather than symbolic approaches dominate the current trends in natural language processing. Moreover, abduction rather than deduction is generally viewed as a promising way to apply reasoning in natural language understanding. We describe three applications where we show how first-order theorem proving and finite model construction can efficiently be employed in language understanding.The first is a text understanding system building semantic representations of texts, developed in the late 1990s. Theorem provers are here used to signal inconsistent interpretations and to check whether new contributions to the discourse are informative or not. This application shows that it is feasible to use general-purpose theorem provers for first-order logic, and that it pays off to use a battery of different inference engines as in practice they complement each other in terms of performance.The second application is a spoken-dialogue interface to a mobile robot and an automated home. We use the first-order theorem prover spass for checking inconsistencies and newness of information, but the inference tasks are complemented with the finite model builder mace used in parallel to the prover. The model builder is used to check for satisfiability of the input; in addition, the produced finite and minimal models are used to determine the actions that the robot or automated house has to execute. When the semantic representation of the dialogue as well as the number of objects in the context are kept fairly small, response times are acceptable to human users.The third demonstration of successful use of first-order inference engines comes from the task of recognising entailment between two (short) texts. We run a robust parser producing semantic representations for both texts, and use the theorem prover vampire to check whether one text entails the other. For many examples it is hard to compute the appropriate background knowledge in order to produce a proof, and the model builders mace and paradox are used to estimate the likelihood of an entailment
Predicate logic unplugged
this paper we describe the syntax and semantics of a description language for underspecified semantic representations. This concept is discussed in general and in particular applied to Predicate Logic and Discourse Representation Theory. The reason for exploring underspecified representations as suitable semantic representations for natural language expressions emerges directly from practical natural language processing applications. The so-called Combinatorial Explosion Puzzle, a well known problem in this area, can succesfully be tackled by using underspecified representations. The source of this problem, scopal ambiguities in natural language expressions, is discussed in section 2. The core of the paper presents Hole Semantics. This is a general proposal for a framework, in principle suitable for any logic, where underspecified representations play a central role. There is a clear separation between the object language (the logical language one is interested in) and the meta language (the language that describes and interprets underspecified structures). It has been noted by various authors that the meaning of an underspecified semantic representation cannot be expressed in terms of a disjunction of denotations, but rather as a set of denotations (cf. Poesio 1994). We support this view, and use it as underlying principle for the definition of the semantic interpretation function of underspecified structures. Section 3 is an informal introduction to Hole Semantics, and in section 4 things are formally defined. In section 5 we apply Hole Semantics to Predicate Logic, resulting in an "unplugged" version of (static and dynamic) Predicate Logic. In section 6 we show that this idea easily carries over to Discourse Representation Structures. A lot of attention has been paid..
A semantically annotated corpus of tombstone inscriptions
The digital preservation of funerary material is of interest to many different scientific disciplines. Textual information found on tombstones often goes far beyond the expected (name of the deceased, dates of birth and death), and may include information about commemorators, family roles, occupations, references to biblical or other texts, places of birth and death, cause of death, epitaphs and poems. Gravestones are multi-modal media, and besides text are often decorated with artistic symbols. To capture this information in a systematic way and make it available on a large scale for research purposes, a meaning representation based on linking entities by relations has been designed that will extend search capabilities beyond simple string matches. Concepts are represented as WordNet synsets, and a vocabulary of 32 relations make connections between concepts. This formalisation has been developed and evaluated based on a dataset of more than 1,000 Dutch tombstones
The Sequence Notation:Catching Complex Meanings in Simple Graphs
Current symbolic semantic representations proposed to capture the semantics of human language have served well to give us insight in how meaning is expressed. But they are either too complicated for large-scale annotation tasks or lack expressive power to play a role in inference tasks. What we propose is a meaning representation system that it is interlingual, model-theoretic, and variable-free. It divides the labour involved in representing meaning along three levels: concept, roles, and contexts. As natural languages are expressed as sequences of phonemes or words, the meaning representations that we propose are likewise sequential. However, the resulting meaning representations can also be visualised as directed acyclic graphs
The Sequence Notation:Catching Complex Meanings in Simple Graphs
Current symbolic semantic representations proposed to capture the semantics of human language have served well to give us insight in how meaning is expressed. But they are either too complicated for large-scale annotation tasks or lack expressive power to play a role in inference tasks. What we propose is a meaning representation system that it is interlingual, model-theoretic, and variable-free. It divides the labour involved in representing meaning along three levels: concept, roles, and contexts. As natural languages are expressed as sequences of phonemes or words, the meaning representations that we propose are likewise sequential. However, the resulting meaning representations can also be visualised as directed acyclic graphs
- …