399 research outputs found

    Context-driven natural language interpretation

    Get PDF

    Refining Implicit Argument Annotation for UCCA

    Full text link
    Predicate-argument structure analysis is a central component in meaning representations of text. The fact that some arguments are not explicitly mentioned in a sentence gives rise to ambiguity in language understanding, and renders it difficult for machines to interpret text correctly. However, only few resources represent implicit roles for NLU, and existing studies in NLP only make coarse distinctions between categories of arguments omitted from linguistic form. This paper proposes a typology for fine-grained implicit argument annotation on top of Universal Conceptual Cognitive Annotation's foundational layer. The proposed implicit argument categorisation is driven by theories of implicit role interpretation and consists of six types: Deictic, Generic, Genre-based, Type-identifiable, Non-specific, and Iterated-set. We exemplify our design by revisiting part of the UCCA EWT corpus, providing a new dataset annotated with the refinement layer, and making a comparative analysis with other schemes.Comment: DMR 202

    Context-driven natural language interpretation

    Get PDF

    Irish treebanking and parsing: a preliminary evaluation

    Get PDF
    Language resources are essential for linguistic research and the development of NLP applications. Low- density languages, such as Irish, therefore lack significant research in this area. This paper describes the early stages in the development of new language resources for Irish – namely the first Irish dependency treebank and the first Irish statistical dependency parser. We present the methodology behind building our new treebank and the steps we take to leverage upon the few existing resources. We discuss language specific choices made when defining our dependency labelling scheme, and describe interesting Irish language characteristics such as prepositional attachment, copula and clefting. We manually develop a small treebank of 300 sentences based on an existing POS-tagged corpus and report an inter-annotator agreement of 0.7902. We train MaltParser to achieve preliminary parsing results for Irish and describe a bootstrapping approach for further stages of development

    Combining ontologies and neural networks for analyzing historical language varieties: a case study in Middle Low German

    Get PDF
    In this paper, we describe experiments on the morphosyntactic annotation of historical language varieties for the example of Middle Low German (MLG), the official language of the German Hanse during the Middle Ages and a dominant language around the Baltic Sea by the time. To our best knowledge, this is the first experiment in automatically producing morphosyntactic annotations for Middle Low German, and accordingly, no part-of-speech (POS) tagset is currently agreed upon. In our experiment, we illustrate how ontology-based specifications of projected annotations can be employed to circumvent this issue: Instead of training and evaluating against a given tagset, we decomponse it into independent features which are predicted independently by a neural network. Using consistency constraints (axioms) from an ontology, then, the predicted feature probabilities are decoded into a sound ontological representation. Using these representations, we can finally bootstrap a POS tagset capturing only morphosyntactic features which could be reliably predicted. In this way, our approach is capable to optimize precision and recall of morphosyntactic annotations simultaneously with bootstrapping a tagset rather than performing iterative cycles

    Nodalida 2005 - proceedings of the 15th NODALIDA conference

    Get PDF

    References to graphical objects in interactive multimodel queries

    Get PDF
    This thesis describes a computational model for interpreting natural language expressions in an interactive multimodal query system integrating both natural language text and graphic displays. The primary concern of the model is to interpret expressions that might involve graphical attributes, and expressions whose referents could be objects on the screen.Graphical objects on the screen are used to visualise entities in the application domain and their attributes (in short, domain entities and domain attributes). This is why graphical objects are treated as descriptions of those domain entities/attributes in the literature. However, graphical objects and their attributes are visible during the interaction, and are thus known by the participants of the interaction. Therefore, they themselves should be part of the mutual knowledge of the interaction.This poses some interesting problems in language processing. As part of the mutual knowledge, graphical attributes could be used in expressions, and graphical objects could be referred to by expressions. In consequence, there could be ambiguities about whether an attribute in an expression belongs to a graphical object or to a domain entity. There could also be ambiguities about whether the referent of an expression is a graphical object or a domain entity.The main contributions of this thesis consist of analysing the above ambiguities, de¬ signing, implementing and testing a computational model and a demonstration system for resolving these ambiguities. Firstly, a structure and corresponding terminology are set up, so these ambiguities can be clarified as ambiguities derived from referring to different databases, the screen or the application domain (source ambiguities). Secondly, a meaning representation language is designed which explicitly represents the information about which database an attribute/entity comes from. Several linguistic regularities inside and among referring expressions are described so that they can be used as heuristics in the ambiguity resolution. Thirdly, a computational model based on constraint satisfaction is constructed to resolve simultaneously some reference ambiguities and source ambiguities. Then, a demonstration system integrating natural language text and graphics is implemented, whose core is the computational model.This thesis ends with an evaluation of the computational model. It provides some concrete evidence about the advantages and disadvantages of the above approach

    Proceedings

    Get PDF
    Proceedings of the NODALIDA 2011 Workshop Constraint Grammar Applications. Editors: Eckhard Bick, Kristin Hagen, Kaili Müürisep, Trond Trosterud. NEALT Proceedings Series, Vol. 14 (2011), vi+69 pp. © 2011 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/19231
    corecore