8 research outputs found

    Cross-lingual Semantic Parsing with Categorial Grammars

    Get PDF
    Humans communicate using natural language. We need to make sure that computers can understand us so that they can act on our spoken commands or independently gain new insights from knowledge that is written down as text. A “semantic parser” is a program that translates natural-language sentences into computer commands or logical formulas–something a computer can work with. Despite much recent progress on semantic parsing, most research focuses on English, and semantic parsers for other languages cannot keep up with the developments. My thesis aims to help close this gap. It investigates “cross-lingual learning” methods by which a computer can automatically adapt a semantic parser to another language, such as Dutch. The computer learns by looking at example sentences and their translations, e.g., “She likes to read books”/”Ze leest graag boeken”. Even with many such examples, learning which word means what and how word meanings combine into sentence meanings is a challenge, because translations are rarely word-for-word. They exhibit grammatical differences and non-literalities. My thesis presents a method for tackling these challenges based on the grammar formalism Combinatory Categorial Grammar. It shows that this is a suitable formalism for this purpose, that many structural differences between sentences and their translations can be dealt with in this framework, and that a (rudimentary) semantic parser for Dutch can be learned cross-lingually based on one for English. I also investigate methods for building large corpora of texts annotated with logical formulas to further study and improve semantic parsers

    Projection in discourse:A data-driven formal semantic analysis

    Get PDF
    A sentence like "Bertrand, a famous linguist, wrote a book" contains different contributions: there is a person named "Bertrand", he is a famous linguist, and he wrote a book. These contributions convey different types of information; while the existence of Bertrand is presented as given information---it is presupposed---the other contributions signal new information. Moreover, the contributions are affected differently by linguistic constructions. The inference that Bertrand wrote a book disappears when the sentence is negated or turned into interrogative form, while the other contributions survive; this is called 'projection'. In this thesis, I investigate the relation between different types of contributions in a sentence from a theoretical and empirical perspective. I focus on projection phenomena, which include presuppositions ('Bertrand exists' in the aforementioned example) and conventional implicatures ('Bertrand is a famous linguist'). I argue that the differences between the contributions can be explained in terms of information status, which describes how content relates to the unfolding discourse context. Based on this analysis, I extend the widely used formal representational system Discourse Representation Theory (DRT) with an explicit representation of the different contributions made by projection phenomena; this extension is called 'Projective Discourse Representation Theory' (PDRT). I present a data-driven computational analysis based on data from the Groningen Meaning Bank, a corpus of semantically annotated texts. This analysis shows how PDRT can be used to learn more about different kinds of projection behaviour. These results can be used to improve linguistically oriented computational applications such as automatic translation systems

    Scope Disambiguation as a Tagging Task

    No full text

    Scope Disambiguation as a Tagging Task

    No full text

    Scope Disambiguation as a Tagging Task

    No full text
    corecore