thesis

Towards a Wide-Coverage Grammar for Swedish Using GF

Abstract

This thesis describes work towards a wide-coverage grammar for parsing and generating Swedish text. We do this by using the dependently typed grammar formalism GF, a functional programming language specialized at describing grammars. The idea is to combine existing language resources with new techniques, with an aim to achieve a parser for unrestricted Swedish. To reach this goal, problems of computational as well as linguistic nature had to be solved. The work includes the development of the grammar - to identify and formalize grammatical constructions frequent in Swedish - as well as methods for importing a large-scale lexicon and for evaluating the parser. We present the methods and technologies used and discuss the advantages and problems of using GF for modeling large-scale grammars. We further discuss how our long-term goal can be reached by combining our rule-based grammar with statistical methods. Our contribution is a wide-coverage GF lexicon, a translation of a Swedish treebank into the GF notation and an extended Swedish grammar implementation. The grammar is based on the multilingual abstract syntax given in the GF resource library, and now also covers constructions speci c to Swedish. We further give an example of the advantage of using dependent types when describing grammar and syntax, in this case for dealing with reflexive pronouns

    Similar works