Building a lexical functional grammar for Turkish

Abstract

Large-scale, deep grammars with structurally rich output are basic resources for complex tools in human-computer interaction and also for exploring the linguistic phenomena of a language. In this thesis, we introduce a large scale grammar for Turkish implemented in the Lexical Functional Grammar formalism. Developing a large scale grammar requires that several issues be solved, both linguistically and computationally. As the language to be dealt with is Turkish, rich morphological structures play an important role in constructing the basis of the representation. We follow an approach based on building units that are larger than a morpheme but smaller than a word, in encoding rules of the grammar to explain the linguistic phenomena in a more formal and accurate way. Our implementation covers rules ranging from basic constituents such as adjective, adverbial, or prepositional phrases to more complex types with derivations such as sentential complements, sentential adjuncts, and relative clauses. The noun phrase subgrammar is the core of the system. Other important rules deal with several types of sentence structures, free word order, and coordination. Also, a date-time grammar developed earlier is integrated into our system. Some of the frequently occuring phenomena, such as causatives, passives, noun-verb compounds, and non-canonical objects, are also important from a theoretical perspective. We first examine their linguistic representation and then analyze the details of different types of causatives and non-canonical objects by conducting several tests. We then provide their implementation. To evaluate our grammar we have experimented with real world data. Results show that we have a reasonably high coverage in noun phrases (85.5%). We have also integrated our system into a tool called LingBrowser

    Similar works