11 research outputs found

    Lexicon Acquisition with and for Symbolic NLP-Systems -- a Bootstrapping Approach

    No full text
    We present a method of applying a broad-coverage LFG grammar of German in the process of semi-automatic lexicon acquisition from corpora. The identification of corpus instances that illustrate a certain subcategorization frame uniquely is done by a comparison of the numbers of analyses the grammar assigns to the corpus instances, under the assumption of different hypothetical lexicon entries for the candidate verb. Filtering conditions expressed on the feature representation output by the grammar further restrict the sentences that the automatic extraction step is based on. Experiments show that the grammar-based method produces better results than a method based on patterns in a corpus query language. 1. Background This paper reports ongoing research activities in methods for semi-automatic lexicon acquisition from corpora (cf. also (Eckle and Heid, 1996; Eckle-Kohler, 1998)). As a test application, the lexical resources constructed with various methods are being used in a broad-coverage LFG 1 grammar of German under development at the IMS Stuttgart. With the method reported in this paper, a bootstrapping cycle is closed: the lexical resources are no longer just applied in the LFG grammar, but application of the grammar also feeds back into the construction of further resources. The grammar development activities are part of a research project on grammar engineering and the Parallel Grammar Project 2 (in collaboration with Xerox PARC and the Xero
    corecore