10,371 research outputs found
Acquiring Word-Meaning Mappings for Natural Language Interfaces
This paper focuses on a system, WOLFIE (WOrd Learning From Interpreted
Examples), that acquires a semantic lexicon from a corpus of sentences paired
with semantic representations. The lexicon learned consists of phrases paired
with meaning representations. WOLFIE is part of an integrated system that
learns to transform sentences into representations such as logical database
queries. Experimental results are presented demonstrating WOLFIE's ability to
learn useful lexicons for a database interface in four different natural
languages. The usefulness of the lexicons learned by WOLFIE are compared to
those acquired by a similar system, with results favorable to WOLFIE. A second
set of experiments demonstrates WOLFIE's ability to scale to larger and more
difficult, albeit artificially generated, corpora. In natural language
acquisition, it is difficult to gather the annotated data needed for supervised
learning; however, unannotated data is fairly plentiful. Active learning
methods attempt to select for annotation and training only the most informative
examples, and therefore are potentially very useful in natural language
applications. However, most results to date for active learning have only
considered standard classification tasks. To reduce annotation effort while
maintaining accuracy, we apply active learning to semantic lexicons. We show
that active learning can significantly reduce the number of annotated examples
required to achieve a given level of performance
Irish treebanking and parsing: a preliminary evaluation
Language resources are essential for linguistic research and the development of NLP applications. Low- density languages, such as Irish, therefore lack significant research in this area. This paper describes the early stages in the development of new language resources for Irish – namely the first Irish dependency treebank and the first Irish statistical dependency parser. We present the methodology behind building our new treebank and the steps we take to leverage upon the few existing resources. We discuss language specific choices made when defining our dependency labelling scheme, and describe interesting Irish language characteristics such as prepositional attachment, copula and clefting. We manually develop a small treebank of 300 sentences based on an existing POS-tagged corpus and report an inter-annotator agreement of 0.7902. We train MaltParser to achieve preliminary parsing results for Irish and describe a bootstrapping approach for further stages of development
- …