Location of Repository

Learning grammars for noun phrase extraction by partition search

By Anja Belz

Abstract

This paper describes an application of Grammar Learning by Partition Search to noun phrase extraction, an essential task in information extraction and many other N L P applications. Grammar Learning by Partition Search is a general method for automatically constructing grammars for a range of parsing tasks; it constructs an optimised probabilistic context-free grammar by searching a space of nonterminal set partitions, looking for a partition that maximises parsing performance and minimises grammar size. The idea is that the considerable time and cost involved in building new grammars can be avoided if instead existing grammars can be automatically adapted to new parsing tasks and new domains. This paper presents results for applying Partition Search to the tasks of (i) identifying flat N P chunks, and (ii) identifying all N Ps in a text. For N P chunking, Partition Search improves a general baseline result by 12.7%, and a method- specific baseline by 2.2%. For N P identification, Partition Search improves the general baseline by 21.45%, and the method-specific one by 3.48%. Even though the grammars are nonlexicalised, results for N P identification closely match the best existing results for lexicalised approaches

Topics: Q100 Linguistics
Publisher: John Benjamins Publishing Company
Year: 2002
OAI identifier: oai:eprints.brighton.ac.uk:3206

Suggested articles

Preview

Citations

  1. (2000). A comparison of PCFG models.
  2. (1969). A practical method for constructing LR(k) processors.
  3. (2000). Applying system combination to base noun phrase identification.
  4. (1994). Contextsensitive statistics for improved grammatical language models.
  5. (2000). Evaluating two methods for treebank grammar compaction.
  6. (2002). Grammar learning by partition search.
  7. (2001). Learning computational grammars.
  8. (2000). LoPar: Design and implementation.
  9. (2001). Optimising corpus-derived probabilistic grammars.
  10. (1991). Parsing by chunks. In
  11. (1995). Partitioning grammars and composing parsers.
  12. (1998). PCFG models of linguistic tree representations.
  13. (2000). Robust German noun chunking with a probabilistic context-free grammar.
  14. (1996). Tree-bank grammars.

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.