Article thumbnail

Faster parsing and supertagging model estimation

By Jonathan K. Kummerfeld, James R. Curran and Jessika Roesner

Abstract

Parsers are often the bottleneck for data acquisition, processing text too slowly to be widely applied. One way to improve the efficiency of parsers is to construct more confident statistical models. More training data would enable the use of more sophisticated features and also provide more evidence for current features, but gold standard annotated data is limited and expensive to produce. We demonstrate faster methods for training a supertagger using hundreds of millions of automatically annotated words, constructing statistical models that further constrain the number of derivations the parser must consider. By introducing new features and using an automatically annotated corpus we are able to double parsing speed on Wikipedia and the Wall Street Journal, and gain accuracy slightly when parsing Sectio

Year: 2013
OAI identifier: oai:CiteSeerX.psu:10.1.1.383.866
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://aclweb.org/anthology//U... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.