Skip to main content
Article thumbnail
Location of Repository

Edit Detection and Parsing for Transcribed Speech

By Eugene Charniak and Mark Johnson

Abstract

We present a simple architecture for parsing transcribed speech in which an edited-word detector first removes such words from the sentence string, and then a standard statistical parser trained on transcribed speech parses the remaining words. The edit detector achieves a misclassification rate on edited words of 2.2%. (The NULL-model, which marks everything as not edited, has an error rate of 5.9%.) To evaluate our parsing results we introduce a new evaluation metric, the purpose of which is to make evaluation of a parse tree relatively indi#erent to the exact tree position of EDITED nodes. By this metric the parser achieves 85.3% precision and 86.5% recall

Year: 2001
OAI identifier: oai:CiteSeerX.psu:10.1.1.19.7955
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://acl.ldc.upenn.edu/N/N01... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.