1 research outputs found

    Towards Named Entity Extraction and Translation in Spoken Language Translation

    Get PDF
    In this paper we propose a new method of detecting and translating named entities (NE) from spoken language, e.g., Chinese broadcast news. This approach detects possible NE regions from less reliably recognized hypotheses using confidence measures. Each possible NE boundary within the region is compared with candidate NEs from retrieved documents based on their acoustic similarities and semantic correlations. These candidate NEs are re-ranked by additionally incorporating general and topic-specific language models to measure the NE context consistency. This approach, combined with the HMM-based NE extraction on confidently recognized words, improves NE extraction F-score from 66% to 71% and NE translation quality from 69% to 73% over the baseline method. Systematic comparisons on NE translation quality with different speech input quality are also presented
    corecore