Location of Repository

Language Identification in the Context of a Human Machine Dialog System

By E. Nöth, S. Harbeck and H. Niemann

Abstract

We present two concepts for systems with language identification in the context of multilingual information retrieval dialogs. The first one has an explicit module for language identification. It is based on training a common codebook for all the languages and integrating over the output probabilities of language specific n--gram models trained over the codebook sequences. The system can decide for one language either after a predefined time interval or if the difference between the probabilities of the languages succeeds a certain threshold. This approach allows to recognize languages that the system can not process and give out a prerecorded message in that language. In the second approach, the trained recognizers of the languages to be recognized, the lexicons, and the language models are combined to one multilingual recognizer. Only allowing transitions between the words from one language, each hypothesized word chain only contains words from one language and language identification is an implicit by-product of the speech recognizer. First results for both language identification approaches are presented

Year: 1996
OAI identifier: oai:CiteSeerX.psu:10.1.1.18.8263
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www5.informatik.uni-erl... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.