Search CORE

1 research outputs found

Structural Representation of pronunciation and its application for classifying Japanese learners of English

Author: K. Hirose
K. Kamata
N. Minematsu
S. Asakawa
T. Makino
Publication venue
Publication date
Field of study

One of the most fundamental and unsolved problems in speech recognition is the mismatch problem. Speech systems trained by a specific group of speakers, e.g. adults, do not work well with another group, e.g. children. In the case of CALL, when a student receives a bad score from a system, it may be just because he is an outlier to the system. The problem is that he cannot know whether he is an outlier or not. Recently, a speaker-invariant structural and holistic representation of speech was proposed [1], where only the interrelations among speech sounds were extracted to form their external structure. Speech variation caused by speaker individuality was modeled mathematically and, based on the model, the speaker-invariance was guaranteed. This structural representation was already applied to describe the pronunciations of language learners [2]. Since the non-linguistic factors were well removed, the representation purely showed non-nativeness in the individual pronunciations. In this paper, using the new representation, language learners are automatically classified irrespective of speaker individuality. The classification is also done by an expert phonetician. High correlation is found between the two classifications. 1

CiteSeerX