Detection of prosodic word boundaries by statistical modeling of mora transitions of fundamental frequency contours and its use for continuous speech recognition

Abstract

We have been developing a reliable method of prosodic word boundary detection for Japanese continuous speech based on the statistical modeling of mora transitions of fundamental frequency contours of prosodic words. Modifications in the codebook sizes and in the HMM topologies improved the boundary detection performance. When using mora boundary information obtainable from the phoneme recognition process, the detection rates were reached around 73 % with 12.5 % insertion errors for speaker-open experiments. This method was then integrated to a continuous speech recognition system with un-limited vocabulary. The integrated system conducts recognition process in two stages: first stage to detect mora boundaries without prosodic information and second stage to increase mora recognition rate using prosodic word boundary information. Slight improvements in mora recognition rates were observed both in speaker-closed and-open experiments. 1

    Similar works

    Full text

    thumbnail-image

    Available Versions