1 research outputs found

    Neural Network vs. Rule-Based G2P: A Hybrid Approach to Stress Prediction and Related Vowel Reduction in Bulgarian

    Full text link
    An effective grapheme-to-phoneme (G2P) conversion system is a critical element of speech synthesis. Rule-based systems were an early method for G2P conversion. In recent years, machine learning tools have been shown to outperform rule-based approaches in G2P tasks. We investigate neural network sequence-to-sequence modeling for the prediction of syllable stress and resulting vowel reductions in the Bulgarian language. We then develop a hybrid G2P approach which combines manually written grapheme-to-phoneme mapping rules with neural network-enabled syllable stress predictions by inserting stress markers in the predicted stress position of the transcription produced by the rule-based finite-state transducer. Finally, we apply vowel reduction rules in relation to the position of the stress marker to yield the predicted phonetic transcription of the source Bulgarian word written in Cyrillic graphemes. We compare word error rates between the neural network sequence-to-sequence modeling approach with the hybrid approach and find no significant difference between the two. We conclude that our hybrid approach to syllable stress, vowel reduction, and transcription performs as well as the exclusively machine learning powered approach
    corecore