107 research outputs found

    Estimating articulatory parameters from the acoustic speech signal

    Get PDF

    Lip Synchronization by Acoustic Inversion

    Get PDF

    On Generating Combilex Pronunciations via Morphological Analysis

    Get PDF
    Combilex is a high-quality lexicon that has been developed specifically for speech technology purposes and recently released by CSTR. Combilex benefits from many advanced features. This paper explores one of these: the ability to generate fully-specified transcriptions for morphologically derived words automatically. This functionality was originally implemented to encode the pronunciations of derived words in terms of their constituent morphemes, thus accelerating lexicon development and ensuring a high level of consistency. In this paper, we propose this method of modelling pronunciations can be exploited further by combining it with a morphological parser, thus yielding a method to generate full transcriptions for unknown derived words. Not only could this accelerate adding new derived words to Combilex, but it could also serve as an alternative to conventional letter-to-sound rules. This paper presents preliminary work indicating this is a promising direction

    Comparison of HMM and TMD Methods for Lip Synchronisation

    Get PDF
    This paper presents a comparison between a hidden Markov model (HMM) based method and a novel artificial neural network (ANN) based method for lip synchronisation. Both model types were trained on motion tracking data, and a perceptual evaluation was carried out comparing the output of the models, both to each other and to the original tracked data. It was found that the ANN-based method was judged significantly better than the HMM based method. Furthermore, the original data was not judged significantly better than the output of the ANN method

    Enhancing Sequence-to-Sequence Text-to-Speech with Morphology

    Get PDF

    Confidence Intervals for ASR-based TTS Evaluation

    Get PDF

    Generating gestural timing from EMA data using articulatory resynthesis

    Get PDF
    As part of ongoing work to integrate an articulatory synthesizer into a modular TTS platform, a method is presented which allows gestural timings to be generated automatically from EMA data. Further work is outlined which will adapt the vocal tract model and phoneset to English using new articulatory data, and use statistical trajectory models

    Comparison of HMM and TMDN Methods for Lip Synchronisation

    Get PDF
    This paper presents a comparison between a hidden Markov model (HMM) based method and a novel artificial neural network (ANN) based method for lip synchronisation. Both model types were trained on motion tracking data and a perceptual evaluation was carried out comparing the output of the models, both to each other and to the original tracked data. It was found that the ANN based method was judged significantly better than the HMM based method. Furthermore the original data was not judged significantly better than the output of the ANN method. Index Terms: hidden Markov model, mixture density network, lip synchronisation, inversion mappin
    corecore