14 research outputs found

    Using Tone Information in Thai Spelling Speech Recognition

    Get PDF

    A Hidden Conditional Random Field-Based Approach for Thai Tone Classification

    Get PDF
    In Thai, tonal information is a crucial component for identifying the lexical meaning of a word. Consequently, Thai tone classification can obviously improve performance of Thai speech recognition system. In this article, we therefore reported our study of Thai tone classification. Based on our investigation, most of Thai tone classification studies relied on statistical machine learning approaches, especially the Artificial Neural Network (ANN)-based approach and the Hidden Markov Model (HMM)-based approach. Although both approaches gave reasonable performances, they had some limitations due to their mathematical models. We therefore introduced a novel approach for Thai tone classification using a Hidden Conditional Random Field (HCRF)-based approach. In our study, we also investigated tone configurations involving tone features, frequency scaling and normalization techniques in order to fine tune performances of Thai tone classification. Experiments were conducted in both isolated word scenario and continuous speech scenario. Results showed that the HCRF-based approach with the feature F_dF_aF, ERB-rate scaling and a z-score normalization technique yielded the highest performance and outperformed a baseline using the ANN-based approach, which had been reported as the best for the Thai tone classification, in both scenarios. The best performance of HCRF-based approach provided the error rate reduction of 10.58% and 12.02% for isolated word scenario and continuous speech scenario respectively when comparing with the best result of baselines

    An Automatic Real-time Synchronization of Live speech with Its Transcription Approach

    Get PDF
    Most studies in automatic synchronization of speech and transcription focus on the synchronization at the sentence level or the phrase level. Nevertheless, in some languages, like Thai, boundaries of such levels are difficult to linguistically define, especially in case of the synchronization of speech and its transcription. Consequently, the synchronization at a finer level like the syllabic level is promising. In this article, an approach to synchronize live speech with its corresponding transcription in real time at the syllabic level is proposed. Our approach employs the modified real-time syllable detection procedure from our previous work and the transcription verification procedure then adopts to verify correctness and to recover errors caused by the real-time syllable detection procedure. In experiments, the acoustic features and the parameters are customized empirically. Results are compared with two baselines which have been applied to the Thai scenario. Experimental results indicate that, our approach outperforms two baselines with error rate reduction of 75.9% and 41.9% respectively and also can provide results in the real-time situation. Besides, our approach is applied to the practical application, namely ChulaDAISY. Practical experiments show that ChulaDAISY applied with our approach could reduce time consumption for producing audio books

    CHULA TTS: A Modularized Text-To-Speech Framework

    Get PDF

    An Automatic Knowledge Graph Creation Framework from Natural Language Text

    No full text

    A Heuristic Expansion Framework for Mapping Instances to Linked Open Data

    No full text

    Using Tone Information in Thai Spelling Speech Recognition

    No full text
    corecore