415 research outputs found

    The Pronunciation Accuracy of Interactive Dialog System for Malaysian Primary School Students

    Get PDF
    This project is to examine the accuracy of using existing speech recognition engine in interactive dialog system for English as second language (ESL) Malaysian primary school student in literacy education. Students are interested to learn literacy using computer that encompasses spoken dialog as it motivates students to be more confidence in reading and pronunciation without depending solely on teachers. This computer assisted learning will improve student’s oral reading ability by using the speech recognition in IDS. By using the system students are able to learn, to read and pronounce a word correctly independently without seeking help from teachers. This study is conducted at Sungai Berembang Primary School involving all 16 female and 18 male standard 2 students aged 8 years old. These students possess various reading pronunciation, abilities, and experience in English language with Malay language as their first language. The main objective of this studyis to examine the accuracy of using an existing speech recognition engine for ESL Malaysian students in literacy education. The specific objectives of this study are to identify requirement and evaluate speech recognition based dialog system for reading accuracy. This kind of speech recognition technology is aiming to provide teacher-similar tutoring ability in children’s phonemic awareness, vocabulary building, word comprehension, and fluent reading.This method has five stages. This method enables to construct a framework. Develop system architecture then analyze and design the system. It also builds the prototype for the system upon the system implementation which will be used in this study is the System Development Research Method.Lastly its observe, test the system and the results of the study and implementation of IDS students found 85% of this has helped the English language after using this system

    Improving generalisation to new speakers in spoken dialogue state tracking

    Get PDF
    Users with disabilities can greatly benefit from personalised voice-enabled environmental-control interfaces, but for users with speech impairments (e.g. dysarthria) poor ASR performance poses a challenge to successful dialogue. Statistical dialogue management has shown resilience against high ASR error rates, hence making it useful to improve the performance of these interfaces. However, little research was devoted to dialogue management personalisation to specific users so far. Recently, data driven discriminative models have been shown to yield the best performance in dialogue state tracking (the inference of the user goal from the dialogue history). However, due to the unique characteristics of each speaker, training a system for a new user when user specific data is not available can be challenging due to the mismatch between training and working conditions. This work investigates two methods to improve the performance with new speakers of a LSTM-based personalised state tracker: The use of speaker specific acoustic and ASRrelated features; and dropout regularisation. It is shown that in an environmental control system for dysarthric speakers, the combination of both techniques yields improvements of 3.5% absolute in state tracking accuracy. Further analysis explores the effect of using different amounts of speaker specific data to train the tracking system

    Using phone features to improve dialogue state tracking generalisation to unseen states

    Get PDF
    The generalisation of dialogue state tracking to unseen dialogue states can be very challenging. In a slot-based dialogue system, dialogue states lie in discrete space where distances between states cannot be computed. Therefore, the model parameters to track states unseen in the training data can only be estimated from more general statistics, under the assumption that every dialogue state will have the same underlying state tracking behaviour. However, this assumption is not valid. For example, two values, whose associated concepts have different ASR accuracy, may have different state tracking performance. Therefore, if the ASR performance of the concepts related to each value can be estimated, such estimates can be used as general features. The features will help to relate unseen dialogue states to states seen in the training data with similar ASR performance. Furthermore, if two phonetically similar concepts have similar ASR performance, the features extracted from the phonetic structure of the concepts can be used to improve generalisation. In this paper, ASR and phonetic structurerelated features are used to improve the dialogue state tracking generalisation to unseen states of an environmental control system developed for dysarthric speakers

    Survey on Evaluation Methods for Dialogue Systems

    Get PDF
    In this paper we survey the methods and concepts developed for the evaluation of dialogue systems. Evaluation is a crucial part during the development process. Often, dialogue systems are evaluated by means of human evaluations and questionnaires. However, this tends to be very cost and time intensive. Thus, much work has been put into finding methods, which allow to reduce the involvement of human labour. In this survey, we present the main concepts and methods. For this, we differentiate between the various classes of dialogue systems (task-oriented dialogue systems, conversational dialogue systems, and question-answering dialogue systems). We cover each class by introducing the main technologies developed for the dialogue systems and then by presenting the evaluation methods regarding this class

    A Survey of Available Corpora For Building Data-Driven Dialogue Systems: The Journal Version

    Get PDF
    During the past decade, several areas of speech and language understanding have witnessed substantial breakthroughs from the use of data-driven models. In the area of dialogue systems, the trend is less obvious, and most practical systems are still built through significant engineering and expert knowledge. Nevertheless, several recent results suggest that data-driven approaches are feasible and quite promising. To facilitate research in this area, we have carried out a wide survey of publicly available datasets suitable for data-driven learning of dialogue systems. We discuss important characteristics of these datasets, how they can be used to learn diverse dialogue strategies, and their other potential uses. We also examine methods for transfer learning between datasets and the use of external knowledge. Finally, we discuss appropriate choice of evaluation metrics for the learning objective
    • …
    corecore