8 research outputs found

    DDSupport: Language Learning Support System that Displays Differences and Distances from Model Speech

    Full text link
    When beginners learn to speak a non-native language, it is difficult for them to judge for themselves whether they are speaking well. Therefore, computer-assisted pronunciation training systems are used to detect learner mispronunciations. These systems typically compare the user's speech with that of a specific native speaker as a model in units of rhythm, phonemes, or words and calculate the differences. However, they require extensive speech data with detailed annotations or can only compare with one specific native speaker. To overcome these problems, we propose a new language learning support system that calculates speech scores and detects mispronunciations by beginners based on a small amount of unannotated speech data without comparison to a specific person. The proposed system uses deep learning--based speech processing to display the pronunciation score of the learner's speech and the difference/distance between the learner's and a group of models' pronunciation in an intuitively visual manner. Learners can gradually improve their pronunciation by eliminating differences and shortening the distance from the model until they become sufficiently proficient. Furthermore, since the pronunciation score and difference/distance are not calculated compared to specific sentences of a particular model, users are free to study the sentences they wish to study. We also built an application to help non-native speakers learn English and confirmed that it can improve users' speech intelligibility

    The Effects of a Digital Articulatory Game on the Ability to Perceive Speech-Sound Contrasts in Another Language

    Get PDF
    Digital and mobile devices enable easy access to applications for the learning of foreign languages. However, experimental studies on the effectiveness of these applications are scarce. Moreover, it is not understood whether the effects of speech and language training generalize to features that are not trained. To this end, we conducted a four-week intervention that focused on articulatory training and learning of English words in 6-7-year-old Finnish-speaking children who used a digital language-learning game app Pop2talk. An essential part of the app is automatic speech recognition that enables assessing children's utterances and giving instant feedback to the players. The generalization of the effects of such training in English were explored by using discrimination tasks before and after training (or the same period of time in a control group). The stimuli of the discrimination tasks represented phonetic contrasts from two non-trained languages, including Russian sibilant consonants and Mandarin tones. We found some improvement with the Russian sibilant contrast in the gamers but it was not statistically significant. No improvement was observed for the tone contrast for the gaming group. A control group with no training showed no improvement in either contrast. The pattern of results suggests that the game may have improved the perception of non-trained speech sounds in some but not all individuals, yet the effects of motivation and attention span on their performance could not be excluded with the current methods. Children's perceptual skills were linked to their word learning in the control group but not in the gaming group where recurrent exposure enabled learning also for children with poorer perceptual skills. Together, the results demonstrate beneficial effects of learning via a digital application, yet raise a need for further research of individual differences in learning.Peer reviewe

    Non-game like training benefits spoken foreign-language processing in children with dyslexia

    Get PDF
    Publisher Copyright: Copyright © 2023 Junttila, Smolander, Karhila, Kurimo and Ylinen.Children with dyslexia often face difficulties in learning foreign languages, which is reflected as weaker neural activation. However, digital language-learning applications could support learning-induced plastic changes in the brain. Here we aimed to investigate whether plastic changes occur in children with dyslexia more readily after targeted training with a digital language-learning game or similar training without game-like elements. We used auditory event-related potentials (ERPs), specifically, the mismatch negativity (MMN), to study learning-induced changes in the brain responses. Participants were 24 school-aged Finnish-speaking children with dyslexia and 24 age-matched typically reading control children. They trained English speech sounds and words with “Say it again, kid!” (SIAK) language-learning game for 5 weeks between ERP measurements. During the game, the players explored game boards and produced English words aloud to score stars as feedback from an automatic speech recognizer. To compare the effectiveness of the training type (game vs. non-game), we embedded in the game some non-game levels stripped of all game-like elements. In the dyslexia group, the non-game training increased the MMN amplitude more than the game training, whereas in the control group the game training increased the MMN response more than the non-game training. In the dyslexia group, the MMN increase with the non-game training correlated with phonological awareness: the children with poorer phonological awareness showed a larger increase in the MMN response. Improved neural processing of foreign speech sounds as indicated by the MMN increase suggests that targeted training with a simple application could alleviate some spoken foreign-language learning difficulties that are related to phonological processing in children with dyslexia.Peer reviewe

    BioVisualSpeech: Deployment of an Interactive Platform for Speech Therapy Sessions With Children

    Get PDF
    Sigmatism is a speech sound disorder (SSD) that prevents people from correctly pro- nouncing sibilant consonant sounds ([Z], [z], [S] and [s]). If left untreated, it can negatively impact children’s ability to communicate and socialize. Parents are advised to seek speech therapy for their kids whenever they are not reaching the milestones that are expected of their age, and while the exercises employed in speech therapy sessions are vital for the treatment of these disorders, they can also become repetitive. BioVisualSpeech is a research project that explores ways to provide biofeedback in speech therapy sessions through the usage of serious games. An example of this is the BioVisualSpeech Therapy Support Platform, an interactive tool that contains many types of games in one place, and which children can play in therapy sessions and at home by using the computer’s microphone to capture their voices. However, because the platform was developed in an academic context, it was important for us to adapt this system to the context of real life in collaboration with speech-language pathologists (SLPs). To achieve this, we set the goal of deploying the platform to SLPs’ computers. For that we first reengineered the system to turn it into an in-session focused application, instead of a system where children can practice with SLPs and at home. In addition, we also integrated Windows Speech Recognition into the platform, made the system easier to install and capable of collecting data from players, such as voice productions that could be used in the future to train better classification models, and other objective parameters concerning game performance. Our deployment with SLPs was accompanied by the questionnaires, documentation and data collection protocol needed to proceed with: firstly, the further validation of the platform along with two of its games and, secondly, the design of a user study focused on gathering voice productions from children. In the end, not only did we get promising results regarding the validation of the platform, but SLPs also got the opportunity to own a system that can continue to be used, and distributed by future researchers, even after the termination of this project.O sigmatismo é uma perturbação da fala que impede quem sofre deste de pronunciar corretamente as consoantes sibilantes ([Z], [z], [S] and [s]). Se deixado por tratar, este pode ter um impacto negativo na habilidade das crianças de comunicar e socializar. Pais destas crianças são aconselhados a procurar consultas de terapia da fala para os seus filhos, e enquanto que os exercícios utilizados durante as sessões de terapia da fala são vitais para o tratamento de perturbações, estes também correm o risco de se tornarem repetitivos. BioVisualSpeech é um projeto de investigação que explora formas de fornecer bio- feedback em sessões de terapia da fala através de jogos sérios. Um exemplo destes é a Plataforma de Apoio à Terapia da Fala da BioVisualSpeech, um sistema que contém vários tipos de jogos que as crianças podem jogar em sessões de terapia e em casa, utilizando para isso o microfone do computador para capturarem as suas vozes. Contudo, visto que a plataforma foi desenvolvida num contexto académico, era importante adaptá-la ao contexto do mundo real em colaboração com terapeutas da fala e da linguagem (TFLs). Assim, o objetivo desta dissertação foi implantar a plataforma para os computadores de TFLs. Para isso foi primeiro preciso mudar o foco da plataforma de modo a se tornar numa aplicação de apoio às sessões de terapia, exclusivamente. Para além disto, também se integrou o Sistema de Reconhecimento de Voz do Windows na plataforma, tornou-se o sistema mais fácil de instalar e capaz de recolher dados dos jogadores, como produções de voz que podem no futuro ser utilizadas para treinar melhores classificadores de fala, e outros parâmetros objetivos sobre os jogos. A implantação com TFLs foi acompanhada pelos questionários, documentação e protocolo necessários para proceder com: primeiro, a validação da plataforma e dois dos seus jogos e, segundo, o desenho de um estudo focado na recolha de produções de voz de crianças. No final, não só foram obtidos resultados promissores no que toca à validação da plataforma, mas os TFLs também tiveram a oportunidade de ficar com um sistema que pode continuar a ser utilizado mesmo depois deste projeto acabar
    corecore