Search CORE

14 research outputs found

MISPRONUNCIATION DETECTION AND DIAGNOSIS IN MANDARIN ACCENTED ENGLISH SPEECH

Author: Khanal Subash
Publication venue: UKnowledge
Publication date: 01/01/2020
Field of study

This work presents the development, implementation, and evaluation of a Mispronunciation Detection and Diagnosis (MDD) system, with application to pronunciation evaluation of Mandarin-accented English speech. A comprehensive detection and diagnosis of errors in the Electromagnetic Articulography corpus of Mandarin-Accented English (EMA-MAE) was performed by using the expert phonetic transcripts and an Automatic Speech Recognition (ASR) system. Articulatory features derived from the parallel kinematic data available in the EMA-MAE corpus were used to identify the most significant articulatory error patterns seen in L2 speakers during common mispronunciations. Using both acoustic and articulatory information, an ASR based Mispronunciation Detection and Diagnosis (MDD) system was built and evaluated across different feature combinations and Deep Neural Network (DNN) architectures. The MDD system captured mispronunciation errors with a detection accuracy of 82.4%, a diagnostic accuracy of 75.8% and a false rejection rate of 17.2%. The results demonstrate the advantage of using articulatory features in revealing the significant contributors of mispronunciation as well as improving the performance of MDD systems

University of Kentucky

Apraxia World: Deploying a Mobile Game and Automatic Speech Recognition for Independent Child Speech Therapy

Author: Hair Adam
Publication venue
Publication date: 27/04/2021
Field of study

Children with speech sound disorders typically improve pronunciation quality by undergoing speech therapy, which must be delivered frequently and with high intensity to be effective. As such, clinic sessions are supplemented with home practice, often under caregiver supervision. However, traditional home practice can grow boring for children due to monotony. Furthermore, practice frequency is limited by caregiver availability, making it difficult for some children to reach therapy dosage. To address these issues, this dissertation presents a novel speech therapy game to increase engagement, and explores automatic pronunciation evaluation techniques to afford children independent practice. Children with speech sound disorders typically improve pronunciation quality by undergoing speech therapy, which must be delivered frequently and with high intensity to be effective. As such, clinic sessions are supplemented with home practice, often under caregiver supervision. However, traditional home practice can grow boring for children due to monotony. Furthermore, practice frequency is limited by caregiver availability, making it difficult for some children to reach therapy dosage. To address these issues, this dissertation presents a novel speech therapy game to increase engagement, and explores automatic pronunciation evaluation techniques to afford children independent practice. The therapy game, called Apraxia World, delivers customizable, repetition-based speech therapy while children play through platformer-style levels using typical on-screen tablet controls; children complete in-game speech exercises to collect assets required to progress through the levels. Additionally, Apraxia World provides pronunciation feedback according to an automated pronunciation evaluation system running locally on the tablet. Apraxia World offers two advantages over current commercial and research speech therapy games; first, the game provides extended gameplay to support long therapy treatments; second, it affords some therapy practice independence via automatic pronunciation evaluation, allowing caregivers to lightly supervise instead of directly administer the practice. Pilot testing indicated that children enjoyed the game-based therapy much more than traditional practice and that the exercises did not interfere with gameplay. During a longitudinal study, children made clinically-significant pronunciation improvements while playing Apraxia World at home. Furthermore, children remained engaged in the game-based therapy over the two-month testing period and some even wanted to continue playing post-study. The second part of the dissertation explores word- and phoneme-level pronunciation verification for child speech therapy applications. Word-level pronunciation verification is accomplished using a child-specific template-matching framework, where an utterance is compared against correctly and incorrectly pronounced examples of the word. This framework identified mispronounced words better than both a standard automated baseline and co-located caregivers. Phoneme-level mispronunciation detection is investigated using a technique from the second-language learning literature: training phoneme-specific classifiers with phonetic posterior features. This method also outperformed the standard baseline, but more significantly, identified mispronunciations better than student clinicians

Texas A&M Repository

Automatic Screening of Childhood Speech Sound Disorders and Detection of Associated Pronunciation Errors

Author: Shahin Mostafa
Publication venue: UNSW, Sydney
Publication date: 01/01/2023
Field of study

Speech disorders in children can affect their fluency and intelligibility. Delay in their diagnosis and treatment increases the risk of social impairment and learning disabilities. With the significant shortage of Speech and Language Pathologists (SLPs), there is an increasing interest in Computer-Aided Speech Therapy tools with automatic detection and diagnosis capability. However, the scarcity and unreliable annotation of disordered child speech corpora along with the high acoustic variations in the child speech data has impeded the development of reliable automatic detection and diagnosis of childhood speech sound disorders. Therefore, this thesis investigates two types of detection systems that can be achieved with minimum dependency on annotated mispronounced speech data. First, a novel approach that adopts paralinguistic features which represent the prosodic, spectral, and voice quality characteristics of the speech was proposed to perform segment- and subject-level classification of Typically Developing (TD) and Speech Sound Disordered (SSD) child speech using a binary Support Vector Machine (SVM) classifier. As paralinguistic features are both language- and content-independent, they can be extracted from an unannotated speech signal. Second, a novel Mispronunciation Detection and Diagnosis (MDD) approach was introduced to detect the pronunciation errors made due to SSDs and provide low-level diagnostic information that can be used in constructing formative feedback and a detailed diagnostic report. Unlike existing MDD methods where detection and diagnosis are performed at the phoneme level, the proposed method achieved MDD at the speech attribute level, namely the manners and places of articulations. The speech attribute features describe the involved articulators and their interactions when making a speech sound allowing a low-level description of the pronunciation error to be provided. Two novel methods to model speech attributes are further proposed in this thesis, a frame-based (phoneme-alignment) method leveraging the Multi-Task Learning (MTL) criterion and training a separate model for each attribute, and an alignment-free jointly-learnt method based on the Connectionist Temporal Classification (CTC) sequence to sequence criterion. The proposed techniques have been evaluated using standard and publicly accessible adult and child speech corpora, while the MDD method has been validated using L2 speech corpora

UNSWorks

Proceedings of the Sixteenth Australasian International Conference on Speech Science and Technology

Author
Publication venue: ASSTA
Publication date: 31/12/2016
Field of study

UCL Discovery

Psychological Engagement in Choice and Judgment Under Risk and Uncertainty

Author: Baranova Victoria
Buontempo Michele
Heilman Renata
Kusev Petko
Publication venue: The Psychonomic Society
Publication date
Field of study

Theories of choice and judgment assume that agents behave rationally, choose the higher expected value option, and evaluate the choice consistently (Expected Utility Theory, Von Neumann, & Morgenstern, 1947). However, researchers in decision-making showed that human behaviour is different in choice and judgement tasks (Slovic & Lichtenstein, 1968; 1971; 1973). In this research, we propose that psychological engagement and control deprivation predict behavioural inconsistencies and utilitarian performance with judgment and choice. Moreover, we explore the influences of engagement and control deprivation on agent’s behaviours, while manipulating content of utility (Kusev et al., 2011, Hertwig & Gigerenzer 1999, Tversky & Khaneman, 1996) and decision reward (Kusev et al, 2013, Shafir et al., 2002)

University of Huddersfield Repository

Grafting Acoustic Instruments and Signal Processing: Creative Control and Augmented Expressivity

Author: Freed Adrian
Overholt Daniel
Publication venue
Publication date: 01/01/2013
Field of study

VBN

A case study of a school in Zimbabwe: Investigating challenges faced by rural O-level students and strategies used by teachers in the English Reading- comprehension classes.

Author: Kanyoka Maxwell Obediah
Publication venue
Publication date: 01/01/2018
Field of study

UDORA - University of Derby Online Research Archive

Multimedia Development of English Vocabulary Learning in Primary School

Author: Syaiful Rohim Aim
Publication venue: ICCE 2014 Organizing Committee, Japan
Publication date: 01/01/2014
Field of study

In this paper, we describe a prototype of web-based intelligent handwriting education system for autonomous learning of Bengali characters. Bengali language is used by more than 211 million people of India and Bangladesh. Due to the socio-economical limitation, all of the population does not have the chance to go to school. This research project was aimed to develop an intelligent Bengali handwriting education system. As an intelligent tutor, the system can automatically check the handwriting errors, such as stroke production errors, stroke sequence errors, stroke relationship errors and immediately provide a feedback to the students to correct themselves. Our proposed system can be accessed from smartphone or iPhone that allows students to do practice their Bengali handwriting at anytime and anywhere. Bengali is a multi-stroke input characters with extremely long cursive shaped where it has stroke order variability and stroke direction variability. Due to this structural limitation, recognition speed is a crucial issue to apply traditional online handwriting recognition algorithm for Bengali language learning. In this work, we have adopted hierarchical recognition approach to improve the recognition speed that makes our system adaptable for web-based language learning. We applied writing speed free recognition methodology together with hierarchical recognition algorithm. It ensured the learning of all aged population, especially for children and older national. The experimental results showed that our proposed hierarchical recognition algorithm can provide higher accuracy than traditional multi-stroke recognition algorithm with more writing variability

UHAMKA Repository

A Sound Approach to Language Matters: In Honor of Ocke-Schwen Bohn

Author: Avesani Cinzia
Baker Brett Joseph
Balling Laura Winther
Behne Dawn M.
Best Catherine
Bundgaard-Nielsen Rikke
Carlet Angélica
Cebrian Juli
Christensen Ken Ramshøj
Cooper Angela
Flege James Emil
Hejná Michaela
Hejná Mísa
Horslund Camilla Søballe
Hua Congehao
Højen Anders
Højen Anders
Jespersen Anna
Jespersen Anna Bothe
Jongman Allard
Jørgensen Henrik
Karmeli Sophia
Kizach Johannes
Kluge Denise Cristina
Lee Goun
Li Bin
Li Yingjie
Masapollo Matthew
Mooshammer Christine
Mora Joan C.
Mora-Plaza Ingrid
Niebuhr Oliver
Nyvad Anne Mette
Nyvad Anne Mette
Piske Thorsten
Polka Linda
Rasmussen Sidsel
Ruan Yufang
Sereno Joan A.
Steinlen Anja
Sørensen Mette Hjortshøj
Sørensen Mette Hjortshøj
Tyler Michael
Vayra Mario
Vikner Sten
Wang Yue
Wayland Ratree
Whalen D. H.
Wood Johanna
Yan Mengzhu
Publication venue: 'Aarhus University Library'
Publication date: 16/05/2019
Field of study

The contributions in this Festschrift were written by Ocke’s current and former PhD-students, colleagues and research collaborators. The Festschrift is divided into six sections, moving from the smallest building blocks of language, through gradually expanding objects of linguistic inquiry to the highest levels of description - all of which have formed a part of Ocke’s career, in connection with his teaching and/or his academic productions: “Segments”, “Perception of Accent”, “Between Sounds and Graphemes”, “Prosody”, “Morphology and Syntax” and “Second Language Acquisition”. Each one of these illustrates a sound approach to language matters

AU Library Scholarly Publishing Services: E-books (Aarhus University)

Pre service Teachers Self Reflection in Micro Teaching Class

Author: Tutyandari Caecilia
Publication venue
Publication date: 01/01/2014
Field of study

Repository Universitas Sanata Dharma