Search CORE

3,465 research outputs found

Correlating ASR Errors with Developmental Changes in Speech Production: A Study of 3-10-Year-Old European Portuguese Children's Speech

Author: Abad Alberto
Candeias Sara
Cho Hyongsil
Hämäläinen Annika
Meinedo Hugo
Pellegrini Thomas
Sales Dias Miguel
Tjalve Michael
Trancoso Isabel
Publication venue: HAL CCSD
Publication date: 01/01/2014
Field of study

International audienceAutomatically recognising children's speech is a very difficult task. This difficulty can be attributed to the high variability in children's speech, both within and across speakers. The variability is due to developmental changes in children's anatomy, speech production skills et cetera, and manifests itself, for example, in fundamental and formant frequencies, the frequency of disfluencies, and pronunciation quality. In this paper, we report the results of acoustic and auditory analyses of 3-10-year-old European Portuguese children's speech. Furthermore, we are able to correlate some of the pronunciation error patterns revealed by our analyses - such as the truncation of consonant clusters - with the errors made by a children's speech recogniser trained on speech collected from the same age group. Other pronunciation error patterns seem to have little or no impact on speech recognition performance. In future work, we will attempt to use our findings to improve the performance of our recogniser

Automatically Recognising European Portuguese Children's Speech

Author: A. Potamianos
B. Vieru
C. Bowen
H. Guerreiro
J.E. Huber
J.M. Barbosa
M. Gerosa
M.H. Mateus
P. Boersma
P. Grunwell
S. Candeias
S. Eguchi
S. Frota
S. Lee
S. Narayanan
S. Young
Publication venue: HAL CCSD
Publication date: 01/01/2014
Field of study

International audienceThis paper reports findings from an analysis of errors made by an automatic speech recogniser trained and tested with 3-10-year-old European Portuguese children's speech. We expected and were able to identify frequent pronunciation error patterns in the children's speech. Furthermore, we were able to correlate some of these pronunciation error patterns and automatic speech recognition errors. The findings reported in this paper are of phonetic interest but will also be useful for improving the performance of automatic speech recognisers aimed at children representing the target population of the study

A process-oriented language for describing aspects of reading comprehension

Author: Brown John Seely
Bruce Bertram C.
Rubin Ann D.
Publication venue: Cambridge, Mass. : Bolt Beranek and Newman, Inc.
Publication date: 01/10/1976
Field of study

Includes bibliographical references (p. 36-38)The research described herein was supported in part by the National Institute of Education under Contract No. MS-NIE-C-400-76-011

Pronunciation Portfolio : How were, are, and will be you?

Author: 岡部浩司
峯松信明
広瀬啓吉
朝川智
Publication venue: IWLeL 2004 Program Committee
Publication date: 31/03/2005
Field of study

No two students are the same. There are about 2 billion students of English on this planet and each student is always evolving through training. This means that there are about 2 billion types of English pronunciation. Despite the tremendous number of pronunciations, there has been no good method so far to represent each pronunciation individually. This study introduces a very novel method to represent the individual pronunciations. The method is based on physical implementation of structural phonology and the implementation can be regarded as a mathematical interpretation of Saussure’s claim that language is a system of conceptual differences and phonic differences. Each student’s pronunciation is acoustically and entirely represented as phonological structure with no dimensions to indicate non-linguistic features like age, gender, speaker, microphone, room, line, etc. This paper examines whether the structural representation can provide a good tool for pronunciation assessment. Results of experiments with good and intentionally-bad pronunciations of a single speaker showed that all the students used in the experiment are acoustically located between the two pronunciations, indicating that the students are judged to be acoustically closer to the speaker than the speaker himself is. This result shows that the proposed method can delete the irrelevant factors effectively and is extremely reliable in CALL

Automatic transcription and phonetic labelling of dyslexic children's reading in Bahasa Melayu

Author: Nik Nurhidayat Nik Him
Publication venue
Publication date: 01/01/2015
Field of study

Automatic speech recognition (ASR) is potentially helpful for children who suffer from dyslexia. Highly phonetically similar errors of dyslexic children‟s reading affect the accuracy of ASR. Thus, this study aims to evaluate acceptable accuracy of ASR using automatic transcription and phonetic labelling of dyslexic children‟s reading in BM. For that, three objectives have been set: first to produce manual transcription and phonetic labelling; second to construct automatic transcription and phonetic labelling using forced alignment; and third to compare between accuracy using automatic transcription and phonetic labelling and manual transcription and phonetic labelling. Therefore, to accomplish these goals methods have been used including manual speech labelling and segmentation, forced alignment, Hidden Markov Model (HMM) and Artificial Neural Network (ANN) for training, and for measure accuracy of ASR, Word Error Rate (WER) and False Alarm Rate (FAR) were used. A number of 585 speech files are used for manual transcription, forced alignment and training experiment. The recognition ASR engine using automatic transcription and phonetic labelling obtained optimum results is 76.04% with WER as low as 23.96% and FAR is 17.9%. These results are almost similar with ASR engine using manual transcription namely 76.26%, WER as low as 23.97% and FAR a 17.9%. As conclusion, the accuracy of automatic transcription and phonetic labelling is acceptable to use it for help dyslexic children learning using ASR in Bahasa Melayu (BM

Universiti Utara Malaysia: UUM eTheses

Promoting Phonological Awareness in Young Children through At-Home Activities: A Video Curriculum

Author: Kwak Kathleen A.
Publication venue: ScholarWorks@CWU
Publication date: 01/01/1999
Field of study

Research relating phonological awareness, beginning reading acquisition, and parental involvement in children\u27s literacy development was read, evaluated, and summarized. A positive relationship between phonological awareness and learning to read was indicated from this review, and a correlation between parental literacy activities and children\u27s language and reading acquisition was found. Studies suggesting the existence of a developmental sequence of phonological skills were examined. The literature review provided a rationale and design for phonological awareness instruction. A research supported curriculum containing a teacher\u27s manual, take-home interactive video activities and activity sheets, and assessments was created

ScholarWorks at Central Washington University

Atypical cortical tracking of the speech envelope in children who stutter: a potential contributor towards phonological processing differences

Author: McKenzie Megan
Publication venue
Publication date: 04/05/2020
Field of study

A growing body of evidence suggests that individuals with developmental stuttering exhibit phonological processing differences when compared to fluent peers. However, it has yet to be unveiled which factors may contribute towards this atypical processing. It has been argued that the speech mechanisms which process these phonological units are monitored within a hierarchical system, whose foundation is controlled by low-frequency neural oscillating networks (Giraud & Poeppel, 2015). Thus, phonological processing differences may arise due to impairments in fundamental mechanisms associated with low-frequency neural oscillating networks, such as temporal speech encoding. For this reason, this study sought to investigate cortical temporal response functions in 14 children who stutter (3-7 years of age) compared to 13 normally fluent peers. EEG data were recorded as participants encoded natural speech during a dichotic listening task. When comparing between groups, the results provide evidence that children who stutter experience significantly weaker cortical tracking for unattended speech and more efficient cortical tracking for attended speech, suggesting that phonological processing is atypical at the level of speech envelope encoding. Considering these findings, we propose that children who stutter may be increasing cognitive effort during speech and language processing, in order to compensate for an atypical phonological processing mechanism

Gender detection in children’s speech utterances for human-robot interaction

Author: Abdul-Hassan Alia Karim
Badr Ameer Abdul-Baqi
Publication venue: Institute of Advanced Engineering and Science
Publication date: 01/10/2022
Field of study

The human voice speech essentially includes paralinguistic information used in many real-time applications. Detecting the children’s gender is considered a challenging task compared to the adult’s gender. In this study, a system for human-robot interaction (HRI) is proposed to detect the gender in children’s speech utterances without depending on the text. The robot's perception includes three phases: Feature’s extraction phase where four formants are measured at each glottal pulse and then a median is calculated across these measurements. After that, three types of features are measured which are formant average (AF), formant dispersion (DF), and formant position (PF). Feature’s standardization phase where the measured feature dimensions are standardized using the z-score method. The semantic understanding phase is where the children’s gender is detected accurately using the logistic regression classifier. At the same time, the action of the robot is specified via a speech response using the text to speech (TTS) technique. Experiments are conducted on the Carnegie Mellon University (CMU) Kids dataset to measure the suggested system’s performance. In the suggested system, the overall accuracy is 98%. The results show a relatively clear improvement in terms of accuracy of up to 13% compared to related works that utilized the CMU Kids dataset

ZENODO

Institute of Advanced Engineering and Science