8,899 research outputs found

    Using State-of-the-Art Speech Models to Evaluate Oral Reading Fluency in Ghana

    Full text link
    This paper reports on a set of three recent experiments utilizing large-scale speech models to evaluate the oral reading fluency (ORF) of students in Ghana. While ORF is a well-established measure of foundational literacy, assessing it typically requires one-on-one sessions between a student and a trained evaluator, a process that is time-consuming and costly. Automating the evaluation of ORF could support better literacy instruction, particularly in education contexts where formative assessment is uncommon due to large class sizes and limited resources. To our knowledge, this research is among the first to examine the use of the most recent versions of large-scale speech models (Whisper V2 wav2vec2.0) for ORF assessment in the Global South. We find that Whisper V2 produces transcriptions of Ghanaian students reading aloud with a Word Error Rate of 13.5. This is close to the model's average WER on adult speech (12.8) and would have been considered state-of-the-art for children's speech transcription only a few years ago. We also find that when these transcriptions are used to produce fully automated ORF scores, they closely align with scores generated by expert human graders, with a correlation coefficient of 0.96. Importantly, these results were achieved on a representative dataset (i.e., students with regional accents, recordings taken in actual classrooms), using a free and publicly available speech model out of the box (i.e., no fine-tuning). This suggests that using large-scale speech models to assess ORF may be feasible to implement and scale in lower-resource, linguistically diverse educational contexts

    Access to recorded interviews: A research agenda

    Get PDF
    Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed

    Computational Language Assessment in patients with speech, language, and communication impairments

    Full text link
    Speech, language, and communication symptoms enable the early detection, diagnosis, treatment planning, and monitoring of neurocognitive disease progression. Nevertheless, traditional manual neurologic assessment, the speech and language evaluation standard, is time-consuming and resource-intensive for clinicians. We argue that Computational Language Assessment (C.L.A.) is an improvement over conventional manual neurological assessment. Using machine learning, natural language processing, and signal processing, C.L.A. provides a neuro-cognitive evaluation of speech, language, and communication in elderly and high-risk individuals for dementia. ii. facilitates the diagnosis, prognosis, and therapy efficacy in at-risk and language-impaired populations; and iii. allows easier extensibility to assess patients from a wide range of languages. Also, C.L.A. employs Artificial Intelligence models to inform theory on the relationship between language symptoms and their neural bases. It significantly advances our ability to optimize the prevention and treatment of elderly individuals with communication disorders, allowing them to age gracefully with social engagement.Comment: 36 pages, 2 figures, to be submite

    Automatic Framework to Aid Therapists to Diagnose Children who Stutter

    Get PDF
    • …
    corecore