17 research outputs found

    Automating Intended Target Identification for Paraphasias in Discourse using a large language model

    Get PDF
    Purpose: To date, there are no automated tools for the identification and fine-grained classification of paraphasias within discourse, the production of which is the hallmark characteristic of most people with aphasia (PWA). In this work, we fine-tune a large language model (LLM) to automatically predict paraphasia targets in Cinderella story retellings. Method: Data consisted of 332 Cinderella story retellings containing 2,489 paraphasias from PWA, for which research assistants identified their intended targets. We supplemented these training data with 256 sessions from control participants, to which we added 2,415 synthetic paraphasias. We conducted four experiments using different training data configurations to fine-tune the LLM to automatically “fill in the blank” of the paraphasia with a predicted target, given the context of the rest of the story retelling. We tested the experiments\u27 predictions against our human-identified targets and stratified our results by ambiguity of the targets and clinical factors. Results: The model trained on controls and PWA achieved 50.7% accuracy at exactly matching the human-identified target. Fine-tuning on PWA data, with or without controls, led to comparable performance. The model performed better on targets with less human ambiguity and on paraphasias from participants with fluent or less severe aphasia. Conclusions: We were able to automatically identify the intended target of paraphasias in discourse using just the surrounding language about half of the time. These findings take us a step closer to automatic aphasic discourse analysis. In future work, we will incorporate phonological information from the paraphasia to further improve predictive utility

    Towards Automatic Speech-Language Assessment for Aphasia Rehabilitation

    Full text link
    Speech-based technology has the potential to reinforce traditional aphasia therapy through the development of automatic speech-language assessment systems. Such systems can provide clinicians with supplementary information to assist with progress monitoring and treatment planning, and can provide support for on-demand auxiliary treatment. However, current technology cannot support this type of application due to the difficulties associated with aphasic speech processing. The focus of this dissertation is on the development of computational methods that can accurately assess aphasic speech across a range of clinically-relevant dimensions. The first part of the dissertation focuses on novel techniques for assessing aphasic speech intelligibility in constrained contexts. The second part investigates acoustic modeling methods that lead to significant improvement in aphasic speech recognition and allow the system to work with unconstrained speech samples. The final part demonstrates the efficacy of speech recognition-based analysis in automatic paraphasia detection, extraction of clinically-motivated quantitative measures, and estimation of aphasia severity. The methods and results presented in this work will enable robust technologies for accurately recognizing and assessing aphasic speech, and will provide insights into the link between computational methods and clinical understanding of aphasia.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/140840/1/ducle_1.pd

    Uncovering the potential for a weakly supervised end-to-end model in recognising speech from patient with post-stroke aphasia

    Get PDF
    Post-stroke speech and language deficits (aphasia) significantly impact patients' quality of life. Many with mild symptoms remain undiagnosed, and the majority do not receive the intensive doses of therapy recommended, due to healthcare costs and/or inadequate services. Automatic Speech Recognition (ASR) may help overcome these difficulties by improving diagnostic rates and providing feedback during tailored therapy. However, its performance is often unsatisfactory due to the high variability in speech errors and scarcity of training datasets. This study assessed the performance of Whisper, a recently released end-to-end model, in patients with post-stroke aphasia (PWA). We tuned its hyperparameters to achieve the lowest word error rate (WER) on aphasic speech. WER was significantly higher in PWA compared to age-matched controls (10.3% vs 38.5%, p < 0.001). We demonstrated that worse WER was related to the more severe aphasia as measured by expressive (overt naming, and spontaneous speech production) and receptive (written and spoken comprehension) language assessments. Stroke lesion size did not affect the performance of Whisper. Linear mixed models accounting for demographic factors, therapy duration, and time since stroke, confirmed worse Whisper performance with left hemispheric frontal lesions. We discuss the implications of these findings for how future ASR can be improved in PWA

    Produkce diskurzu českých mluvčích s afázií: Explorace s využitím usage-based lingvistiky

    Get PDF
    The research in linguistic aphasiology has been dominated by structuralist, rule-based approaches to the study of langauge. However, recent work has shown that analyses based in constructivist, usage-based frameworks can provide explanations to patterns of language processing in aphasia that are difficult to accommodate in structuralist models. The present work follows up on these findings and aims to provide additional evidence for the benefits of the usage-based model by using data from Czech speakers with aphasia, an understudied language in this context. The aims of the study were threefold: to create a collection of samples of aphasic connected speech available to other researchers, to provide a description of the patterns of aphasic discourse production in Czech, and, most importantly, to show potential benefits of usage-based construction grammar for aphasia research. A corpus of the speech of eleven persons with fluent and non-fluent aphasia of varying degrees of severity was created. The corpus consist of more than 23000 word position produced by speakers with aphasia in tasks used to elicit conversational, narrative, descriptive, and procedural discourse. The corpus is lemmatized and morphologically tagged and the transcripts are aligned with audio recordings. A smaller sample of three,...Výzkum v lingvistické afaziologii využíval po dlouhou dobu především strukturalistické přístupy založené na pravidlech. Některé výsledky z poslední doby však ukazují, že konstruktivistické přístupy založené na užívání jazyka (usage-based přístup) dokážou vysvětlit některá specifika zpracování jazyka v afázii, která jsou ve strukturalistickém rámci obtížně vysvětlitelná. Předkládaná dizertační práce navazuje na tyto výzkumy a klade si za cíl předložit další důkazy pro výhodnost usage-přístupu. Využívá přitom data z češtiny, která je v afaziologickém výzkumu značně podreprezentovaná. Práce si stanovila tři cíle: jednak shromáždit projevy českých mluvčích s afázií, které by byly přístupné dalším výzkumníkům, dále podat detailní popis produkce diskurzu v afázii v češtině a konečně ukázat některé přednosti usage-based přístupu pro afaziologii. V rámci práce byl vytvořen korpus jedenácti mluvčích s fluentní a nefluentní afázií s různými stupni závažnosti poruchy. Korpus obsahuje přes 23000 slovních pozic vyprodukovaných mluvčími s afázií sebranými s využitím úkolů, jejichž cílem bylo elicitovat konverzační, narativní, deskriptivní a procedurální diskurz. Korpus je lematizován a morfologicky označkován. Dále je v něm zahrnut menší vzorek řečové produkce tří neurotypických mluvčích se srovnatelnými...Ústav českého jazyka a teorie komunikaceInstitute of Czech Language and Theory of CommunicationFaculty of ArtsFilozofická fakult

    A tale of two lexica: Investigating computational pressures on word representation with neural networks

    Get PDF
    IntroductionThe notion of a single localized store of word representations has become increasingly less plausible as evidence has accumulated for the widely distributed neural representation of wordform grounded in motor, perceptual, and conceptual processes. Here, we attempt to combine machine learning methods and neurobiological frameworks to propose a computational model of brain systems potentially responsible for wordform representation. We tested the hypothesis that the functional specialization of word representation in the brain is driven partly by computational optimization. This hypothesis directly addresses the unique problem of mapping sound and articulation vs. mapping sound and meaning.ResultsWe found that artificial neural networks trained on the mapping between sound and articulation performed poorly in recognizing the mapping between sound and meaning and vice versa. Moreover, a network trained on both tasks simultaneously could not discover the features required for efficient mapping between sound and higher-level cognitive states compared to the other two models. Furthermore, these networks developed internal representations reflecting specialized task-optimized functions without explicit training.DiscussionTogether, these findings demonstrate that different task-directed representations lead to more focused responses and better performance of a machine or algorithm and, hypothetically, the brain. Thus, we imply that the functional specialization of word representation mirrors a computational optimization strategy given the nature of the tasks that the human brain faces

    Modestly Modular vs. Massively Modular Approaches to Phonology

    Get PDF
    Ph. D. ThesisThis thesis considers the extent to which phonology (that is, the phonological processor) can be considered a module of the mind. It is divided into two parts. In the first, an approach of 'modest' modularity owing to Fodor (1983) is explored. In the second, the 'massive' modularity model, due to evolutionary psychologists in general, but Caruthers (2006a) in particular, is examined. Whilst for Fodor (1983, 2000) the mind is only modular around its periphery (i.e. only its input and output systems are modules), for massive modularists the mind is modular through and through, up to and including its central capacities. The two authors, therefore, by extension differ in their definitions of modularity: Fodor (1983, 2000) sees 'informational encapsulation' as being essential to modularity, whereas for Carruthers (2006) domain specificity is much more important. The thesis concludes that whether phonology is a module or not then depends on the definition of modularity, for although a substance-free phonology which has no phonetic grounding could count as strong evidence for the informational encapsulation (and therefore the modularity) of phonology by Fodor's (1983) standards, some aphasiology data has shown that semantic treatments can remediate phonological word finding difficulties in aphasia, which would be indicative that phonology is not domain-specific, and therefore amodular in the terms of massive modularists like Carruthers (2006a).1 In order to answer whether phonology is modular, then, we must first define, once and for all, what modularity (and indeed phonology) means. Until then, the debate remains, and so does my resolve to settle it.Arts and Humanities Research Council, Northern Bridge Doctoral Training Partnershi
    corecore