297 research outputs found

    An Investigation Into the Feasibility of Streamlining Language Sample Analysis Through Computer-Automated Transcription and Scoring

    Get PDF
    The purpose of the study was to investigate the feasibility of streamlining the transcription and scoring portion of language sample analysis (LSA) through computer-automation. LSA is a gold-standard procedure for examining childrens’ language abilities that is underutilized by speech language pathologists due to its time-consuming nature. To decrease the time associated with the process, the accuracy of transcripts produced automatically with Google Cloud Speech and the accuracy of scores generated by a hard-coded scoring function called the Literate Language Use in Narrative Analysis (LLUNA) were evaluated. A collection of narrative transcripts and audio recordings of narrative samples were selected to evaluate the accuracy of these automated systems. Samples were previously elicited from school-age children between the ages of 6;0-11;11 who were either typically developing (TD), at-risk for language-related learning disabilities (AR), or had developmental language disorder (DLD). Transcription error of Google Cloud Speech transcripts was evaluated with a weighted word-error rate (WERw). Score accuracy was evaluated with a quadratic weighted kappa (Kqw). Results indicated an average WERw of 48% across all language sample recordings, with a median WERw of 40%. Several recording characteristics of samples were associated with transcription error including the codec used to recorded the audio sample and the presence of background noise. Transcription error was lower on average for samples collected using a lossless codec, that contained no background noise. Scoring accuracy of LLUNA was high across all six measures of literate language when generated from traditionally produced transcripts, regardless of age or language ability (TD, DLD, AR). Adverbs were most variable in their score accuracy. Scoring accuracy dropped when LLUNA generated scores from transcripts produced by Google Cloud Speech, however, LLUNA was more likely to generate accurate scores when transcripts had low to moderate levels of transcription error. This work provides additional support for the use of automated transcription under the right recording conditions and automated scoring of literate language indices. It also provides preliminary support for streamlining the entire LSA process by automating both transcription and scoring, when high quality recordings of language samples are utilized

    A longitudinal observational study of home-based conversations for detecting early dementia:protocol for the CUBOId TV task

    Get PDF
    INTRODUCTION: Limitations in effective dementia therapies mean that early diagnosis and monitoring are critical for disease management, but current clinical tools are impractical and/or unreliable, and disregard short-term symptom variability. Behavioural biomarkers of cognitive decline, such as speech, sleep and activity patterns, can manifest prodromal pathological changes. They can be continuously measured at home with smart sensing technologies, and permit leveraging of interpersonal interactions for optimising diagnostic and prognostic performance. Here we describe the ContinUous behavioural Biomarkers Of cognitive Impairment (CUBOId) study, which explores the feasibility of multimodal data fusion for in-home monitoring of mild cognitive impairment (MCI) and early Alzheimer’s disease (AD). The report focuses on a subset of CUBOId participants who perform a novel speech task, the ‘TV task’, designed to track changes in ecologically valid conversations with disease progression. METHODS AND ANALYSIS: CUBOId is a longitudinal observational study. Participants have diagnoses of MCI or AD, and controls are their live-in partners with no such diagnosis. Multimodal activity data were passively acquired from wearables and in-home fixed sensors over timespans of 8–25 months. At two time points participants completed the TV task over 5 days by recording audio of their conversations as they watched a favourite TV programme, with further testing to be completed after removal of the sensor installations. Behavioural testing is supported by neuropsychological assessment for deriving ground truths on cognitive status. Deep learning will be used to generate fused multimodal activity-speech embeddings for optimisation of diagnostic and predictive performance from speech alone. ETHICS AND DISSEMINATION: CUBOId was approved by an NHS Research Ethics Committee (Wales REC; ref: 18/WA/0158) and is sponsored by University of Bristol. It is supported by the National Institute for Health Research Clinical Research Network West of England. Results will be reported at conferences and in peer-reviewed scientific journals

    Speech and natural language processing for the assessment of customer satisfaction and neuro-degenerative diseases

    Get PDF
    ABSTRACT: Nowadays, the interest in the automatic analysis of speech and text in different scenarios have been increasing. Currently, acoustic analysis is frequently used to extract non-verbal information related to para-linguistic aspects such as articulation and prosody. The linguistic analysis focuses on capturing verbal information from written sources, which can be suitable to evaluate customer satisfaction, or in health-care applications to assess the state of patients under depression or other cognitive states. In the case of call-centers many of the speech recordings collected are related to the opinion of the customers in different industry sectors. Only a small proportion of these calls are evaluated, whereby these processes can be automated using acoustic and linguistic analysis. In the assessment of neuro-degenerative diseases such as Alzheimer's Disease (AD) and Parkinson's Disease (PD), the symptoms are progressive, directly linked to dementia, cognitive decline, and motor impairments. This implies a continuous evaluation of the neurological state since the patients become dependent and need intensive care, showing a decrease of the ability from individual activities of daily life. This thesis proposes methodologies for acoustic and linguistic analyses in different scenarios related to customer satisfaction, cognitive disorders in AD, and depression in PD. The experiments include the evaluation of customer satisfaction, the assessment of genetic AD, linguistic analysis to discriminate PD, depression assessment in PD, and user state modeling based on the arousal-plane for the evaluation of customer satisfaction, AD, and depression in PD. The acoustic features are mainly focused on articulation and prosody analyses, while linguistic features are based on natural language processing techniques. Deep learning approaches based on convolutional and recurrent neural networks are also considered in this thesis

    Single-Case Pilot Study For Longitudinal Analysis Of Referential Failures And Sentiment In Schizophrenic Speech From Client-Centered Psychotherapy Recordings

    Get PDF
    Though computational linguistic analyses have revealed the presence of distinctly characteristic language features in schizophrenic disordered speech, the relative stability of these language features in longitudinal samples is still unknown. This longitudinal pilot study analyzed schizophrenic disordered speech data from the archival therapy audio recordings of one patient spanning 23 years. End-to-end Neural Coreference Resolution software was used to analyze transcribed speech data from three therapy sessions to identify ambiguous pronouns, referred to as referential failures, which were reviewed and confirmed by multiple raters. Speech samples were analyzed using Google Cloud Natural Language API software for sentiment variables (i.e., score, valence, and magnitude). Referential failures and sentiment variables were analyzed within each session and all sessions combined to study the relationships between these variables within single sessions and over a span of 23 years. Results and implications for this study are discussed

    Measuring the Severity of Depression from Text using Graph Representation Learning

    Get PDF
    The common practice of psychology in measuring the severity of a patient's depressive symptoms is based on an interactive conversation between a clinician and the patient. In this dissertation, we focus on predicting a score representing the severity of depression from such a text. We first present a generic graph neural network (GNN) to automatically rate severity using patient transcripts. We also test a few sequence-based deep models in the same task. We then propose a novel form for node attributes within a GNN-based model that captures node-specific embedding for every word in the vocabulary. This provides a global representation of each node, coupled with node-level updates according to associations between words in a transcript. Furthermore, we evaluate the performance of our GNN-based model on a Twitter sentiment dataset to classify three different sentiments and on Alzheimer's data to differentiate Alzheimer’s disease from healthy individuals respectively. In addition to applying the GNN model to learn a prediction model from the text, we provide post-hoc explanations of the model's decisions for all three tasks using the model's gradients

    Only Words Count; the Rest Is Mere Chattering: A Cross-Disciplinary Approach to the Verbal Expression of Emotional Experience

    Get PDF
    The analysis of sequences of words and prosody, meter, and rhythm provided in an interview addressing the capacity to identify and describe emotions represents a powerful tool to reveal emotional processing. The ability to express and identify emotions was analyzed by means of the Toronto Structured Interview for Alexithymia (TSIA), and TSIA transcripts were analyzed by Natural Language Processing to shed light on verbal features. The brain correlates of the capacity to translate emotional experience into words were determined through cortical thickness measures. A machine learning methodology proved that individuals with deficits in identifying and describing emotions (n = 7) produced language distortions, frequently used the present tense of auxiliary verbs, and few possessive determiners, as well as scarcely connected the speech, in comparison to individuals without deficits (n = 7). Interestingly, they showed high cortical thickness at left temporal pole and low at isthmus of the right cingulate cortex. Overall, we identified the neuro-linguistic pattern of the expression of emotional experience

    Only Words Count; the Rest Is Mere Chattering: A Cross-Disciplinary Approach to the Verbal Expression of Emotional Experience

    Get PDF
    The analysis of sequences of words and prosody, meter, and rhythm provided in an interview addressing the capacity to identify and describe emotions represents a powerful tool to reveal emotional processing. The ability to express and identify emotions was analyzed by means of the Toronto Structured Interview for Alexithymia (TSIA), and TSIA transcripts were analyzed by Natural Language Processing to shed light on verbal features. The brain correlates of the capacity to translate emotional experience into words were determined through cortical thickness measures. A machine learning methodology proved that individuals with deficits in identifying and describing emotions (n = 7) produced language distortions, frequently used the present tense of auxiliary verbs, and few possessive determiners, as well as scarcely connected the speech, in comparison to individuals without deficits (n = 7). Interestingly, they showed high cortical thickness at left temporal pole and low at isthmus of the right cingulate cortex. Overall, we identified the neuro-linguistic pattern of the expression of emotional experience

    Prediction of Alzheimer's disease and semantic dementia from scene description: toward better language and topic generalization

    Full text link
    La segmentation des données par la langue et le thème des tests psycholinguistiques devient de plus en plus un obstacle important à la généralisation des modèles de prédiction. Cela limite notre capacité à comprendre le cœur du dysfonctionnement linguistique et cognitif, car les modèles sont surajustés pour les détails d'une langue ou d'un sujet particulier. Dans ce travail, nous étudions les approches potentielles pour surmonter ces limitations. Nous discutons des propriétés de divers modèles de plonjement de mots FastText pour l'anglais et le français et proposons un ensemble des caractéristiques, dérivées de ces propriétés. Nous montrons que malgré les différences dans les langues et les algorithmes de plonjement, un ensemble universel de caractéristiques de vecteurs de mots indépendantes de la langage est capable de capturer le dysfonctionnement cognitif. Nous soutenons que dans le contexte de données rares, les caractéristiques de vecteur de mots fabriquées à la main sont une alternative raisonnable pour l'apprentissage des caractéristiques, ce qui nous permet de généraliser sur les limites de la langue et du sujet.Data segmentation by the language and the topic of psycholinguistic tests increasingly becomes a significant obstacle for generalization of predicting models. It limits our ability to understand the core of linguistic and cognitive dysfunction because the models overfit the details of a particular language or topic. In this work, we study potential approaches to overcome such limitations. We discuss the properties of various FastText word embedding models for English and French and propose a set of features derived from these properties. We show that despite the differences in the languages and the embedding algorithms, a universal language-agnostic set of word-vector features can capture cognitive dysfunction. We argue that in the context of scarce data, the hand-crafted word-vector features is a reasonable alternative for feature learning, which allows us to generalize over the language and topic boundaries
    • …
    corecore