Search CORE

18 research outputs found

MultiMediate '22: Backchannel Detection and Agreement Estimation in Group Interactions

Author: André Elisabeth
Bulling Andreas
Dietz Michael
Gebhard Patrick
Lindsay Hali
Müller Philipp
Schiller Dominik
Thomas Dominike
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/09/2022
Field of study

Backchannels, i.e. short interjections of the listener, serve important meta-conversational purposes like signifying attention or indicating agreement. Despite their key role, automatic analysis of backchannels in group interactions has been largely neglected so far. The MultiMediate challenge addresses, for the first time, the tasks of backchannel detection and agreement estimation from backchannels in group conversations. This paper describes the MultiMediate challenge and presents a novel set of annotations consisting of 7234 backchannel instances for the MPIIGroupInteraction dataset. Each backchannel was additionally annotated with the extent by which it expresses agreement towards the current speaker. In addition to a an analysis of the collected annotations, we present baseline results for both challenge tasks.Comment: ACM Multimedia 202

arXiv.org e-Print Archive

Multilingual Learning for Mild Cognitive Impairment Screening from a Clinical Speech Task

Author: Kröger Insa
König Alexandra
Lindsay Hali
Linz Nicklas
Müller Philipp
Ramakers Inez H.G.B.
Tröger Johannes
Verhey Frans R.J.
Zeghari Radia
Publication venue: Association for Computational Linguistics (ACL)
Publication date: 01/01/2021
Field of study

The Semantic Verbal Fluency Task (SVF) is an efficient and minimally invasive speech-based screening tool for Mild Cognitive Impairment (MCI). In the SVF, testees have to produce as many words for a given semantic category as possible within 60 seconds. State-of-the-art approaches for automatic evaluation of the SVF employ word embeddings to analyze semantic similarities in these word sequences. While these approaches have proven promising in a variety of test languages, the small amount of data available for any given language limits the performance. In this paper, we for the first time investigate multilingual learning approaches for MCI classification from the SVF in order to combat data scarcity. To allow for cross-language generalisation, these approaches either rely on translation to a shared language, or make use of several distinct word embeddings. In evaluations on a multilingual corpus of older French, Dutch, and German participants (Controls=66, MCI=66), we show that our multilingual approaches clearly improve over single-language baselines

Maastricht University Research Portal

Language Impairment in Alzheimer’s Disease—Robust and Explainable Evidence for AD-Related Deterioration of Spontaneous Speech Through Multilingual Machine Learning

Author: König Alexandra
Lindsay Hali
Tröger Johannes
Publication venue: 'Frontiers Media SA'
Publication date: 19/05/2021
Field of study

International audienceAlzheimer’s disease (AD) is a pervasive neurodegenerative disease that affects millions worldwide and is most prominently associated with broad cognitive decline, including language impairment. Picture description tasks are routinely used to monitor language impairment in AD. Due to the high amount of manual resources needed for an in-depth analysis of thereby-produced spontaneous speech, advanced natural language processing (NLP) combined with machine learning (ML) represents a promising opportunity. In this applied research field though, NLP and ML methodology do not necessarily ensure robust clinically actionable insights into cognitive language impairment in AD and additional precautions must be taken to ensure clinical-validity and generalizability of results. In this study, we add generalizability through multilingual feature statistics to computational approaches for the detection of language impairment in AD. We include 154 participants (78 healthy subjects, 76 patients with AD) from two different languages (106 English speaking and 47 French speaking). Each participant completed a picture description task, in addition to a battery of neuropsychological tests. Each response was recorded and manually transcribed. From this, task-specific, semantic, syntactic and paralinguistic features are extracted using NLP resources. Using inferential statistics, we determined language features, excluding task specific features, that are significant in both languages and therefore represent “generalizable” signs for cognitive language impairment in AD. In a second step, we evaluated all features as well as the generalizable ones for English, French and both languages in a binary discrimination ML scenario (AD vs. healthy) using a variety of classifiers. The generalizable language feature set outperforms the all language feature set in English, French and the multilingual scenarios. Semantic features are the most generalizable while paralinguistic features show no overlap between languages. The multilingual model shows an equal distribution of error in both English and French. By leveraging multilingual statistics combined with a theory-driven approach, we identify AD-related language impairment that generalizes beyond a single corpus or language to model language impairment as a clinically-relevant cognitive symptom. We find a primary impairment in semantics in addition to mild syntactic impairment, possibly confounded by additional impaired cognitive functions

INRIA a CCSD electronic archive server

PubMed Central

The importance of sharing patient-generated clinical speech and language data

Author: Fraser Kathleen,
König Alexandra
Lindsay Hali
Linz Nicklas
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

International audienceIncreased access to large datasets has driven progress in NLP. However, most computational studies of clinically-validated, patient-generated speech and language involve very few datapoints, as such data are difficult (and expensive) to collect. In this position paper, we argue that we must find ways to promote data sharing across research groups, in order to build datasets of a more appropriate size for NLP and machine learning analysis. We review the benefits and challenges of sharing clinical language data, and suggest several concrete actions by both clinical and NLP researchers to encourage multi-site and multidisciplinary data sharing. We also propose the creation of a collaborative data sharing platform , to allow NLP researchers to take a more active responsibility for data transcription, annotation , and curation

Crossref

INRIA a CCSD electronic archive server

The use of artificial intelligence and automatic speech and image analysis for remote cognitive testing

Author: Konig Alexandra
Lindsay Hali
Ramakers Inez H.G.B.
Tröger Johannes
Publication venue: HAL CCSD
Publication date: 25/10/2019
Field of study

International audienc

INRIA a CCSD electronic archive server

The use of artificial intelligence and automatic speech and image analysis for remote cognitive testing

Author: Konig Alexandra
Lindsay Hali
Ramakers Inez H.G.B.
Tröger Johannes
Publication venue: HAL CCSD
Publication date: 25/10/2019
Field of study

International audienc

INRIA a CCSD electronic archive server

Language Modelling for the Clinical Semantic Verbal Fluency Task

Author: Alexandersson Jan
Hali Lindsay
König Alexandra
Linz Nicklas
Peter Jessica
Robert Philippe
Tröger Johannes
Publication venue
Publication date: 05/08/2018
Field of study

Semantic Verbal Fluency (SVF) tests are common neuropsychological tasks, in which patients are asked to name as many words belonging to a semantic category as they can in 60 seconds. These tests are sensitive to even early forms of dementia caused by e.g. Alzheimer's disease. Performance is usually measured as the total number of correct responses. Clinical research has shown that not only the raw count, but also production strategy is a relevant clinical marker. We employed language modelling (LM) as a natural technique to model production in this task. Comparing different LMs, we show that perplexity of a persons SVF production predicts dementia well (F1 = 0.83). Demented patients show significantly lower perplexity, thus are more predictable. Persons in advanced stages of de-mentia differ in predictability of word choice and production strategy-people in early stages only in predictability of production strategy

Bern Open Repository and Information System (BORIS)

Patients with amnestic MCI Fail to Adapt Executive Control When Repeatedly Tested with Semantic Verbal Fluency Tasks

Author: Klöppel Stefan
Kray Jutta
Lindsay Hali
Linz Nicklas
Mina Mario
Peter Jessica
Tröger Johannes
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/07/2022
Field of study

Objective: Semantic verbal fluency (SVF) tasks require individuals to name items from a specified category within a fixed time. An impaired SVF performance is well documented in patients with amnestic Mild Cognitive Impairment (aMCI). The two leading theoretical views suggest either loss of semantic knowledge or impaired executive control to be responsible. Method: We assessed SVF 3 times on 2 consecutive days in 29 healthy controls (HC) and 29 patients with aMCI with the aim to answer the question which of the two views holds true. Results: When doing the task for the first time, patients with aMCI produced fewer and more common words with a shorter mean response latency. When tested repeatedly, only healthy volunteers increased performance. Likewise, only the performance of HC indicated two distinct retrieval processes: a prompt retrieval of readily available items at the beginning of the task and an active search through semantic space towards the end. With repeated assessment, the pool of readily available items became larger in HC, but not patients with aMCI. Conclusion: The production of fewer and more common words in aMCI points to a smaller search set and supports the loss of semantic knowledge view. The failure to improve performance as well as the lack of distinct retrieval processes point to an additional impairment in executive control. Our data did not clearly favour one theoretical view over the other, but rather indicates that the impairment of patients with aMCI in SVF is due to a combination of both

Bern Open Repository and Information System (BORIS)

Generating Synthetic Clinical Speech Data Through Simulated ASR Deletion Error

Author: Alexandersson Jan
Lindsay Hali
Linz Nicklas
Mina Mario
Müller Philipp
Ramakers Inez
Tröger Johannes
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2022
Field of study

Training classification models on clinical speech is a time-saving and effective solution for many healthcare challenges, such as screening for Alzheimer’s Disease over the phone. One of the primary limiting factors of the success of artificial intelligence (AI) solutions is the amount of relevant data available. Clinical data is expensive to collect, not sufficient for large-scale machine learning or neural methods, and often not shareable between institutions due to data protection laws. With the increasing demand for AI in health systems, generating synthetic clinical data that maintains the nuance of underlying patient pathology is the next pressing task. Previous work has shown that automated evaluation of clinical speech tasks via automatic speech recognition (ASR) is comparable to manually annotated results in diagnostic scenarios even though ASR systems produce errors during the transcription process. In this work, we propose to generate synthetic clinical data by simulating ASR deletion errors on the transcript to produce additional data. We compare the synthetic data to the real data with traditional machine learning methods to test the feasibility of the proposed method. Using a dataset of 50 cognitively impaired and 50 control Dutch speakers, ten additional data points are synthetically generated for each subject, increasing the training size for 100 to 1000 training points. We find consistent and comparable performance of models trained on only synthetic data (AUC=0.77) to real data (AUC=0.77) in a variety of traditional machine learning scenarios. Additionally, linear models are not able to distinguish between real and synthetic data

Maastricht University Research Portal

Language Modelling for the Clinical Semantic Verbal Fluency Task

Author: Alexandersson Jan
Konig Alexandra
Lindsay Hali
Linz Nicklas
Peter Jessica
Robert Philippe
Tröger Johannes
Publication venue: HAL CCSD
Publication date: 08/05/2018
Field of study

International audienceSemantic Verbal Fluency (SVF) tests are common neuropsychological tasks, in which patients are asked to name as many words belonging to a semantic category as they can in 60 seconds. These tests are sensitive to even early forms of dementia caused by e.g. Alzheimer's disease. Performance is usually measured as the total number of correct responses. Clinical research has shown that not only the raw count, but also production strategy is a relevant clinical marker. We employed language modelling (LM) as a natural technique to model production in this task. Comparing different LMs, we show that perplexity of a persons SVF production predicts dementia well (F1 = 0.83). Demented patients show significantly lower perplexity, thus are more predictable. Persons in advanced stages of de-mentia differ in predictability of word choice and production strategy-people in early stages only in predictability of production strategy

INRIA a CCSD electronic archive server

Bern Open Repository and Information System (BORIS)