Search CORE

257 research outputs found

Deep Learning-based Cognitive Impairment Diseases Prediction and Assistance using Multimodal Data

Author: Ortiz Pérez David
Publication venue
Publication date: 25/07/2023
Field of study

In this project, we propose a mobile robot-based system capable of analyzing data from elderly people and patients with cognitive impairment diseases, such as aphasia or dementia. The project entails the deployment of two primary tasks that will be performed by the robot. The first task is the detection of these diseases in their early stages to initiate professional treatment, thereby improving the patient's quality of life. The other task focuses on automatic emotion detection, particularly during interactions with other people, in this case, clinicians. Additionally, the project aims to examine how the combination of different modalities, such as audio or text, can influence the model's results. Extensive research has been conducted on various dementia and aphasia datasets, as well as the implemented tasks. For this purpose, we utilized the DementiaBank and AphasiaBank datasets, which contain multimodal data in different formats, including video, audio, and audio transcriptions. We employed diverse models for the prediction task, including Convolutional Neural Networks for audio classification, Transformers for text classification, and a multimodal model combining both approaches. These models underwent testing on a separate test set, and the best results were achieved using the text modality, achieving a 90.36% accuracy in detecting dementia. Additionally, we conducted a detailed analysis of the available data to explain the obtained results and the model's explainability. The pipeline for automatic emotion recognition was evaluated by manually reviewing initial frames of one hundred randomly selected video samples from the dataset. This pipeline was also employed to recognize emotions in both healthy patients, and those with aphasia. The study revealed that individuals with aphasia express different emotional moods than healthy ones when listening to someone's speech, primarily due to their difficulties in understanding and expressing speech. Due to this, it negatively impacts their mood. Analyzing their emotional state can facilitate improved interactions by avoiding conversations that may have a negative impact on their mood, thus providing better assistance

Repositorio Institucional de la Universidad de Alicante

Natural Language Processing: Emerging Neural Approaches and Applications

Author
Publication venue: 'MDPI AG'
Publication date: 06/05/2022
Field of study

This Special Issue highlights the most recent research being carried out in the NLP field to discuss relative open issues, with a particular focus on both emerging approaches for language learning, understanding, production, and grounding interactively or autonomously from data in cognitive and neural systems, as well as on their potential or real applications in different domains

Directory of Open Access Books (DOAB)

Sequence labeling to detect stuttering events in read speech

Author: Alharbi S.
Brumfitt S.
Green P.
Hasan M.
Simons A.J.H.
Publication venue: 'Elsevier BV'
Publication date: 01/07/2020
Field of study

Stuttering is a speech disorder that, if treated during childhood, may be prevented from persisting into adolescence. A clinician must first determine the severity of stuttering, assessing a child during a conversational or reading task, recording each instance of disfluency, either in real time, or after transcribing the recorded session and analysing the transcript. The current study evaluates the ability of two machine learning approaches, namely conditional random fields (CRF) and bi-directional long-short-term memory (BLSTM), to detect stuttering events in transcriptions of stuttering speech. The two approaches are compared for their performance both on ideal hand-transcribed data and also on the output of automatic speech recognition (ASR). We also study the effect of data augmentation to improve performance. A corpus of 35 speakers’ read speech (13K words) was supplemented with a corpus of 63 speakers’ spontaneous speech (11K words) and an artificially-generated corpus (50K words). Experimental results show that, without feature engineering, BLSTM classifiers outperform CRF classifiers by 33.6%. However, adding features to support the CRF classifier yields performance improvements of 45% and 18% over the CRF baseline and BLSTM results, respectively. Moreover, adding more data to train the CRF and BLSTM classifiers consistently improves the results

White Rose Research Online

Optimizing text mining methods for improving biomedical natural language processing

Author: Mehryary Farrokh
Publication venue: fi=Turun yliopisto|en=University of Turku|
Publication date: 04/02/2022
Field of study

The overwhelming amount and the increasing rate of publication in the biomedical domain make it difficult for life sciences researchers to acquire and maintain all information that is necessary for their research. Pubmed (the primary citation database for the biomedical literature) currently contains over 21 million article abstracts and more than one million of them were published in 2020 alone. Even though existing article databases provide capable keyword search services, typical everyday-life queries usually return thousands of relevant articles. For instance, a cancer research scientist may need to acquire a complete list of genes that interact with BRCA1 (breast cancer 1) gene. The PubMed keyword search for BRCA1 returns over 16,500 article abstracts, making manual inspection of the retrieved documents impractical. Missing even one of the interacting gene partners in this scenario may jeopardize successful development of a potential new drug or vaccine. Although manually curated databases of biomolecular interactions exist, they are usually not up-to-date and they require notable human effort to maintain. To summarize, new discoveries are constantly being shared within the community via scientific publishing, but unfortunately the probability of missing vital information for research in life sciences is increasing. In response to this problem, the biomedical natural language processing (BioNLP) community of researchers has emerged and strives to assist life sciences researchers by building modern language processing and text mining tools that can be applied at large-scale and scan the whole publicly available literature and extract, classify, and aggregate the information found within, thus keeping life sciences researchers always up-to-date with the recent relevant discoveries and facilitating their research in numerous fields such as molecular biology, biomedical engineering, bioinformatics, genetics engineering and biochemistry. My research has almost exclusively focused on biomedical relation and event extraction tasks. These foundational information extraction tasks deal with automatic detection of biological processes, interactions and relations described in the biomedical literature. Precisely speaking, biomedical relation and event extraction systems can scan through a vast amount of biomedical texts and automatically detect and extract the semantic relations of biomedical named entities (e.g. genes, proteins, chemical compounds, and diseases). The structured outputs of such systems (i.e., the extracted relations or events) can be stored as relational databases or molecular interaction networks which can easily be queried, filtered, analyzed, visualized and integrated with other structured data sources. Extracting biomolecular interactions has always been the primary interest of BioNLP researcher because having knowledge about such interactions is crucially important in various research areas including precision medicine, drug discovery, drug repurposing, hypothesis generation, construction and curation of signaling pathways, and protein function and structure prediction. State-of-the-art relation and event extraction methods are based on supervised machine learning, requiring manually annotated data for training. Manual annotation for the biomedical domain requires domain expertise and it is time-consuming. Hence, having minimal training data for building information extraction systems is a common case in the biomedical domain. This demands development of methods that can make the most out of available training data and this thesis gathers all my research efforts and contributions in that direction. It is worth mentioning that biomedical natural language processing has undergone a revolution since I started my research in this field almost ten years ago. As a member of the BioNLP community, I have witnessed the emergence, improvement– and in some cases, the disappearance–of many methods, each pushing the performance of the best previous method one step further. I can broadly divide the last ten years into three periods. Once I started my research, feature-based methods that relied on heavy feature engineering were dominant and popular. Then, significant advancements in the hardware technology, as well as several breakthroughs in the algorithms and methods enabled machine learning practitioners to seriously utilize artificial neural networks for real-world applications. In this period, convolutional, recurrent, and attention-based neural network models became dominant and superior. Finally, the introduction of transformer-based language representation models such as BERT and GPT impacted the field and resulted in unprecedented performance improvements on many data sets. When reading this thesis, I demand the reader to take into account the course of history and judge the methods and results based on what could have been done in that particular period of the history

UTUPub