20 research outputs found

    PT-Net: A Multi-Model Machine Learning Approach for Smarter Next-Generation Wearable Tremor Suppression Devices for Parkinson\u27s Disease Tremor

    Get PDF
    According to the World Health Organization (WHO), Parkinson\u27s Disease (PD) is the second most common neurodegenerative condition that can cause tremors and other motor and non motor related symptoms. Medication and deep brain stimulation (DBS) are often used to treat tremor; however, medication is not always effective and has adverse effects, and DBS is invasive and carries a significant risk of complications. Wearable tremor suppression devices (WTSDs) have been proposed as a possible alternative, but their effectiveness is limited by the tremor models they use, which introduce a phase delay that decreases the performance of the devices. Additionally, the availability of tremor datasets is limited, which prevents the rapid advancement of these devices. To address the challenges facing the WTSDs, PD tremor data were collected at the Wearable Biomechatronics Laboratory (WearMe Lab) to develop methods and data-driven models to improve the performance of WTSDs in managing tremor, and potentially to be integrated with the wearable tremor suppression glove that is being developed at the WearMe Lab. A predictive model was introduced and showed improved motion estimation with an average estimation accuracy of 99.2%. The model was also able to predict motion with multiple steps ahead, negating the phase delay introduced by previous models and achieving prediction accuracies of 97%, 94%, 92%, and 90\% for predicting voluntary motion 10, 20, 50, and 100 steps ahead, respectively. Tremor and task classification models were also developed, with mean classification accuracies of 91.2% and 91.1%, respectively. These models can be used to fine-tune the parameters of existing estimators based on the type of tremor and task, increasing their suppression capabilities. To address the absence of a mathematical model for generating tremor data and limited access to existing PD tremor datasets, an open-source generative model was developed to produce data with similar characteristics, distribution, and patterns to real data. The reliability of the generated data was evaluated using four different methods, showing that the generative model can produce data with similar distribution, patterns, and characteristics to real data. The development of data-driven models and methods to improve the performance of wearable tremor suppression devices for Parkinson\u27s disease can potentially offer a noninvasive and effective alternative to medication and deep brain stimulation. The proposed predictive model, classification model, and the open-source generative model provide a promising framework for the advancement of wearable technology for tremor suppression, potentially leading to a significant improvement in the quality of life for individuals with Parkinson\u27s disease

    DEEP-AD: The deep learning model for diagnostic classification and prognostic prediction of alzheimer's disease

    Get PDF
    In terms of context, the aim of this dissertation is to aid neuroradiologists in their clinical judgment regarding the early detection of AD by using DL. To that aim, the system design research methodology is suggested in this dissertation for achieving three goals. The first goal is to investigate the DL models that have performed well at identifying patterns associated with AD, as well as the accuracy so far attained, limitations, and gaps. A systematic review of the literature (SLR) revealed a shortage of empirical studies on the early identification of AD through DL. In this regard, thirteen empirical studies were identified and examined. We concluded that three-dimensional (3D) DL models have been generated far less often and that their performance is also inadequate to qualify them for clinical trials. The second goal is to provide the neuroradiologist with the computer-interpretable information they need to analyze neuroimaging biomarkers. Given this context, the next step in this dissertation is to find the optimum DL model to analyze neuroimaging biomarkers. It has been achieved in two steps. In the first step, eight state-of-the-art DL models have been implemented by training from scratch using end-to-end learning (E2EL) for two binary classification tasks (AD vs. CN and AD vs. stable MCI) and compared by utilizing MRI scans from the publicly accessible datasets of neuroimaging biomarkers. Comparative analysis is carried out by utilizing efficiency-effects graphs, comprehensive indicators, and ranking mechanisms. For the training of the AD vs. sMCI task, the EfficientNet-B0 model gets the highest value for the comprehensive indicator and has the fewest parameters. DenseNet264 performed better than the others in terms of evaluation matrices, but since it has the most parameters, it costs more to train. For the AD vs. CN task by DenseNet264, we achieved 100% accuracy for training and 99.56% accuracy for testing. However, the classification accuracy was still only 82.5% for the AD vs. sMCI task. In the second step, fusion of transfer learning (TL) with E2EL is applied to train the EfficientNet-B0 for the AD vs. sMCI task, which achieved 95.29% accuracy for training and 93.10% accuracy for testing. Additionally, we have also implemented EfficientNet-B0 for the multiclass AD vs. CN vs. sMCI classification task with E2EL to be used in ensemble of models and achieved 85.66% training accuracy and 87.38% testing accuracy. To evaluate the model’s robustness, neuroradiologists must validate the implemented model. As a result, the third goal of this dissertation is to create a tool that neuroradiologists may use at their convenience. To achieve this objective, this dissertation proposes a web-based application (DEEP-AD) that has been created by making an ensemble of Efficient-Net B0 and DenseNet 264 (based on the contribution of goal 2). The accuracy of a DEEP-AD prototype has undergone repeated evaluation and improvement. First, we validated 41 subjects of Spanish MRI datasets (acquired from HT Medica, Madrid, Spain), achieving an accuracy of 82.90%, which was later verified by neuroradiologists. The results of these evaluation studies showed the accomplishment of such goals and relevant directions for future research in applied DL for the early detection of AD in clinical settings.En términos de contexto, el objetivo de esta tesis es ayudar a los neurorradiólogos en su juicio clínico sobre la detección precoz de la AD mediante el uso de DL. Para ello, en esta tesis se propone la metodología de investigación de diseño de sistemas para lograr tres objetivos. El segundo objetivo es proporcionar al neurorradiólogo la información interpretable por ordenador que necesita para analizar los biomarcadores de neuroimagen. Dado este contexto, el siguiente paso en esta tesis es encontrar el modelo DL óptimo para analizar biomarcadores de neuroimagen. Esto se ha logrado en dos pasos. En el primer paso, se han implementado ocho modelos DL de última generación mediante entrenamiento desde cero utilizando aprendizaje de extremo a extremo (E2EL) para dos tareas de clasificación binarias (AD vs. CN y AD vs. MCI estable) y se han comparado utilizando escaneos MRI de los conjuntos de datos de biomarcadores de neuroimagen de acceso público. El análisis comparativo se lleva a cabo utilizando gráficos de efecto-eficacia, indicadores exhaustivos y mecanismos de clasificación. Para el entrenamiento de la tarea AD vs. sMCI, el modelo EfficientNet-B0 obtiene el valor más alto para el indicador exhaustivo y tiene el menor número de parámetros. DenseNet264 obtuvo mejores resultados que los demás en términos de matrices de evaluación, pero al ser el que tiene más parámetros, su entrenamiento es más costoso. Para la tarea AD vs. CN de DenseNet264, conseguimos una accuracy del 100% en el entrenamiento y del 99,56% en las pruebas. Sin embargo, la accuracy de la clasificación fue sólo del 82,5% para la tarea AD vs. sMCI. En el segundo paso, se aplica la fusión del aprendizaje por transferencia (TL) con E2EL para entrenar la EfficientNet-B0 para la tarea AD vs. sMCI, que alcanzó una accuracy del 95,29% en el entrenamiento y del 93,10% en las pruebas. Además, también hemos implementado EfficientNet-B0 para la tarea de clasificación multiclase AD vs. CN vs. sMCI con E2EL para su uso en conjuntos de modelos y hemos obtenido una accuracy de entrenamiento del 85,66% y una precisión de prueba del 87,38%. Para evaluar la solidez del modelo, los neurorradiólogos deben validar el modelo implementado. Como resultado, el tercer objetivo de esta disertación es crear una herramienta que los neurorradiólogos puedan utilizar a su conveniencia. Para lograr este objetivo, esta disertación propone una aplicación basada en web (DEEP-AD) que ha sido creada haciendo un ensemble de Efficient-Net B0 y DenseNet 264 (basado en la contribución del objetivo 2). La accuracy del prototipo DEEP-AD ha sido sometida a repetidas evaluaciones y mejoras. En primer lugar, validamos 41 sujetos de conjuntos de datos de MRI españoles (adquiridos de HT Medica, Madrid, España), logrando una accuracy del 82,90%, que posteriormente fue verificada por neurorradiólogos. Los resultados de estos estudios de evaluación mostraron el cumplimiento de dichos objetivos y las direcciones relevantes para futuras investigaciones en DL, aplicada en la detección precoz de la AD en entornos clínicos.Escuela de DoctoradoDoctorado en Tecnologías de la Información y las Telecomunicacione

    Detection and Prediction of Freezing of Gait in Parkinson’s Disease using Wearable Sensors and Machine Learning

    Get PDF
    Freezing of gait (FOG), is a brief episodic absence of forward body progression despite the intention to walk. Appearing mostly in mid-late stage Parkinson’s disease (PD), freezing manifests as a sudden loss of lower-limb function, and is closely linked to falling, decreased functional mobility, and loss of independence. Wearable-sensor based devices can detect freezes already in progress, and intervene by delivering auditory, visual, or tactile stimuli called cues. Cueing has been shown to reduce FOG duration and allow walking to continue. However, FOG detection and cueing systems require data from the freeze episode itself and are thus unable to prevent freezing. Anticipating the FOG episode before onset and supplying a timely cue could prevent the freeze from occurring altogether. FOG has been predicted in offline analyses by training machine learning models to identify wearable-sensor signal patterns known to precede FOG. The most commonly used sensors for FOG detection and prediction are inertial measurement units (IMU) that include an accelerometer, gyroscope and sometimes magnetometer. Currently, the best FOG prediction systems use data collected from multiple sensors on various body locations to develop person-specific models. Multi-sensor systems are more complex and may be challenging to integrate into real-life assistive devices. The ultimate goal of FOG prediction systems is a user-friendly assistive device that can be used by anyone experiencing FOG. To achieve this goal, person-independent models with high FOG prediction performance and a minimal number of conveniently located sensors are needed. The objectives of this thesis were: to develop and evaluate FOG detection and prediction models using IMU and plantar pressure data; determine if event-based or period of gait disruption FOG definitions have better classification performance for FOG detection and prediction; and evaluate FOG prediction models that use a single unilateral plantar pressure insole sensor or bilateral sensors. In this thesis, IMU (accelerometer and gyroscope) and plantar pressure insole sensors were used to collect data from 11 people with FOG while they walked a freeze provoking path. A custom-made synchronization and labeling program was used synchronize the IMU and plantar pressure data and annotate FOG episodes. Data were divided into overlapping 1 s windows with 0.2 s shift between consecutive windows. Time domain, Fourier transform based, and wavelet transform based features were extracted from the data. A total of 861 features were extracted from each of the 71,000 data windows. To evaluate the effectiveness of FOG detection and prediction models using plantar pressure and IMU data features, three feature sets were compared: plantar pressure, IMU, and both plantar pressure and IMU features. Minimum-redundancy maximum-relevance (mRMR) and Relief-F feature selection were performed prior to training boosted ensembles of decision trees. The binary classification models identified Total-FOG or Non-FOG states, wherein the Total-FOG class included windows with data from 2 s before the FOG onset until the end of the FOG episode. The plantar-pressure-only model had the greatest sensitivity, and the IMU-only model had the greatest specificity. The best overall model used the combination of plantar pressure and IMU features, achieving 76.4% sensitivity and 86.2% specificity. Next, the Total-FOG class components were evaluated individually (i.e., Pre-FOG windows, freeze windows, and transition windows between Pre-FOG and FOG). The best model, which used plantar pressure and IMU features, detected windows that contained both Pre-FOG and FOG data with 85.2% sensitivity, which is equivalent to detecting FOG less than 1 s after the freeze began. Models using both plantar pressure and IMU features performed better than models that used either sensor type alone. Datasets used to train machine learning models often generate ground truth FOG labels based on visual observation of specific lower limb movements (event-based definition) or an overall inability to walk effectively (period of gait disruption based definition). FOG definition ambiguity may affect FOG detection and prediction model performance, especially with respect to multiple FOG in rapid succession. This research examined the effects of defining FOG either as a period of gait disruption (merging successive FOG), or based on an event (no merging), on FOG detection and prediction. Plantar pressure and lower limb acceleration data were used to extract a set of features and train decision tree ensembles. FOG was labeled using an event-based definition. Additional datasets were then produced by merging FOG that occurred in rapid succession. A merging threshold was introduced where FOG that were separated by less than the merging threshold were merged into one episode. FOG detection and prediction models were trained for merging thresholds of 0, 1, 2, and 3 s. Merging had little effect on FOG detection model performance; however, for the prediction model, merging resulted in slightly later FOG identification and lower precision. FOG prediction models may benefit from using event-based FOG definitions and avoiding merging multiple FOG in rapid succession. Despite the known asymmetry of PD motor symptom manifestation, the difference between the more severely affected side (MSS) and less severely affected side (LSS) is rarely considered in FOG detection and prediction studies. The additional information provided by the MSS or LSS, if any, may be beneficial to FOG prediction models, especially if using a single sensor. To examine the effect of using data from the MSS, LSS, or both limbs, multiple FOG prediction models were trained and compared. Three datasets were created using plantar pressure data from the MSS, LSS, and both sides together. Feature selection was performed, and FOG prediction models were trained using the top 5, 10, 15, 20, 25 or 30 features for each dataset. The best models were the MSS model with 15 features, and the LSS and bilateral features with 5 features. The LSS model reached the highest sensitivity (79.5%) and identified the highest percentage of FOG episodes (94.9%). The MSS model achieved the highest specificity (84.9%) and the lowest false positive (FP) rate (2 FP/walking trial). Overall, the bilateral model was best. The bilateral model had 77.3% sensitivity, 82.9% specificity, and identified 94.3% of FOG episodes an average of 1.1 s before FOG onset. Compared to the bilateral model, the LSS model had a higher false positive rate; however, the bilateral and LSS models were similar in all other evaluation metrics. Therefore, using the LSS model instead of the bilateral model would produce similar FOG prediction performance at the cost of slightly more false positives. Given the advantages of single sensor systems, the increased FP rate may be acceptable. Therefore, a single plantar pressure sensor placed on the LSS could be used to develop a FOG prediction system and produce performance similar to a bilateral system

    Signal processing and analytics of multimodal biosignals

    Get PDF
    Ph. D. ThesisBiosignals have been extensively studied by researchers for applications in diagnosis, therapy, and monitoring. As these signals are complex, they have to be crafted as features for machine learning to work. This begs the question of how to extract features that are relevant and yet invariant to uncontrolled extraneous factors. In the last decade or so, deep learning has been used to extract features from the raw signals automatically. Furthermore, with the proliferation of sensors, more raw signals are now available, making it possible to use multi-view learning to improve on the predictive performance of deep learning. The purpose of this work is to develop an effective deep learning model of the biosignals and make use of the multi-view information in the sequential data. This thesis describes two proposed methods, namely: (1) The use of a deep temporal convolution network to provide the temporal context of the signals to the deeper layers of a deep belief net. (2) The use of multi-view spectral embedding to blend the complementary data in an ensemble. This work uses several annotated biosignal data sets that are available in the open domain. They are non-stationary, noisy and non-linear signals. Using these signals in their raw form without feature engineering will yield poor results with the traditional machine learning techniques. By passing abstractions that are more useful through the deep belief net and blending the complementary data in an ensemble, there will be improvement in performance in terms of accuracy and variance, as shown by the results of 10-fold validations.Nanyang Polytechni

    Analysis and automatic identification of spontaneous emotions in speech from human-human and human-machine communication

    Get PDF
    383 p.This research mainly focuses on improving our understanding of human-human and human-machineinteractions by analysing paricipants¿ emotional status. For this purpose, we have developed andenhanced Speech Emotion Recognition (SER) systems for both interactions in real-life scenarios,explicitly emphasising the Spanish language. In this framework, we have conducted an in-depth analysisof how humans express emotions using speech when communicating with other persons or machines inactual situations. Thus, we have analysed and studied the way in which emotional information isexpressed in a variety of true-to-life environments, which is a crucial aspect for the development of SERsystems. This study aimed to comprehensively understand the challenge we wanted to address:identifying emotional information on speech using machine learning technologies. Neural networks havebeen demonstrated to be adequate tools for identifying events in speech and language. Most of themaimed to make local comparisons between some specific aspects; thus, the experimental conditions weretailored to each particular analysis. The experiments across different articles (from P1 to P19) are hardlycomparable due to our continuous learning of dealing with the difficult task of identifying emotions inspeech. In order to make a fair comparison, additional unpublished results are presented in the Appendix.These experiments were carried out under identical and rigorous conditions. This general comparisonoffers an overview of the advantages and disadvantages of the different methodologies for the automaticrecognition of emotions in speech

    XV. Magyar Számítógépes Nyelvészeti Konferencia

    Get PDF

    State of the art of audio- and video based solutions for AAL

    Get PDF
    Working Group 3. Audio- and Video-based AAL ApplicationsIt is a matter of fact that Europe is facing more and more crucial challenges regarding health and social care due to the demographic change and the current economic context. The recent COVID-19 pandemic has stressed this situation even further, thus highlighting the need for taking action. Active and Assisted Living (AAL) technologies come as a viable approach to help facing these challenges, thanks to the high potential they have in enabling remote care and support. Broadly speaking, AAL can be referred to as the use of innovative and advanced Information and Communication Technologies to create supportive, inclusive and empowering applications and environments that enable older, impaired or frail people to live independently and stay active longer in society. AAL capitalizes on the growing pervasiveness and effectiveness of sensing and computing facilities to supply the persons in need with smart assistance, by responding to their necessities of autonomy, independence, comfort, security and safety. The application scenarios addressed by AAL are complex, due to the inherent heterogeneity of the end-user population, their living arrangements, and their physical conditions or impairment. Despite aiming at diverse goals, AAL systems should share some common characteristics. They are designed to provide support in daily life in an invisible, unobtrusive and user-friendly manner. Moreover, they are conceived to be intelligent, to be able to learn and adapt to the requirements and requests of the assisted people, and to synchronise with their specific needs. Nevertheless, to ensure the uptake of AAL in society, potential users must be willing to use AAL applications and to integrate them in their daily environments and lives. In this respect, video- and audio-based AAL applications have several advantages, in terms of unobtrusiveness and information richness. Indeed, cameras and microphones are far less obtrusive with respect to the hindrance other wearable sensors may cause to one’s activities. In addition, a single camera placed in a room can record most of the activities performed in the room, thus replacing many other non-visual sensors. Currently, video-based applications are effective in recognising and monitoring the activities, the movements, and the overall conditions of the assisted individuals as well as to assess their vital parameters (e.g., heart rate, respiratory rate). Similarly, audio sensors have the potential to become one of the most important modalities for interaction with AAL systems, as they can have a large range of sensing, do not require physical presence at a particular location and are physically intangible. Moreover, relevant information about individuals’ activities and health status can derive from processing audio signals (e.g., speech recordings). Nevertheless, as the other side of the coin, cameras and microphones are often perceived as the most intrusive technologies from the viewpoint of the privacy of the monitored individuals. This is due to the richness of the information these technologies convey and the intimate setting where they may be deployed. Solutions able to ensure privacy preservation by context and by design, as well as to ensure high legal and ethical standards are in high demand. After the review of the current state of play and the discussion in GoodBrother, we may claim that the first solutions in this direction are starting to appear in the literature. A multidisciplinary 4 debate among experts and stakeholders is paving the way towards AAL ensuring ergonomics, usability, acceptance and privacy preservation. The DIANA, PAAL, and VisuAAL projects are examples of this fresh approach. This report provides the reader with a review of the most recent advances in audio- and video-based monitoring technologies for AAL. It has been drafted as a collective effort of WG3 to supply an introduction to AAL, its evolution over time and its main functional and technological underpinnings. In this respect, the report contributes to the field with the outline of a new generation of ethical-aware AAL technologies and a proposal for a novel comprehensive taxonomy of AAL systems and applications. Moreover, the report allows non-technical readers to gather an overview of the main components of an AAL system and how these function and interact with the end-users. The report illustrates the state of the art of the most successful AAL applications and functions based on audio and video data, namely (i) lifelogging and self-monitoring, (ii) remote monitoring of vital signs, (iii) emotional state recognition, (iv) food intake monitoring, activity and behaviour recognition, (v) activity and personal assistance, (vi) gesture recognition, (vii) fall detection and prevention, (viii) mobility assessment and frailty recognition, and (ix) cognitive and motor rehabilitation. For these application scenarios, the report illustrates the state of play in terms of scientific advances, available products and research project. The open challenges are also highlighted. The report ends with an overview of the challenges, the hindrances and the opportunities posed by the uptake in real world settings of AAL technologies. In this respect, the report illustrates the current procedural and technological approaches to cope with acceptability, usability and trust in the AAL technology, by surveying strategies and approaches to co-design, to privacy preservation in video and audio data, to transparency and explainability in data processing, and to data transmission and communication. User acceptance and ethical considerations are also debated. Finally, the potentials coming from the silver economy are overviewed.publishedVersio

    Deep learning of brain asymmetry digital biomarkers to support early diagnosis of cognitive decline and dementia

    Get PDF
    Early identification of degenerative processes in the human brain is essential for proper care and treatment. This may involve different instrumental diagnostic methods, including the most popular computer tomography (CT), magnetic resonance imaging (MRI) and positron emission tomography (PET) scans. These technologies provide detailed information about the shape, size, and function of the human brain. Structural and functional cerebral changes can be detected by computational algorithms and used to diagnose dementia and its stages (amnestic early mild cognitive impairment - EMCI, Alzheimer’s Disease - AD). They can help monitor the progress of the disease. Transformation shifts in the degree of asymmetry between the left and right hemispheres illustrate the initialization or development of a pathological process in the brain. In this vein, this study proposes a new digital biomarker for the diagnosis of early dementia based on the detection of image asymmetries and crosssectional comparison of NC (normal cognitively), EMCI and AD subjects. Features of brain asymmetries extracted from MRI of the ADNI and OASIS databases are used to analyze structural brain changes and machine learning classification of the pathology. The experimental part of the study includes results of supervised machine learning algorithms and transfer learning architectures of convolutional neural networks for distinguishing between cognitively normal subjects and patients with early or progressive dementia. The proposed pipeline offers a low-cost imaging biomarker for the classification of dementia. It can be potentially helpful to other brain degenerative disorders accompanied by changes in brain asymmetries
    corecore