199 research outputs found

    Emotion and Stress Recognition Related Sensors and Machine Learning Technologies

    Get PDF
    This book includes impactful chapters which present scientific concepts, frameworks, architectures and ideas on sensing technologies and machine learning techniques. These are relevant in tackling the following challenges: (i) the field readiness and use of intrusive sensor systems and devices for capturing biosignals, including EEG sensor systems, ECG sensor systems and electrodermal activity sensor systems; (ii) the quality assessment and management of sensor data; (iii) data preprocessing, noise filtering and calibration concepts for biosignals; (iv) the field readiness and use of nonintrusive sensor technologies, including visual sensors, acoustic sensors, vibration sensors and piezoelectric sensors; (v) emotion recognition using mobile phones and smartwatches; (vi) body area sensor networks for emotion and stress studies; (vii) the use of experimental datasets in emotion recognition, including dataset generation principles and concepts, quality insurance and emotion elicitation material and concepts; (viii) machine learning techniques for robust emotion recognition, including graphical models, neural network methods, deep learning methods, statistical learning and multivariate empirical mode decomposition; (ix) subject-independent emotion and stress recognition concepts and systems, including facial expression-based systems, speech-based systems, EEG-based systems, ECG-based systems, electrodermal activity-based systems, multimodal recognition systems and sensor fusion concepts and (x) emotion and stress estimation and forecasting from a nonlinear dynamical system perspective

    InMyFace: Inertial and Mechanomyography-Based Sensor Fusion for Wearable Facial Activity Recognition

    Full text link
    Recognizing facial activity is a well-understood (but non-trivial) computer vision problem. However, reliable solutions require a camera with a good view of the face, which is often unavailable in wearable settings. Furthermore, in wearable applications, where systems accompany users throughout their daily activities, a permanently running camera can be problematic for privacy (and legal) reasons. This work presents an alternative solution based on the fusion of wearable inertial sensors, planar pressure sensors, and acoustic mechanomyography (muscle sounds). The sensors were placed unobtrusively in a sports cap to monitor facial muscle activities related to facial expressions. We present our integrated wearable sensor system, describe data fusion and analysis methods, and evaluate the system in an experiment with thirteen subjects from different cultural backgrounds (eight countries) and both sexes (six women and seven men). In a one-model-per-user scheme and using a late fusion approach, the system yielded an average F1 score of 85.00% for the case where all sensing modalities are combined. With a cross-user validation and a one-model-for-all-user scheme, an F1 score of 79.00% was obtained for thirteen participants (six females and seven males). Moreover, in a hybrid fusion (cross-user) approach and six classes, an average F1 score of 82.00% was obtained for eight users. The results are competitive with state-of-the-art non-camera-based solutions for a cross-user study. In addition, our unique set of participants demonstrates the inclusiveness and generalizability of the approach.Comment: Submitted to Information Fusion, Elsevie

    Estimation of the QoE for video streaming services based on facial expressions and gaze direction

    Get PDF
    As the multimedia technologies evolve, the need to control their quality becomes even more important making the Quality of Experience (QoE) measurements a key priority. Machine Learning (ML) can support this task providing models to analyse the information extracted by the multimedia. It is possible to divide the ML models applications in the following categories: 1) QoE modelling: ML is used to define QoE models which provide an output (e.g., perceived QoE score) for any given input (e.g., QoE influence factor). 2) QoE monitoring in case of encrypted traffic: ML is used to analyze passive traffic monitored data to obtain insight into degradations perceived by end-users. 3) Big data analytics: ML is used for the extraction of meaningful and useful information from the collected data, which can further be converted to actionable knowledge and utilized in managing QoE. The QoE estimation quality task can be carried out by using two approaches: the objective approach and subjective one. As the two names highlight, they are referred to the pieces of information that the model analyses. The objective approach analyses the objective features extracted by the network connection and by the used media. As objective parameters, the state-of-the-art shows different approaches that use also the features extracted by human behaviour. The subjective approach instead, comes as a result of the rating approach, where the participants were asked to rate the perceived quality using different scales. This approach had the problem of being a time-consuming approach and for this reason not all the users agree to compile the questionnaire. Thus the direct evolution of this approach is the ML model adoption. A model can substitute the questionnaire and evaluate the QoE, depending on the data that analyses. By modelling the human response to the perceived quality on multimedia, QoE researchers found that the parameters extracted from the users could be different, like Electroencephalogram (EEG), Electrocardiogram (ECG), waves of the brain. The main problem with these techniques is the hardware. In fact, the user must wear electrodes in case of ECG and EEG, and also if the obtained results from these methods are relevant, their usage in a real context could be not feasible. For this reason, my studies have been focused on the developing of a Machine Learning framework completely unobtrusively based on the Facial reactions

    Facial expression recognition and intensity estimation.

    Get PDF
    Doctoral Degree. University of KwaZulu-Natal, Durban.Facial Expression is one of the profound non-verbal channels through which human emotion state is inferred from the deformation or movement of face components when facial muscles are activated. Facial Expression Recognition (FER) is one of the relevant research fields in Computer Vision (CV) and Human-Computer Interraction (HCI). Its application is not limited to: robotics, game, medical, education, security and marketing. FER consists of a wealth of information. Categorising the information into primary emotion states only limit its performance. This thesis considers investigating an approach that simultaneously predicts the emotional state of facial expression images and the corresponding degree of intensity. The task also extends to resolving FER ambiguous nature and annotation inconsistencies with a label distribution learning method that considers correlation among data. We first proposed a multi-label approach for FER and its intensity estimation using advanced machine learning techniques. According to our findings, this approach has not been considered for emotion and intensity estimation in the field before. The approach used problem transformation to present FER as a multilabel task, such that every facial expression image has unique emotion information alongside the corresponding degree of intensity at which the emotion is displayed. A Convolutional Neural Network (CNN) with a sigmoid function at the final layer is the classifier for the model. The model termed ML-CNN (Multilabel Convolutional Neural Network) successfully achieve concurrent prediction of emotion and intensity estimation. ML-CNN prediction is challenged with overfitting and intraclass and interclass variations. We employ Visual Geometric Graphics-16 (VGG-16) pretrained network to resolve the overfitting challenge and the aggregation of island loss and binary cross-entropy loss to minimise the effect of intraclass and interclass variations. The enhanced ML-CNN model shows promising results and outstanding performance than other standard multilabel algorithms. Finally, we approach data annotation inconsistency and ambiguity in FER data using isomap manifold learning with Graph Convolutional Networks (GCN). The GCN uses the distance along the isomap manifold as the edge weight, which appropriately models the similarity between adjacent nodes for emotion predictions. The proposed method produces a promising result in comparison with the state-of-the-art methods.Author's List of Publication is on page xi of this thesis

    A novel NMF-based DWI CAD framework for prostate cancer.

    Get PDF
    In this thesis, a computer aided diagnostic (CAD) framework for detecting prostate cancer in DWI data is proposed. The proposed CAD method consists of two frameworks that use nonnegative matrix factorization (NMF) to learn meaningful features from sets of high-dimensional data. The first technique, is a three dimensional (3D) level-set DWI prostate segmentation algorithm guided by a novel probabilistic speed function. This speed function is driven by the features learned by NMF from 3D appearance, shape, and spatial data. The second technique, is a probabilistic classifier that seeks to label a prostate segmented from DWI data as either alignat, contain cancer, or benign, containing no cancer. This approach uses a NMF-based feature fusion to create a feature space where data classes are clustered. In addition, using DWI data acquired at a wide range of b-values (i.e. magnetic field strengths) is investigated. Experimental analysis indicates that for both of these frameworks, using NMF producing more accurate segmentation and classification results, respectively, and that combining the information from DWI data at several b-values can assist in detecting prostate cancer

    Role of machine learning in early diagnosis of kidney diseases.

    Get PDF
    Machine learning (ML) and deep learning (DL) approaches have been used as indispensable tools in modern artificial intelligence-based computer-aided diagnostic (AIbased CAD) systems that can provide non-invasive, early, and accurate diagnosis of a given medical condition. These AI-based CAD systems have proven themselves to be reproducible and have the generalization ability to diagnose new unseen cases with several diseases and medical conditions in different organs (e.g., kidneys, prostate, brain, liver, lung, breast, and bladder). In this dissertation, we will focus on the role of such AI-based CAD systems in early diagnosis of two kidney diseases, namely: acute rejection (AR) post kidney transplantation and renal cancer (RC). A new renal computer-assisted diagnostic (Renal-CAD) system was developed to precisely diagnose AR post kidney transplantation at an early stage. The developed Renal-CAD system perform the following main steps: (1) auto-segmentation of the renal allograft from surrounding tissues from diffusion weighted magnetic resonance imaging (DW-MRI) and blood oxygen level-dependent MRI (BOLD-MRI), (2) extraction of image markers, namely: voxel-wise apparent diffusion coefficients (ADCs) are calculated from DW-MRI scans at 11 different low and high b-values and then represented as cumulative distribution functions (CDFs) and extraction of the transverse relaxation rate (R2*) values from the segmented kidneys using BOLD-MRI scans at different echotimes, (3) integration of multimodal image markers with the associated clinical biomarkers, serum creatinine (SCr) and creatinine clearance (CrCl), and (4) diagnosing renal allograft status as nonrejection (NR) or AR by utilizing these integrated biomarkers and the developed deep learning classification model built on stacked auto-encoders (SAEs). Using a leaveone- subject-out cross-validation approach along with SAEs on a total of 30 patients with transplanted kidney (AR = 10 and NR = 20), the Renal-CAD system demonstrated 93.3% accuracy, 90.0% sensitivity, and 95.0% specificity in differentiating AR from NR. Robustness of the Renal-CAD system was also confirmed by the area under the curve value of 0.92. Using a stratified 10-fold cross-validation approach, the Renal-CAD system demonstrated its reproduciblity and robustness with a diagnostic accuracy of 86.7%, sensitivity of 80.0%, specificity of 90.0%, and AUC of 0.88. In addition, a new renal cancer CAD (RC-CAD) system for precise diagnosis of RC at an early stage was developed, which incorporates the following main steps: (1) estimating the morphological features by applying a new parametric spherical harmonic technique, (2) extracting appearance-based features, namely: first order textural features are calculated and second order textural features are extracted after constructing the graylevel co-occurrence matrix (GLCM), (3) estimating the functional features by constructing wash-in/wash-out slopes to quantify the enhancement variations across different contrast enhanced computed tomography (CE-CT) phases, (4) integrating all the aforementioned features and modeling a two-stage multilayer perceptron artificial neural network (MLPANN) classifier to classify the renal tumor as benign or malignant and identify the malignancy subtype. On a total of 140 RC patients (malignant = 70 patients (ccRCC = 40 and nccRCC = 30) and benign angiomyolipoma tumors = 70), the developed RC-CAD system was validated using a leave-one-subject-out cross-validation approach. The developed RC-CAD system achieved a sensitivity of 95.3% ± 2.0%, a specificity of 99.9% ± 0.4%, and Dice similarity coefficient of 0.98 ± 0.01 in differentiating malignant from benign renal tumors, as well as an overall accuracy of 89.6% ± 5.0% in the sub-typing of RCC. The diagnostic abilities of the developed RC-CAD system were further validated using a randomly stratified 10-fold cross-validation approach. The results obtained using the proposed MLP-ANN classification model outperformed other machine learning classifiers (e.g., support vector machine, random forests, and relational functional gradient boosting) as well as other different approaches from the literature. In summary, machine and deep learning approaches have shown potential abilities to be utilized to build AI-based CAD systems. This is evidenced by the promising diagnostic performance obtained by both Renal-CAD and RC-CAD systems. For the Renal- CAD, the integration of functional markers extracted from multimodal MRIs with clinical biomarkers using SAEs classification model, potentially improved the final diagnostic results evidenced by high accuracy, sensitivity, and specificity. The developed Renal-CAD demonstrated high feasibility and efficacy for early, accurate, and non-invasive identification of AR. For the RC-CAD, integrating morphological, textural, and functional features extracted from CE-CT images using a MLP-ANN classification model eventually enhanced the final results in terms of accuracy, sensitivity, and specificity, making the proposed RC-CAD a reliable noninvasive diagnostic tool for RC. The early and accurate diagnosis of AR or RC will help physicians to provide early intervention with the appropriate treatment plan to prolong the life span of the diseased kidney, increase the survival chance of the patient, and thus improve the healthcare outcome in the U.S. and worldwide

    Handling Class Imbalance Using Swarm Intelligence Techniques, Hybrid Data and Algorithmic Level Solutions

    Get PDF
    This research focuses mainly on the binary class imbalance problem in data mining. It investigates the use of combined approaches of data and algorithmic level solutions. Moreover, it examines the use of swarm intelligence and population-based techniques to combat the class imbalance problem at all levels, including at the data, algorithmic, and feature level. It also introduces various solutions to the class imbalance problem, in which swarm intelligence techniques like Stochastic Diffusion Search (SDS) and Dispersive Flies Optimisation (DFO) are used. The algorithms were evaluated using experiments on imbalanced datasets, in which the Support Vector Machine (SVM) was used as a classifier. SDS was used to perform informed undersampling of the majority class to balance the dataset. The results indicate that this algorithm improves the classifier performance and can be used on imbalanced datasets. Moreover, SDS was extended further to perform feature selection on high dimensional datasets. Experimental results show that SDS can be used to perform feature selection and improve the classifier performance on imbalanced datasets. Further experiments evaluated DFO as an algorithmic level solution to optimise the SVM kernel parameters when learning from imbalanced datasets. Based on the promising results of DFO in these experiments, the novel approach was extended further to provide a hybrid algorithm that simultaneously optimises the kernel parameters and performs feature selection

    Alzheimer’s Dementia Recognition Through Spontaneous Speech

    Get PDF

    CAD system for early diagnosis of diabetic retinopathy based on 3D extracted imaging markers.

    Get PDF
    This dissertation makes significant contributions to the field of ophthalmology, addressing the segmentation of retinal layers and the diagnosis of diabetic retinopathy (DR). The first contribution is a novel 3D segmentation approach that leverages the patientspecific anatomy of retinal layers. This approach demonstrates superior accuracy in segmenting all retinal layers from a 3D retinal image compared to current state-of-the-art methods. It also offers enhanced speed, enabling potential clinical applications. The proposed segmentation approach holds great potential for supporting surgical planning and guidance in retinal procedures such as retinal detachment repair or macular hole closure. Surgeons can benefit from the accurate delineation of retinal layers, enabling better understanding of the anatomical structure and more effective surgical interventions. Moreover, real-time guidance systems can be developed to assist surgeons during procedures, improving overall patient outcomes. The second contribution of this dissertation is the introduction of a novel computeraided diagnosis (CAD) system for precise identification of diabetic retinopathy. The CAD system utilizes 3D-OCT imaging and employs an innovative approach that extracts two distinct features: first-order reflectivity and 3D thickness. These features are then fused and used to train and test a neural network classifier. The proposed CAD system exhibits promising results, surpassing other machine learning and deep learning algorithms commonly employed in DR detection. This demonstrates the effectiveness of the comprehensive analysis approach employed by the CAD system, which considers both low-level and high-level data from the 3D retinal layers. The CAD system presents a groundbreaking contribution to the field, as it goes beyond conventional methods, optimizing backpropagated neural networks to integrate multiple levels of information effectively. By achieving superior performance, the proposed CAD system showcases its potential in accurately diagnosing DR and aiding in the prevention of vision loss. In conclusion, this dissertation presents novel approaches for the segmentation of retinal layers and the diagnosis of diabetic retinopathy. The proposed methods exhibit significant improvements in accuracy, speed, and performance compared to existing techniques, opening new avenues for clinical applications and advancements in the field of ophthalmology. By addressing future research directions, such as testing on larger datasets, exploring alternative algorithms, and incorporating user feedback, the proposed methods can be further refined and developed into robust, accurate, and clinically valuable tools for diagnosing and monitoring retinal diseases

    Fear Classification using Affective Computing with Physiological Information and Smart-Wearables

    Get PDF
    Mención Internacional en el título de doctorAmong the 17 Sustainable Development Goals proposed within the 2030 Agenda and adopted by all of the United Nations member states, the fifth SDG is a call for action to effectively turn gender equality into a fundamental human right and an essential foundation for a better world. It includes the eradication of all types of violence against women. Focusing on the technological perspective, the range of available solutions intended to prevent this social problem is very limited. Moreover, most of the solutions are based on a panic button approach, leaving aside the usage and integration of current state-of-the-art technologies, such as the Internet of Things (IoT), affective computing, cyber-physical systems, and smart-sensors. Thus, the main purpose of this research is to provide new insight into the design and development of tools to prevent and combat Gender-based Violence risky situations and, even, aggressions, from a technological perspective, but without leaving aside the different sociological considerations directly related to the problem. To achieve such an objective, we rely on the application of affective computing from a realist point of view, i.e. targeting the generation of systems and tools capable of being implemented and used nowadays or within an achievable time-frame. This pragmatic vision is channelled through: 1) an exhaustive study of the existing technological tools and mechanisms oriented to the fight Gender-based Violence, 2) the proposal of a new smart-wearable system intended to deal with some of the current technological encountered limitations, 3) a novel fear-related emotion classification approach to disentangle the relation between emotions and physiology, and 4) the definition and release of a new multi-modal dataset for emotion recognition in women. Firstly, different fear classification systems using a reduced set of physiological signals are explored and designed. This is done by employing open datasets together with the combination of time, frequency and non-linear domain techniques. This design process is encompassed by trade-offs between both physiological considerations and embedded capabilities. The latter is of paramount importance due to the edge-computing focus of this research. Two results are highlighted in this first task, the designed fear classification system that employed the DEAP dataset data and achieved an AUC of 81.60% and a Gmean of 81.55% on average for a subjectindependent approach, and only two physiological signals; and the designed fear classification system that employed the MAHNOB dataset data achieving an AUC of 86.00% and a Gmean of 73.78% on average for a subject-independent approach, only three physiological signals, and a Leave-One-Subject-Out configuration. A detailed comparison with other emotion recognition systems proposed in the literature is presented, which proves that the obtained metrics are in line with the state-ofthe- art. Secondly, Bindi is presented. This is an end-to-end autonomous multimodal system leveraging affective IoT throughout auditory and physiological commercial off-theshelf smart-sensors, hierarchical multisensorial fusion, and secured server architecture to combat Gender-based Violence by automatically detecting risky situations based on a multimodal intelligence engine and then triggering a protection protocol. Specifically, this research is focused onto the hardware and software design of one of the two edge-computing devices within Bindi. This is a bracelet integrating three physiological sensors, actuators, power monitoring integrated chips, and a System- On-Chip with wireless capabilities. Within this context, different embedded design space explorations are presented: embedded filtering evaluation, online physiological signal quality assessment, feature extraction, and power consumption analysis. The reported results in all these processes are successfully validated and, for some of them, even compared against physiological standard measurement equipment. Amongst the different obtained results regarding the embedded design and implementation within the bracelet of Bindi, it should be highlighted that its low power consumption provides a battery life to be approximately 40 hours when using a 500 mAh battery. Finally, the particularities of our use case and the scarcity of open multimodal datasets dealing with emotional immersive technology, labelling methodology considering the gender perspective, balanced stimuli distribution regarding the target emotions, and recovery processes based on the physiological signals of the volunteers to quantify and isolate the emotional activation between stimuli, led us to the definition and elaboration of Women and Emotion Multi-modal Affective Computing (WEMAC) dataset. This is a multimodal dataset in which 104 women who never experienced Gender-based Violence that performed different emotion-related stimuli visualisations in a laboratory environment. The previous fear binary classification systems were improved and applied to this novel multimodal dataset. For instance, the proposed multimodal fear recognition system using this dataset reports up to 60.20% and 67.59% for ACC and F1-score, respectively. These values represent a competitive result in comparison with the state-of-the-art that deal with similar multi-modal use cases. In general, this PhD thesis has opened a new research line within the research group under which it has been developed. Moreover, this work has established a solid base from which to expand knowledge and continue research targeting the generation of both mechanisms to help vulnerable groups and socially oriented technology.Programa de Doctorado en Ingeniería Eléctrica, Electrónica y Automática por la Universidad Carlos III de MadridPresidente: David Atienza Alonso.- Secretaria: Susana Patón Álvarez.- Vocal: Eduardo de la Torre Arnan
    corecore