    Intelligent Advanced User Interfaces for Monitoring Mental Health Wellbeing

    It has become pressing to develop objective and automatic measurements integrated in intelligent diagnostic tools for detecting and monitoring depressive states and enabling an increased precision of diagnoses and clinical decision-makings. The challenge is to exploit behavioral and physiological biomarkers and develop Artificial Intelligent (AI) models able to extract information from a complex combination of signals considered key symptoms. The proposed AI models should be able to help clinicians to rapidly formulate accurate diagnoses and suggest personalized intervention plans ranging from coaching activities (exploiting for example serious games), support networks (via chats, or social networks), and alerts to caregivers, doctors, and care control centers, reducing the considerable burden on national health care institutions in terms of medical, and social costs associated to depression cares

    Internet of Things Enabled Technologies for Behaviour Analytics in Elderly Person Care: A Survey

    The advances in sensor technology over recent years has provided new ways for researchers to monitor the elderly in uncontrolled environments. Sensors have become smaller, cheaper and can be worn on the body, potentially creating a network of sensors. Smart phones are also more common in the average household and can also provide some behavioural analysis due to the built in sensors. As a result of this, researchers are able to monitor behaviours in a more natural setting, which can lead to more useful data. This is important for those that may be suffering from mental illness as it allows for continuous, non-invasive monitoring in order to diagnose symptoms from different behaviours. However there are various challenges that need to be addressed ranging from issues with sensors to the involvement of human factors. It is vital that these challenges are taken into consideration along with the major behavioural symptoms that can appear in an Elderly Person. For a person suffering with Dementia, the application of sensor technologies can improve the quality of life of the person and also monitor the progress of the disease through behavioural analysis. This paper will consider the behaviours that can be associated with dementia and how these behaviours can be monitored through sensor technology. We will also provide an insight into some sensors and algorithms gathered through survey in order to provide advantages and disadvantages of these technologies as well as to present any challenges that may face future research

    Predicting depression and suicidal tendencies by analyzing online activities using machine learning in android devices

    Artificial Intelligence (AI) has brought about a profound transformation in the realm of technology, with Machine Learning (ML) within AI playing a crucial role in today's healthcare systems. Advanced systems with intellectual abilities resembling those of humans are being created and utilized to carry out intricate tasks. Applications like Object recognition, classification, Optical Character Recognition (OCR), Natural Language processing (NLP), among others, have started producing magnificent results with algorithms trained on humongous data readily available these days. Keeping in view the socio-economic implications of the pandemic threat posed to the world by COVID-19, this research aims at improving the quality of life of people suffering from mild depression by timely diagnosing the symptoms using AI in android devices, especially phones. In cases of severe depression, which is highly likely to lead to suicide, valuable lives can also be saved if adequate help can be dispatched to such patients within time. This can be achieved using automatic analysis of users’ data including text messages, emails, voice calls and internet search history, among other mobile phone activities, using Text mining/ text analytics which is the process of deriving meaningful information from natural language text. Machine Learning models analyse the users’ behaviour continuously from text and voice communications and data, thereby identifying if there are any negative tendencies in the behaviour over a certain period of time, and by using this information make inferences about the mental health state of the patient and instantly request appropriate healthcare before it is too late. In this research, an android application capable of performing the aforementioned tasks in real-time has been developed and tested for various performance features with an average accuracy of 95%

    Natural Language Processing Methods for Acoustic and Landmark Event-Based Features in Speech-Based Depression Detection

    The processing of speech as an explicit sequence of events is common in automatic speech recognition (linguistic events), but has received relatively little attention in paralinguistic speech classification despite its potential for characterizing broad acoustic event sequences. This paper proposes a framework for analyzing speech as a sequence of acoustic events, and investigates its application to depression detection. In this framework, acoustic space regions are tokenized to 'words' representing speech events at fixed or irregular intervals. This tokenization allows the exploitation of acoustic word features using proven natural language processing methods. A key advantage of this framework is its ability to accommodate heterogeneous event types: herein we combine acoustic words and speech landmarks, which are articulation-related speech events. Another advantage is the option to fuse such heterogeneous events at various levels, including the embedding level. Evaluation of the proposed framework on both controlled laboratory-grade supervised audio recordings as well as unsupervised self-administered smartphone recordings highlight the merits of the proposed framework across both datasets, with the proposed landmark-dependent acoustic words achieving improvements in F1(depressed) of up to 15% and 13% for SH2-FS and DAIC-WOZ respectively, relative to acoustic speech baseline approaches

    Artificial Intelligence for Suicide Assessment using Audiovisual Cues: A Review

    Death by suicide is the seventh leading death cause worldwide. The recent advancement in Artificial Intelligence (AI), specifically AI applications in image and voice processing, has created a promising opportunity to revolutionize suicide risk assessment. Subsequently, we have witnessed fast-growing literature of research that applies AI to extract audiovisual non-verbal cues for mental illness assessment. However, the majority of the recent works focus on depression, despite the evident difference between depression symptoms and suicidal behavior and non-verbal cues. This paper reviews recent works that study suicide ideation and suicide behavior detection through audiovisual feature analysis, mainly suicidal voice/speech acoustic features analysis and suicidal visual cues. Automatic suicide assessment is a promising research direction that is still in the early stages. Accordingly, there is a lack of large datasets that can be used to train machine learning and deep learning models proven to be effective in other, similar tasks.Comment: Manuscript submitted to Arificial Intelligence Reviews (2022

    Dual-level segmentation method for feature extraction enhancement strategy in speech emotion recognition

    The speech segmentation approach could be one of the significant factors contributing to a Speech Emotion Recognition (SER) system's overall performance. An utterance may contain more than one perceived emotion, the boundaries between the changes of emotion in an utterance are challenging to determine. Speech segmented through the conventional fixed window did not correspond to the signal changes, due to the random segment point, an arbitrary segmented frame is produced, the segment boundary might be within the sentence or in-between emotional changes. This study introduced an improvement of segment-based segmentation on a fixed-window Relative Time Interval (RTI) by using Signal Change (SC) segmentation approach to discover the signal boundary concerning the signal transition. A segment-based feature extraction enhancement strategy using a dual-level segmentation method was proposed: RTI-SC segmentation utilizing the conventional approach. Instead of segmenting the whole utterance at the relative time interval, this study implements peak analysis to obtain segment boundaries defined by the maximum peak value within each temporary RTI segment. In peak selection, over-segmentation might occur due to connections with the input signal, impacting the boundary selection decision. Two approaches in finding the maximum peaks were implemented, firstly; peak selection by distance allocation, and secondly; peak selection by Maximum function. The substitution of the temporary RTI segment with the segment concerning signal change was intended to capture better high-level statistical-based features within the signal transition. The signal's prosodic, spectral, and wavelet properties were integrated to structure a fine feature set based on the proposed method. 36 low-level descriptors and 12 statistical features and their derivative were extracted on each segment resulted in a fixed vector dimension. Correlation-based Feature Subset Selection (CFS) with the Best First search method was applied for dimensionality reduction before Support Vector Machine (SVM) with Sequential Minimal Optimization (SMO) was implemented for classification. The performance of the feature fusion constructed from the proposed method was evaluated through speaker-dependent and speaker-independent tests on EMO-DB and RAVDESS databases. The result indicated that the prosodic and spectral feature derived from the dual-level segmentation method offered a higher recognition rate for most speaker-independent tasks with a significant improvement of the overall accuracy of 82.2% (150 features), the highest accuracy among other segmentation approaches used in this study. The proposed method outperformed the baseline approach in a single emotion assessment in both full dimensions and an optimized set. The highest accuracy for every emotion was mostly contributed by the proposed method. Using the EMO-DB database, accuracy was enhanced, specifically, happy (67.6%), anger (89%), fear (85.5%), disgust (79.3%), while neutral and sadness emotion obtained a similar accuracy with the baseline method (91%) and (93.5%) respectively. A 100% accuracy for boredom emotion (female speaker) was observed in the speaker-dependent test, the highest single emotion classified, reported in this study

    What you say or how you say it? Depression detection through joint modeling of linguistic and acoustic aspects of speech

    Depression is one of the most common mental health issues. (It affects more than 4% of the world’s population, according to recent estimates.) This article shows that the joint analysis of linguistic and acoustic aspects of speech allows one to discriminate between depressed and nondepressed speakers with an accuracy above 80%. The approach used in the work is based on networks designed for sequence modeling (bidirectional Long-Short Term Memory networks) and multimodal analysis methodologies (late fusion, joint representation and gated multimodal units). The experiments were performed over a corpus of 59 interviews (roughly 4 hours of material) involving 29 individuals diagnosed with depression and 30 control participants. In addition to an accuracy of 80%, the results show that multimodal approaches perform better than unimodal ones owing to people’s tendency to manifest their condition through one modality only, a source of diversity across unimodal approaches. In addition, the experiments show that it is possible to measure the “confidence” of the approach and automatically identify a subset of the test data in which the performance is above a predefined threshold. It is possible to effectively detect depression by using unobtrusive and inexpensive technologies based on the automatic analysis of speech and language

    Investigation of Low-Cost Wearable Internet of Things Enabled Technology for Physical Activity Recognition in the Elderly

    Technological advances in mobile sensing technologies has produced new opportunities for the monitoring of the elderly in uncontrolled environments by researchers. Sensors have become smaller, cheaper and can be worn on the body, potentially creating a network of sensors. Smart phones are also more common in the average household and can also provide some behavioural analysis due to the built-in sensors. As a result of this, researchers are able to monitor behaviours in a more naturalistic setting, which can lead to more contextually meaningful data. For those suffering with a mental illness, non-invasive and continuous monitoring can be achieved. Applying sensors to real world environments can aid in improving the quality of life of an elderly person with a mental illness and monitor their condition through behavioural analysis. In order to achieve this, selected classifiers must be able to accurately detect when an activity has taken place. In this thesis we aim to provide a framework for the investigation of activity recognition in the elderly using low-cost wearable sensors, which has resulted in the following contributions: 1. Classification of eighteen activities which were broken down into three disparate categories typical in a home setting: dynamic, sedentary and transitional. These were detected using two Shimmer3 IMU devices that we have located on the participants’ wrist and waist to create a low-cost, contextually deployable solution for elderly care monitoring. 2. Through the categorisation of performed Extracted time-domain and frequency-domain features from the Shimmer devices accelerometer and gyroscope were used as inputs, we achieved a high accuracy classification from a Convolutional Neural Network (CNN) model applied to the data set gained from participants recruited to the study through Join Dementia Research. The model was evaluated by variable adjustments to the model, tracking changes in its performance. Performance statistics were generated by the model for comparison and evaluation. Our results indicate that a low epoch of 200 using the ReLu activation function can display a high accuracy of 86% on the wrist data set and 85% on the waist data set, using only two low-cost wearable devices