1,664 research outputs found

    Spectral Representation of Behaviour Primitives for Depression Analysis

    Get PDF

    Artificial Intelligence for Suicide Assessment using Audiovisual Cues: A Review

    Get PDF
    Death by suicide is the seventh leading death cause worldwide. The recent advancement in Artificial Intelligence (AI), specifically AI applications in image and voice processing, has created a promising opportunity to revolutionize suicide risk assessment. Subsequently, we have witnessed fast-growing literature of research that applies AI to extract audiovisual non-verbal cues for mental illness assessment. However, the majority of the recent works focus on depression, despite the evident difference between depression symptoms and suicidal behavior and non-verbal cues. This paper reviews recent works that study suicide ideation and suicide behavior detection through audiovisual feature analysis, mainly suicidal voice/speech acoustic features analysis and suicidal visual cues. Automatic suicide assessment is a promising research direction that is still in the early stages. Accordingly, there is a lack of large datasets that can be used to train machine learning and deep learning models proven to be effective in other, similar tasks.Comment: Manuscript submitted to Arificial Intelligence Reviews (2022

    The Verbal and Non Verbal Signals of Depression -- Combining Acoustics, Text and Visuals for Estimating Depression Level

    Full text link
    Depression is a serious medical condition that is suffered by a large number of people around the world. It significantly affects the way one feels, causing a persistent lowering of mood. In this paper, we propose a novel attention-based deep neural network which facilitates the fusion of various modalities. We use this network to regress the depression level. Acoustic, text and visual modalities have been used to train our proposed network. Various experiments have been carried out on the benchmark dataset, namely, Distress Analysis Interview Corpus - a Wizard of Oz (DAIC-WOZ). From the results, we empirically justify that the fusion of all three modalities helps in giving the most accurate estimation of depression level. Our proposed approach outperforms the state-of-the-art by 7.17% on root mean squared error (RMSE) and 8.08% on mean absolute error (MAE).Comment: 10 pages including references, 2 figure

    MLGaze: Machine Learning-Based Analysis of Gaze Error Patterns in Consumer Eye Tracking Systems

    Full text link
    Analyzing the gaze accuracy characteristics of an eye tracker is a critical task as its gaze data is frequently affected by non-ideal operating conditions in various consumer eye tracking applications. In this study, gaze error patterns produced by a commercial eye tracking device were studied with the help of machine learning algorithms, such as classifiers and regression models. Gaze data were collected from a group of participants under multiple conditions that commonly affect eye trackers operating on desktop and handheld platforms. These conditions (referred here as error sources) include user distance, head pose, and eye-tracker pose variations, and the collected gaze data were used to train the classifier and regression models. It was seen that while the impact of the different error sources on gaze data characteristics were nearly impossible to distinguish by visual inspection or from data statistics, machine learning models were successful in identifying the impact of the different error sources and predicting the variability in gaze error levels due to these conditions. The objective of this study was to investigate the efficacy of machine learning methods towards the detection and prediction of gaze error patterns, which would enable an in-depth understanding of the data quality and reliability of eye trackers under unconstrained operating conditions. Coding resources for all the machine learning methods adopted in this study were included in an open repository named MLGaze to allow researchers to replicate the principles presented here using data from their own eye trackers.Comment: https://github.com/anuradhakar49/MLGaz

    Objective and automated assessment of surgical technical skills with IoT systems: A systematic literature review

    Get PDF
    The assessment of surgical technical skills to be acquired by novice surgeons has been traditionally done by an expert surgeon and is therefore of a subjective nature. Nevertheless, the recent advances on IoT, the possibility of incorporating sensors into objects and environments in order to collect large amounts of data, and the progress on machine learning are facilitating a more objective and automated assessment of surgical technical skills. This paper presents a systematic literature review of papers published after 2013 discussing the objective and automated assessment of surgical technical skills. 101 out of an initial list of 537 papers were analyzed to identify: 1) the sensors used; 2) the data collected by these sensors and the relationship between these data, surgical technical skills and surgeons' levels of expertise; 3) the statistical methods and algorithms used to process these data; and 4) the feedback provided based on the outputs of these statistical methods and algorithms. Particularly, 1) mechanical and electromagnetic sensors are widely used for tool tracking, while inertial measurement units are widely used for body tracking; 2) path length, number of sub-movements, smoothness, fixation, saccade and total time are the main indicators obtained from raw data and serve to assess surgical technical skills such as economy, efficiency, hand tremor, or mind control, and distinguish between two or three levels of expertise (novice/intermediate/advanced surgeons); 3) SVM (Support Vector Machines) and Neural Networks are the preferred statistical methods and algorithms for processing the data collected, while new opportunities are opened up to combine various algorithms and use deep learning; and 4) feedback is provided by matching performance indicators and a lexicon of words and visualizations, although there is considerable room for research in the context of feedback and visualizations, taking, for example, ideas from learning analytics.This work was supported in part by the FEDER/Ministerio de Ciencia, Innovación y Universidades;Agencia Estatal de Investigación, through the Smartlet Project under Grant TIN2017-85179-C3-1-R, and in part by the Madrid Regional Government through the e-Madrid-CM Project under Grant S2018/TCS-4307, a project which is co-funded by the European Structural Funds (FSE and FEDER). Partial support has also been received from the European Commission through Erasmus + Capacity Building in the Field of Higher Education projects, more specifically through projects LALA (586120-EPP-1-2017-1-ES-EPPKA2-CBHE-JP), InnovaT (598758-EPP-1-2018-1-AT-EPPKA2-CBHE-JP), and PROF-XXI (609767-EPP-1-2019-1-ES-EPPKA2-CBHE-JP)

    Human behaviour-based automatic depression analysis using hand-crafted statistics and deep learned spectral features

    Get PDF
    Depression is a serious mental disorder that affects millions of people all over the world. Traditional clinical diagnosis methods are subjective, complicated and need extensive participation of experts. Audio-visual automatic depression analysis systems predominantly base their predictions on very brief sequential segments, sometimes as little as one frame. Such data contains much redundant information, causes a high computational load, and negatively affects the detection accuracy. Final decision making at the sequence level is then based on the fusion of frame or segment level predictions. However, this approach loses longer term behavioural correlations, as the behaviours themselves are abstracted away by the frame-level predictions. We propose to on the one hand use automatically detected human behaviour primitives such as Gaze directions, Facial action units (AU), etc. as low-dimensional multi-channel time series data, which can then be used to create two sequence descriptors. The first calculates the sequence-level statistics of the behaviour primitives and the second casts the problem as a Convolutional Neural Network problem operating on a spectral representation of the multichannel behaviour signals. The results of depression detection (binary classification) and severity estimation (regression) experiments conducted on the AVEC 2016 DAIC-WOZ database show that both methods achieved significant improvement compared to the previous state of the art in terms of the depression severity estimation

    Proceedings of the 2011 Joint Workshop of Fraunhofer IOSB and Institute for Anthropomatics, Vision and Fusion Laboratory

    Get PDF
    This book is a collection of 15 reviewed technical reports summarizing the presentations at the 2011 Joint Workshop of Fraunhofer IOSB and Institute for Anthropomatics, Vision and Fusion Laboratory. The covered topics include image processing, optical signal processing, visual inspection, pattern recognition and classification, human-machine interaction, world and situation modeling, autonomous system localization and mapping, information fusion, and trust propagation in sensor networks

    Classification of Drivers' Workload Using Physiological Signals in Conditional Automation

    Get PDF
    The use of automation in cars is increasing. In future vehicles, drivers will no longer be in charge of the main driving task and may be allowed to perform a secondary task. However, they might be requested to regain control of the car if a hazardous situation occurs (i.e., conditionally automated driving). Performing a secondary task might increase drivers' mental workload and consequently decrease the takeover performance if the workload level exceeds a certain threshold. Knowledge about the driver's mental state might hence be useful for increasing safety in conditionally automated vehicles. Measuring drivers' workload continuously is essential to support the driver and hence limit the number of accidents in takeover situations. This goal can be achieved using machine learning techniques to evaluate and classify the drivers' workload in real-time. To evaluate the usefulness of physiological data as an indicator for workload in conditionally automated driving, three physiological signals from 90 subjects were collected during 25 min of automated driving in a fixed-base simulator. Half of the participants performed a verbal cognitive task to induce mental workload while the other half only had to monitor the environment of the car. Three classifiers, sensor fusion and levels of data segmentation were compared. Results show that the best model was able to successfully classify the condition of the driver with an accuracy of 95%. In some cases, the model benefited from sensors' fusion. Increasing the segmentation level (e.g., size of the time window to compute physiological indicators) increased the performance of the model for windows smaller than 4 min, but decreased for windows larger than 4 min. In conclusion, the study showed that a high level of drivers' mental workload can be accurately detected while driving in conditional automation based on 4-min recordings of respiration and skin conductance

    Multimodal Data Analysis of Dyadic Interactions for an Automated Feedback System Supporting Parent Implementation of Pivotal Response Treatment

    Get PDF
    abstract: Parents fulfill a pivotal role in early childhood development of social and communication skills. In children with autism, the development of these skills can be delayed. Applied behavioral analysis (ABA) techniques have been created to aid in skill acquisition. Among these, pivotal response treatment (PRT) has been empirically shown to foster improvements. Research into PRT implementation has also shown that parents can be trained to be effective interventionists for their children. The current difficulty in PRT training is how to disseminate training to parents who need it, and how to support and motivate practitioners after training. Evaluation of the parents’ fidelity to implementation is often undertaken using video probes that depict the dyadic interaction occurring between the parent and the child during PRT sessions. These videos are time consuming for clinicians to process, and often result in only minimal feedback for the parents. Current trends in technology could be utilized to alleviate the manual cost of extracting data from the videos, affording greater opportunities for providing clinician created feedback as well as automated assessments. The naturalistic context of the video probes along with the dependence on ubiquitous recording devices creates a difficult scenario for classification tasks. The domain of the PRT video probes can be expected to have high levels of both aleatory and epistemic uncertainty. Addressing these challenges requires examination of the multimodal data along with implementation and evaluation of classification algorithms. This is explored through the use of a new dataset of PRT videos. The relationship between the parent and the clinician is important. The clinician can provide support and help build self-efficacy in addition to providing knowledge and modeling of treatment procedures. Facilitating this relationship along with automated feedback not only provides the opportunity to present expert feedback to the parent, but also allows the clinician to aid in personalizing the classification models. By utilizing a human-in-the-loop framework, clinicians can aid in addressing the uncertainty in the classification models by providing additional labeled samples. This will allow the system to improve classification and provides a person-centered approach to extracting multimodal data from PRT video probes.Dissertation/ThesisDoctoral Dissertation Computer Science 201
    • …
    corecore