2,341 research outputs found
Depression Estimation Using Audiovisual Features and Fisher Vector Encoding
International audienceWe investigate the use of two visual descriptors: Local Bi-nary Patterns-Three Orthogonal Planes(LBP-TOP) and Dense Trajectories for depression assessment on the AVEC 2014 challenge dataset. We encode the visual information gen-erated by the two descriptors using Fisher Vector encod-ing which has been shown to be one of the best performing methods to encode visual data for image classification. We also incorporate audio features in the final system to intro-duce multiple input modalities. The results produced using Linear Support Vector regression outperform the baseline method[16]
LEARNet Dynamic Imaging Network for Micro Expression Recognition
Unlike prevalent facial expressions, micro expressions have subtle,
involuntary muscle movements which are short-lived in nature. These minute
muscle movements reflect true emotions of a person. Due to the short duration
and low intensity, these micro-expressions are very difficult to perceive and
interpret correctly. In this paper, we propose the dynamic representation of
micro-expressions to preserve facial movement information of a video in a
single frame. We also propose a Lateral Accretive Hybrid Network (LEARNet) to
capture micro-level features of an expression in the facial region. The LEARNet
refines the salient expression features in accretive manner by incorporating
accretion layers (AL) in the network. The response of the AL holds the hybrid
feature maps generated by prior laterally connected convolution layers.
Moreover, LEARNet architecture incorporates the cross decoupled relationship
between convolution layers which helps in preserving the tiny but influential
facial muscle change information. The visual responses of the proposed LEARNet
depict the effectiveness of the system by preserving both high- and micro-level
edge features of facial expression. The effectiveness of the proposed LEARNet
is evaluated on four benchmark datasets: CASME-I, CASME-II, CAS(ME)^2 and SMIC.
The experimental results after investigation show a significant improvement of
4.03%, 1.90%, 1.79% and 2.82% as compared with ResNet on CASME-I, CASME-II,
CAS(ME)^2 and SMIC datasets respectively.Comment: Dynamic imaging, accretion, lateral, micro expression recognitio
Face Image and Video Analysis in Biometrics and Health Applications
Computer Vision (CV) enables computers and systems to derive meaningful information from acquired visual inputs, such as images and videos, and make decisions based on the extracted information. Its goal is to acquire, process, analyze, and understand the information by developing a theoretical and algorithmic model. Biometrics are distinctive and measurable human characteristics used to label or describe individuals by combining computer vision with knowledge of human physiology (e.g., face, iris, fingerprint) and behavior (e.g., gait, gaze, voice). Face is one of the most informative biometric traits. Many studies have investigated the human face from the perspectives of various different disciplines, ranging from computer vision, deep learning, to neuroscience and biometrics. In this work, we analyze the face characteristics from digital images and videos in the areas of morphing attack and defense, and autism diagnosis. For face morphing attacks generation, we proposed a transformer based generative adversarial network to generate more visually realistic morphing attacks by combining different losses, such as face matching distance, facial landmark based loss, perceptual loss and pixel-wise mean square error. In face morphing attack detection study, we designed a fusion-based few-shot learning (FSL) method to learn discriminative features from face images for few-shot morphing attack detection (FS-MAD), and extend the current binary detection into multiclass classification, namely, few-shot morphing attack fingerprinting (FS-MAF). In the autism diagnosis study, we developed a discriminative few shot learning method to analyze hour-long video data and explored the fusion of facial dynamics for facial trait classification of autism spectrum disorder (ASD) in three severity levels. The results show outstanding performance of the proposed fusion-based few-shot framework on the dataset. Besides, we further explored the possibility of performing face micro- expression spotting and feature analysis on autism video data to classify ASD and control groups. The results indicate the effectiveness of subtle facial expression changes on autism diagnosis
Multi-Modality Human Action Recognition
Human action recognition is very useful in many applications in various areas, e.g. video surveillance, HCI (Human computer interaction), video retrieval, gaming and security. Recently, human action recognition becomes an active research topic in computer vision and pattern recognition. A number of action recognition approaches have been proposed. However, most of the approaches are designed on the RGB images sequences, where the action data was collected by RGB/intensity camera. Thus the recognition performance is usually related to various occlusion, background, and lighting conditions of the image sequences. If more information can be provided along with the image sequences, more data sources other than the RGB video can be utilized, human actions could be better represented and recognized by the designed computer vision system.;In this dissertation, the multi-modality human action recognition is studied. On one hand, we introduce the study of multi-spectral action recognition, which involves the information from different spectrum beyond visible, e.g. infrared and near infrared. Action recognition in individual spectra is explored and new methods are proposed. Then the cross-spectral action recognition is also investigated and novel approaches are proposed in our work. On the other hand, since the depth imaging technology has made a significant progress recently, where depth information can be captured simultaneously with the RGB videos. The depth-based human action recognition is also investigated. I first propose a method combining different type of depth data to recognize human actions. Then a thorough evaluation is conducted on spatiotemporal interest point (STIP) based features for depth-based action recognition. Finally, I advocate the study of fusing different features for depth-based action analysis. Moreover, human depression recognition is studied by combining facial appearance model as well as facial dynamic model
Advances in Emotion Recognition: Link to Depressive Disorder
Emotion recognition enables real-time analysis, tagging, and inference of cognitive affective states from human facial expression, speech and tone, body posture and physiological signal, as well as social text on social network platform. Recognition of emotion pattern based on explicit and implicit features extracted through wearable and other devices could be decoded through computational modeling. Meanwhile, emotion recognition and computation are critical to detection and diagnosis of potential patients of mood disorder. The chapter aims to summarize the main findings in the area of affective recognition and its applications in major depressive disorder (MDD), which have made rapid progress in the last decade
Recommended from our members
The role of HG in the analysis of temporal iteration and interaural correlation
Recommended from our members
Employing Information and Communications Technologies in Homes and Cities for the Health and Well-Being of Older People
YesHe X and Sheriff RE (Eds.) Employing ICT in Homes and Cities for the Health and Well-Being of Older People. Workshop Proceedings of ICT4HOPâ16. 15-17 Aug 2016. Sichuan University, Chengdu, China.British Council, Researcher Links, Newton Fund, NSF
Intelligent System for Depression Scale Estimation with Facial Expressions and Case Study in Industrial Intelligence
As a mental disorder, depression has affected people's lives, works, and so on. Researchers have proposed various industrial intelligent systems in the pattern recognition field for audiovisual depression detection. This paper presents an endâtoâend trainable intelligent system to generate highâlevel representations over the entire video clip. Specifically, a threeâdimensional (3D) convolutional neural network equipped with a module spatiotemporal feature aggregation module (STFAM) is trained from scratch on audio/visual emotion challenge (AVEC)2013 and AVEC2014 data, which can model the discriminative patterns closely related to depression. In the STFAM, channel and spatial attention mechanism and an aggregation method, namely 3D DEPâNetVLAD, are integrated to learn the compact characteristic based on the feature maps. Extensive experiments on the two databases (i.e., AVEC2013 and AVEC2014) are illustrated that the proposed intelligent system can efficiently model the underlying depression patterns and obtain better performances over the most videoâbased depression recognition approaches. Case studies are presented to describes the applicability of the proposed intelligent system for industrial intelligence.Peer reviewe
Modern Views of Machine Learning for Precision Psychiatry
In light of the NIMH's Research Domain Criteria (RDoC), the advent of
functional neuroimaging, novel technologies and methods provide new
opportunities to develop precise and personalized prognosis and diagnosis of
mental disorders. Machine learning (ML) and artificial intelligence (AI)
technologies are playing an increasingly critical role in the new era of
precision psychiatry. Combining ML/AI with neuromodulation technologies can
potentially provide explainable solutions in clinical practice and effective
therapeutic treatment. Advanced wearable and mobile technologies also call for
the new role of ML/AI for digital phenotyping in mobile mental health. In this
review, we provide a comprehensive review of the ML methodologies and
applications by combining neuroimaging, neuromodulation, and advanced mobile
technologies in psychiatry practice. Additionally, we review the role of ML in
molecular phenotyping and cross-species biomarker identification in precision
psychiatry. We further discuss explainable AI (XAI) and causality testing in a
closed-human-in-the-loop manner, and highlight the ML potential in multimedia
information extraction and multimodal data fusion. Finally, we discuss
conceptual and practical challenges in precision psychiatry and highlight ML
opportunities in future research
- âŠ