Search CORE

70 research outputs found

Spatio-Temporal Facial Expression Recognition Using Convolutional Neural Networks and Conditional Random Fields

Author: Hasani Behzad
Mahoor Mohammad H.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/04/2017
Field of study

Automated Facial Expression Recognition (FER) has been a challenging task for decades. Many of the existing works use hand-crafted features such as LBP, HOG, LPQ, and Histogram of Optical Flow (HOF) combined with classifiers such as Support Vector Machines for expression recognition. These methods often require rigorous hyperparameter tuning to achieve good results. Recently Deep Neural Networks (DNN) have shown to outperform traditional methods in visual object recognition. In this paper, we propose a two-part network consisting of a DNN-based architecture followed by a Conditional Random Field (CRF) module for facial expression recognition in videos. The first part captures the spatial relation within facial images using convolutional layers followed by three Inception-ResNet modules and two fully-connected layers. To capture the temporal relation between the image frames, we use linear chain CRF in the second part of our network. We evaluate our proposed network on three publicly available databases, viz. CK+, MMI, and FERA. Experiments are performed in subject-independent and cross-database manners. Our experimental results show that cascading the deep network architecture with the CRF module considerably increases the recognition of facial expressions in videos and in particular it outperforms the state-of-the-art methods in the cross-database experiments and yields comparable results in the subject-independent experiments.Comment: To appear in 12th IEEE Conference on Automatic Face and Gesture Recognition Worksho

arXiv.org e-Print Archive

Crossref

Ad-Corre: Adaptive Correlation-Based Loss for Facial Expression Recognition in the Wild

Author: Fard Ali Pourramezan
Mahoor Mohammad H
Publication venue: Digital Commons @ DU
Publication date: 03/03/2022
Field of study

Automated Facial Expression Recognition (FER) in the wild using deep neural networks is still challenging due to intra-class variations and inter-class similarities in facial images. Deep Metric Learning (DML) is among the widely used methods to deal with these issues by improving the discriminative power of the learned embedded features. This paper proposes an Adaptive Correlation (Ad-Corre) Loss to guide the network towards generating embedded feature vectors with high correlation for within-class samples and less correlation for between-class samples. Ad-Corre consists of 3 components called Feature Discriminator, Mean Discriminator, and Embedding Discriminator. We design the Feature Discriminator component to guide the network to create the embedded feature vectors to be highly correlated if they belong to a similar class, and less correlated if they belong to different classes. In addition, the Mean Discriminator component leads the network to make the mean embedded feature vectors of different classes to be less similar to each other. We use Xception network as the backbone of our model, and contrary to previous work, we propose an embedding feature space that contains k feature vectors. Then, the Embedding Discriminator component penalizes the network to generate the embedded feature vectors, which are dissimilar. We trained our model using the combination of our proposed loss functions called Ad-Corre Loss jointly with the crossentropy loss. We achieved a very promising recognition accuracy on AffectNet, RAF-DB, and FER-2013. Our extensive experiments and ablation study indicate the power of our method to cope well with challenging FER tasks in the wild. The code is available on Github

University of Denver

MC-ViViT: Multi-branch Classifier-ViViT to Detect Mild Cognitive Impairment in Older Adults using Facial Videos

Author: Dodge Hiroko H.
Mahoor Mohammad H.
Sun Jian
Publication venue
Publication date: 11/04/2023
Field of study

Deep machine learning models including Convolutional Neural Networks (CNN) have been successful in the detection of Mild Cognitive Impairment (MCI) using medical images, questionnaires, and videos. This paper proposes a novel Multi-branch Classifier-Video Vision Transformer (MC-ViViT) model to distinguish MCI from those with normal cognition by analyzing facial features. The data comes from the I-CONECT, a behavioral intervention trial aimed at improving cognitive function by providing frequent video chats. MC-ViViT extracts spatiotemporal features of videos in one branch and augments representations by the MC module. The I-CONECT dataset is challenging as the dataset is imbalanced containing Hard-Easy and Positive-Negative samples, which impedes the performance of MC-ViViT. We propose a loss function for Hard-Easy and Positive-Negative Samples (HP Loss) by combining Focal loss and AD-CORRE loss to address the imbalanced problem. Our experimental results on the I-CONECT dataset show the great potential of MC-ViViT in predicting MCI with a high accuracy of 90.63\% accuracy on some of the interview videos.Comment: 12 pages, 5 tables, 5 figures, 17 equation

arXiv.org e-Print Archive

A Music-Therapy Robotic Platform for Children with Autism: A Pilot Study

Author: Dino Francesca
Feng Huanghao
Mahoor Mohammad H
Publication venue: Digital Commons @ DU
Publication date: 23/05/2022
Field of study

Children with Autism Spectrum Disorder (ASD) experience deficits in verbal and nonverbal communication skills including motor control, turn-taking, and emotion recognition. Innovative technology, such as socially assistive robots, has shown to be a viable method for Autism therapy. This paper presents a novel robot-based music-therapy platform for modeling and improving the social responses and behaviors of children with ASD. Our autonomous social interactive system consists of three modules. Module one provides an autonomous initiative positioning system for the robot, NAO, to properly localize and play the instrument (Xylophone) using the robot’s arms. Module two allows NAO to play customized songs composed by individuals. Module three provides a real-life music therapy experience to the users. We adopted Short-time Fourier Transform and Levenshtein distance to fulfill the design requirements: 1) “music detection” and 2) “smart scoring and feedback”, which allows NAO to understand music and provide additional practice and oral feedback to the users as applicable. We designed and implemented six Human-Robot-Interaction (HRI) sessions including four intervention sessions. Nine children with ASD and seven Typically Developing participated in a total of fifty HRI experimental sessions. Using our platform, we collected and analyzed data on social behavioral changes and emotion recognition using Electrodermal Activity (EDA) signals. The results of our experiments demonstrate most of the participants were able to complete motor control tasks with 70% accuracy. Six out of the nine ASD participants showed stable turn-taking behavior when playing music. The results of automated emotion classification using Support Vector Machines illustrates that emotional arousal in the ASD group can be detected and well recognized via EDA bio-signals. In summary, the results of our data analyses, including emotion classification using EDA signals, indicate that the proposed robot-music based therapy platform is an attractive and promising assistive tool to facilitate the improvement of fine motor control and turn-taking skills in children with ASD

University of Denver

PubMed Central