Search CORE

44 research outputs found

Improving Speech-related Facial Action Unit Recognition by Audiovisual Information Fusion

Author: Meng Zibo
Publication venue: Scholar Commons
Publication date: 01/01/2018
Field of study

In spite of great progress achieved on posed facial display and controlled image acquisition, performance of facial action unit (AU) recognition degrades significantly for spontaneous facial displays. Furthermore, recognizing AUs accompanied with speech is even more challenging since they are generally activated at a low intensity with subtle facial appearance/geometrical changes during speech, and more importantly, often introduce ambiguity in detecting other co-occurring AUs, e.g., producing non-additive appearance changes. All the current AU recognition systems utilized information extracted only from visual channel. However, sound is highly correlated with visual channel in human communications. Thus, we propose to exploit both audio and visual information for AU recognition. Specifically, a feature-level fusion method combining both audio and visual features is first introduced. Specifically, features are independently extracted from visual and audio channels. The extracted features are aligned to handle the difference in time scales and the time shift between the two signals. These temporally aligned features are integrated via feature-level fusion for AU recognition. Second, a novel approach that recognizes speech-related AUs exclusively from audio signals based on the fact that facial activities are highly correlated with voice during speech is developed. Specifically, dynamic and physiological relationships between AUs and phonemes are modeled through a continuous time Bayesian network (CTBN); then AU recognition is performed by probabilistic inference via the CTBN model. Third, a novel audiovisual fusion framework, which aims to make the best use of visual and acoustic cues in recognizing speech-related facial AUs is developed. In particular, a dynamic Bayesian network (DBN) is employed to explicitly model the semantic and dynamic physiological relationships between AUs and phonemes as well as measurement uncertainty. AU recognition is then conducted by probabilistic inference via the DBN model. To evaluate the proposed approaches, a pilot AU-coded audiovisual database was collected. Experiments on this dataset have demonstrated that the proposed frameworks yield significant improvement in recognizing speech-related AUs compared to the state-of-the-art visual-based methods. Furthermore, more impressive improvement has been achieved for those AUs, whose visual observations are impaired during speech

Scholar Commons - Institutional Repository of the University of South Carolina

Island Loss for Learning Discriminative Features in Facial Expression Recognition

Author: Cai Jie
Khan Ahmed Shehab
Li Zhiyuan
Meng Zibo
O'Reilly James
Tong Yan
Publication venue
Publication date: 23/10/2017
Field of study

Over the past few years, Convolutional Neural Networks (CNNs) have shown promise on facial expression recognition. However, the performance degrades dramatically under real-world settings due to variations introduced by subtle facial appearance changes, head pose variations, illumination changes, and occlusions. In this paper, a novel island loss is proposed to enhance the discriminative power of the deeply learned features. Specifically, the IL is designed to reduce the intra-class variations while enlarging the inter-class differences simultaneously. Experimental results on four benchmark expression databases have demonstrated that the CNN with the proposed island loss (IL-CNN) outperforms the baseline CNN models with either traditional softmax loss or the center loss and achieves comparable or better performance compared with the state-of-the-art methods for facial expression recognition.Comment: 8 pages, 3 figure

arXiv.org e-Print Archive

Crossref

Optimizing Filter Size in Convolutional Neural Networks for Facial Action Unit Recognition

Author: Cai Jie
Han Shizhong
Li Zhiyuan
Meng Zibo
O'Reilly James
Tong Yan
Wang Xiaofeng
Publication venue
Publication date: 22/11/2017
Field of study

Recognizing facial action units (AUs) during spontaneous facial displays is a challenging problem. Most recently, Convolutional Neural Networks (CNNs) have shown promise for facial AU recognition, where predefined and fixed convolution filter sizes are employed. In order to achieve the best performance, the optimal filter size is often empirically found by conducting extensive experimental validation. Such a training process suffers from expensive training cost, especially as the network becomes deeper. This paper proposes a novel Optimized Filter Size CNN (OFS-CNN), where the filter sizes and weights of all convolutional layers are learned simultaneously from the training data along with learning convolution filters. Specifically, the filter size is defined as a continuous variable, which is optimized by minimizing the training loss. Experimental results on two AU-coded spontaneous databases have shown that the proposed OFS-CNN is capable of estimating optimal filter size for varying image resolution and outperforms traditional CNNs with the best filter size obtained by exhaustive search. The OFS-CNN also beats the CNN using multiple filter sizes and more importantly, is much more efficient during testing with the proposed forward-backward propagation algorithm

arXiv.org e-Print Archive

Crossref

Platelet Distribution Width Levels Can Be a Predictor in the Diagnosis of Persistent Organ Failure in Acute Pancreatitis

Author: Feiyang Wang
Heshui Wu
Shoukang Li
Yushun Zhang
Zibo Meng
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2017
Field of study

Purpose. The change of serum platelet indices such as platelet distribution width (PDW) has been reported in a series of inflammatory reaction and clinical diseases. However, the relationship between PDW and the incidence of persistent organ failure (POF) in acute pancreatitis (AP) has not been elucidated so far. Materials and Methods. A total of 135 patients with AP admitted within 72 hours from symptom onset of AP at our center between December 2014 and January 2016 were included in this retrospective study. Demographic parameters on admission, organ failure assessment, laboratory data, and in-hospital mortality were compared between patients with and without POF. Multivariable logistic regression analyses were utilized to evaluate the predictive value of serum PDW for POF. Results. 30 patients were diagnosed with POF. Compared to patients without POF, patients with POF showed a significantly higher value of serum PDW on admission (14.88 ± 2.24 versus 17.60 ± 1.96%, P<0.001). After multivariable analysis, high PDW level remained a risk factor for POF (odds ratio 39.42, 95% CI: 8.64–179.77; P<0.001). A PDW value of 16.45% predicted POF with an area under the curve (AUC) of 0.870, a sensitivity with 0.867, and a specificity with 0.771, respectively. Conclusions. Our results indicate that serum PDW on admission could be a predictive factor in AP with POF and may serve as a potential prognostic factor

Crossref

Directory of Open Access Journals