496 research outputs found
Py-Feat: Python Facial Expression Analysis Toolbox
Studying facial expressions is a notoriously difficult endeavor. Recent
advances in the field of affective computing have yielded impressive progress
in automatically detecting facial expressions from pictures and videos.
However, much of this work has yet to be widely disseminated in social science
domains such as psychology. Current state of the art models require
considerable domain expertise that is not traditionally incorporated into
social science training programs. Furthermore, there is a notable absence of
user-friendly and open-source software that provides a comprehensive set of
tools and functions that support facial expression research. In this paper, we
introduce Py-Feat, an open-source Python toolbox that provides support for
detecting, preprocessing, analyzing, and visualizing facial expression data.
Py-Feat makes it easy for domain experts to disseminate and benchmark computer
vision models and also for end users to quickly process, analyze, and visualize
face expression data. We hope this platform will facilitate increased use of
facial expression data in human behavior research.Comment: 25 pages, 3 figures, 5 table
Baseline CNN structure analysis for facial expression recognition
We present a baseline convolutional neural network (CNN) structure and image
preprocessing methodology to improve facial expression recognition algorithm
using CNN. To analyze the most efficient network structure, we investigated
four network structures that are known to show good performance in facial
expression recognition. Moreover, we also investigated the effect of input
image preprocessing methods. Five types of data input (raw, histogram
equalization, isotropic smoothing, diffusion-based normalization, difference of
Gaussian) were tested, and the accuracy was compared. We trained 20 different
CNN models (4 networks x 5 data input types) and verified the performance of
each network with test images from five different databases. The experiment
result showed that a three-layer structure consisting of a simple convolutional
and a max pooling layer with histogram equalization image input was the most
efficient. We describe the detailed training procedure and analyze the result
of the test accuracy based on considerable observation.Comment: 6 pages, RO-MAN2016 Conferenc
- …