10,938 research outputs found
Temporal Exemplar-based Bayesian Networks for facial expression recognition
Proceedings of the International Conference on Machine Learning and Applications, 2008, p. 16-22We present a Temporal Exemplar-based Bayesian Networks (TEBNs) far facial expression recognition. The proposed Bayesian Networks (BNs) consists of three layers: Observation layer, Exemplars layer and Prior Knowledge layer. In the Exemplars layer, exemplar-based model is integrated with BNs to improve the accuracy of probability estimation. In the Prior Knowledge layer, static BNs is extended to Temporal BNs by considering historical observations to model temporal behavior of facial expression. Experiment on CMU expression database illustrates that the proposed TEBNs is very efficient in modeling the evolution of facial deformation. © 2008 IEEE.published_or_final_versio
Taking the bite out of automated naming of characters in TV video
We investigate the problem of automatically labelling appearances of characters in TV or film material
with their names. This is tremendously challenging due to the huge variation in imaged appearance of each character and the weakness and ambiguity of available annotation. However, we demonstrate that high precision can be achieved by combining multiple sources of information, both visual and textual. The principal novelties that we introduce are: (i) automatic generation of time stamped character annotation by aligning subtitles and transcripts; (ii) strengthening the supervisory information by identifying
when characters are speaking. In addition, we incorporate complementary cues of face matching and clothing matching to propose common annotations for face tracks, and consider choices of classifier which can potentially correct errors made in the automatic extraction of training data from the weak textual annotation. Results are presented on episodes of the TV series ââBuffy the Vampire Slayerâ
LOMo: Latent Ordinal Model for Facial Analysis in Videos
We study the problem of facial analysis in videos. We propose a novel weakly
supervised learning method that models the video event (expression, pain etc.)
as a sequence of automatically mined, discriminative sub-events (eg. onset and
offset phase for smile, brow lower and cheek raise for pain). The proposed
model is inspired by the recent works on Multiple Instance Learning and latent
SVM/HCRF- it extends such frameworks to model the ordinal or temporal aspect in
the videos, approximately. We obtain consistent improvements over relevant
competitive baselines on four challenging and publicly available video based
facial analysis datasets for prediction of expression, clinical pain and intent
in dyadic conversations. In combination with complimentary features, we report
state-of-the-art results on these datasets.Comment: 2016 IEEE Conference on Computer Vision and Pattern Recognition
(CVPR
Familiarization through Ambient Images Alone
The term âambient imagesâ has begun to show up in much of the current literature on facial recognition. Ambient images refer to naturally occurring views of a face that captures the idiosyncratic ways in which a target face may vary (Ritchie & Burton, 2017). Much of the literature on ambient images have concluded that exposing people to ambient images of a target face can lead to improved facial recognition for that target face. Some studies have even suggested that familiarity is the result of increased exposure to ambient images of a target face (Burton, Kramer, Ritchie, & Jenkins, 2016). The current study extended the literature on ambient images. Using the face sorting paradigm from Jenkins, White, Van Montfort, and Burton (2011), the current study served three purposes. First, this study captured whether there was an incremental benefit in showing ambient images. Particularly, we observed whether performance improved as participants were shown a low, medium, or high number of ambient images. Next, this study attempted to provide a strong enough manipulation that participant would be able to perform the face sorting task perfectly, after being exposed to a high number (45 total) of ambient images. Lastly, this study introduced time data as a measure of face familiarity. The results found support for one aim of this study and partial support for another aim of this study. Time data were found to be an effective quantitative measure of familiarity. Also, there was some evidence of an incremental benefit of ambient images, but that benefit disappeared after viewing around 15 unique exemplar presentations of a novel identityâs face. Lastly, exposing participants to 45 ambient images alone did not cause them to reach perfect performance. The paper concludes with a discussion on the need to extend past ambient images to understand how to best mimic natural familiarity in a lab setting
Discriminatively Trained Latent Ordinal Model for Video Classification
We study the problem of video classification for facial analysis and human
action recognition. We propose a novel weakly supervised learning method that
models the video as a sequence of automatically mined, discriminative
sub-events (eg. onset and offset phase for "smile", running and jumping for
"highjump"). The proposed model is inspired by the recent works on Multiple
Instance Learning and latent SVM/HCRF -- it extends such frameworks to model
the ordinal aspect in the videos, approximately. We obtain consistent
improvements over relevant competitive baselines on four challenging and
publicly available video based facial analysis datasets for prediction of
expression, clinical pain and intent in dyadic conversations and on three
challenging human action datasets. We also validate the method with qualitative
results and show that they largely support the intuitions behind the method.Comment: Paper accepted in IEEE TPAMI. arXiv admin note: substantial text
overlap with arXiv:1604.0150
Mean value coordinatesâbased caricature and expression synthesis
We present a novel method for caricature synthesis based on mean value coordinates (MVC). Our method can be applied to any single frontal face image to learn a specified caricature face pair for frontal and 3D caricature synthesis. This technique only requires one or a small number of exemplar pairs and a natural frontal face image training set, while the system can transfer the style of the exemplar pair across individuals. Further exaggeration can be fulfilled in a controllable way. Our method is further applied to facial expression transfer, interpolation, and exaggeration, which are applications of expression editing. Additionally, we have extended our approach to 3D caricature synthesis based on the 3D version of MVC. With experiments we demonstrate that the transferred expressions are credible and the resulting caricatures can be characterized and recognized
A Comprehensive Performance Evaluation of Deformable Face Tracking "In-the-Wild"
Recently, technologies such as face detection, facial landmark localisation
and face recognition and verification have matured enough to provide effective
and efficient solutions for imagery captured under arbitrary conditions
(referred to as "in-the-wild"). This is partially attributed to the fact that
comprehensive "in-the-wild" benchmarks have been developed for face detection,
landmark localisation and recognition/verification. A very important technology
that has not been thoroughly evaluated yet is deformable face tracking
"in-the-wild". Until now, the performance has mainly been assessed
qualitatively by visually assessing the result of a deformable face tracking
technology on short videos. In this paper, we perform the first, to the best of
our knowledge, thorough evaluation of state-of-the-art deformable face tracking
pipelines using the recently introduced 300VW benchmark. We evaluate many
different architectures focusing mainly on the task of on-line deformable face
tracking. In particular, we compare the following general strategies: (a)
generic face detection plus generic facial landmark localisation, (b) generic
model free tracking plus generic facial landmark localisation, as well as (c)
hybrid approaches using state-of-the-art face detection, model free tracking
and facial landmark localisation technologies. Our evaluation reveals future
avenues for further research on the topic.Comment: E. Antonakos and P. Snape contributed equally and have joint second
authorshi
- âŠ