1,437 research outputs found
Discriminatively Trained Latent Ordinal Model for Video Classification
We study the problem of video classification for facial analysis and human
action recognition. We propose a novel weakly supervised learning method that
models the video as a sequence of automatically mined, discriminative
sub-events (eg. onset and offset phase for "smile", running and jumping for
"highjump"). The proposed model is inspired by the recent works on Multiple
Instance Learning and latent SVM/HCRF -- it extends such frameworks to model
the ordinal aspect in the videos, approximately. We obtain consistent
improvements over relevant competitive baselines on four challenging and
publicly available video based facial analysis datasets for prediction of
expression, clinical pain and intent in dyadic conversations and on three
challenging human action datasets. We also validate the method with qualitative
results and show that they largely support the intuitions behind the method.Comment: Paper accepted in IEEE TPAMI. arXiv admin note: substantial text
overlap with arXiv:1604.0150
LOMo: Latent Ordinal Model for Facial Analysis in Videos
We study the problem of facial analysis in videos. We propose a novel weakly
supervised learning method that models the video event (expression, pain etc.)
as a sequence of automatically mined, discriminative sub-events (eg. onset and
offset phase for smile, brow lower and cheek raise for pain). The proposed
model is inspired by the recent works on Multiple Instance Learning and latent
SVM/HCRF- it extends such frameworks to model the ordinal or temporal aspect in
the videos, approximately. We obtain consistent improvements over relevant
competitive baselines on four challenging and publicly available video based
facial analysis datasets for prediction of expression, clinical pain and intent
in dyadic conversations. In combination with complimentary features, we report
state-of-the-art results on these datasets.Comment: 2016 IEEE Conference on Computer Vision and Pattern Recognition
(CVPR
Weakly supervised coupled networks for visual sentiment analysis
Automatic assessment of sentiment from visual content
has gained considerable attention with the increasing tendency
of expressing opinions on-line. In this paper, we solve
the problem of visual sentiment analysis using the high-level
abstraction in the recognition process. Existing methods
based on convolutional neural networks learn sentiment
representations from the holistic image appearance. However,
different image regions can have a different influence
on the intended expression. This paper presents a weakly
supervised coupled convolutional network with two branches
to leverage the localized information. The first branch
detects a sentiment specific soft map by training a fully convolutional
network with the cross spatial pooling strategy,
which only requires image-level labels, thereby significantly
reducing the annotation burden. The second branch utilizes
both the holistic and localized information by coupling
the sentiment map with deep features for robust classification.
We integrate the sentiment detection and classification
branches into a unified deep framework and optimize
the network in an end-to-end manner. Extensive experiments
on six benchmark datasets demonstrate that the
proposed method performs favorably against the state-ofthe-
art methods for visual sentiment analysis
Viraliency: Pooling Local Virality
In our overly-connected world, the automatic recognition of virality - the
quality of an image or video to be rapidly and widely spread in social networks
- is of crucial importance, and has recently awaken the interest of the
computer vision community. Concurrently, recent progress in deep learning
architectures showed that global pooling strategies allow the extraction of
activation maps, which highlight the parts of the image most likely to contain
instances of a certain class. We extend this concept by introducing a pooling
layer that learns the size of the support area to be averaged: the learned
top-N average (LENA) pooling. We hypothesize that the latent concepts (feature
maps) describing virality may require such a rich pooling strategy. We assess
the effectiveness of the LENA layer by appending it on top of a convolutional
siamese architecture and evaluate its performance on the task of predicting and
localizing virality. We report experiments on two publicly available datasets
annotated for virality and show that our method outperforms state-of-the-art
approaches.Comment: Accepted at IEEE CVPR 201
Facial Action Unit Detection Using Attention and Relation Learning
Attention mechanism has recently attracted increasing attentions in the field
of facial action unit (AU) detection. By finding the region of interest of each
AU with the attention mechanism, AU-related local features can be captured.
Most of the existing attention based AU detection works use prior knowledge to
predefine fixed attentions or refine the predefined attentions within a small
range, which limits their capacity to model various AUs. In this paper, we
propose an end-to-end deep learning based attention and relation learning
framework for AU detection with only AU labels, which has not been explored
before. In particular, multi-scale features shared by each AU are learned
firstly, and then both channel-wise and spatial attentions are adaptively
learned to select and extract AU-related local features. Moreover, pixel-level
relations for AUs are further captured to refine spatial attentions so as to
extract more relevant local features. Without changing the network
architecture, our framework can be easily extended for AU intensity estimation.
Extensive experiments show that our framework (i) soundly outperforms the
state-of-the-art methods for both AU detection and AU intensity estimation on
the challenging BP4D, DISFA, FERA 2015 and BP4D+ benchmarks, (ii) can
adaptively capture the correlated regions of each AU, and (iii) also works well
under severe occlusions and large poses.Comment: This paper is accepted by IEEE Transactions on Affective Computin
- …