31,339 research outputs found
A Comprehensive Performance Evaluation of Deformable Face Tracking "In-the-Wild"
Recently, technologies such as face detection, facial landmark localisation
and face recognition and verification have matured enough to provide effective
and efficient solutions for imagery captured under arbitrary conditions
(referred to as "in-the-wild"). This is partially attributed to the fact that
comprehensive "in-the-wild" benchmarks have been developed for face detection,
landmark localisation and recognition/verification. A very important technology
that has not been thoroughly evaluated yet is deformable face tracking
"in-the-wild". Until now, the performance has mainly been assessed
qualitatively by visually assessing the result of a deformable face tracking
technology on short videos. In this paper, we perform the first, to the best of
our knowledge, thorough evaluation of state-of-the-art deformable face tracking
pipelines using the recently introduced 300VW benchmark. We evaluate many
different architectures focusing mainly on the task of on-line deformable face
tracking. In particular, we compare the following general strategies: (a)
generic face detection plus generic facial landmark localisation, (b) generic
model free tracking plus generic facial landmark localisation, as well as (c)
hybrid approaches using state-of-the-art face detection, model free tracking
and facial landmark localisation technologies. Our evaluation reveals future
avenues for further research on the topic.Comment: E. Antonakos and P. Snape contributed equally and have joint second
authorshi
Online Object Tracking with Proposal Selection
Tracking-by-detection approaches are some of the most successful object
trackers in recent years. Their success is largely determined by the detector
model they learn initially and then update over time. However, under
challenging conditions where an object can undergo transformations, e.g.,
severe rotation, these methods are found to be lacking. In this paper, we
address this problem by formulating it as a proposal selection task and making
two contributions. The first one is introducing novel proposals estimated from
the geometric transformations undergone by the object, and building a rich
candidate set for predicting the object location. The second one is devising a
novel selection strategy using multiple cues, i.e., detection score and
edgeness score computed from state-of-the-art object edges and motion
boundaries. We extensively evaluate our approach on the visual object tracking
2014 challenge and online tracking benchmark datasets, and show the best
performance.Comment: ICCV 201
Social Scene Understanding: End-to-End Multi-Person Action Localization and Collective Activity Recognition
We present a unified framework for understanding human social behaviors in
raw image sequences. Our model jointly detects multiple individuals, infers
their social actions, and estimates the collective actions with a single
feed-forward pass through a neural network. We propose a single architecture
that does not rely on external detection algorithms but rather is trained
end-to-end to generate dense proposal maps that are refined via a novel
inference scheme. The temporal consistency is handled via a person-level
matching Recurrent Neural Network. The complete model takes as input a sequence
of frames and outputs detections along with the estimates of individual actions
and collective activities. We demonstrate state-of-the-art performance of our
algorithm on multiple publicly available benchmarks
- …