5,942 research outputs found
Spontaneous Subtle Expression Detection and Recognition based on Facial Strain
Optical strain is an extension of optical flow that is capable of quantifying
subtle changes on faces and representing the minute facial motion intensities
at the pixel level. This is computationally essential for the relatively new
field of spontaneous micro-expression, where subtle expressions can be
technically challenging to pinpoint. In this paper, we present a novel method
for detecting and recognizing micro-expressions by utilizing facial optical
strain magnitudes to construct optical strain features and optical strain
weighted features. The two sets of features are then concatenated to form the
resultant feature histogram. Experiments were performed on the CASME II and
SMIC databases. We demonstrate on both databases, the usefulness of optical
strain information and more importantly, that our best approaches are able to
outperform the original baseline results for both detection and recognition
tasks. A comparison of the proposed method with other existing spatio-temporal
feature extraction approaches is also presented.Comment: 21 pages (including references), single column format, accepted to
Signal Processing: Image Communication journa
Carried baggage detection and recognition in video surveillance with foreground segmentation
Security cameras installed in public spaces or in private organizations continuously
record video data with the aim of detecting and preventing crime. For that reason,
video content analysis applications, either for real time (i.e. analytic) or post-event
(i.e. forensic) analysis, have gained high interest in recent years. In this thesis,
the primary focus is on two key aspects of video analysis, reliable moving object
segmentation and carried object detection & identification.
A novel moving object segmentation scheme by background subtraction is presented
in this thesis. The scheme relies on background modelling which is based
on multi-directional gradient and phase congruency. As a post processing step,
the detected foreground contours are refined by classifying the edge segments as
either belonging to the foreground or background. Further contour completion
technique by anisotropic diffusion is first introduced in this area. The proposed
method targets cast shadow removal, gradual illumination change invariance, and
closed contour extraction.
A state of the art carried object detection method is employed as a benchmark
algorithm. This method includes silhouette analysis by comparing human temporal
templates with unencumbered human models. The implementation aspects of
the algorithm are improved by automatically estimating the viewing direction of
the pedestrian and are extended by a carried luggage identification module. As
the temporal template is a frequency template and the information that it provides
is not sufficient, a colour temporal template is introduced. The standard
steps followed by the state of the art algorithm are approached from a different
extended (by colour information) perspective, resulting in more accurate carried
object segmentation.
The experiments conducted in this research show that the proposed closed
foreground segmentation technique attains all the aforementioned goals. The incremental
improvements applied to the state of the art carried object detection
algorithm revealed the full potential of the scheme. The experiments demonstrate
the ability of the proposed carried object detection algorithm to supersede the
state of the art method
SCOTCH and SODA: A Transformer Video Shadow Detection Framework
Shadows in videos are difficult to detect because of the large shadow
deformation between frames. In this work, we argue that accounting for shadow
deformation is essential when designing a video shadow detection method. To
this end, we introduce the shadow deformation attention trajectory (SODA), a
new type of video self-attention module, specially designed to handle the large
shadow deformations in videos. Moreover, we present a new shadow contrastive
learning mechanism (SCOTCH) which aims at guiding the network to learn a
unified shadow representation from massive positive shadow pairs across
different videos. We demonstrate empirically the effectiveness of our two
contributions in an ablation study. Furthermore, we show that SCOTCH and SODA
significantly outperforms existing techniques for video shadow detection. Code
is available at the project page:
https://lihaoliu-cambridge.github.io/scotch_and_soda/Comment: Accepted to CVPR 202
- …