13 research outputs found
Mean Oriented Riesz Features for Micro Expression Classification
Micro-expressions are brief and subtle facial expressions that go on and off
the face in a fraction of a second. This kind of facial expressions usually
occurs in high stake situations and is considered to reflect a human's real
intent. There has been some interest in micro-expression analysis, however, a
great majority of the methods are based on classically established computer
vision methods such as local binary patterns, histogram of gradients and
optical flow. A novel methodology for micro-expression recognition using the
Riesz pyramid, a multi-scale steerable Hilbert transform is presented. In fact,
an image sequence is transformed with this tool, then the image phase
variations are extracted and filtered as proxies for motion. Furthermore, the
dominant orientation constancy from the Riesz transform is exploited to average
the micro-expression sequence into an image pair. Based on that, the Mean
Oriented Riesz Feature description is introduced. Finally the performance of
our methods are tested in two spontaneous micro-expressions databases and
compared to state-of-the-art methods
MIMAMO Net: Integrating Micro- and Macro-motion for Video Emotion Recognition
Spatial-temporal feature learning is of vital importance for video emotion
recognition. Previous deep network structures often focused on macro-motion
which extends over long time scales, e.g., on the order of seconds. We believe
integrating structures capturing information about both micro- and macro-motion
will benefit emotion prediction, because human perceive both micro- and
macro-expressions. In this paper, we propose to combine micro- and macro-motion
features to improve video emotion recognition with a two-stream recurrent
network, named MIMAMO (Micro-Macro-Motion) Net. Specifically, smaller and
shorter micro-motions are analyzed by a two-stream network, while larger and
more sustained macro-motions can be well captured by a subsequent recurrent
network. Assigning specific interpretations to the roles of different parts of
the network enables us to make choice of parameters based on prior knowledge:
choices that turn out to be optimal. One of the important innovations in our
model is the use of interframe phase differences rather than optical flow as
input to the temporal stream. Compared with the optical flow, phase differences
require less computation and are more robust to illumination changes. Our
proposed network achieves state of the art performance on two video emotion
datasets, the OMG emotion dataset and the Aff-Wild dataset. The most
significant gains are for arousal prediction, for which motion information is
intuitively more informative. Source code is available at
https://github.com/wtomin/MIMAMO-Net.Comment: Accepted by AAAI 202
Micro Expression Spotting through Appearance Based Descriptor and Distance Analysis
Micro-Expressions (MEs) are a typical kind of expressions which are subtle and short lived in nature and reveal the hidden emotion of human beings. Due to processing an entire video, the MEs recognition constitutes huge computational burden and also consumes more time. Hence, MEs spotting is required which locates the exact frames at which the movement of ME persists. Spotting is regarded as a primary step for MEs recognition. This paper proposes a new method for ME spotting which comprises three stages; pre-processing, feature extraction and discrimination. Pre-processing aligns the facial region in every frame based on three landmark points derived from three landmark regions. To do alignment, an in-plane rotation matrix is used which rotates the non-aligned coordinates into aligned coordinates. For feature extraction, two texture based descriptors are deployed; they are Local Binary Pattern (LBP) and Local Mean Binary Pattern (LMBP). Finally at discrimination stage, Feature Difference Analysis is employed through Chi-Squared Distance (CSD) and the distance of each frame is compared with a threshold to spot there frames namely Onset, Apex and Offset. Simulation done over a Standard CASME dataset and performance is verified through Feature Difference and F1-Score. The obtained results prove that the proposed method is superior than the state-of-the-art methods
Simultaneous magnification of subtle motions and color variations in videos using riesz pyramids
Videos often contain subtle motions and color variations that cannot be easily observed. Examples include, for instance, head motion and changes in skin face color due to blood flow controlled by the heart pumping rhythm. A few techniques have been developed to magnify these subtle signals. However, they are not easily applied to many applications. First of all, previous techniques were targeted specifically towards magnification of either motion or color variations. Trying to magnify both aspects applying two of these tech niques in sequence does not produce good results. We present a method for magnifying subtle motions and color variations in videos simultaneously. Our approach is based on the Riesz pyramid, which was previously used only for motion magnification. Besides modifying the local phases of the coefficients of this pyramid, we show how altering its local amplitudes and its residue can be used to produce intensity (color) magnification. We demonstrate the effectiveness of our technique in multiple videos by revealing both subtle signals simultaneously. Finally, we also developed an Android application as a proof-of-concept that can be used for magnifying either motion or color changes
Facial Micro- and Macro-Expression Spotting and Generation Methods
Facial micro-expression (ME) recognition requires face movement interval as input, but computer methods in spotting ME are still underperformed. This is due to lacking large-scale long video dataset and ME generation methods are in their infancy. This thesis presents methods to address data deficiency issues and introduces a new method for spotting macro- and micro-expressions simultaneously.
This thesis introduces SAMM Long Videos (SAMM-LV), which contains 147 annotated long videos, and develops a baseline method to facilitate ME Grand Challenge 2020. Further, a reference-guided style transfer of StarGANv2 is experimented on SAMM-LV to generate a synthetic dataset, namely SAMM-SYNTH. The quality of SAMM-SYNTH is evaluated by using facial action units detected by OpenFace. Quantitative measurement shows high correlations on two Action Units (AU12 and AU6) of the original and synthetic data.
In facial expression spotting, a two-stream 3D-Convolutional Neural Network with temporal oriented frame skips that can spot micro- and macro-expression simultaneously is proposed. This method achieves state-of-the-art performance in SAMM-LV and is competitive in CAS(ME)2, it was used as the baseline result of ME Grand Challenge 2021. The F1-score improves to 0.1036 when trained with composite data consisting of SAMM-LV and SAMMSYNTH. On the unseen ME Grand Challenge 2022 evaluation dataset, it achieves F1-score of 0.1531.
Finally, a new sequence generation method to explore the capability of deep learning network is proposed. It generates spontaneous facial expressions by using only two input sequences without any labels. SSIM and NIQE were used for image quality analysis and the generated data achieved 0.87 and 23.14. By visualising the movements using optical flow value and absolute frame differences, this method demonstrates its potential in generating subtle ME. For realism evaluation, the generated videos were rated by using two facial expression recognition networks
Deep learning applied to computational mechanics: A comprehensive review, state of the art, and the classics
Three recent breakthroughs due to AI in arts and science serve as motivation:
An award winning digital image, protein folding, fast matrix multiplication.
Many recent developments in artificial neural networks, particularly deep
learning (DL), applied and relevant to computational mechanics (solid, fluids,
finite-element technology) are reviewed in detail. Both hybrid and pure machine
learning (ML) methods are discussed. Hybrid methods combine traditional PDE
discretizations with ML methods either (1) to help model complex nonlinear
constitutive relations, (2) to nonlinearly reduce the model order for efficient
simulation (turbulence), or (3) to accelerate the simulation by predicting
certain components in the traditional integration methods. Here, methods (1)
and (2) relied on Long-Short-Term Memory (LSTM) architecture, with method (3)
relying on convolutional neural networks. Pure ML methods to solve (nonlinear)
PDEs are represented by Physics-Informed Neural network (PINN) methods, which
could be combined with attention mechanism to address discontinuous solutions.
Both LSTM and attention architectures, together with modern and generalized
classic optimizers to include stochasticity for DL networks, are extensively
reviewed. Kernel machines, including Gaussian processes, are provided to
sufficient depth for more advanced works such as shallow networks with infinite
width. Not only addressing experts, readers are assumed familiar with
computational mechanics, but not with DL, whose concepts and applications are
built up from the basics, aiming at bringing first-time learners quickly to the
forefront of research. History and limitations of AI are recounted and
discussed, with particular attention at pointing out misstatements or
misconceptions of the classics, even in well-known references. Positioning and
pointing control of a large-deformable beam is given as an example.Comment: 275 pages, 158 figures. Appeared online on 2023.03.01 at
CMES-Computer Modeling in Engineering & Science