Search CORE

3,708 research outputs found

Relaxed Spatio-Temporal Deep Feature Aggregation for Real-Fake Expression Prediction

Author: Akar Gozde Bozdagi
Ozkan Savas
Publication venue
Publication date: 24/08/2017
Field of study

Frame-level visual features are generally aggregated in time with the techniques such as LSTM, Fisher Vectors, NetVLAD etc. to produce a robust video-level representation. We here introduce a learnable aggregation technique whose primary objective is to retain short-time temporal structure between frame-level features and their spatial interdependencies in the representation. Also, it can be easily adapted to the cases where there have very scarce training samples. We evaluate the method on a real-fake expression prediction dataset to demonstrate its superiority. Our method obtains 65% score on the test dataset in the official MAP evaluation and there is only one misclassified decision with the best reported result in the Chalearn Challenge (i.e. 66:7%) . Lastly, we believe that this method can be extended to different problems such as action/event recognition in future.Comment: Submitted to International Conference on Computer Vision Workshop

arXiv.org e-Print Archive

Crossref

OpenMETU (Middle East Technical University)

Fusing deep learned and hand-crafted features of appearance, shape, and dynamics for automatic pain estimation

Author: Egede Joy Onyekachukwu
Martinez Brais
Valstar Michel F.
Publication venue
Publication date: 17/01/2017
Field of study

Automatic continuous time, continuous value assessment of a patient's pain from face video is highly sought after by the medical profession. Despite the recent advances in deep learning that attain impressive results in many domains, pain estimation risks not being able to benefit from this due to the difficulty in obtaining data sets of considerable size. In this work we propose a combination of hand-crafted and deep-learned features that makes the most of deep learning techniques in small sample settings. Encoding shape, appearance, and dynamics, our method significantly outperforms the current state of the art, attaining a RMSE error of less than 1 point on a 16-level pain scale, whilst simultaneously scoring a 67.3% Pearson correlation coefficient between our predicted pain level time series and the ground truth

Nottingham ePrints

arXiv.org e-Print Archive

Nottingham eTheses

Crossref