Search CORE

21,028 research outputs found

Distinguishing Posed and Spontaneous Smiles by Facial Dynamics

Author: B Mandal
B Mandal
B Mandal
B Mandal
C Cortes
EG Krumhuber
G Farnebäck
H Dibeklioglu
H Dibeklioğlu
HY Wu
J Cohn
J Hadwin
K Schmidt
K Schmidt
M Hoque
P Ekman
P Ekman
P Ekman
P Ekman
PF Felzenszwalb
Q Xu
V Ojansivu
X Yu
Z Ambadar
Z Zeng
Publication venue
Publication date: 17/02/2017
Field of study

Smile is one of the key elements in identifying emotions and present state of mind of an individual. In this work, we propose a cluster of approaches to classify posed and spontaneous smiles using deep convolutional neural network (CNN) face features, local phase quantization (LPQ), dense optical flow and histogram of gradient (HOG). Eulerian Video Magnification (EVM) is used for micro-expression smile amplification along with three normalization procedures for distinguishing posed and spontaneous smiles. Although the deep CNN face model is trained with large number of face images, HOG features outperforms this model for overall face smile classification task. Using EVM to amplify micro-expressions did not have a significant impact on classification accuracy, while the normalizing facial features improved classification accuracy. Unlike many manual or semi-automatic methodologies, our approach aims to automatically classify all smiles into either `spontaneous' or `posed' categories, by using support vector machines (SVM). Experimental results on large UvA-NEMO smile database show promising results as compared to other relevant methods.Comment: 16 pages, 8 figures, ACCV 2016, Second Workshop on Spontaneous Facial Behavior Analysi

arXiv.org e-Print Archive

Crossref

Real-time human action recognition on an embedded, reconfigurable video processing architecture

Author: A. Aizerman
A.F. Bobick
B. Farnell
Chris Bailey
G.R. Bradski
Hongying Meng
J.K. Aggarwal
Michael Freeman
Nick Pears
T. Moeslund
T. Ogata
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Copyright @ 2008 Springer-Verlag.In recent years, automatic human motion recognition has been widely researched within the computer vision and image processing communities. Here we propose a real-time embedded vision solution for human motion recognition implemented on a ubiquitous device. There are three main contributions in this paper. Firstly, we have developed a fast human motion recognition system with simple motion features and a linear Support Vector Machine (SVM) classifier. The method has been tested on a large, public human action dataset and achieved competitive performance for the temporal template (eg. “motion history image”) class of approaches. Secondly, we have developed a reconfigurable, FPGA based video processing architecture. One advantage of this architecture is that the system processing performance can be reconfiured for a particular application, with the addition of new or replicated processing cores. Finally, we have successfully implemented a human motion recognition system on this reconfigurable architecture. With a small number of human actions (hand gestures), this stand-alone system is performing reliably, with an 80% average recognition rate using limited training data. This type of system has applications in security systems, man-machine communications and intelligent environments.DTI and Broadcom Ltd

Crossref

UCL Discovery

Brunel University Research Archive

Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields

Author: Jansson Ylva
Lindeberg Tony
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

This work presents a first evaluation of using spatio-temporal receptive fields from a recently proposed time-causal spatio-temporal scale-space framework as primitives for video analysis. We propose a new family of video descriptors based on regional statistics of spatio-temporal receptive field responses and evaluate this approach on the problem of dynamic texture recognition. Our approach generalises a previously used method, based on joint histograms of receptive field responses, from the spatial to the spatio-temporal domain and from object recognition to dynamic texture recognition. The time-recursive formulation enables computationally efficient time-causal recognition. The experimental evaluation demonstrates competitive performance compared to state-of-the-art. Especially, it is shown that binary versions of our dynamic texture descriptors achieve improved performance compared to a large range of similar methods using different primitives either handcrafted or learned from data. Further, our qualitative and quantitative investigation into parameter choices and the use of different sets of receptive fields highlights the robustness and flexibility of our approach. Together, these results support the descriptive power of this family of time-causal spatio-temporal receptive fields, validate our approach for dynamic texture recognition and point towards the possibility of designing a range of video analysis methods based on these new time-causal spatio-temporal primitives.Comment: 29 pages, 16 figure

arXiv.org e-Print Archive

Publikationer från KTH

Digitala Vetenskapliga Arkivet - Academic Archive On-line

FPGA implementation of real-time human motion recognition on a reconfigurable video processing architecture

Author: Bailey Chris
Freeman Micheal
Meng Hongying
Pears Nick
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2008
Field of study

In recent years, automatic human motion recognition has been widely researched within the computer vision and image processing communities. Here we propose a real-time embedded vision solution for human motion recognition implemented on a ubiquitous device. There are three main contributions in this paper. Firstly, we have developed a fast human motion recognition system with simple motion features and a linear Support Vector Machine(SVM) classifier. The method has been tested on a large, public human action dataset and achieved competitive performance for the temporal template (eg. ``motion history image") class of approaches. Secondly, we have developed a reconfigurable, FPGA based video processing architecture. One advantage of this architecture is that the system processing performance can be reconfigured for a particular application, with the addition of new or replicated processing cores. Finally, we have successfully implemented a human motion recognition system on this reconfigurable architecture. With a small number of human actions (hand gestures), this stand-alone system is performing reliably, with an 80% average recognition rate using limited training data. This type of system has applications in security systems, man-machine communications and intelligent environments

University of Lincoln Institutional Repository

Multi-View Region Adaptive Multi-temporal DMM and RGB Action Recognition

Author: Al-Faris Mahmoud
Chiverton John P.
Ndzi David L.
Yang Yanyan
Publication venue
Publication date: 12/04/2019
Field of study

Human action recognition remains an important yet challenging task. This work proposes a novel action recognition system. It uses a novel Multiple View Region Adaptive Multi-resolution in time Depth Motion Map (MV-RAMDMM) formulation combined with appearance information. Multiple stream 3D Convolutional Neural Networks (CNNs) are trained on the different views and time resolutions of the region adaptive Depth Motion Maps. Multiple views are synthesised to enhance the view invariance. The region adaptive weights, based on localised motion, accentuate and differentiate parts of actions possessing faster motion. Dedicated 3D CNN streams for multi-time resolution appearance information (RGB) are also included. These help to identify and differentiate between small object interactions. A pre-trained 3D-CNN is used here with fine-tuning for each stream along with multiple class Support Vector Machines (SVM)s. Average score fusion is used on the output. The developed approach is capable of recognising both human action and human-object interaction. Three public domain datasets including: MSR 3D Action,Northwestern UCLA multi-view actions and MSR 3D daily activity are used to evaluate the proposed solution. The experimental results demonstrate the robustness of this approach compared with state-of-the-art algorithms.Comment: 14 pages, 6 figures, 13 tables. Submitte

arXiv.org e-Print Archive

Research Repository and Portal - University of the West of Scotland

Exploiting Image-trained CNN Architectures for Unconstrained Video Classification

Author: Andrews Walter
Luisier Florian
Salakhutdinov Ruslan
Srivastava Nitish
Zha Shengxin
Publication venue
Publication date: 01/01/2015
Field of study

We conduct an in-depth exploration of different strategies for doing event detection in videos using convolutional neural networks (CNNs) trained for image classification. We study different ways of performing spatial and temporal pooling, feature normalization, choice of CNN layers as well as choice of classifiers. Making judicious choices along these dimensions led to a very significant increase in performance over more naive approaches that have been used till now. We evaluate our approach on the challenging TRECVID MED'14 dataset with two popular CNN architectures pretrained on ImageNet. On this MED'14 dataset, our methods, based entirely on image-trained CNN features, can outperform several state-of-the-art non-CNN models. Our proposed late fusion of CNN- and motion-based features can further increase the mean average precision (mAP) on MED'14 from 34.95% to 38.74%. The fusion approach achieves the state-of-the-art classification performance on the challenging UCF-101 dataset

arXiv.org e-Print Archive

Crossref