Search CORE

6,955 research outputs found

A Survey on Ear Biometrics

Author: Abaza Ayman
Harrison Mary Ann F.
Hebert Christina
Nixon Mark
Ross Arun
Publication venue
Publication date: 01/02/2013
Field of study

Recognizing people by their ear has recently received significant attention in the literature. Several reasons account for this trend: first, ear recognition does not suffer from some problems associated with other non contact biometrics, such as face recognition; second, it is the most promising candidate for combination with the face in the context of multi-pose face recognition; and third, the ear can be used for human recognition in surveillance videos where the face may be occluded completely or in part. Further, the ear appears to degrade little with age. Even though, current ear detection and recognition systems have reached a certain level of maturity, their success is limited to controlled indoor conditions. In addition to variation in illumination, other open research problems include hair occlusion; earprint forensics; ear symmetry; ear classification; and ear individuality. This paper provides a detailed survey of research conducted in ear detection and recognition. It provides an up-to-date review of the existing literature revealing the current state-of-art for not only those who are working in this area but also for those who might exploit this new approach. Furthermore, it offers insights into some unsolved ear recognition problems as well as ear databases available for researchers

Southampton (e-Prints Soton)

Online Mutual Foreground Segmentation for Multispectral Stereo Videos

Author: Bergevin Robert
Bilodeau Guillaume-Alexandre
St-Charles Pierre-Luc
Publication venue
Publication date: 21/12/2018
Field of study

The segmentation of video sequences into foreground and background regions is a low-level process commonly used in video content analysis and smart surveillance applications. Using a multispectral camera setup can improve this process by providing more diverse data to help identify objects despite adverse imaging conditions. The registration of several data sources is however not trivial if the appearance of objects produced by each sensor differs substantially. This problem is further complicated when parallax effects cannot be ignored when using close-range stereo pairs. In this work, we present a new method to simultaneously tackle multispectral segmentation and stereo registration. Using an iterative procedure, we estimate the labeling result for one problem using the provisional result of the other. Our approach is based on the alternating minimization of two energy functions that are linked through the use of dynamic priors. We rely on the integration of shape and appearance cues to find proper multispectral correspondences, and to properly segment objects in low contrast regions. We also formulate our model as a frame processing pipeline using higher order terms to improve the temporal coherence of our results. Our method is evaluated under different configurations on multiple multispectral datasets, and our implementation is available online.Comment: Preprint accepted for publication in IJCV (December 2018

arXiv.org e-Print Archive

PolyPublie

EmoNets: Multimodal deep learning approaches for emotion recognition in video

Author: Bengio Yoshua
Boulanger-Lewandowski Nicolas
Bouthillier Xavier
Courville Aaron
Dauphin Yann
Ferrari Raul Chandias
Froumenty Pierre
Gulcehre Caglar
Jean Sébastien
Kahou Samira Ebrahimi
Konda Kishore
Lamblin Pascal
Memisevic Roland
Michalski Vincent
Mirza Mehdi
Pal Christopher
Vincent Pascal
Warde-Farley David
Publication venue
Publication date: 29/03/2015
Field of study

The task of the emotion recognition in the wild (EmotiW) Challenge is to assign one of seven emotions to short video clips extracted from Hollywood style movies. The videos depict acted-out emotions under realistic conditions with a large degree of variation in attributes such as pose and illumination, making it worthwhile to explore approaches which consider combinations of features from multiple modalities for label assignment. In this paper we present our approach to learning several specialist models using deep learning techniques, each focusing on one modality. Among these are a convolutional neural network, focusing on capturing visual information in detected faces, a deep belief net focusing on the representation of the audio stream, a K-Means based "bag-of-mouths" model, which extracts visual features around the mouth region and a relational autoencoder, which addresses spatio-temporal aspects of videos. We explore multiple methods for the combination of cues from these modalities into one common classifier. This achieves a considerably greater accuracy than predictions from our strongest single-modality classifier. Our method was the winning submission in the 2013 EmotiW challenge and achieved a test set accuracy of 47.67% on the 2014 dataset

arXiv.org e-Print Archive

PolyPublie

Infrared face recognition: a comprehensive review of methodologies and databases

Author: Arandjelovic Ognjen
Bendada Hakim
Ghiass Reza Shoja
Maldague Xavier
Publication venue
Publication date: 01/01/2014
Field of study

Automatic face recognition is an area with immense practical potential which includes a wide range of commercial and law enforcement applications. Hence it is unsurprising that it continues to be one of the most active research areas of computer vision. Even after over three decades of intense research, the state-of-the-art in face recognition continues to improve, benefitting from advances in a range of different research fields such as image processing, pattern recognition, computer graphics, and physiology. Systems based on visible spectrum images, the most researched face recognition modality, have reached a significant level of maturity with some practical success. However, they continue to face challenges in the presence of illumination, pose and expression changes, as well as facial disguises, all of which can significantly decrease recognition accuracy. Amongst various approaches which have been proposed in an attempt to overcome these limitations, the use of infrared (IR) imaging has emerged as a particularly promising research direction. This paper presents a comprehensive and timely review of the literature on this subject. Our key contributions are: (i) a summary of the inherent properties of infrared imaging which makes this modality promising in the context of face recognition, (ii) a systematic review of the most influential approaches, with a focus on emerging common trends as well as key differences between alternative methodologies, (iii) a description of the main databases of infrared facial images available to the researcher, and lastly (iv) a discussion of the most promising avenues for future research.Comment: Pattern Recognition, 2014. arXiv admin note: substantial text overlap with arXiv:1306.160

arXiv.org e-Print Archive

Deakin Research Online

Crossref

University of St. Andrews - Pure

Neural correlates of the processing of co-speech gestures

Author: Alibali
Allison
Andreas Hennenlotter
Baddeley
Beattie
Beattie
Beauchamp
Beauchamp
Beauchamp
Binkofski
Calvert
Calvert
Cassell
Chomsky
Culham
Dupont
Feyereisen
Fiebach
Fiez
Fridman
Friston
Gallese
Gallese
Gauthier
Gauthier
Goldin-Meadow
Grill-Spector
Grodzinsky
Grosjean
Gunter
Gunter
Hadar
Henning Holle
Holle
Holler
Iacoboni
Iacoboni
Iacoboni
Josephs
Joubert
Kanwisher
Kelly
Kita
Krauss
Lee
Levelt
Lohmann
Marco Iacoboni
McNeill
McNeill
Molnar-Szakacs
Morrel-Samuels
Nixon
Nobe
Norris
Oldfield
Paulesu
Pelphrey
Rizzolatti
Rizzolatti
Rizzolatti
Rose
Saygin
Schubotz
Sekiyama
Shirley-Ann Rüschemeyer
Skipper
Tanenhaus
Tarr
Thomas C. Gunter
Ugurbil
Umiltà
van Atteveldt
Ward
Worsley
Wright
Wu
Özyürek
Publication venue: 'Elsevier BV'
Publication date: 13/11/2007
Field of study

In communicative situations, speech is often accompanied by gestures. For example, speakers tend to illustrate certain contents of speech by means of iconic gestures which are hand movements that bear a formal relationship to the contents of speech. The meaning of an iconic gesture is determined both by its form as well as the speech context in which it is performed. Thus, gesture and speech interact in comprehension. Using fMRI, the present study investigated what brain areas are involved in this interaction process. Participants watched videos in which sentences containing an ambiguous word (e.g. She touched the mouse) were accompanied by either a meaningless grooming movement, a gesture supporting the more frequent dominant meaning (e.g. animal) or a gesture supporting the less frequent subordinate meaning (e.g. computer device). We hypothesized that brain areas involved in the interaction of gesture and speech would show greater activation to gesture-supported sentences as compared to sentences accompanied by a meaningless grooming movement. The main results are that when contrasted with grooming, both types of gestures (dominant and subordinate) activated an array of brain regions consisting of the left posterior superior temporal sulcus (STS), the inferior parietal lobule bilaterally and the ventral precentral sulcus bilaterally. Given the crucial role of the STS in audiovisual integration processes, this activation might reflect the interaction between the meaning of gesture and the ambiguous sentence. The activations in inferior frontal and inferior parietal regions may reflect a mechanism of determining the goal of co-speech hand movements through an observation-execution matching process

Repository@Hull - Worktribe

Crossref

Sussex Research Online

MPG.PuRe