225 research outputs found
Face recognition using statistical adapted local binary patterns.
Biometrics is the study of methods of recognizing humans based on their behavioral and physical characteristics or traits. Face recognition is one of the biometric modalities that received a great amount of attention from many researchers during the past few decades because of its potential applications in a variety of security domains. Face recognition however is not only concerned with recognizing human faces, but also with recognizing faces of non-biological entities or avatars. Fortunately, the need for secure and affordable virtual worlds is attracting the attention of many researchers who seek to find fast, automatic and reliable ways to identify virtual worlds’ avatars. In this work, I propose new techniques for recognizing avatar faces, which also can be applied to recognize human faces. Proposed methods are based mainly on a well-known and efficient local texture descriptor, Local Binary Pattern (LBP). I am applying different versions of LBP such as: Hierarchical Multi-scale Local Binary Patterns and Adaptive Local Binary Pattern with Directional Statistical Features in the wavelet space and discuss the effect of this application on the performance of each LBP version. In addition, I use a new version of LBP called Local Difference Pattern (LDP) with other well-known descriptors and classifiers to differentiate between human and avatar face images. The original LBP achieves high recognition rate if the tested images are pure but its performance gets worse if these images are corrupted by noise. To deal with this problem I propose a new definition to the original LBP in which the LBP descriptor will not threshold all the neighborhood pixel based on the central pixel value. A weight for each pixel in the neighborhood will be computed, a new value for each pixel will be calculated and then using simple statistical operations will be used to compute the new threshold, which will change automatically, based on the pixel’s values. This threshold can be applied with the original LBP or any other version of LBP and can be extended to work with Local Ternary Pattern (LTP) or any version of LTP to produce different versions of LTP for recognizing noisy avatar and human faces images
Reconhecimento de padrões em expressões faciais : algoritmos e aplicações
Orientador: HĂ©lio PedriniTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: O reconhecimento de emoções tem-se tornado um tĂłpico relevante de pesquisa pela comunidade cientĂfica, uma vez que desempenha um papel essencial na melhoria contĂnua dos sistemas de interação humano-computador. Ele pode ser aplicado em diversas áreas, tais como medicina, entretenimento, vigilância, biometria, educação, redes sociais e computação afetiva. Há alguns desafios em aberto relacionados ao desenvolvimento de sistemas emocionais baseados em expressões faciais, como dados que refletem emoções mais espontâneas e cenários reais. Nesta tese de doutorado, apresentamos diferentes metodologias para o desenvolvimento de sistemas de reconhecimento de emoções baseado em expressões faciais, bem como sua aplicabilidade na resolução de outros problemas semelhantes. A primeira metodologia Ă© apresentada para o reconhecimento de emoções em expressões faciais ocluĂdas baseada no Histograma da Transformada Census (CENTRIST). Expressões faciais ocluĂdas sĂŁo reconstruĂdas usando a Análise Robusta de Componentes Principais (RPCA). A extração de caracterĂsticas das expressões faciais Ă© realizada pelo CENTRIST, bem como pelos Padrões Binários Locais (LBP), pela Codificação Local do Gradiente (LGC) e por uma extensĂŁo do LGC. O espaço de caracterĂsticas gerado Ă© reduzido aplicando-se a Análise de Componentes Principais (PCA) e a Análise Discriminante Linear (LDA). Os algoritmos K-Vizinhos mais PrĂłximos (KNN) e Máquinas de Vetores de Suporte (SVM) sĂŁo usados para classificação. O mĂ©todo alcançou taxas de acerto competitivas para expressões faciais ocluĂdas e nĂŁo ocluĂdas. A segunda Ă© proposta para o reconhecimento dinâmico de expressões faciais baseado em Ritmos Visuais (VR) e Imagens da HistĂłria do Movimento (MHI), de modo que uma fusĂŁo de ambos descritores codifique informações de aparĂŞncia, forma e movimento dos vĂdeos. Para extração das caracterĂsticas, o Descritor Local de Weber (WLD), o CENTRIST, o Histograma de Gradientes Orientados (HOG) e a Matriz de CoocorrĂŞncia em NĂvel de Cinza (GLCM) sĂŁo empregados. A abordagem apresenta uma nova proposta para o reconhecimento dinâmico de expressões faciais e uma análise da relevância das partes faciais. A terceira Ă© um mĂ©todo eficaz apresentado para o reconhecimento de emoções audiovisuais com base na fala e nas expressões faciais. A metodologia envolve uma rede neural hĂbrida para extrair caracterĂsticas visuais e de áudio dos vĂdeos. Para extração de áudio, uma Rede Neural Convolucional (CNN) baseada no log-espectrograma de Mel Ă© usada, enquanto uma CNN construĂda sobre a Transformada de Census Ă© empregada para a extração das caracterĂsticas visuais. Os atributos audiovisuais sĂŁo reduzidos por PCA e LDA, entĂŁo classificados por KNN, SVM, RegressĂŁo LogĂstica (LR) e Gaussian NaĂŻve Bayes (GNB). A abordagem obteve taxas de reconhecimento competitivas, especialmente em dados espontâneos. A penĂşltima investiga o problema de detectar a sĂndrome de Down a partir de fotografias. Um descritor geomĂ©trico Ă© proposto para extrair caracterĂsticas faciais. Experimentos realizados em uma base de dados pĂşblica mostram a eficácia da metodologia desenvolvida. A Ăşltima metodologia trata do reconhecimento de sĂndromes genĂ©ticas em fotografias. O mĂ©todo visa extrair atributos faciais usando caracterĂsticas de uma rede neural profunda e medidas antropomĂ©tricas. Experimentos sĂŁo realizados em uma base de dados pĂşblica, alcançando taxas de reconhecimento competitivasAbstract: Emotion recognition has become a relevant research topic by the scientific community, since it plays an essential role in the continuous improvement of human-computer interaction systems. It can be applied in various areas, for instance, medicine, entertainment, surveillance, biometrics, education, social networks, and affective computing. There are some open challenges related to the development of emotion systems based on facial expressions, such as data that reflect more spontaneous emotions and real scenarios. In this doctoral dissertation, we propose different methodologies to the development of emotion recognition systems based on facial expressions, as well as their applicability in the development of other similar problems. The first is an emotion recognition methodology for occluded facial expressions based on the Census Transform Histogram (CENTRIST). Occluded facial expressions are reconstructed using an algorithm based on Robust Principal Component Analysis (RPCA). Extraction of facial expression features is then performed by CENTRIST, as well as Local Binary Patterns (LBP), Local Gradient Coding (LGC), and an LGC extension. The generated feature space is reduced by applying Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). K-Nearest Neighbor (KNN) and Support Vector Machine (SVM) algorithms are used for classification. This method reached competitive accuracy rates for occluded and non-occluded facial expressions. The second proposes a dynamic facial expression recognition based on Visual Rhythms (VR) and Motion History Images (MHI), such that a fusion of both encodes appearance, shape, and motion information of the video sequences. For feature extraction, Weber Local Descriptor (WLD), CENTRIST, Histogram of Oriented Gradients (HOG), and Gray-Level Co-occurrence Matrix (GLCM) are employed. This approach shows a new direction for performing dynamic facial expression recognition, and an analysis of the relevance of facial parts. The third is an effective method for audio-visual emotion recognition based on speech and facial expressions. The methodology involves a hybrid neural network to extract audio and visual features from videos. For audio extraction, a Convolutional Neural Network (CNN) based on log Mel-spectrogram is used, whereas a CNN built on Census Transform is employed for visual extraction. The audio and visual features are reduced by PCA and LDA, and classified through KNN, SVM, Logistic Regression (LR), and Gaussian NaĂŻve Bayes (GNB). This approach achieves competitive recognition rates, especially in a spontaneous data set. The second last investigates the problem of detecting Down syndrome from photographs. A geometric descriptor is proposed to extract facial features. Experiments performed on a public data set show the effectiveness of the developed methodology. The last methodology is about recognizing genetic disorders in photos. This method focuses on extracting facial features using deep features and anthropometric measurements. Experiments are conducted on a public data set, achieving competitive recognition ratesDoutoradoCiĂŞncia da ComputaçãoDoutora em CiĂŞncia da Computação140532/2019-6CNPQCAPE
Face recognition in the wild.
Research in face recognition deals with problems related to Age, Pose, Illumination and Expression (A-PIE), and seeks approaches that are invariant to these factors. Video images add a temporal aspect to the image acquisition process. Another degree of complexity, above and beyond A-PIE recognition, occurs when multiple pieces of information are known about people, which may be distorted, partially occluded, or disguised, and when the imaging conditions are totally unorthodox! A-PIE recognition in these circumstances becomes really “wild” and therefore, Face Recognition in the Wild has emerged as a field of research in the past few years. Its main purpose is to challenge constrained approaches of automatic face recognition, emulating some of the virtues of the Human Visual System (HVS) which is very tolerant to age, occlusion and distortions in the imaging process. HVS also integrates information about individuals and adds contexts together to recognize people within an activity or behavior. Machine vision has a very long road to emulate HVS, but face recognition in the wild, using the computer, is a road to perform face recognition in that path. In this thesis, Face Recognition in the Wild is defined as unconstrained face recognition under A-PIE+; the (+) connotes any alterations to the design scenario of the face recognition system. This thesis evaluates the Biometric Optical Surveillance System (BOSS) developed at the CVIP Lab, using low resolution imaging sensors. Specifically, the thesis tests the BOSS using cell phone cameras, and examines the potential of facial biometrics on smart portable devices like iPhone, iPads, and Tablets. For quantitative evaluation, the thesis focused on a specific testing scenario of BOSS software using iPhone 4 cell phones and a laptop. Testing was carried out indoor, at the CVIP Lab, using 21 subjects at distances of 5, 10 and 15 feet, with three poses, two expressions and two illumination levels. The three steps (detection, representation and matching) of the BOSS system were tested in this imaging scenario. False positives in facial detection increased with distances and with pose angles above ± 15°. The overall identification rate (face detection at confidence levels above 80%) also degraded with distances, pose, and expressions. The indoor lighting added challenges also, by inducing shadows which affected the image quality and the overall performance of the system. While this limited number of subjects and somewhat constrained imaging environment does not fully support a “wild” imaging scenario, it did provide a deep insight on the issues with automatic face recognition. The recognition rate curves demonstrate the limits of low-resolution cameras for face recognition at a distance (FRAD), yet it also provides a plausible defense for possible A-PIE face recognition on portable devices
Hybrid component-based face recognition.
Masters Degree. University of KwaZulu-Natal, Pietermaritzburg.Facial recognition (FR) is the trusted biometric method for authentication. Compared
to other biometrics such as signature; which can be compromised, facial recognition
is non-intrusive and it can be apprehended at a distance in a concealed manner.
It has a significant role in conveying the identity of a person in social interaction
and its performance largely depends on a variety of factors such as illumination, facial
pose, expression, age span, hair, facial wear, and motion. In the light of these
considerations this dissertation proposes a hybrid component-based approach that
seeks to utilise any successfully detected components.
This research proposes a facial recognition technique to recognize faces at component
level. It employs the texture descriptors Grey-Level Co-occurrence (GLCM),
Gabor Filters, Speeded-Up Robust Features (SURF) and Scale Invariant Feature Transforms
(SIFT), and the shape descriptor Zernike Moments. The advantage of using
the texture attributes is their simplicity. However, they cannot completely characterise
the whole face recognition, hence the Zernike Moments descriptor was used to
compute the shape properties of the selected facial components. These descriptors
are effective facial components feature representations and are robust to illumination
and pose changes.
Experiments were performed on four different state of the art facial databases,
the FERET, FEI, SCface and CMU and Error-Correcting Output Code (ECOC) was
used for classification. The results show that component-based facial recognition is
more effective than whole face and the proposed methods achieve 98.75% of recognition
accuracy rate. This approach performs well compared to other componentbased
facial recognition approaches
Out-of-plane action unit recognition using recurrent neural networks
A dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of requirements for the degree of Master of Science. Johannesburg, 2015.The face is a fundamental tool to assist in interpersonal communication and interaction between people.
Humans use facial expressions to consciously or subconsciously express their emotional states, such as
anger or surprise. As humans, we are able to easily identify changes in facial expressions even in complicated
scenarios, but the task of facial expression recognition and analysis is complex and challenging
to a computer. The automatic analysis of facial expressions by computers has applications in several scientific
subjects such as psychology, neurology, pain assessment, lie detection, intelligent environments,
psychiatry, and emotion and paralinguistic communication. We look at methods of facial expression
recognition, and in particular, the recognition of Facial Action Coding System’s (FACS) Action Units
(AUs). Movements of individual muscles on the face are encoded by FACS from slightly different, instant
changes in facial appearance. Contractions of specific facial muscles are related to a set of units
called AUs. We make use of Speeded Up Robust Features (SURF) to extract keypoints from the face and
use the SURF descriptors to create feature vectors. SURF provides smaller sized feature vectors than
other commonly used feature extraction techniques. SURF is comparable to or outperforms other methods
with respect to distinctiveness, robustness, and repeatability. It is also much faster than other feature
detectors and descriptors. The SURF descriptor is scale and rotation invariant and is unaffected by small
viewpoint changes or illumination changes. We use the SURF feature vectors to train a recurrent neural
network (RNN) to recognize AUs from the Cohn-Kanade database. An RNN is able to handle temporal
data received from image sequences in which an AU or combination of AUs are shown to develop from
a neutral face. We are recognizing AUs as they provide a more fine-grained means of measurement that
is independent of age, ethnicity, gender and different expression appearance. In addition to recognizing
FACS AUs from the Cohn-Kanade database, we use our trained RNNs to recognize the development
of pain in human subjects. We make use of the UNBC-McMaster pain database which contains image
sequences of people experiencing pain. In some cases, the pain results in their face moving out-of-plane
or some degree of in-plane movement. The temporal processing ability of RNNs can assist in classifying
AUs where the face is occluded and not facing frontally for some part of the sequence. Results are
promising when tested on the Cohn-Kanade database. We see higher overall recognition rates for upper
face AUs than lower face AUs. Since keypoints are globally extracted from the face in our system, local
feature extraction could provide improved recognition results in future work. We also see satisfactory
recognition results when tested on samples with out-of-plane head movement, showing the temporal
processing ability of RNNs
Biometric Systems
Because of the accelerating progress in biometrics research and the latest nation-state threats to security, this book's publication is not only timely but also much needed. This volume contains seventeen peer-reviewed chapters reporting the state of the art in biometrics research: security issues, signature verification, fingerprint identification, wrist vascular biometrics, ear detection, face detection and identification (including a new survey of face recognition), person re-identification, electrocardiogram (ECT) recognition, and several multi-modal systems. This book will be a valuable resource for graduate students, engineers, and researchers interested in understanding and investigating this important field of study
Innovative local texture descriptors with application to eye detection
Local Binary Patterns (LBP), which is one of the well-known texture descriptors, has broad applications in pattern recognition and computer vision. The attractive properties of LBP are its tolerance to illumination variations and its computational simplicity. However, LBP only compares a pixel with those in its own neighborhood and encodes little information about the relationship of the local texture with the features. This dissertation introduces a new Feature Local Binary Patterns (FLBP) texture descriptor that can compare a pixel with those in its own neighborhood as well as in other neighborhoods and encodes the information of both local texture and features. The features encoded in FLBP are broadly defined, such as edges, Gabor wavelet features, and color features. Specifically, a binary image is first derived by extracting feature pixels from a given image, and then a distance vector field is obtained by computing the distance vector between each pixel and its nearest feature pixel defined in the binary image. Based on the distance vector field and the FLBP parameters, the FLBP representation of the given image is derived. The feasibility of the proposed FLBP is demonstrated on eye detection using the BioID and the FERET databases. Experimental results show that the FLBP method significantly improves upon the LBP method in terms of both the eye detection rate and the eye center localization accuracy.
As LBP is sensitive to noise especially in near-uniform image regions, Local Ternary Patterns (LTP) was proposed to address this problem by extending LBP to three-valued codes. However, further research reveals that both LTP and LBP achieve similar results for face and facial expression recognition, while LTP has a higher computational cost than LBP. To improve upon LTP, this dissertation introduces another new local texture descriptor: Local Quaternary Patterns (LQP) and its extension, Feature Local Quaternary Patterns (FLQP). LQP encodes four relationships of local texture, and therefore, it includes more information of local texture than the LBP and the LTP. FLQP, which encodes both local and feature information, is expected to perform even better than LQP for texture description and pattern analysis. The LQP and FLQP are applied to eye detection on the BioID database. Experimental results show that both FLQP and LQP achieve better eye detection performance than FLTP, LTP, FLBP and LBP. The FLQP method achieves the highest eye detection rate
Pengenalan Ekspresi Wajah dengan Metode Viola Jones dan Convolutional Neural Network
Currently, the use of artificial intelligence is growing rapidly, including being used to recognize human facial expressions. Human facial expressions have a complex recognition rate. In this study, deep learning will be applied to find out how much accuracy the recognition of facial expressions. The method used in this study is a combination of Viola Jones and Convolutional Neural Network. Viola Jones is used at the segmentation stage and Convolutional Neural Network to classify data. The facial expression dataset that was analyzed consisted of happiness, anger, disgust, sadness, fear, surprise and normal totaling 2205 data. Tests conducted using a conffusion matrix with an accuracy rate of 96.14%. The results of this test indicate that the proposed method has good accuracy for recognizing facial expressions.Saat ini penggunaan kecerdasan buatan berkembang dengan pesat, diantaranya dimanfaatkan untuk mengenali ekspresi wajah manusia. Ekspresi wajah manusia memiliki tingkat pengenalan yang kompleks. Pada penelitian ini akan diterapkan deep learning untuk mengetahui seberapa besar tingkat akurasi dalam pengenalan ekspresi wajah. Metode yang digunakan dalam penelitian ini yaitu gabungan Viola Jones dan Convolutional Neural Network. Viola Jones digunakan pada tahap segmentasi dan Convolutional Neural Network untuk mengklasifikasi data. Dataset ekspresi wajah yang dianalisis terdiri dari bahagia, merah, muak, sedih, takut, terkejut dan normal sejumlah 2205 data. Pengujian yang dilakukan menggunakan confussion matrix dengan tingkat akurasi sebesar 96,14%. Dari hasil pengujian ini menunjukan bahwa metode yang diusulkan memiliki akurasi yang baik untuk mengenali ekspresi wajah
- …