9,019 research outputs found
A survey of face detection, extraction and recognition
The goal of this paper is to present a critical survey of existing literatures on human face recognition over the last 4-5 years. Interest and research activities in face recognition have increased significantly over the past few years, especially after the American airliner tragedy on September 11 in 2001. While this growth largely is driven by growing application demands, such as static matching of controlled photographs as in mug shots matching, credit card verification to surveillance video images, identification for law enforcement and authentication for banking and security system access, advances in signal analysis techniques, such as wavelets and neural networks, are also important catalysts. As the number of proposed techniques increases, survey and evaluation becomes important
Face Recognition with Multi-stage Matching Algorithms
For every face recognition method, the primary goal is to achieve higher recognition accuracy and spend less computational costs. However, as the gallery size increases, especially when one probe image corresponds to only one training image, face recognition becomes more and more challenging. First, a larger gallery size requires more computational costs and memory usage. Meanwhile, that the large gallery sizes degrade the recognition accuracy becomes an even more significant problem to be solved.
A coarse parallel algorithm that equally divides training images and probe images into multiple processors is proposed to deal with the large computational costs and huge memory usage of the Non-Graph Matching (NGM) feature-based method. First, each processor finishes its own training workload and stores the extracted feature information, respectively. And then, each processor simultaneously carries out the matching process for their own probe images by communicating their own stored feature information with each other. Finally, one processor collects the recognition result from the other processors. Due to the well-balanced workload, the speedup increases with the number of processors and thus the efficiency is excellently maintained. Moreover, the memory usage on each processor also evidently reduces as the number of processors increases. In sum, the parallel algorithm simultaneously brings less running time and memory usage for one processor.
To solve the recognition degradation problem, a set of multi-stage matching algorithms that determine the recognition result step-by-step are proposed. Each step picks a small proportion of the best similar candidates for the next step and removes the others. The behavior of picking and removing repeats until the number of remaining candidates is small enough to produce the final recognition result. Three multi-stage matching algorithms— n-ary elimination, divide and conquer, and two-stage hybrid— are introduced to the matching process of traditional face recognition methods, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), and Non-graph Matching (NGM). N-ary elimination accomplishes the multi-stage matching from the global perspective by ranking the similarities and picking the best candidates. Divide and conquer implements the multi-stage matching from the local perspective by dividing the candidates into groups and selecting the best one of each group. For two-stage hybrid, it uses a holistic method to choose a small amount of candidates and then utilizes a feature-based method to find out the final recognition result from them. From the experimental results, three conclusions can be drawn. First, with the multi-stage matching algorithms, higher recognition accuracy can be achieved. Second, the larger the gallery size, the greater the improved accuracy brought by the multi-stage matching algorithms. Finally, the multi-stage matching algorithms achieve little extra computational costs
Recommended from our members
Evaluation and analysis of hybrid intelligent pattern recognition techniques for speaker identification
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The rapid momentum of the technology progress in the recent years has led to a tremendous rise in the use of biometric authentication systems. The objective of this research is to investigate the problem
of identifying a speaker from its voice regardless of the content (i.e.
text-independent), and to design efficient methods of combining face and voice in producing a robust authentication system.
A novel approach towards speaker identification is developed using
wavelet analysis, and multiple neural networks including Probabilistic
Neural Network (PNN), General Regressive Neural Network (GRNN)and Radial Basis Function-Neural Network (RBF NN) with the AND
voting scheme. This approach is tested on GRID and VidTIMIT cor-pora and comprehensive test results have been validated with state-
of-the-art approaches. The system was found to be competitive and it improved the recognition rate by 15% as compared to the classical Mel-frequency Cepstral Coe±cients (MFCC), and reduced the recognition time by 40% compared to Back Propagation Neural Network (BPNN), Gaussian Mixture Models (GMM) and Principal Component Analysis (PCA).
Another novel approach using vowel formant analysis is implemented using Linear Discriminant Analysis (LDA). Vowel formant based speaker identification is best suitable for real-time implementation and requires only a few bytes of information to be stored for each speaker, making it both storage and time efficient. Tested on GRID and Vid-TIMIT, the proposed scheme was found to be 85.05% accurate when Linear Predictive Coding (LPC) is used to extract the vowel formants, which is much higher than the accuracy of BPNN and GMM. Since the proposed scheme does not require any training time other than creating a small database of vowel formants, it is faster as well. Furthermore, an increasing number of speakers makes it di±cult for BPNN and GMM to sustain their accuracy, but the proposed score-based methodology stays almost linear.
Finally, a novel audio-visual fusion based identification system is implemented using GMM and MFCC for speaker identi¯cation and PCA for face recognition. The results of speaker identification and face recognition are fused at different levels, namely the feature, score and decision levels. Both the score-level and decision-level (with OR voting) fusions were shown to outperform the feature-level fusion in terms of accuracy and error resilience. The result is in line with the distinct nature of the two modalities which lose themselves when combined at the feature-level. The GRID and VidTIMIT test results validate that
the proposed scheme is one of the best candidates for the fusion of
face and voice due to its low computational time and high recognition accuracy
TECHNIKI ROZPOZNAWANIA TWARZY
The problem of face recognition is discussed. The main methods of recognition are considered. The calibrated stereo pair for the face and calculating the depth map by the correlation algorithm are used. As a result, a 3D mask of the face is obtained. Using three anthropomorphic points, then constructed a coordinate system that ensures a possibility of superposition of the tested mask.Omawiany jest problem rozpoznawania twarzy. Rozważane są główne metody rozpoznawania. Użyta zostaje skalibrowana para stereo dla twarzy oraz obliczanie mapy głębokości poprzez algorytm korelacji. W wyniku takiego, uzyskiwana jest maska twarzy w wymiarze 3D. Użycie trzech antropomorficznych punktów, a następnie skonstruowany systemu współrzędnych zapewnia możliwość nakładania się przetestowanej maski
Sparse Modeling for Image and Vision Processing
In recent years, a large amount of multi-disciplinary research has been
conducted on sparse models and their applications. In statistics and machine
learning, the sparsity principle is used to perform model selection---that is,
automatically selecting a simple model among a large collection of them. In
signal processing, sparse coding consists of representing data with linear
combinations of a few dictionary elements. Subsequently, the corresponding
tools have been widely adopted by several scientific communities such as
neuroscience, bioinformatics, or computer vision. The goal of this monograph is
to offer a self-contained view of sparse modeling for visual recognition and
image processing. More specifically, we focus on applications where the
dictionary is learned and adapted to data, yielding a compact representation
that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics
and Visio
DEFORM'06 - Proceedings of the Workshop on Image Registration in Deformable Environments
Preface These are the proceedings of DEFORM'06, the Workshop on Image Registration in Deformable Environments, associated to BMVC'06, the 17th British Machine Vision Conference, held in Edinburgh, UK, in September 2006. The goal of DEFORM'06 was to bring together people from different domains having interests in deformable image registration. In response to our Call for Papers, we received 17 submissions and selected 8 for oral presentation at the workshop. In addition to the regular papers, Andrew Fitzgibbon from Microsoft Research Cambridge gave an invited talk at the workshop. The conference website including online proceedings remains open, see http://comsee.univ-bpclermont.fr/events/DEFORM06. We would like to thank the BMVC'06 co-chairs, Mike Chantler, Manuel Trucco and especially Bob Fisher for is great help in the local arrangements, Andrew Fitzgibbon, and the Programme Committee members who provided insightful reviews of the submitted papers. Special thanks go to Marc Richetin, head of the CNRS Research Federation TIMS, which sponsored the workshop. August 2006 Adrien Bartoli Nassir Navab Vincent Lepeti
Contributions on Automatic Recognition of Faces using Local Texture Features
Uno de los temas más destacados del área de visión artifical se deriva del análisis facial automático. En particular, la detección precisa de caras humanas y el análisis biométrico de las mismas son problemas que han generado especial interés debido a la gran cantidad de aplicaciones que actualmente hacen uso de estos mecnismos.
En esta Tesis Doctoral se analizan por separado los problemas relacionados con detección precisa de caras basada en la localización de los ojos y el reconomcimiento facial a partir de la extracción de características locales de textura. Los algoritmos desarrollados abordan el problema de la extracción de la identidad a partir de una imagen de cara ( en vista frontal o semi-frontal), para escenarios parcialmente controlados. El objetivo es desarrollar algoritmos robustos y que puedan incorpararse fácilmente a aplicaciones reales, tales como seguridad avanzada en banca o la definición de estrategias comerciales aplicadas al sector de retail.
Respecto a la extracción de texturas locales, se ha realizado un análisis exhaustivo de los descriptores más extendidos; se ha puesto especial énfasis en el estudio de los Histogramas de Grandientes Orientados (HOG features). En representaciones normalizadas de la cara, estos descriptores ofrecen información discriminativa de los elementos faciales (ojos, boca, etc.), siendo robustas a variaciones en la iluminación y pequeños desplazamientos.
Se han elegido diferentes algoritmos de clasificación para realizar la detección y el reconocimiento de caras, todos basados en una estrategia de sistemas supervisados. En particular, para la localización de ojos se ha utilizado clasificadores boosting y Máquinas de Soporte Vectorial (SVM) sobre descriptores HOG. En el caso de reconocimiento de caras, se ha desarrollado un nuevo algoritmo, HOG-EBGM (HOG sobre Elastic Bunch Graph Matching). Dada la imagen de una cara, el esquema seguido por este algoritmo se puede resumir en pocos pasos: en una primera etapa se extMonzó Ferrer, D. (2012). Contributions on Automatic Recognition of Faces using Local Texture Features [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/16698Palanci
- …