214 research outputs found

    Gender classification using facial components.

    Get PDF
    Master’s degree. University of KwaZulu-Natal, Durban.Gender classification is very important in facial analysis as it can be used as input into a number of systems such as face recognition. Humans are able to classify gender with great accuracy however passing this ability to machines is a complex task because of many variables such as lighting to mention a few. For the purpose of this research we have approached gender classification as a binary problem, involving the two classes male and female. Two datasets are used in this research which are the FG-NET dataset and Pilots Parliament datasets. Two appearance based feature extractors are used which are the LBP and LDP with the Active Shape model being included by fusing. The classifiers used here are the Support Vector Machine with Radial Basis Function kernel and an Artificial Neural Network with backpropagation. On the FG-NET an average detection of 90.6% against that of 87.5% to that of the PPB. Gender is then detected from the facial components the nose, eyes among others. The forehead recorded the highest accuracy with 92%, followed by the nose with 90%, cheeks with 89.2% and the eyes with 87% and the mouth recorded the lowest accuracy of 75%. As a result feature fusion is then carried out to improve classification accuracies especially that of the mouth and eyes with lowest accuracies. The eyes with an accuracy of 87% is fused with the forehead with 92% and the resulting accuracy is an increase to 93%. The mouth, with the lowest accuracy of 75% is fused with the nose which has an accuracy of 90% and the resulting accuracy is 87%. These results carried out by fusing through addition showed improved results. Fusion is then carried out between Appearance based and shape based features. On the FG-NET dataset using the LBP and LDP an accuracy of 85.33% and 89.53% with the PPB recording 83.13%, 89.3% for LBP and LDP respectively. As expected and shown by previous researchers the LDP clearly obtains higher classification accuracies as it than LBP as it uses gradient rather than pixel intensity. We then fuse the vectors of the LDP, LBP with that of the ASM and carry out dimensionality reduction, then fusion by addition. On the PPB dataset fusion of LDP and ASM records 81.56%, and 94.53% with the FG-NET recording 89.53% respectively

    Joint optimization of manifold learning and sparse representations for face and gesture analysis

    Get PDF
    Face and gesture understanding algorithms are powerful enablers in intelligent vision systems for surveillance, security, entertainment, and smart spaces. In the future, complex networks of sensors and cameras may disperse directions to lost tourists, perform directory lookups in the office lobby, or contact the proper authorities in case of an emergency. To be effective, these systems will need to embrace human subtleties while interacting with people in their natural conditions. Computer vision and machine learning techniques have recently become adept at solving face and gesture tasks using posed datasets in controlled conditions. However, spontaneous human behavior under unconstrained conditions, or in the wild, is more complex and is subject to considerable variability from one person to the next. Uncontrolled conditions such as lighting, resolution, noise, occlusions, pose, and temporal variations complicate the matter further. This thesis advances the field of face and gesture analysis by introducing a new machine learning framework based upon dimensionality reduction and sparse representations that is shown to be robust in posed as well as natural conditions. Dimensionality reduction methods take complex objects, such as facial images, and attempt to learn lower dimensional representations embedded in the higher dimensional data. These alternate feature spaces are computationally more efficient and often more discriminative. The performance of various dimensionality reduction methods on geometric and appearance based facial attributes are studied leading to robust facial pose and expression recognition models. The parsimonious nature of sparse representations (SR) has successfully been exploited for the development of highly accurate classifiers for various applications. Despite the successes of SR techniques, large dictionaries and high dimensional data can make these classifiers computationally demanding. Further, sparse classifiers are subject to the adverse effects of a phenomenon known as coefficient contamination, where for example variations in pose may affect identity and expression recognition. This thesis analyzes the interaction between dimensionality reduction and sparse representations to present a unified sparse representation classification framework that addresses both issues of computational complexity and coefficient contamination. Semi-supervised dimensionality reduction is shown to mitigate the coefficient contamination problems associated with SR classifiers. The combination of semi-supervised dimensionality reduction with SR systems forms the cornerstone for a new face and gesture framework called Manifold based Sparse Representations (MSR). MSR is shown to deliver state-of-the-art facial understanding capabilities. To demonstrate the applicability of MSR to new domains, MSR is expanded to include temporal dynamics. The joint optimization of dimensionality reduction and SRs for classification purposes is a relatively new field. The combination of both concepts into a single objective function produce a relation that is neither convex, nor directly solvable. This thesis studies this problem to introduce a new jointly optimized framework. This framework, termed LGE-KSVD, utilizes variants of Linear extension of Graph Embedding (LGE) along with modified K-SVD dictionary learning to jointly learn the dimensionality reduction matrix, sparse representation dictionary, sparse coefficients, and sparsity-based classifier. By injecting LGE concepts directly into the K-SVD learning procedure, this research removes the support constraints K-SVD imparts on dictionary element discovery. Results are shown for facial recognition, facial expression recognition, human activity analysis, and with the addition of a concept called active difference signatures, delivers robust gesture recognition from Kinect or similar depth cameras
    • …
    corecore