5,012 research outputs found

    Deep Directional Statistics: Pose Estimation with Uncertainty Quantification

    Full text link
    Modern deep learning systems successfully solve many perception tasks such as object pose estimation when the input image is of high quality. However, in challenging imaging conditions such as on low-resolution images or when the image is corrupted by imaging artifacts, current systems degrade considerably in accuracy. While a loss in performance is unavoidable, we would like our models to quantify their uncertainty in order to achieve robustness against images of varying quality. Probabilistic deep learning models combine the expressive power of deep learning with uncertainty quantification. In this paper, we propose a novel probabilistic deep learning model for the task of angular regression. Our model uses von Mises distributions to predict a distribution over object pose angle. Whereas a single von Mises distribution is making strong assumptions about the shape of the distribution, we extend the basic model to predict a mixture of von Mises distributions. We show how to learn a mixture model using a finite and infinite number of mixture components. Our model allows for likelihood-based training and efficient inference at test time. We demonstrate on a number of challenging pose estimation datasets that our model produces calibrated probability predictions and competitive or superior point estimates compared to the current state-of-the-art

    Face recognition technologies for evidential evaluation of video traces

    Get PDF
    Human recognition from video traces is an important task in forensic investigations and evidence evaluations. Compared with other biometric traits, face is one of the most popularly used modalities for human recognition due to the fact that its collection is non-intrusive and requires less cooperation from the subjects. Moreover, face images taken at a long distance can still provide reasonable resolution, while most biometric modalities, such as iris and fingerprint, do not have this merit. In this chapter, we discuss automatic face recognition technologies for evidential evaluations of video traces. We first introduce the general concepts in both forensic and automatic face recognition , then analyse the difficulties in face recognition from videos . We summarise and categorise the approaches for handling different uncontrollable factors in difficult recognition conditions. Finally we discuss some challenges and trends in face recognition research in both forensics and biometrics . Given its merits tested in many deployed systems and great potential in other emerging applications, considerable research and development efforts are expected to be devoted in face recognition in the near future

    Joint optimization of manifold learning and sparse representations for face and gesture analysis

    Get PDF
    Face and gesture understanding algorithms are powerful enablers in intelligent vision systems for surveillance, security, entertainment, and smart spaces. In the future, complex networks of sensors and cameras may disperse directions to lost tourists, perform directory lookups in the office lobby, or contact the proper authorities in case of an emergency. To be effective, these systems will need to embrace human subtleties while interacting with people in their natural conditions. Computer vision and machine learning techniques have recently become adept at solving face and gesture tasks using posed datasets in controlled conditions. However, spontaneous human behavior under unconstrained conditions, or in the wild, is more complex and is subject to considerable variability from one person to the next. Uncontrolled conditions such as lighting, resolution, noise, occlusions, pose, and temporal variations complicate the matter further. This thesis advances the field of face and gesture analysis by introducing a new machine learning framework based upon dimensionality reduction and sparse representations that is shown to be robust in posed as well as natural conditions. Dimensionality reduction methods take complex objects, such as facial images, and attempt to learn lower dimensional representations embedded in the higher dimensional data. These alternate feature spaces are computationally more efficient and often more discriminative. The performance of various dimensionality reduction methods on geometric and appearance based facial attributes are studied leading to robust facial pose and expression recognition models. The parsimonious nature of sparse representations (SR) has successfully been exploited for the development of highly accurate classifiers for various applications. Despite the successes of SR techniques, large dictionaries and high dimensional data can make these classifiers computationally demanding. Further, sparse classifiers are subject to the adverse effects of a phenomenon known as coefficient contamination, where for example variations in pose may affect identity and expression recognition. This thesis analyzes the interaction between dimensionality reduction and sparse representations to present a unified sparse representation classification framework that addresses both issues of computational complexity and coefficient contamination. Semi-supervised dimensionality reduction is shown to mitigate the coefficient contamination problems associated with SR classifiers. The combination of semi-supervised dimensionality reduction with SR systems forms the cornerstone for a new face and gesture framework called Manifold based Sparse Representations (MSR). MSR is shown to deliver state-of-the-art facial understanding capabilities. To demonstrate the applicability of MSR to new domains, MSR is expanded to include temporal dynamics. The joint optimization of dimensionality reduction and SRs for classification purposes is a relatively new field. The combination of both concepts into a single objective function produce a relation that is neither convex, nor directly solvable. This thesis studies this problem to introduce a new jointly optimized framework. This framework, termed LGE-KSVD, utilizes variants of Linear extension of Graph Embedding (LGE) along with modified K-SVD dictionary learning to jointly learn the dimensionality reduction matrix, sparse representation dictionary, sparse coefficients, and sparsity-based classifier. By injecting LGE concepts directly into the K-SVD learning procedure, this research removes the support constraints K-SVD imparts on dictionary element discovery. Results are shown for facial recognition, facial expression recognition, human activity analysis, and with the addition of a concept called active difference signatures, delivers robust gesture recognition from Kinect or similar depth cameras

    On automatic age estimation from facial profile view

    Get PDF
    YesIn recent years, automatic facial age estimation has gained popularity due to its numerous applications. Much work has been done on frontal images and lately, minimal estimation errors have been achieved on most of the benchmark databases. However, in reality, images obtained in unconstrained environments are not always frontal. For instance, when conducting a demographic study or crowd analysis, one may get profile images of the face. To the best of our knowledge, no attempt has been made to estimate ages from the side-view of face images. Here we exploit this by using a pre-trained deep residual neural network (ResNet) to extract features. We then utilize a sparse partial least squares regression approach to estimate ages. Despite having less information as compared to frontal images, our results show that the extracted deep features achieve a promising performance

    A Hierarchical Framework for Facial Age Estimation

    Get PDF
    Age estimation is a complex issue of multiclassification or regression. To address the problems of uneven distribution of age database and ignorance of ordinal information, this paper shows a hierarchic age estimation system, comprising age group and specific age estimation. In our system, two novel classifiers, sequence k-nearest neighbor (SKNN) and ranking-KNN, are introduced to predict age group and value, respectively. Notably, ranking-KNN utilizes the ordinal information between samples in estimation process rather than regards samples as separate individuals. Tested on FG-NET database, our system achieves 4.97 evaluated by MAE (mean absolute error) for age estimation

    Klasifikacija dvodeminezionalnih slika lica za razlikovanje djece od odraslih osoba na temelju antropometrije

    Get PDF
    Classification of face images can be done in various ways. This research uses two-dimensional photographs of people's faces to detect children in images. Algorithm for classification of images into children and adults is developed and existing algorithms are analysed. This algorithm will also be used for age estimation. Through analysis of the state of the art researchon facial landmarks for age estimationand combination with changes that occur in human face morphology during growth and aging, facial landmarks needed for age classification and estimation of humans are identified. Algorithm is based on ratios of Euclidean distances between those landmarks. Based on these ratios, children can be detected and age can be estimated.Slike lica mogu biti klasificirane na različite načine. Ovo istraživanje koristi dvodimenzionalne fotografije ljudskih lica za detekciju djece na slikama. Kreiran je novi algoritam za klasifikaciju fotografija ljudskih lica u dvije grupe, djeca i odrasli. Algoritam će se također koristiti za procjenu dobi osoba na slici te će biti analizirani postojeći algoritmi. Kroz analizu literature o karakterističnim točkama korištenih u procjeni dobi i kombinacijom dobivenih karakterističnih točaka s morfološkim promjenama tokom odrastanja i starenja, definirane su karakteristične točke potrebne za klasifikaciju i procjenu dobi. Algoritam se bazira na omjerima Euklidskih udaljenosti između identificiranih karakterističnih točaka

    Doctor of Philosophy

    Get PDF
    dissertationThe statistical study of anatomy is one of the primary focuses of medical image analysis. It is well-established that the appropriate mathematical settings for such analyses are Riemannian manifolds and Lie group actions. Statistically defined atlases, in which a mean anatomical image is computed from a collection of static three-dimensional (3D) scans, have become commonplace. Within the past few decades, these efforts, which constitute the field of computational anatomy, have seen great success in enabling quantitative analysis. However, most of the analysis within computational anatomy has focused on collections of static images in population studies. The recent emergence of large-scale longitudinal imaging studies and four-dimensional (4D) imaging technology presents new opportunities for studying dynamic anatomical processes such as motion, growth, and degeneration. In order to make use of this new data, it is imperative that computational anatomy be extended with methods for the statistical analysis of longitudinal and dynamic medical imaging. In this dissertation, the deformable template framework is used for the development of 4D statistical shape analysis, with applications in motion analysis for individualized medicine and the study of growth and disease progression. A new method for estimating organ motion directly from raw imaging data is introduced and tested extensively. Polynomial regression, the staple of curve regression in Euclidean spaces, is extended to the setting of Riemannian manifolds. This polynomial regression framework enables rigorous statistical analysis of longitudinal imaging data. Finally, a new diffeomorphic model of irrotational shape change is presented. This new model presents striking practical advantages over standard diffeomorphic methods, while the study of this new space promises to illuminate aspects of the structure of the diffeomorphism group
    corecore