82 research outputs found

    A efficient and practical 3D face scanner using near infrared and visible photometric stereo

    Get PDF
    AbstractThis paper is concerned with the acquisition of model data for automatic 3D face recognition applications. As 3D methods become progressively more popular in face recognition research, the need for fast and accurate data capture has become crucial. This paper is motivated by this need and offers three primary contributions. Firstly, the paper demonstrates that four-source photometric stereo offers a potential means for data capture that is computationally nd financially viable and easily deployable in commercial settings. We have shown that both visible light and less ntrusive near infrared light is suitable for facial illumination. The second contribution is a detailed set of experimental esults that compare the accuracy of the device to ground truth, which was captured using a commercial projected pattern range finder. Importantly, we show that not only is near infrared light a valid alternative to the more commonly xploited visible light, but that it actually gives more accurate reconstructions. Finally, we assess the validity of the Lambertian assumption on skin reflectance data and show that better results may be obtained by incorporating more dvanced reflectance functions, such as the Oren–Nayar model

    Multispectral scleral patterns for ocular biometric recognition

    Get PDF
    Biometrics is the science of recognizing people based on their physical or behavioral traits such as face, fingerprints, iris, and voice. Among the various traits studied in the literature, ocular biometrics has gained popularity due to the significant progress made in iris recognition. However, iris recognition is unfavorably influenced by the non-frontal gaze direction of the eye with respect to the acquisition device. In such scenarios, additional parts of the eye, such as the sclera (the white of the eye) may be of significance. In this dissertation, we investigate the use of the sclera texture and the vasculature patterns evident in the sclera as potential biometric cues. Iris patterns are better discerned in the near infrared spectrum (NIR) while vasculature patterns are better discerned in the visible spectrum (RGB). Therefore, multispectral images of the eye, consisting of both NIR and RGB channels, were used in this work in order to ensure that both the iris and the vasculature patterns are successfully imaged.;The contributions of this work include the following. Firstly, a multispectral ocular database was assembled by collecting high-resolution color infrared images of the left and right eyes of 103 subjects using the DuncanTech MS 3100 multispectral camera. Secondly, a novel segmentation algorithm was designed to localize the spacial extent of the iris, sclera and pupil in the ocular images. The proposed segmentation algorithm is a combination of region-based and edge-based schemes that exploits the multispectral information. Thirdly, different feature extraction and matching method were used to determine the potential of utilizing the sclera and the accompanying vasculature pattern as biometric cues. The three specific matching methods considered in this work were keypoint-based matching, direct correlation matching, and minutiae matching based on blood vessel bifurcations. Fourthly, the potential of designing a bimodal ocular system that combines the sclera biometric with the iris biometric was explored.;Experiments convey the efficacy of the proposed segmentation algorithm in localizing the sclera and the iris. The use of keypoint-based matching was observed to result in the best recognition performance for the scleral patterns. Finally, the possibility of utilizing the scleral patterns in conjunction with the iris for recognizing ocular images exhibiting non-frontal gaze directions was established

    Subspace Representations for Robust Face and Facial Expression Recognition

    Get PDF
    Analyzing human faces and modeling their variations have always been of interest to the computer vision community. Face analysis based on 2D intensity images is a challenging problem, complicated by variations in pose, lighting, blur, and non-rigid facial deformations due to facial expressions. Among the different sources of variation, facial expressions are of interest as important channels of non-verbal communication. Facial expression analysis is also affected by changes in view-point and inter-subject variations in performing different expressions. This dissertation makes an attempt to address some of the challenges involved in developing robust algorithms for face and facial expression recognition by exploiting the idea of proper subspace representations for data. Variations in the visual appearance of an object mostly arise due to changes in illumination and pose. So we first present a video-based sequential algorithm for estimating the face albedo as an illumination-insensitive signature for face recognition. We show that by knowing/estimating the pose of the face at each frame of a sequence, the albedo can be efficiently estimated using a Kalman filter. Then we extend this to the case of unknown pose by simultaneously tracking the pose as well as updating the albedo through an efficient Bayesian inference method performed using a Rao-Blackwellized particle filter. Since understanding the effects of blur, especially motion blur, is an important problem in unconstrained visual analysis, we then propose a blur-robust recognition algorithm for faces with spatially varying blur. We model a blurred face as a weighted average of geometrically transformed instances of its clean face. We then build a matrix, for each gallery face, whose column space spans the space of all the motion blurred images obtained from the clean face. This matrix representation is then used to define a proper objective function and perform blur-robust face recognition. To develop robust and generalizable models for expression analysis one needs to break the dependence of the models on the choice of the coordinate frame of the camera. To this end, we build models for expressions on the affine shape-space (Grassmann manifold), as an approximation to the projective shape-space, by using a Riemannian interpretation of deformations that facial expressions cause on different parts of the face. This representation enables us to perform various expression analysis and recognition algorithms without the need for pose normalization as a preprocessing step. There is a large degree of inter-subject variations in performing various expressions. This poses an important challenge on developing robust facial expression recognition algorithms. To address this challenge, we propose a dictionary-based approach for facial expression analysis by decomposing expressions in terms of action units (AUs). First, we construct an AU-dictionary using domain experts' knowledge of AUs. To incorporate the high-level knowledge regarding expression decomposition and AUs, we then perform structure-preserving sparse coding by imposing two layers of grouping over AU-dictionary atoms as well as over the test image matrix columns. We use the computed sparse code matrix for each expressive face to perform expression decomposition and recognition. Most of the existing methods for the recognition of faces and expressions consider either the expression-invariant face recognition problem or the identity-independent facial expression recognition problem. We propose joint face and facial expression recognition using a dictionary-based component separation algorithm (DCS). In this approach, the given expressive face is viewed as a superposition of a neutral face component with a facial expression component, which is sparse with respect to the whole image. This assumption leads to a dictionary-based component separation algorithm, which benefits from the idea of sparsity and morphological diversity. The DCS algorithm uses the data-driven dictionaries to decompose an expressive test face into its constituent components. The sparse codes we obtain as a result of this decomposition are then used for joint face and expression recognition

    Specular reflection removal and bloodless vessel segmentation for 3-D heart model reconstruction from single view images

    Get PDF
    Three Dimensional (3D) human heart model is attracting attention for its role in medical images for education and clinical purposes. Analysing 2D images to obtain meaningful information requires a certain level of expertise. Moreover, it is time consuming and requires special devices to obtain aforementioned images. In contrary, a 3D model conveys much more information. 3D human heart model reconstruction from medical imaging devices requires several input images, while reconstruction from a single view image is challenging due to the colour property of the heart image, light reflections, and its featureless surface. Lights and illumination condition of the operating room cause specular reflections on the wet heart surface that result in noises forming of the reconstruction process. Image-based technique is used for the proposed human heart surface reconstruction. It is important the reflection is eliminated to allow for proper 3D reconstruction and avoid imperfect final output. Specular reflections detection and correction process examine the surface properties. This was implemented as a first step to detect reflections using the standard deviation of RGB colour channel and the maximum value of blue channel to establish colour, devoid of specularities. The result shows the accurate and efficient performance of the specularities removing process with 88.7% similarity with the ground truth. Realistic 3D heart model reconstruction was developed based on extraction of pixel information from digital images to allow novice surgeons to reduce the time for cardiac surgery training and enhancing their perception of the Operating Theatre (OT). Cardiac medical imaging devices such as Magnetic Resonance Imaging (MRI), Computed Tomography (CT) images, or Echocardiography provide cardiac information. However,these images from medical modalities are not adequate, to precisely simulate the real environment and to be used in the training simulator for cardiac surgery. The propose method exploits and develops techniques based on analysing real coloured images taken during cardiac surgery in order to obtain meaningful information of the heart anatomical structures. Another issue is the different human heart surface vessels. The most important vessel region is the bloodless, lack of blood, vessels. Surgeon faces some difficulties in locating the bloodless vessel region during surgery. The thesis suggests a technique of identifying the vessels’ Region of Interest (ROI) to avoid surgical injuries by examining an enhanced input image. The proposed method locates vessels’ ROI by using Decorrelation Stretch technique. This Decorrelation Stretch can clearly enhance the heart’s surface image. Through this enhancement, the surgeon become enables effectively identifying the vessels ROI to perform the surgery from textured and coloured surface images. In addition, after enhancement and segmentation of the vessels ROI, a 3D reconstruction of this ROI takes place and then visualize it over the 3D heart model. Experiments for each phase in the research framework were qualitatively and quantitatively evaluated. Two hundred and thirteen real human heart images are the dataset collected during cardiac surgery using a digital camera. The experimental results of the proposed methods were compared with manual hand-labelling ground truth data. The cost reduction of false positive and false negative of specular detection and correction processes of the proposed method was less than 24% compared to other methods. In addition, the efficient results of Root Mean Square Error (RMSE) to measure the correctness of the z-axis values to reconstruction of the 3D model accurately compared to other method. Finally, the 94.42% accuracy rate of the proposed vessels segmentation method using RGB colour space achieved is comparable to other colour spaces. Experimental results show that there is significant efficiency and robustness compared to existing state of the art methods

    Integration of Image Processing Algorithm and Path Planning for Search and Rescue Robot

    Get PDF
    The focus of this project was to explore algorithms and techniques used in motion detection, object recognition and facial recognition, path finding as well as obstacle avoidance. To apply these algorithms, OpenCV was used. In terms of hardware, Raspberry Pi were used to perform image processing and robot movement. To perform image processing various OpenCV commands, such as cv2.GreyScale followed by cv2.GaussianBlur and cv2.DetectContour were combined. Path finding and obstacle avoidance were done by integrating an ultrasonic sensor into the system. Path finding is done by utilizing the coordinates of the bounding box. As a result, the robot turned around, 90 degree each time, to have a view of each of the four directions, in search of the target. Once motion was detected, the robot would stop at that direction and approach until the ultrasonic detected something. The robot would then run a scan on the target using facial recognition to determine whether it is human

    Visual Tracking: An Experimental Survey

    Get PDF
    There is a large variety of trackers, which have been proposed in the literature during the last two decades with some mixed success. Object tracking in realistic scenarios is difficult problem, therefore it remains a most active area of research in Computer Vision. A good tracker should perform well in a large number of videos involving illumination changes, occlusion, clutter, camera motion, low contrast, specularities and at least six more aspects. However, the performance of proposed trackers have been evaluated typically on less than ten videos, or on the special purpose datasets. In this paper, we aim to evaluate trackers systematically and experimentally on 315 video fragments covering above aspects. We selected a set of nineteen trackers to include a wide variety of algorithms often cited in literature, supplemented with trackers appearing in 2010 and 2011 for which the code was publicly available. We demonstrate that trackers can be evaluated objectively by survival curves, Kaplan Meier statistics, and Grubs testing. We find that in the evaluation practice the F-score is as effective as the object tracking accuracy (OTA) score. The analysis under a large variety of circumstances provides objective insight into the strengths and weaknesses of trackers
    corecore