167 research outputs found

    Using retinex for point selection in 3D shape registration

    Get PDF
    Inspired by retinex theory, we propose a novel method for selecting key points from a depth map of a 3D freeform shape; we also use these key points as a basis for shape registration. To find key points, first, depths are transformed using the Hotelling method and normalized to reduce their dependence on a particular viewpoint. Adaptive smoothing is then applied using weights which decrease with spatial gradient and local inhomogeneity; this preserves local features such as edges and corners while ensuring smoothed depths are not reduced. Key points are those with locally maximal depths, faithfully capturing shape. We show how such key points can be used in an efficient registration process, using two state-of-the-art iterative closest point variants. A comparative study with leading alternatives, using real range images, shows that our approach provides informative, expressive, and repeatable points leading to the most accurate registration results. © 2014 Elsevier Ltd

    Region-based saliency estimation for 3D shape analysis and understanding

    Get PDF
    The detection of salient regions is an important pre-processing step for many 3D shape analysis and understanding tasks. This paper proposes a novel method for saliency detection in 3D free form shapes. Firstly, we smooth the surface normals by a bilateral filter. Such a method is capable of smoothing the surfaces and retaining the local details. Secondly, a novel method is proposed for the estimation of the saliency value of each vertex. To this end, two new features are defined: Retinex-based Importance Feature (RIF) and Relative Normal Distance (RND). They are based on the human visual perception characteristics and surface geometry respectively. Since the vertex based method cannot guarantee that the detected salient regions are semantically continuous and complete, we propose to refine such values based on surface patches. The detected saliency is finally used to guide the existing techniques for mesh simplification, interest point detection, and overlapping point cloud registration. The comparative studies based on real data from three publicly accessible databases show that the proposed method usually outperforms five selected state of the art ones both qualitatively and quantitatively for saliency detection and 3D shape analysis and understanding

    Building colour terms: A combined GIS and stereo vision approach to identifying building pixels in images to determine appropriate colour terms

    Get PDF
    Color information is a useful attribute to include in a building’s description to assist the listener in identifying the intended target. Often this information is only available as image data, and not readily accessible for use in constructing referring expressions for verbal communication. The method presented uses a GIS building polygon layer in conjunction with street-level captured imagery to provide a method to automatically filter foreground objects and select pixels which correspond to building fac¸ades. These selected pixels are then used to define the most appropriate color term for the building, and corresponding fuzzy color term histogram. The technique uses a single camera capturing images at a high frame rate, with the baseline distance between frames calculated from a GPS speed log. The expected distance from the camera to the building is measured from the polygon layer and refined from the calculated depth map, after which building pixels are selected. In addition significant foreground planar surfaces between the known road edge and building fac¸ade are identified as possible boundarywalls and hedges. The output is a dataset of the most appropriate color terms for both the building and boundary walls. Initial trials demonstrate the usefulness of the technique in automatically capturing color terms for buildings in urban regions

    Cross-Spectral Face Recognition Between Near-Infrared and Visible Light Modalities.

    Get PDF
    In this thesis, improvement of face recognition performance with the use of images from the visible (VIS) and near-infrared (NIR) spectrum is attempted. Face recognition systems can be adversely affected by scenarios which encounter a significant amount of illumination variation across images of the same subject. Cross-spectral face recognition systems using images collected across the VIS and NIR spectrum can counter the ill-effects of illumination variation by standardising both sets of images. A novel preprocessing technique is proposed, which attempts the transformation of faces across both modalities to a feature space with enhanced correlation. Direct matching across the modalities is not possible due to the inherent spectral differences between NIR and VIS face images. Compared to a VIS light source, NIR radiation has a greater penetrative depth when incident on human skin. This fact, in addition to the greater number of scattering interactions within the skin by rays from the NIR spectrum can alter the morphology of the human face enough to disable a direct match with the corresponding VIS face. Several ways to bridge the gap between NIR-VIS faces have been proposed previously. Mostly of a data-driven approach, these techniques include standardised photometric normalisation techniques and subspace projections. A generative approach driven by a true physical model has not been investigated till now. In this thesis, it is proposed that a large proportion of the scattering interactions present in the NIR spectrum can be accounted for using a model for subsurface scattering. A novel subsurface scattering inversion (SSI) algorithm is developed that implements an inversion approach based on translucent surface rendering by the computer graphics field, whereby the reversal of the first order effects of subsurface scattering is attempted. The SSI algorithm is then evaluated against several preprocessing techniques, and using various permutations of feature extraction and subspace projection algorithms. The results of this evaluation show an improvement in cross spectral face recognition performance using SSI over existing Retinex-based approaches. The top performing combination of an existing photometric normalisation technique, Sequential Chain, is seen to be the best performing with a Rank 1 recognition rate of 92. 5%. In addition, the improvement in performance using non-linear projection models shows an element of non-linearity exists in the relationship between NIR and VIS

    Illumination Processing in Face Recognition

    Get PDF

    Enhanced Augmented Reality Framework for Sports Entertainment Applications

    Get PDF
    Augmented Reality (AR) superimposes virtual information on real-world data, such as displaying useful information on videos/images of a scene. This dissertation presents an Enhanced AR (EAR) framework for displaying useful information on images of a sports game. The challenge in such applications is robust object detection and recognition. This is even more challenging when there is strong sunlight. We address the phenomenon where a captured image is degraded by strong sunlight. The developed framework consists of an image enhancement technique to improve the accuracy of subsequent player and face detection. The image enhancement is followed by player detection, face detection, recognition of players, and display of personal information of players. First, an algorithm based on Multi-Scale Retinex (MSR) is proposed for image enhancement. For the tasks of player and face detection, we use adaptive boosting algorithm with Haar-like features for both feature selection and classification. The player face recognition algorithm uses adaptive boosting with the LDA for feature selection and nearest neighbor classifier for classification. The framework can be deployed in any sports where a viewer captures images. Display of players-specific information enhances the end-user experience. Detailed experiments are performed on 2096 diverse images captured using a digital camera and smartphone. The images contain players in different poses, expressions, and illuminations. Player face recognition module requires players faces to be frontal or up to ?350 of pose variation. The work demonstrates the great potential of computer vision based approaches for future development of AR applications.COMSATS Institute of Information Technolog

    Multi-Modal Enhancement Techniques for Visibility Improvement of Digital Images

    Get PDF
    Image enhancement techniques for visibility improvement of 8-bit color digital images based on spatial domain, wavelet transform domain, and multiple image fusion approaches are investigated in this dissertation research. In the category of spatial domain approach, two enhancement algorithms are developed to deal with problems associated with images captured from scenes with high dynamic ranges. The first technique is based on an illuminance-reflectance (I-R) model of the scene irradiance. The dynamic range compression of the input image is achieved by a nonlinear transformation of the estimated illuminance based on a windowed inverse sigmoid transfer function. A single-scale neighborhood dependent contrast enhancement process is proposed to enhance the high frequency components of the illuminance, which compensates for the contrast degradation of the mid-tone frequency components caused by dynamic range compression. The intensity image obtained by integrating the enhanced illuminance and the extracted reflectance is then converted to a RGB color image through linear color restoration utilizing the color components of the original image. The second technique, named AINDANE, is a two step approach comprised of adaptive luminance enhancement and adaptive contrast enhancement. An image dependent nonlinear transfer function is designed for dynamic range compression and a multiscale image dependent neighborhood approach is developed for contrast enhancement. Real time processing of video streams is realized with the I-R model based technique due to its high speed processing capability while AINDANE produces higher quality enhanced images due to its multi-scale contrast enhancement property. Both the algorithms exhibit balanced luminance, contrast enhancement, higher robustness, and better color consistency when compared with conventional techniques. In the transform domain approach, wavelet transform based image denoising and contrast enhancement algorithms are developed. The denoising is treated as a maximum a posteriori (MAP) estimator problem; a Bivariate probability density function model is introduced to explore the interlevel dependency among the wavelet coefficients. In addition, an approximate solution to the MAP estimation problem is proposed to avoid the use of complex iterative computations to find a numerical solution. This relatively low complexity image denoising algorithm implemented with dual-tree complex wavelet transform (DT-CWT) produces high quality denoised images
    • …
    corecore