658 research outputs found

    HUMAN FACE RECOGNITION BASED ON FRACTAL IMAGE CODING

    Get PDF
    Human face recognition is an important area in the field of biometrics. It has been an active area of research for several decades, but still remains a challenging problem because of the complexity of the human face. In this thesis we describe fully automatic solutions that can locate faces and then perform identification and verification. We present a solution for face localisation using eye locations. We derive an efficient representation for the decision hyperplane of linear and nonlinear Support Vector Machines (SVMs). For this we introduce the novel concept of ρ\rho and η\eta prototypes. The standard formulation for the decision hyperplane is reformulated and expressed in terms of the two prototypes. Different kernels are treated separately to achieve further classification efficiency and to facilitate its adaptation to operate with the fast Fourier transform to achieve fast eye detection. Using the eye locations, we extract and normalise the face for size and in-plane rotations. Our method produces a more efficient representation of the SVM decision hyperplane than the well-known reduced set methods. As a result, our eye detection subsystem is faster and more accurate. The use of fractals and fractal image coding for object recognition has been proposed and used by others. Fractal codes have been used as features for recognition, but we need to take into account the distance between codes, and to ensure the continuity of the parameters of the code. We use a method based on fractal image coding for recognition, which we call the Fractal Neighbour Distance (FND). The FND relies on the Euclidean metric and the uniqueness of the attractor of a fractal code. An advantage of using the FND over fractal codes as features is that we do not have to worry about the uniqueness of, and distance between, codes. We only require the uniqueness of the attractor, which is already an implied property of a properly generated fractal code. Similar methods to the FND have been proposed by others, but what distinguishes our work from the rest is that we investigate the FND in greater detail and use our findings to improve the recognition rate. Our investigations reveal that the FND has some inherent invariance to translation, scale, rotation and changes to illumination. These invariances are image dependent and are affected by fractal encoding parameters. The parameters that have the greatest effect on recognition accuracy are the contrast scaling factor, luminance shift factor and the type of range block partitioning. The contrast scaling factor affect the convergence and eventual convergence rate of a fractal decoding process. We propose a novel method of controlling the convergence rate by altering the contrast scaling factor in a controlled manner, which has not been possible before. This helped us improve the recognition rate because under certain conditions better results are achievable from using a slower rate of convergence. We also investigate the effects of varying the luminance shift factor, and examine three different types of range block partitioning schemes. They are Quad-tree, HV and uniform partitioning. We performed experiments using various face datasets, and the results show that our method indeed performs better than many accepted methods such as eigenfaces. The experiments also show that the FND based classifier increases the separation between classes. The standard FND is further improved by incorporating the use of localised weights. A local search algorithm is introduced to find a best matching local feature using this locally weighted FND. The scores from a set of these locally weighted FND operations are then combined to obtain a global score, which is used as a measure of the similarity between two face images. Each local FND operation possesses the distortion invariant properties described above. Combined with the search procedure, the method has the potential to be invariant to a larger class of non-linear distortions. We also present a set of locally weighted FNDs that concentrate around the upper part of the face encompassing the eyes and nose. This design was motivated by the fact that the region around the eyes has more information for discrimination. Better performance is achieved by using different sets of weights for identification and verification. For facial verification, performance is further improved by using normalised scores and client specific thresholding. In this case, our results are competitive with current state-of-the-art methods, and in some cases outperform all those to which they were compared. For facial identification, under some conditions the weighted FND performs better than the standard FND. However, the weighted FND still has its short comings when some datasets are used, where its performance is not much better than the standard FND. To alleviate this problem we introduce a voting scheme that operates with normalised versions of the weighted FND. Although there are no improvements at lower matching ranks using this method, there are significant improvements for larger matching ranks. Our methods offer advantages over some well-accepted approaches such as eigenfaces, neural networks and those that use statistical learning theory. Some of the advantages are: new faces can be enrolled without re-training involving the whole database; faces can be removed from the database without the need for re-training; there are inherent invariances to face distortions; it is relatively simple to implement; and it is not model-based so there are no model parameters that need to be tweaked

    Multimodal Remote Sensing Image Registration with Accuracy Estimation at Local and Global Scales

    Full text link
    This paper focuses on potential accuracy of remote sensing images registration. We investigate how this accuracy can be estimated without ground truth available and used to improve registration quality of mono- and multi-modal pair of images. At the local scale of image fragments, the Cramer-Rao lower bound (CRLB) on registration error is estimated for each local correspondence between coarsely registered pair of images. This CRLB is defined by local image texture and noise properties. Opposite to the standard approach, where registration accuracy is only evaluated at the output of the registration process, such valuable information is used by us as an additional input knowledge. It greatly helps detecting and discarding outliers and refining the estimation of geometrical transformation model parameters. Based on these ideas, a new area-based registration method called RAE (Registration with Accuracy Estimation) is proposed. In addition to its ability to automatically register very complex multimodal image pairs with high accuracy, the RAE method provides registration accuracy at the global scale as covariance matrix of estimation error of geometrical transformation model parameters or as point-wise registration Standard Deviation. This accuracy does not depend on any ground truth availability and characterizes each pair of registered images individually. Thus, the RAE method can identify image areas for which a predefined registration accuracy is guaranteed. The RAE method is proved successful with reaching subpixel accuracy while registering eight complex mono/multimodal and multitemporal image pairs including optical to optical, optical to radar, optical to Digital Elevation Model (DEM) images and DEM to radar cases. Other methods employed in comparisons fail to provide in a stable manner accurate results on the same test cases.Comment: 48 pages, 8 figures, 5 tables, 51 references Revised arguments in sections 2 and 3. Additional test cases added in Section 4; comparison with the state-of-the-art improved. References added. Conclusions unchanged. Proofrea

    Computational Modeling for Abnormal Brain Tissue Segmentation, Brain Tumor Tracking, and Grading

    Get PDF
    This dissertation proposes novel texture feature-based computational models for quantitative analysis of abnormal tissues in two neurological disorders: brain tumor and stroke. Brain tumors are the cells with uncontrolled growth in the brain tissues and one of the major causes of death due to cancer. On the other hand, brain strokes occur due to the sudden interruption of the blood supply which damages the normal brain tissues and frequently causes death or persistent disability. Clinical management of these brain tumors and stroke lesions critically depends on robust quantitative analysis using different imaging modalities including Magnetic Resonance (MR) and Digital Pathology (DP) images. Due to uncontrolled growth and infiltration into the surrounding tissues, the tumor regions appear with a significant texture variation in the static MRI volume and also in the longitudinal imaging study. Consequently, this study developed computational models using novel texture features to segment abnormal brain tissues (tumor, and stroke lesions), tracking the change of tumor volume in longitudinal images, and tumor grading in MR images. Manual delineation and analysis of these abnormal tissues in large scale is tedious, error-prone, and often suffers from inter-observer variability. Therefore, efficient computational models for robust segmentation of different abnormal tissues is required to support the diagnosis and analysis processes. In this study, brain tissues are characterized with novel computational modeling of multi-fractal texture features for multi-class brain tumor tissue segmentation (BTS) and extend the method for ischemic stroke lesions in MRI. The robustness of the proposed segmentation methods is evaluated using a huge amount of private and public domain clinical data that offers competitive performance when compared with that of the state-of-the-art methods. Further, I analyze the dynamic texture behavior of tumor volume in longitudinal imaging and develop post-processing frame-work using three-dimensional (3D) texture features. These post-processing methods are shown to reduce the false positives in the BTS results and improve the overall segmentation result in longitudinal imaging. Furthermore, using this improved segmentation results the change of tumor volume has been quantified in three types such as stable, progress, and shrinkage as observed by the volumetric changes of different tumor tissues in longitudinal images. This study also investigates a novel non-invasive glioma grading, for the first time in literature, that uses structural MRI only. Such non-invasive glioma grading may be useful before an invasive biopsy is recommended. This study further developed an automatic glioma grading scheme using the invasive cell nuclei morphology in DP images for cross-validation with the same patients. In summary, the texture-based computational models proposed in this study are expected to facilitate the clinical management of patients with the brain tumors and strokes by automating large scale imaging data analysis, reducing human error, inter-observer variability, and producing repeatable brain tumor quantitation and grading

    Invariant Scattering Transform for Medical Imaging

    Full text link
    Invariant scattering transform introduces new area of research that merges the signal processing with deep learning for computer vision. Nowadays, Deep Learning algorithms are able to solve a variety of problems in medical sector. Medical images are used to detect diseases brain cancer or tumor, Alzheimer's disease, breast cancer, Parkinson's disease and many others. During pandemic back in 2020, machine learning and deep learning has played a critical role to detect COVID-19 which included mutation analysis, prediction, diagnosis and decision making. Medical images like X-ray, MRI known as magnetic resonance imaging, CT scans are used for detecting diseases. There is another method in deep learning for medical imaging which is scattering transform. It builds useful signal representation for image classification. It is a wavelet technique; which is impactful for medical image classification problems. This research article discusses scattering transform as the efficient system for medical image analysis where it's figured by scattering the signal information implemented in a deep convolutional network. A step by step case study is manifested at this research work.Comment: 11 pages, 8 figures and 1 tabl

    Understanding cities with machine eyes: A review of deep computer vision in urban analytics

    Get PDF
    Modelling urban systems has interested planners and modellers for decades. Different models have been achieved relying on mathematics, cellular automation, complexity, and scaling. While most of these models tend to be a simplification of reality, today within the paradigm shifts of artificial intelligence across the different fields of science, the applications of computer vision show promising potential in understanding the realistic dynamics of cities. While cities are complex by nature, computer vision shows progress in tackling a variety of complex physical and non-physical visual tasks. In this article, we review the tasks and algorithms of computer vision and their applications in understanding cities. We attempt to subdivide computer vision algorithms into tasks, and cities into layers to show evidence of where computer vision is intensively applied and where further research is needed. We focus on highlighting the potential role of computer vision in understanding urban systems related to the built environment, natural environment, human interaction, transportation, and infrastructure. After showing the diversity of computer vision algorithms and applications, the challenges that remain in understanding the integration between these different layers of cities and their interactions with one another relying on deep learning and computer vision. We also show recommendations for practice and policy-making towards reaching AI-generated urban policies

    An Integrated Multi-modal Registration Technique for Medical Imaging

    Get PDF
    Registration of medical imaging is essential for aligning in time and space different modalities and hence consolidating their strengths for enhanced diagnosis and for the effective planning of treatment or therapeutic interventions. The primary objective of this study is to develop an integrated registration method that is effective for registering both brain and whole-body images. We seek in the proposed method to combine in one setting the excellent registration results that FMRIB Software Library (FSL) produces with brain images and the excellent results of Statistical Parametric Mapping (SPM) when registering whole-body images. To assess attainment of these objectives, the following registration tasks were performed: (1) FDG_CT with FLT_CT images, (2) pre-operation MRI with intra-operation CT images, (3) brain only MRI with corresponding PET images, and (4) MRI T1 with T2, T1 with FLAIR, and T1 with GE images. Then, the results of the proposed method will be compared to those obtained using existing state-of-the-art registration methods such as SPM and FSL. Initially, three slices were chosen from the reference image, and the normalized mutual information (NMI) was calculated between each of them for every slice in the moving image. The three pairs with the highest NMI values were chosen. The wavelet decomposition method is applied to minimize the computational requirements. An initial search applying a genetic algorithm is conducted on the three pairs to obtain three sets of registration parameters. The Powell method is applied to reference and moving images to validate the three sets of registration parameters. A linear interpolation method is then used to obtain the registration parameters for all remaining slices. Finally, the aligned registered image with the reference image were displayed to show the different performances of the 3 methods, namely the proposed method, SPM and FSL by gauging the average NMI values obtained in the registration results. Visual observations are also provided in support of these NMI values. For comparative purposes, tests using different multi-modal imaging platforms are performed

    Connected Attribute Filtering Based on Contour Smoothness

    Get PDF

    Deep Image Translation With an Affinity-Based Change Prior for Unsupervised Multimodal Change Detection

    Get PDF
    © 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Image translation with convolutional neural networks has recently been used as an approach to multimodal change detection. Existing approaches train the networks by exploiting supervised information of the change areas, which, however, is not always available. A main challenge in the unsupervised problem setting is to avoid that change pixels affect the learning of the translation function. We propose two new network architectures trained with loss functions weighted by priors that reduce the impact of change pixels on the learning objective. The change prior is derived in an unsupervised fashion from relational pixel information captured by domain-specific affinity matrices. Specifically, we use the vertex degrees associated with an absolute affinity difference matrix and demonstrate their utility in combination with cycle consistency and adversarial training. The proposed neural networks are compared with the state-of-the-art algorithms. Experiments conducted on three real data sets show the effectiveness of our methodology
    corecore