120 research outputs found

    Multi-focus Image Fusion with Sparse Feature Based Pulse Coupled Neural Network

    Get PDF
    In order to better extract the focused regions and effectively improve the quality of the fused image, a novel multi-focus image fusion scheme with sparse feature based pulse coupled neural network (PCNN) is proposed. The registered source images are decomposed into principal matrices and sparse matrices by robust principal component analysis (RPCA). The salient features of the sparse matrices construct the sparse feature space of the source images. The sparse features are used to motivate the PCNN neurons. The focused regions of the source images are detected by the output of the PCNN and integrated to construct the final fused image. Experimental results show that the proposed scheme works better in extracting the focused regions and improving the fusion quality compared to the other existing fusion methods in both spatial and transform domain

    The Nonsubsampled Contourlet Transform Based Statistical Medical Image Fusion Using Generalized Gaussian Density

    Get PDF
    We propose a novel medical image fusion scheme based on the statistical dependencies between coefficients in the nonsubsampled contourlet transform (NSCT) domain, in which the probability density function of the NSCT coefficients is concisely fitted using generalized Gaussian density (GGD), as well as the similarity measurement of two subbands is accurately computed by Jensen-Shannon divergence of two GGDs. To preserve more useful information from source images, the new fusion rules are developed to combine the subbands with the varied frequencies. That is, the low frequency subbands are fused by utilizing two activity measures based on the regional standard deviation and Shannon entropy and the high frequency subbands are merged together via weight maps which are determined by the saliency values of pixels. The experimental results demonstrate that the proposed method significantly outperforms the conventional NSCT based medical image fusion approaches in both visual perception and evaluation indices

    Comparative Analysis and Fusion of MRI and PET Images based on Wavelets for Clinical Diagnosis

    Get PDF
    Nowadays, Medical imaging modalities like Magnetic Resonance Imaging (MRI), Positron Emission Tomography (PET), Single Photon Emission Tomography (SPECT), and Computed Tomography (CT) play a crucial role in clinical diagnosis and treatment planning. The images obtained from each of these modalities contain complementary information of the organ imaged. Image fusion algorithms are employed to bring all of this disparate information together into a single image, allowing doctors to diagnose disorders quickly. This paper proposes a novel technique for the fusion of MRI and PET images based on YUV color space and wavelet transform. Quality assessment based on entropy showed that the method can achieve promising results for medical image fusion. The paper has done a comparative analysis of the fusion of MRI and PET images using different wavelet families at various decomposition levels for the detection of brain tumors as well as Alzheimer’s disease. The quality assessment and visual analysis showed that the Dmey wavelet at decomposition level 3 is optimum for the fusion of MRI and PET images. This paper also compared the results of several fusion rules such as average, maximum, and minimum, finding that the maximum fusion rule outperformed the other two

    Cross View Action Recognition

    Get PDF
    openCross View Action Recognition (CVAR) appraises a system's ability to recognise actions from viewpoints that are unfamiliar to the system. The state of the art methods that train on large amounts of training data rely on variation in the training data itself to increase their ability to tackle viewpoints changes. Therefore, these methods not only require a large scale dataset of appropriate classes for the application every time they train, but also correspondingly large amount of computation power for the training process leading to high costs, in terms of time, effort, funds and electrical energy. In this thesis, we propose a methodological pipeline that tackles change in viewpoint, training on small datasets and employing sustainable amounts of resources. Our method uses the optical flow input with a stream of a pre-trained model as-is to obtain a feature. Thereafter, this feature is used to train a custom designed classifier that promotes view-invariant properties. Our method only uses video information as input, in contrast to another set of methods that approach CVAR by using depth or pose input at the expense of increased sensor costs. We present a number of comparative analysis that aided the design of the pipelines, farther assessing the power of each component in the pipeline. The technique can also be adopted to existing, trained classifiers, with minimal fine-tuning, as this work demonstrates by comparing classifiers including shallow classifiers, deep pre-trained classifiers and our proposed classifier trained from scratch. Additionally, we present a set of qualitative results that promote our understanding of the relationship between viewpoints in the feature-space.openXXXII CICLO - INFORMATICA E INGEGNERIA DEI SISTEMI/ COMPUTER SCIENCE AND SYSTEMS ENGINEERING - InformaticaGoyal, Gaurv

    Large-Scale Light Field Capture and Reconstruction

    Get PDF
    This thesis discusses approaches and techniques to convert Sparsely-Sampled Light Fields (SSLFs) into Densely-Sampled Light Fields (DSLFs), which can be used for visualization on 3DTV and Virtual Reality (VR) devices. Exemplarily, a movable 1D large-scale light field acquisition system for capturing SSLFs in real-world environments is evaluated. This system consists of 24 sparsely placed RGB cameras and two Kinect V2 sensors. The real-world SSLF data captured with this setup can be leveraged to reconstruct real-world DSLFs. To this end, three challenging problems require to be solved for this system: (i) how to estimate the rigid transformation from the coordinate system of a Kinect V2 to the coordinate system of an RGB camera; (ii) how to register the two Kinect V2 sensors with a large displacement; (iii) how to reconstruct a DSLF from a SSLF with moderate and large disparity ranges. To overcome these three challenges, we propose: (i) a novel self-calibration method, which takes advantage of the geometric constraints from the scene and the cameras, for estimating the rigid transformations from the camera coordinate frame of one Kinect V2 to the camera coordinate frames of 12-nearest RGB cameras; (ii) a novel coarse-to-fine approach for recovering the rigid transformation from the coordinate system of one Kinect to the coordinate system of the other by means of local color and geometry information; (iii) several novel algorithms that can be categorized into two groups for reconstructing a DSLF from an input SSLF, including novel view synthesis methods, which are inspired by the state-of-the-art video frame interpolation algorithms, and Epipolar-Plane Image (EPI) inpainting methods, which are inspired by the Shearlet Transform (ST)-based DSLF reconstruction approaches

    Advanced Feature Learning and Representation in Image Processing for Anomaly Detection

    Get PDF
    Techniques for improving the information quality present in imagery for feature extraction are proposed in this thesis. Specifically, two methods are presented: soft feature extraction and improved Evolution-COnstructed (iECO) features. Soft features comprise the extraction of image-space knowledge by performing a per-pixel weighting based on an importance map. Through soft features, one is able to extract features relevant to identifying a given object versus its background. Next, the iECO features framework is presented. The iECO features framework uses evolutionary computation algorithms to learn an optimal series of image transforms, specific to a given feature descriptor, to best extract discriminative information. That is, a composition of image transforms are learned from training data to present a given feature descriptor with the best opportunity to extract its information for the application at hand. The proposed techniques are applied to an automatic explosive hazard detection application and significant results are achieved
    corecore