88 research outputs found

    Design and Analysis of A New Illumination Invariant Human Face Recognition System

    Get PDF
    In this dissertation we propose the design and analysis of a new illumination invariant face recognition system. We show that the multiscale analysis of facial structure and features of face images leads to superior recognition rates for images under varying illumination. We assume that an image I ( x,y ) is a black box consisting of a combination of illumination and reflectance. A new approximation is proposed to enhance the illumination removal phase. As illumination resides in the low-frequency part of images, a high-performance multiresolution transformation is employed to accurately separate the frequency contents of input images. The procedure is followed by a fine-tuning process. After extracting a mask, feature vector is formed and the principal component analysis (PCA) is used for dimensionality reduction which is then proceeded by the extreme learning machine (ELM) as a classifier. We then analyze the effect of the frequency selectivity of subbands of the transformation on the performance of the proposed face recognition system. In fact, we first propose a method to tune the characteristics of a multiresolution transformation, and then analyze how these specifications may affect the recognition rate. In addition, we show that the proposed face recognition system can be further improved in terms of the computational time and accuracy. The motivation for this progress is related to the fact that although illumination mostly lies in the low-frequency part of images, these low-frequency components may have low- or high-resonance nature. Therefore, for the first time, we introduce the resonance based analysis of face images rather than the traditional frequency domain approaches. We found that energy selectivity of the subbands of the resonance based decomposition can lead to superior results with less computational complexity. The method is free of any prior information about the face shape. It is systematic and can be applied separately on each image. Several experiments are performed employing the well known databases such as the Yale B, Extended-Yale B, CMU-PIE, FERET, AT&T, and LFW. Illustrative examples are given and the results confirm the effectiveness of the method compared to the current results in the literature

    Enhanced processing methods for light field imaging

    Full text link
    The light field camera provides rich textural and geometric information, but it is still challenging to use it efficiently and accurately to solve computer vision problems. Light field image processing is divided into multiple levels. First, low-level processing technology mainly includes the acquisition of light field images and their preprocessing. Second, the middle-level process consists of the depth estimation, light field encoding, and the extraction of cues from the light field. Third, high-level processing involves 3D reconstruction, target recognition, visual odometry, image reconstruction, and other advanced applications. We propose a series of improved algorithms for each of these levels. The light field signal contains rich angular information. By contrast, traditional computer vision methods, as used for 2D images, often cannot make full use of the high-frequency part of the light field angular information. We propose a fast pre-estimation algorithm to enhance the light field feature to improve its speed and accuracy when keeping full use of the angular information.Light field filtering and refocusing are essential cues in light field signal processing. Modern frequency domain filtering technology and wavelet technology have effectively improved light field filtering accuracy but may fail at object edges. We adapted the sub-window filtering with the light field to improve the reconstruction of object edges. Light field images can analyze the effects of scattering and refraction phenomena, and there are still insufficient metrics to evaluate the results. Therefore, we propose a physical rendering-based light field dataset that simulates the distorted light field image through a transparent medium, such as atmospheric turbulence or water surface. The neural network is an essential method to process complex light field data. We propose an efficient 3D convolutional autoencoder network for the light field structure. This network overcomes the severe distortion caused by high-intensity turbulence with limited angular resolution and solves the difficulty of pixel matching between distorted images. This work emphasizes the application and usefulness of light field imaging in computer vision whilst improving light field image processing speed and accuracy through signal processing, computer graphics, computer vision, and artificial neural networks

    The multifocal visual evoked cortical potential in visual field mapping: a methodological study.

    Get PDF
    The application of multifocal techniques to the visual evoked cortical potential permits objective electrophysiological mapping of the visual field. The multifocal visual evoked cortical potential (mfVECP) presents several technical challenges. Signals are small, are influenced by a number of sources of noise and waveforms vary both across the visual field and between subjects due to the complex geometry of the visual cortex. Together these factors hamper the ability to distinguish between a mfVECP response from the healthy visual pathway, and a response that is reduced or absent and is therefore representative of pathology. This thesis presents a series of methodological investigations with the aim of maximising the information available in the recorded electrophysiological response, thereby improving the performance of the mfVECP. A novel method of calculating the signal to noise ratio (SNR) of mfVECP waveform responses is introduced. A noise estimate unrelated to the response of the visual cortex to the visual stimulus is created. This is achieved by cross-correlating m-sequences which are created when the orthogonal set of m-sequences are created but are not used to control a stimulus region, with the physiological record. This metric is compared to the approach of defining noise within a delayed time window and shows good correlation. ROC analysis indicates a small improvement in the ability to distinguish between physiological waveform responses and noise. Defining the signal window as 45-250ms is recommended. Signal quality is improved by post-acquisition bandwidth filtering. A wide range of bandwidths are compared and the greatest gains are seen with a bandpass of 3 to 20Hz applied after cross-correlation. Responses evoked when stimulation is delivered using a cathode ray tube (CRT) and a liquid crystal display (LCD) projector system are compared. The mode of stimulus delivery affects the waveshape of responses. A significantly higher SNR is seen in waveforms is shown in waveforms evoked by an m=16 bit m-sequence delivered by a CRT monitor. Differences for shorter m-sequences were not statistically significant. The area of the visual field which can usefully be tested is investigated by increasing the field of view of stimulation from 20° to 40° of radius in 10° increments. A field of view of 30° of radius is shown to provide stimulation of as much of the visual field as possible without losing signal quality. Stimulation rates of 12.5 to 75Hz are compared. Slowing the stimulation rate produced increases waveform amplitudes, latencies and SNR values. The best performance was achieved with 25Hz stimulation. It is shown that a six-minute recording stimulated at 25Hz is superior to an eight-minute, 75Hz acquisition. An electrophysiology system capable of providing multifocal stimulation, synchronising with the acquisition of data from a large number of electrodes and performing cross-correlation has been created. This is a powerful system which permits the interrogation of the dipoles evoked within the complex geometry of the visual cortex from a very large number of orientations, which will improve detection ability. The system has been used to compare the performance of 16 monopolar recording channels in detecting responses to stimulation throughout the visual field. A selection of four electrodes which maximise the available information throughout the visual field has been made. It is shown that a several combinations of four electrodes provide good responses throughout the visual field, but that it is important to have them distributed on either hemisphere and above and below Oz. A series of investigations have indicated methods of maximising the available information in mfVECP recordings and progress the technique towards becoming a robust clinical tool. A powerful multichannel multifocal electrophysiology system has been created, with the ability to simultaneously acquire data from a very large number of bipolar recording channels and thereby detect many small dipole responses to stimulation of many small areas of the visual field. This will be an invaluable tool in future investigations. Performance has been shown to improve when the presence or absence of a waveform is determined by a novel SNR metric, when data is filtered post-acquisition through a 3-20Hz bandpass after cross-correlation and when a CRT is used to deliver the stimulus. The field of view of stimulation can usefully be extended to a radius of 30° when a 60-region dartboard pattern is employed. Performance can be enhanced at the same time as acquisition time is reduced by 25%, by the use of a 25Hz rate of stimulation instead of the frequently employed rate of 75Hz

    Signal processing algorithms for enhanced image fusion performance and assessment

    Get PDF
    The dissertation presents several signal processing algorithms for image fusion in noisy multimodal conditions. It introduces a novel image fusion method which performs well for image sets heavily corrupted by noise. As opposed to current image fusion schemes, the method has no requirements for a priori knowledge of the noise component. The image is decomposed with Chebyshev polynomials (CP) being used as basis functions to perform fusion at feature level. The properties of CP, namely fast convergence and smooth approximation, renders it ideal for heuristic and indiscriminate denoising fusion tasks. Quantitative evaluation using objective fusion assessment methods show favourable performance of the proposed scheme compared to previous efforts on image fusion, notably in heavily corrupted images. The approach is further improved by incorporating the advantages of CP with a state-of-the-art fusion technique named independent component analysis (ICA), for joint-fusion processing based on region saliency. Whilst CP fusion is robust under severe noise conditions, it is prone to eliminating high frequency information of the images involved, thereby limiting image sharpness. Fusion using ICA, on the other hand, performs well in transferring edges and other salient features of the input images into the composite output. The combination of both methods, coupled with several mathematical morphological operations in an algorithm fusion framework, is considered a viable solution. Again, according to the quantitative metrics the results of our proposed approach are very encouraging as far as joint fusion and denoising are concerned. Another focus of this dissertation is on a novel metric for image fusion evaluation that is based on texture. The conservation of background textural details is considered important in many fusion applications as they help define the image depth and structure, which may prove crucial in many surveillance and remote sensing applications. Our work aims to evaluate the performance of image fusion algorithms based on their ability to retain textural details from the fusion process. This is done by utilising the gray-level co-occurrence matrix (GLCM) model to extract second-order statistical features for the derivation of an image textural measure, which is then used to replace the edge-based calculations in an objective-based fusion metric. Performance evaluation on established fusion methods verifies that the proposed metric is viable, especially for multimodal scenarios

    Physiology, Psychoacoustics and Cognition in Normal and Impaired Hearing

    Get PDF
    • …
    corecore