304 research outputs found

    Real-Time Restoration of Images Degraded by Uniform Motion Blur in Foveal Active Vision Systems

    Full text link
    Foveated, log-polar, or space-variant image architectures provide a high resolution and wide field workspace, while providing a small pixel computation load. These characteristics are ideal for mobile robotic and active vision applications. Recently we have described a generalization of the Fourier Transform (the fast exponential chirp transform) which allows frame-rate computation of full-field 2D frequency transforms on a log-polar image format. In the present work, we use Wiener filtering, performed using the Exponential Chirp Transform, on log-polar (fovcated) image formats to de-blur images which have been degraded by uniform camera motion.Defense Advanced Research Projects Agency and Office of Naval Research (N00014-96-C-0178); Office of Naval Research Multidisciplinary University Research Initiative (N00014-95-1-0409

    Automated choroidal segmentation of 1060 nm OCT in healthy and pathologic eyes using a statistical model

    Get PDF
    A two stage statistical model based on texture and shape for fully automatic choroidal segmentation of normal and pathologic eyes obtained by a 1060 nm optical coherence tomography (OCT) system is developed. A novel dynamic programming approach is implemented to determine location of the retinal pigment epithelium/ Bruch’s membrane /choriocapillaris (RBC) boundary. The choroid–sclera interface (CSI) is segmented using a statistical model. The algorithm is robust even in presence of speckle noise, low signal (thick choroid), retinal pigment epithelium (RPE) detachments and atrophy, drusen, shadowing and other artifacts. Evaluation against a set of 871 manually segmented cross-sectional scans from 12 eyes achieves an average error rate of 13%, computed per tomogram as a ratio of incorrectly classified pixels and the total layer surface. For the first time a fully automatic choroidal segmentation algorithm is successfully applied to a wide range of clinical volumetric OCT data

    Full Reference Objective Quality Assessment for Reconstructed Background Images

    Full text link
    With an increased interest in applications that require a clean background image, such as video surveillance, object tracking, street view imaging and location-based services on web-based maps, multiple algorithms have been developed to reconstruct a background image from cluttered scenes. Traditionally, statistical measures and existing image quality techniques have been applied for evaluating the quality of the reconstructed background images. Though these quality assessment methods have been widely used in the past, their performance in evaluating the perceived quality of the reconstructed background image has not been verified. In this work, we discuss the shortcomings in existing metrics and propose a full reference Reconstructed Background image Quality Index (RBQI) that combines color and structural information at multiple scales using a probability summation model to predict the perceived quality in the reconstructed background image given a reference image. To compare the performance of the proposed quality index with existing image quality assessment measures, we construct two different datasets consisting of reconstructed background images and corresponding subjective scores. The quality assessment measures are evaluated by correlating their objective scores with human subjective ratings. The correlation results show that the proposed RBQI outperforms all the existing approaches. Additionally, the constructed datasets and the corresponding subjective scores provide a benchmark to evaluate the performance of future metrics that are developed to evaluate the perceived quality of reconstructed background images.Comment: Associated source code: https://github.com/ashrotre/RBQI, Associated Database: https://drive.google.com/drive/folders/1bg8YRPIBcxpKIF9BIPisULPBPcA5x-Bk?usp=sharing (Email for permissions at: ashrotreasuedu

    Content-prioritised video coding for British Sign Language communication.

    Get PDF
    Video communication of British Sign Language (BSL) is important for remote interpersonal communication and for the equal provision of services for deaf people. However, the use of video telephony and video conferencing applications for BSL communication is limited by inadequate video quality. BSL is a highly structured, linguistically complete, natural language system that expresses vocabulary and grammar visually and spatially using a complex combination of facial expressions (such as eyebrow movements, eye blinks and mouth/lip shapes), hand gestures, body movements and finger-spelling that change in space and time. Accurate natural BSL communication places specific demands on visual media applications which must compress video image data for efficient transmission. Current video compression schemes apply methods to reduce statistical redundancy and perceptual irrelevance in video image data based on a general model of Human Visual System (HVS) sensitivities. This thesis presents novel video image coding methods developed to achieve the conflicting requirements for high image quality and efficient coding. Novel methods of prioritising visually important video image content for optimised video coding are developed to exploit the HVS spatial and temporal response mechanisms of BSL users (determined by Eye Movement Tracking) and the characteristics of BSL video image content. The methods implement an accurate model of HVS foveation, applied in the spatial and temporal domains, at the pre-processing stage of a current standard-based system (H.264). Comparison of the performance of the developed and standard coding systems, using methods of video quality evaluation developed for this thesis, demonstrates improved perceived quality at low bit rates. BSL users, broadcasters and service providers benefit from the perception of high quality video over a range of available transmission bandwidths. The research community benefits from a new approach to video coding optimisation and better understanding of the communication needs of deaf people

    Retinal vessel segmentation:An efficient graph cut approach with Retinex and local phase

    Get PDF
    Our application concerns the automated detection of vessels in retinal images to improve understanding of the disease mechanism, diagnosis and treatment of retinal and a number of systemic diseases. We propose a new framework for segmenting retinal vasculatures with much improved accuracy and efficiency. The proposed framework consists of three technical components: Retinex-based image inhomogeneity correction, local phase-based vessel enhancement and graph cut-based active contour segmentation. These procedures are applied in the following order. Underpinned by the Retinex theory, the inhomogeneity correction step aims to address challenges presented by the image intensity inhomogeneities, and the relatively low contrast of thin vessels compared to the background. The local phase enhancement technique is employed to enhance vessels for its superiority in preserving the vessel edges. The graph cut-based active contour method is used for its efficiency and effectiveness in segmenting the vessels from the enhanced images using the local phase filter. We have demonstrated its performance by applying it to four public retinal image datasets (3 datasets of color fundus photography and 1 of fluorescein angiography). Statistical analysis demonstrates that each component of the framework can provide the level of performance expected. The proposed framework is compared with widely used unsupervised and supervised methods, showing that the overall framework outperforms its competitors. For example, the achieved sensitivity (0:744), specificity (0:978) and accuracy (0:953) for the DRIVE dataset are very close to those of the manual annotations obtained by the second observer

    Space-variant picture coding

    Get PDF
    PhDSpace-variant picture coding techniques exploit the strong spatial non-uniformity of the human visual system in order to increase coding efficiency in terms of perceived quality per bit. This thesis extends space-variant coding research in two directions. The first of these directions is in foveated coding. Past foveated coding research has been dominated by the single-viewer, gaze-contingent scenario. However, for research into the multi-viewer and probability-based scenarios, this thesis presents a missing piece: an algorithm for computing an additive multi-viewer sensitivity function based on an established eye resolution model, and, from this, a blur map that is optimal in the sense of discarding frequencies in least-noticeable- rst order. Furthermore, for the application of a blur map, a novel algorithm is presented for the efficient computation of high-accuracy smoothly space-variant Gaussian blurring, using a specialised filter bank which approximates perfect space-variant Gaussian blurring to arbitrarily high accuracy and at greatly reduced cost compared to the brute force approach of employing a separate low-pass filter at each image location. The second direction is that of artifi cially increasing the depth-of- field of an image, an idea borrowed from photography with the advantage of allowing an image to be reduced in bitrate while retaining or increasing overall aesthetic quality. Two synthetic depth of field algorithms are presented herein, with the desirable properties of aiming to mimic occlusion eff ects as occur in natural blurring, and of handling any number of blurring and occlusion levels with the same level of computational complexity. The merits of this coding approach have been investigated by subjective experiments to compare it with single-viewer foveated image coding. The results found the depth-based preblurring to generally be significantly preferable to the same level of foveation blurring

    Models of learning in the visual system: dependence on retinal eccentricity

    Get PDF
    In the primary visual cortex of primates relatively more space is devoted to the representation of the central visual field in comparison to the representation of the peripheral visual field. Experimentally testable theories about the factors and mechanisms which may have determined this inhomogeneous mapping may provide valuable insights into general processing principles in the visual system. Therefore, I investigated to which visual situations this inhomogeneous representation of the visual field is well adapted, and which mechanisms could support its refinement and stabilization during individual development. Furthermore, I studied possible functional consequences of the inhomogeneous representation for visual processing at central and peripheral locations of the visual field. Vision plays an important role during navigation. Thus, visual processing should be well adapted to self-motion. Therefore, I assumed that spatially inhomogeneous retinal velocity distributions, caused by static objects during self-motion along the direction of gaze, are transformed on average into spatially homogeneous cortical velocity distributions. This would have the advantage that the cortical mechanisms, concerned with the processing of self-motion, can be identical in their spatial and temporal properties across the representation of the whole visual field. This is the case if the arrangement of objects relative to the observer corresponds to an ellipsoid with the observer in its center. I used the resulting flow field to train a network model of pulse coding neurons with a Hebbian learning rule. The distribution of the learned receptive fields is in agreement with the inhomogeneous cortical representation of the visual field. These results suggest that self motion may have played an important role in the evolution of the visual system and that the inhomogeneous cortical representation of the visual field can be refined and stabilized by Hebbian learning mechanisms during ontogenesis under natural viewing conditions. In addition to the processing of self-motion, an important task of the visual system is the grouping and segregation of local features within a visual scene into coherent objects. Therefore, I asked how the corresponding mechanisms depend on the represented position of the visual field. It is assumed that neuronal connections within the primary visual cortex subserve this grouping process. These connections develop after eye-opening in dependence on the visual input. How does the lateral connectivity depend on the represented position of the visual field? With increasing eccentricity, primary cortical receptive fields become larger and the cortical magnification of the visual field declines. Therefore, I investigated the spatial statistics of real-world scenes with respect to the spatial filter-properties of cortical neurons at different locations of the visual field. I show that correlations between collinearly arranged filters of the same size and orientation increase with increasing filter size. However, in distances relative to the size of the filters, collinear correlations decline more steeply with increasing distance for larger filters. This provides evidence against a homogeneous cortical connectivity across the whole visual field with respect to the coding of spatial object properties. Two major retino-cortical pathways are the magnocellular (M) and the parvocellular (P) pathways. While neurons along the M-pathway display temporal bandpass characteristics, neurons along the P-pathway show temporal lowpass characteristics. The ratio of P- to M-cells is not constant across the whole visual field, but declines with increasing retinal eccentricity. Therefore, I investigated how the different temporal response-properties of neurons of the M- and the P-pathways influence self-organization in the visual cortex, and discussed possible consequences for the coding of visual objects at different locations of the visual field. Specifically, I studied the influence of stimulus-motion on the self-organization of lateral connections in a network-model of spiking neurons with Hebbian learning. Low stimulus velocities lead to horizontal connections well adapted to the coding of the spatial structure within the visual input, while higher stimulus velocities lead to connections which subserve the coding of the stimulus movement direction. This suggests that the temporal lowpass properties of P-neurons subserve the coding of spatial stimulus attributes (form) in the visual cortex, while the temporal bandpass properties of M-neurons support the coding of spatio-temporal stimulus attributes (movement direction). Hence, the central representation of the visual field may be well adapted to the encoding of spatial object properties due to the strong contribution of P-neurons. The peripheral representation may be better adapted to the processing of motion
    corecore