3,576 research outputs found

    Combined object recognition approaches for mobile robotics

    Get PDF
    There are numerous solutions to simple object recognition problems when the machine is operating under strict environmental conditions (such as lighting). Object recognition in real-world environments poses greater difficulty however. Ideally mobile robots will function in real-world environments without the aid of fiduciary identifiers. More robust methods are therefore needed to perform object recognition reliably. A combined approach of multiple techniques improves recognition results. Active vision and peripheral-foveal vision—systems that are designed to improve the information gathered for the purposes of object recognition—are examined. In addition to active vision and peripheral-foveal vision, five object recognition methods that either make use of some form of active vision or could leverage active vision and/or peripheral-foveal vision systems are also investigated: affine-invariant image patches, perceptual organization, 3D morphable models (3DMMs), active viewpoint, and adaptive color segmentation. The current state-of-the-art in these areas of vision research and observations on areas of future research are presented. Examples of state-of-theart methods employed in other vision applications that have not been used for object recognition are also mentioned. Lastly, the future direction of the research field is hypothesized

    Real-Time Anisotropic Diffusion using Space-Variant Vision

    Full text link
    Many computer and robot vision applications require multi-scale image analysis. Classically, this has been accomplished through the use of a linear scale-space, which is constructed by convolution of visual input with Gaussian kernels of varying size (scale). This has been shown to be equivalent to the solution of a linear diffusion equation on an infinite domain, as the Gaussian is the Green's function of such a system (Koenderink, 1984). Recently, much work has been focused on the use of a variable conductance function resulting in anisotropic diffusion described by a nonlinear partial differential equation (PDF). The use of anisotropic diffusion with a conductance coefficient which is a decreasing function of the gradient magnitude has been shown to enhance edges, while decreasing some types of noise (Perona and Malik, 1987). Unfortunately, the solution of the anisotropic diffusion equation requires the numerical integration of a nonlinear PDF which is a costly process when carried out on a fixed mesh such as a typical image. In this paper we show that the complex log transformation, variants of which are universally used in mammalian retino-cortical systems, allows the nonlinear diffusion equation to be integrated at exponentially enhanced rates due to the non-uniform mesh spacing inherent in the log domain. The enhanced integration rates, coupled with the intrinsic compression of the complex log transformation, yields a seed increase of between two and three orders of magnitude, providing a means of performing real-time image enhancement using anisotropic diffusion.Office of Naval Research (N00014-95-I-0409

    A biologically inspired computational vision front-end based on a self-organised pseudo-randomly tessellated artificial retina

    Get PDF
    This paper considers the construction of a biologically inspired front-end for computer vision based on an artificial retina pyramid with a self-organised pseudo-randomly tessellated receptive field tessellation. The organisation of photoreceptors and receptive fields in biological retinae locally resembles a hexagonal mosaic, whereas globally these are organised with a very densely tessellated central foveal region which seamlessly merges into an increasingly sparsely tessellated periphery. In contrast, conventional computer vision approaches use a rectilinear sampling tessellation which samples the whole field of view with uniform density. Scale-space interest points which are suitable for higher level attention and reasoning tasks are efficiently extracted by our vision front-end by performing hierarchical feature extraction on the pseudo-randomly spaced visual information. All operations were conducted on a geometrically irregular foveated representation (data structure for visual information) which is radically different to the uniform rectilinear arrays used in conventional computer vision

    Real-Time Restoration of Images Degraded by Uniform Motion Blur in Foveal Active Vision Systems

    Full text link
    Foveated, log-polar, or space-variant image architectures provide a high resolution and wide field workspace, while providing a small pixel computation load. These characteristics are ideal for mobile robotic and active vision applications. Recently we have described a generalization of the Fourier Transform (the fast exponential chirp transform) which allows frame-rate computation of full-field 2D frequency transforms on a log-polar image format. In the present work, we use Wiener filtering, performed using the Exponential Chirp Transform, on log-polar (fovcated) image formats to de-blur images which have been degraded by uniform camera motion.Defense Advanced Research Projects Agency and Office of Naval Research (N00014-96-C-0178); Office of Naval Research Multidisciplinary University Research Initiative (N00014-95-1-0409

    An Adaptable Foveating Vision Chip

    Get PDF
    Published versio

    Object Detection Through Exploration With A Foveated Visual Field

    Get PDF
    We present a foveated object detector (FOD) as a biologically-inspired alternative to the sliding window (SW) approach which is the dominant method of search in computer vision object detection. Similar to the human visual system, the FOD has higher resolution at the fovea and lower resolution at the visual periphery. Consequently, more computational resources are allocated at the fovea and relatively fewer at the periphery. The FOD processes the entire scene, uses retino-specific object detection classifiers to guide eye movements, aligns its fovea with regions of interest in the input image and integrates observations across multiple fixations. Our approach combines modern object detectors from computer vision with a recent model of peripheral pooling regions found at the V1 layer of the human visual system. We assessed various eye movement strategies on the PASCAL VOC 2007 dataset and show that the FOD performs on par with the SW detector while bringing significant computational cost savings.Comment: An extended version of this manuscript was published in PLOS Computational Biology (October 2017) at https://doi.org/10.1371/journal.pcbi.100574
    • …
    corecore