Search CORE

64 research outputs found

Image Extraction by Wide Angle Foveated Lens for Overt-Attention

Author: Burdick Joel W.
Jiang Hao
Shimizu Sota
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2006
Field of study

This paper defines Wide Angle Foveated (WAF) imaging. A proposed model combines Cartesian coordinate system, a log-polar coordinate system, and a unique camera model composed of planar projection and spherical projection for all-purpose use of a single imaging device. The central field-of-view (FOV) and intermediate FOV are given translation-invariance and, rotation and scale-invariance for pattern recognition, respectively. Further, the peripheral FOV is more useful for camera’s view direction control, because its image height is linear to an incident angle to the camera model’s optical center point. Thus, this imaging model improves its usability especially when a camera is dynamically moved, that is, overt-attention. Moreover, simulation results of image extraction show advantages of the proposed model, in view of its magnification factor of the central FOV, accuracy of scale-invariance and flexibility to describe other WAF vision sensors

Caltech Authors

Wide-Angle Foveation for All-Purpose Use

Author: Shimizu Sota
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2008
Field of study

This paper proposes a model of a wide-angle space-variant image that provides a guide for designing a fovea sensor. First, an advanced wide-angle foveated (AdWAF) model is formulated, taking all-purpose use into account. This proposed model uses both Cartesian (linear) coordinates and logarithmic coordinates in both planar projection and spherical projection. Thus, this model divides its wide-angle field of view into four areas, such that it can represent an image by various types of lenses, flexibly. The first simulation compares with other lens models, in terms of image height and resolution. The result shows that the AdWAF model can reduce image data by 13.5%, compared to a log-polar lens model, both having the same resolution in the central field of view. The AdWAF image is remapped from an actual input image by the prototype fovea lens, a wide-angle foveated (WAF) lens, using the proposed model. The second simulation compares with other foveation models used for the existing log-polar chip and vision system. The third simulation estimates a scale-invariant property by comparing with the existing fovea lens and the log-polar lens. The AdWAF model gives its planar logarithmic part a complete scale-invariant property, while the fovea lens has 7.6% error at most in its spherical logarithmic part. The fourth simulation computes optical flow in order to examine the unidirectional property when the fovea sensor by the AdWAF model moves, compared to the pinhole camera. The result obtained by using a concept of a virtual cylindrical screen indicates that the proposed model has advantages in terms of computation and application of the optical flow when the fovea sensor moves forward

Caltech Authors

Wide-Angle Foveation for All-Purpose Use

Author: S. Shimizu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Hierarchical Object-Based Visual Attention for Machine Vision

Author: Sun Yaoru
Publication venue: University of Edinburgh. College of Science and Engineering. School of Informatics.
Publication date: 01/01/2003
Field of study

Institute of Perception, Action and BehaviourHuman vision uses mechanisms of covert attention to selectively process interesting information and overt eye movements to extend this selectivity ability. Thus, visual tasks can be effectively dealt with by limited processing resources. Modelling visual attention for machine vision systems is not only critical but also challenging. In the machine vision literature there have been many conventional attention models developed but they are all space-based only and cannot perform object-based selection. In consequence, they fail to work in real-world visual environments due to the intrinsic limitations of the space-based attention theory upon which these models are built. The aim of the work presented in this thesis is to provide a novel human-like visual selection framework based on the object-based attention theory recently being developed in psychophysics. The proposed solution – a Hierarchical Object-based Attention Framework (HOAF) based on grouping competition, consists of two closely-coupled visual selection models of (1) hierarchical object-based visual (covert) attention and (2) object-based attention-driven (overt) saccadic eye movements. The Hierarchical Object-based Attention Model (HOAM) is the primary selection mechanism and the Object-based Attention-Driven Saccading model (OADS) has a supporting role, both of which are combined in the integrated visual selection framework HOAF. This thesis first describes the proposed object-based attention model HOAM which is the primary component of the selection framework HOAF. The model is based on recent psychophysical results on object-based visual attention and adopted grouping-based competition to integrate object-based and space-based attention together so as to achieve object-based hierarchical selectivity. The behaviour of the model is demonstrated on a number of synthetic images simulating psychophysical experiments and real-world natural scenes. The experimental results showed that the performance of our object-based attention model HOAM concurs with the main findings in the psychophysical literature on object-based and space-based visual attention. Moreover, HOAM has outstanding hierarchical selectivity from far to near and from coarse to fine by features, objects, spatial regions, and their groupings in complex natural scenes. This successful performance arises from three original mechanisms in the model: grouping-based saliency evaluation, integrated competition between groupings, and hierarchical selectivity. The model is the first implemented machine vision model of integrated object-based and space-based visual attention. The thesis then addresses another proposed model of Object-based Attention-Driven Saccadic eye movements (OADS) built upon the object-based attention model HOAM, ii as an overt saccading component within the object-based selection framework HOAF. This model, like our object-based attention model HOAM, is also the first implemented machine vision saccading model which makes a clear distinction between (covert) visual attention and overt saccading movements in a two-level selection system – an important feature of human vision but not yet explored in conventional machine vision saccading systems. In the saccading model OADS, a log-polar retina-like sensor is employed to simulate the human-like foveation imaging for space variant sensing. Through a novel mechanism for attention-driven orienting, the sensor fixates on new destinations determined by object-based attention. Hence it helps attention to selectively process interesting objects located at the periphery of the whole field of view to accomplish the large-scale visual selection tasks. By another proposed novel mechanism for temporary inhibition of return, OADS can simulate the human saccading/ attention behaviour to refixate/reattend interesting objects for further detailed inspection. This thesis concludes that the proposed human-like visual selection solution – HOAF, which is inspired by psychophysical object-based attention theory and grouping-based competition, is particularly useful for machine vision. HOAF is a general and effective visual selection framework integrating object-based attention and attentiondriven saccadic eye movements with biological plausibility and object-based hierarchical selectivity from coarse to fine in a space-time context

CiteSeerX

Edinburgh Research Archive

Perception-driven approaches to real-time remote immersive visualization

Author: Yonas Teodros Tefera
Publication venue
Publication date: 01/01/2022
Field of study

In remote immersive visualization systems, real-time 3D perception through RGB-D cameras, combined with modern Virtual Reality (VR) interfaces, enhances the user’s sense of presence in a remote scene through 3D reconstruction rendered in a remote immersive visualization system. Particularly, in situations when there is a need to visualize, explore and perform tasks in inaccessible environments, too hazardous or distant. However, a remote visualization system requires the entire pipeline from 3D data acquisition to VR rendering satisfies the speed, throughput, and high visual realism. Mainly when using point-cloud, there is a fundamental quality difference between the acquired data of the physical world and the displayed data because of network latency and throughput limitations that negatively impact the sense of presence and provoke cybersickness. This thesis presents state-of-the-art research to address these problems by taking the human visual system as inspiration, from sensor data acquisition to VR rendering. The human visual system does not have a uniform vision across the field of view; It has the sharpest visual acuity at the center of the field of view. The acuity falls off towards the periphery. The peripheral vision provides lower resolution to guide the eye movements so that the central vision visits all the interesting crucial parts. As a first contribution, the thesis developed remote visualization strategies that utilize the acuity fall-off to facilitate the processing, transmission, buffering, and rendering in VR of 3D reconstructed scenes while simultaneously reducing throughput requirements and latency. As a second contribution, the thesis looked into attentional mechanisms to select and draw user engagement to specific information from the dynamic spatio-temporal environment. It proposed a strategy to analyze the remote scene concerning the 3D structure of the scene, its layout, and the spatial, functional, and semantic relationships between objects in the scene. The strategy primarily focuses on analyzing the scene with models the human visual perception uses. It sets a more significant proportion of computational resources on objects of interest and creates a more realistic visualization. As a supplementary contribution, A new volumetric point-cloud density-based Peak Signal-to-Noise Ratio (PSNR) metric is proposed to evaluate the introduced techniques. An in-depth evaluation of the presented systems, comparative examination of the proposed point cloud metric, user studies, and experiments demonstrated that the methods introduced in this thesis are visually superior while significantly reducing latency and throughput

Catalogo dei prodotti della ricerca

Space-variant picture coding

Author: Popkin Timothy John
Publication venue: 'Queen Mary University of London'
Publication date: 01/01/2010
Field of study

PhDSpace-variant picture coding techniques exploit the strong spatial non-uniformity of the human visual system in order to increase coding efficiency in terms of perceived quality per bit. This thesis extends space-variant coding research in two directions. The first of these directions is in foveated coding. Past foveated coding research has been dominated by the single-viewer, gaze-contingent scenario. However, for research into the multi-viewer and probability-based scenarios, this thesis presents a missing piece: an algorithm for computing an additive multi-viewer sensitivity function based on an established eye resolution model, and, from this, a blur map that is optimal in the sense of discarding frequencies in least-noticeable- rst order. Furthermore, for the application of a blur map, a novel algorithm is presented for the efficient computation of high-accuracy smoothly space-variant Gaussian blurring, using a specialised filter bank which approximates perfect space-variant Gaussian blurring to arbitrarily high accuracy and at greatly reduced cost compared to the brute force approach of employing a separate low-pass filter at each image location. The second direction is that of artifi cially increasing the depth-of- field of an image, an idea borrowed from photography with the advantage of allowing an image to be reduced in bitrate while retaining or increasing overall aesthetic quality. Two synthetic depth of field algorithms are presented herein, with the desirable properties of aiming to mimic occlusion eff ects as occur in natural blurring, and of handling any number of blurring and occlusion levels with the same level of computational complexity. The merits of this coding approach have been investigated by subjective experiments to compare it with single-viewer foveated image coding. The results found the depth-based preblurring to generally be significantly preferable to the same level of foveation blurring

Queen Mary Research Online

OpenGrey Repository

Recommended from our members

Foveated Vision Models for Search and Recognition

Author: Ngo Thuyen Van
Publication venue: eScholarship, University of California
Publication date: 01/01/2018
Field of study

Computer vision has made a significant progress in recent years thanks to advancement in neural network architectures and computing power. At the sensory level, the current machine vision systems sample the visual data uniformly to make predictions about the scene. This is in contrast with the human vision system that has high visual acuity only in a small central region, the fovea, and much coarser sampling away from the center. There has been a renewed interest, particularly in the context of active vision for robotics navigation and scene exploration, to develop biologically motivated methods that can leverage such foveated computations. While foveated vision offers computational savings at or near the region of interest, it requires eye movements to scan the scene for effective image understanding. The hypothesis is that methods that can leverage non-uniform sampling of the field of view together with eye-movements will lead to a new class of active vision systems that are optimized computationally for specific tasks of interest.Inspired by the above observations, this research provides, for the first time, a comprehensive study of the human visual search in the constrained setting of person identification in the wild. A novel video database is created that systematically tests how different parts of a person contribute towards eye-movements and person identification. Our study shows that the search errors can dominate the overall recognition accuracy in human subject experiments. This calls for new strategies for integrating eye tracking with foveated image representations. Towards this two specific approaches are investigated further.In the first approach, a deep neural network based method is developed to model eye movements. Using the long-short-term-memory to model the successive fixations. The proposed method outperforms state of the state of the art performance while simplifying the feature extraction procedure. The second approach focuses on the foveated image model that leverages multiple fixations. A convolutional neural network method is proposed that works directly with the foveated input images that achieves competitive recognition rates compared to standard neural networks operating on the same number of input pixels. Overall the thesis investigates the requirements and implementations that could support active foveated vision, and lays down the ground work for future studies in this area

eScholarship - University of California

Memory-Based Active Visual Search for Humanoid Robots

Author: Welke Kai
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2011
Field of study

KITopen

Active Vision for Scene Understanding

Author: Grotz Markus
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2021
Field of study

Visual perception is one of the most important sources of information for both humans and robots. A particular challenge is the acquisition and interpretation of complex unstructured scenes. This work contributes to active vision for humanoid robots. A semantic model of the scene is created, which is extended by successively changing the robot\u27s view in order to explore interaction possibilities of the scene

KITopen

Directory of Open Access Books (DOAB)

Perceptive agents with attentive interfaces : learning and vision for man-machine systems

Author: Darrell Trevor Jackson
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1996
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1996.Includes bibliographical references (leaves 107-116).by Trevor Jackson Darrell.Ph. D

DSpace@MIT