Search CORE

75 research outputs found

Wide-Angle Foveation for All-Purpose Use

Author: Shimizu Sota
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2008
Field of study

This paper proposes a model of a wide-angle space-variant image that provides a guide for designing a fovea sensor. First, an advanced wide-angle foveated (AdWAF) model is formulated, taking all-purpose use into account. This proposed model uses both Cartesian (linear) coordinates and logarithmic coordinates in both planar projection and spherical projection. Thus, this model divides its wide-angle field of view into four areas, such that it can represent an image by various types of lenses, flexibly. The first simulation compares with other lens models, in terms of image height and resolution. The result shows that the AdWAF model can reduce image data by 13.5%, compared to a log-polar lens model, both having the same resolution in the central field of view. The AdWAF image is remapped from an actual input image by the prototype fovea lens, a wide-angle foveated (WAF) lens, using the proposed model. The second simulation compares with other foveation models used for the existing log-polar chip and vision system. The third simulation estimates a scale-invariant property by comparing with the existing fovea lens and the log-polar lens. The AdWAF model gives its planar logarithmic part a complete scale-invariant property, while the fovea lens has 7.6% error at most in its spherical logarithmic part. The fourth simulation computes optical flow in order to examine the unidirectional property when the fovea sensor by the AdWAF model moves, compared to the pinhole camera. The result obtained by using a concept of a virtual cylindrical screen indicates that the proposed model has advantages in terms of computation and application of the optical flow when the fovea sensor moves forward

Caltech Authors

Embodied-driven design : a framework to configure body representation & mapping(本文)

Author: Saraiji MHD Yamen
サライジムハマドヤメン
Publication venue: 慶應義塾大学大学院メディアデザイン研究科
Publication date
Field of study

KeiO Academic Resource Archive

Interactive Object Learning and Recognition with Multiclass Support Vector Machines

Author: Ales Ude
Publication venue: 'IntechOpen'
Publication date: 01/03/2010
Field of study

IntechOpen

Crossref

A distributed camera system for multi-resolution surveillance

Author: Bellotto Nicola
Benfold Ben
Bibby Charles
Fenandez Carles
Gonzales Jordi
Reid Ian
Roth Daniel
Sommerlade Eric
Van Gool Luc
Publication venue
Publication date: 01/01/2009
Field of study

We describe an architecture for a multi-camera, multi-resolution surveillance system. The aim is to support a set of distributed static and pan-tilt-zoom (PTZ) cameras and visual tracking algorithms, together with a central supervisor unit. Each camera (and possibly pan-tilt device) has a dedicated process and processor. Asynchronous interprocess communications and archiving of data are achieved in a simple and effective way via a central repository, implemented using an SQL database. Visual tracking data from static views are stored dynamically into tables in the database via client calls to the SQL server. A supervisor process running on the SQL server determines if active zoom cameras should be dispatched to observe a particular target, and this message is effected via writing demands into another database table. We show results from a real implementation of the system comprising one static camera overviewing the environment under consideration and a PTZ camera operating under closed-loop velocity control, which uses a fast and robust level-set-based region tracker. Experiments demonstrate the effectiveness of our approach and its feasibility to multi-camera systems for intelligent surveillance

University of Lincoln Institutional Repository

Crossref

Adelaide Research & Scholarship

Content-prioritised video coding for British Sign Language communication.

Author: Muir Laura Joy
Publication venue
Publication date: 31/10/2007
Field of study

Video communication of British Sign Language (BSL) is important for remote interpersonal communication and for the equal provision of services for deaf people. However, the use of video telephony and video conferencing applications for BSL communication is limited by inadequate video quality. BSL is a highly structured, linguistically complete, natural language system that expresses vocabulary and grammar visually and spatially using a complex combination of facial expressions (such as eyebrow movements, eye blinks and mouth/lip shapes), hand gestures, body movements and finger-spelling that change in space and time. Accurate natural BSL communication places specific demands on visual media applications which must compress video image data for efficient transmission. Current video compression schemes apply methods to reduce statistical redundancy and perceptual irrelevance in video image data based on a general model of Human Visual System (HVS) sensitivities. This thesis presents novel video image coding methods developed to achieve the conflicting requirements for high image quality and efficient coding. Novel methods of prioritising visually important video image content for optimised video coding are developed to exploit the HVS spatial and temporal response mechanisms of BSL users (determined by Eye Movement Tracking) and the characteristics of BSL video image content. The methods implement an accurate model of HVS foveation, applied in the spatial and temporal domains, at the pre-processing stage of a current standard-based system (H.264). Comparison of the performance of the developed and standard coding systems, using methods of video quality evaluation developed for this thesis, demonstrates improved perceived quality at low bit rates. BSL users, broadcasters and service providers benefit from the perception of high quality video over a range of available transmission bandwidths. The research community benefits from a new approach to video coding optimisation and better understanding of the communication needs of deaf people

Open Access Institutional Repository at Robert Gordon University

Wide-Angle Foveation for All-Purpose Use

Author: S. Shimizu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Space-variant picture coding

Author: Popkin Timothy John
Publication venue: 'Queen Mary University of London'
Publication date: 01/01/2010
Field of study

PhDSpace-variant picture coding techniques exploit the strong spatial non-uniformity of the human visual system in order to increase coding efficiency in terms of perceived quality per bit. This thesis extends space-variant coding research in two directions. The first of these directions is in foveated coding. Past foveated coding research has been dominated by the single-viewer, gaze-contingent scenario. However, for research into the multi-viewer and probability-based scenarios, this thesis presents a missing piece: an algorithm for computing an additive multi-viewer sensitivity function based on an established eye resolution model, and, from this, a blur map that is optimal in the sense of discarding frequencies in least-noticeable- rst order. Furthermore, for the application of a blur map, a novel algorithm is presented for the efficient computation of high-accuracy smoothly space-variant Gaussian blurring, using a specialised filter bank which approximates perfect space-variant Gaussian blurring to arbitrarily high accuracy and at greatly reduced cost compared to the brute force approach of employing a separate low-pass filter at each image location. The second direction is that of artifi cially increasing the depth-of- field of an image, an idea borrowed from photography with the advantage of allowing an image to be reduced in bitrate while retaining or increasing overall aesthetic quality. Two synthetic depth of field algorithms are presented herein, with the desirable properties of aiming to mimic occlusion eff ects as occur in natural blurring, and of handling any number of blurring and occlusion levels with the same level of computational complexity. The merits of this coding approach have been investigated by subjective experiments to compare it with single-viewer foveated image coding. The results found the depth-based preblurring to generally be significantly preferable to the same level of foveation blurring

Queen Mary Research Online

OpenGrey Repository

A perceptual comparison of empirical and predictive region-of-interest video

Author: Ghinea G
Gulliver SR
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

When viewing multimedia presentations, a user only attends to a relatively small part of the video display at any one point in time. By shifting allocation of bandwidth from peripheral areas to those locations where a user’s gaze is more likely to rest, attentive displays can be produced. Attentive displays aim to reduce resource requirements while minimizing negative user perception—understood in this paper as not only a user’s ability to assimilate and understand information but also his/her subjective satisfaction with the video content. This paper introduces and discusses a perceptual comparison between two region-of-interest display (RoID) adaptation techniques. A RoID is an attentive display where bandwidth has been preallocated around measured or highly probable areas of user gaze. In this paper, video content was manipulated using two sources of data: empirical measured data (captured using eye-tracking technology) and predictive data (calculated from the physical characteristics of the video data). Results show that display adaptation causes significant variation in users’ understanding of specific multimedia content. Interestingly, RoID adaptation and the type of video being presented both affect user perception of video quality. Moreover, the use of frame rates less than 15 frames per second, for any video adaptation technique, caused a significant reduction in user perceived quality, suggesting that although users are aware of video quality reduction, it does impact level of information assimilation and understanding. Results also highlight that user level of enjoyment is significantly affected by the type of video yet is not as affected by the quality or type of video adaptation—an interesting implication in the field of entertainment

Central Archive at the University of Reading

CiteSeerX

Crossref

Brunel University Research Archive

Recommended from our members

Peripheral Representations: from Perception to Visual Search

Author: Deza Arturo
Publication venue: eScholarship, University of California
Publication date: 01/01/2018
Field of study

The human visual field is composed of a high acuity region at the center of gaze called the fovea, andits complement, the visual periphery. Not much is known about the computations and representationsof the visual periphery, as most of the focus in the field of human (and machine) vision is geared towardsfoveal vision. Thus, the focus of this thesis will be on understanding the computations performed bythe human visual system in the visual periphery. In doing so, I will begin by modelling the perceptionof clutter and how it changes as a function of the behavioural task and point of fixation, developinga collection of foveated clutter models that enhance non-foveated models. I will then propose a newmetamer model that renders how the information is distorted in the visual field, and what this tells usabout the computations done in the visual periphery. Finally, I will conclude with the design of twohybrid man-machine collaborative visual search systems that try to overcome the limitations in humanvisual search imposed by the visual periphery and observer inefficiencies in terminating exploration

eScholarship - University of California