University of Edinburgh. College of Science and Engineering. School of Informatics.
Abstract
Institute of Perception, Action and BehaviourHuman vision uses mechanisms of covert attention to selectively process interesting
information and overt eye movements to extend this selectivity ability. Thus, visual
tasks can be effectively dealt with by limited processing resources. Modelling visual
attention for machine vision systems is not only critical but also challenging. In the
machine vision literature there have been many conventional attention models developed
but they are all space-based only and cannot perform object-based selection. In
consequence, they fail to work in real-world visual environments due to the intrinsic
limitations of the space-based attention theory upon which these models are built.
The aim of the work presented in this thesis is to provide a novel human-like visual
selection framework based on the object-based attention theory recently being developed
in psychophysics. The proposed solution – a Hierarchical Object-based Attention
Framework (HOAF) based on grouping competition, consists of two closely-coupled
visual selection models of (1) hierarchical object-based visual (covert) attention and
(2) object-based attention-driven (overt) saccadic eye movements. The Hierarchical
Object-based Attention Model (HOAM) is the primary selection mechanism and the
Object-based Attention-Driven Saccading model (OADS) has a supporting role, both
of which are combined in the integrated visual selection framework HOAF.
This thesis first describes the proposed object-based attention model HOAM which
is the primary component of the selection framework HOAF. The model is based on
recent psychophysical results on object-based visual attention and adopted grouping-based
competition to integrate object-based and space-based attention together so as
to achieve object-based hierarchical selectivity. The behaviour of the model is demonstrated
on a number of synthetic images simulating psychophysical experiments and
real-world natural scenes. The experimental results showed that the performance of
our object-based attention model HOAM concurs with the main findings in the psychophysical
literature on object-based and space-based visual attention. Moreover,
HOAM has outstanding hierarchical selectivity from far to near and from coarse to fine
by features, objects, spatial regions, and their groupings in complex natural scenes.
This successful performance arises from three original mechanisms in the model:
grouping-based saliency evaluation, integrated competition between groupings, and
hierarchical selectivity. The model is the first implemented machine vision model of
integrated object-based and space-based visual attention.
The thesis then addresses another proposed model of Object-based Attention-Driven
Saccadic eye movements (OADS) built upon the object-based attention model HOAM,
ii
as an overt saccading component within the object-based selection framework HOAF.
This model, like our object-based attention model HOAM, is also the first implemented
machine vision saccading model which makes a clear distinction between (covert) visual
attention and overt saccading movements in a two-level selection system – an
important feature of human vision but not yet explored in conventional machine vision
saccading systems. In the saccading model OADS, a log-polar retina-like sensor
is employed to simulate the human-like foveation imaging for space variant sensing.
Through a novel mechanism for attention-driven orienting, the sensor fixates on
new destinations determined by object-based attention. Hence it helps attention to
selectively process interesting objects located at the periphery of the whole field of
view to accomplish the large-scale visual selection tasks. By another proposed novel
mechanism for temporary inhibition of return, OADS can simulate the human saccading/
attention behaviour to refixate/reattend interesting objects for further detailed
inspection.
This thesis concludes that the proposed human-like visual selection solution –
HOAF, which is inspired by psychophysical object-based attention theory and grouping-based
competition, is particularly useful for machine vision. HOAF is a general and
effective visual selection framework integrating object-based attention and attentiondriven
saccadic eye movements with biological plausibility and object-based hierarchical
selectivity from coarse to fine in a space-time context