Disentangling topdown from bottom up influences on attentional allocation in dynamic scenes

Carmi, Ran; Itti, Lawrence

Search results>Research output from CaltechCONF

conference paper review

oai:caltechconf.library.caltech.edu:2

Disentangling topdown from bottom up influences on attentional allocation in dynamic scenes

Authors: Ran Carmi
Lawrence Itti
Publication date: 15 May 2004
Publisher

Abstract

Motivation: Attentional allocation is determined by the interplay between bottom-up and top-down influences. Here we try to quantify the relative contributions of different influences on attentional allocation in dynamic scenes, as well as examine how they change over time. Methods: In order to manipulate the availability of top-down influences on attentional allocation, heterogeneous video clips were cut into clippets (M=2s), which were scrambled and re-assembled into MTV-style clips. Two groups of 8 Subjects each were instructed to "follow the main actors and actions". One group viewd the original stimuli while the other group viewd the MTV-style clips. Eye positions were recorded using an ISCAN eye-tracker (240Hz, yielding a total of more than a million samples for each group), and segmented into saccades, blinks, and fixation/smooth pursuit periods. A saliency-based model of attention capture (Itti & Koch 2000) was used to probe the relative contribution of bottom-up influences on attentional allocation based on a novel performance metric - Chance-Adjusted Saliency Accumometric (CASA). CASA values were computed based on the weighted sum of differences between normalized saliency at human vs. random saccade targets. Results: Total CASA based on the full saliency model was 6% higher in the MTV group compared to the original group. In both original and MTV groups, CASA based on either motion or flicker features alone was ~95% of the CASA based on the full saliency model. CASA based on either color, intensity, or orientation features alone was ~66% of the full model CASA. Generally, CASA values for earlier saccades after stimulus onset (clip or clippet start) were higher than for later saccades, but tapered off and flactuated around a fairly high value after the first several saccades. Conclusions: The 6% CASA difference between the original and MTV groups shows that eliminating visual context beyond the first ~2s of viewing barely increased the overall relative weight of bottom-up influences on attentional allocation. Our results imply that the relative weight of top-down influences on attentional allocation in dynamic scenes does not increase with viewing time (beyond the first ~2s). We also found that either motion or flicker are ~150% stronger than either color, intensity, or orientation as bottom-up attractors of attention

Similar works

Full text

CaltechCONF

oai:caltechconf.library.caltec...

Last time updated on 09/10/2012

This paper was published in CaltechCONF.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.