142 research outputs found
Task-set switching with natural scenes: Measuring the cost of deploying top-down attention
In many everyday situations, we bias our perception from the top down, based on a task or an agenda. Frequently, this entails shifting attention to a specific attribute of a particular object or scene. To explore the cost of shifting top-down attention to a different stimulus attribute, we adopt the task-set switching paradigm, in which switch trials are contrasted with repeat trials in mixed-task blocks and with single-task blocks. Using two tasks that relate to the content of a natural scene in a gray-level photograph and two tasks that relate to the color of the frame around the image, we were able to distinguish switch costs with and without shifts of attention. We found a significant cost in reaction time of 23–31 ms for switches that require shifting attention to other stimulus attributes, but no significant switch cost for switching the task set within an attribute. We conclude that deploying top-down attention to a different attribute incurs a significant cost in reaction time, but that biasing to a different feature value within the same stimulus attribute is effortless
Modeling feature sharing between object detection and top-down attention
Visual search and other attentionally demanding processes are often guided from the top down when a specific task is given (e.g. Wolfe et al. Vision Research 44, 2004). In the simplified stimuli commonly used in visual search experiments, e.g. red and horizontal bars, the selection of potential features that might be biased for is obvious (by design). In a natural setting with real-world objects, the selection of these features is not obvious, and there is some debate which features can be used for top-down guidance, and how a specific task maps to them (Wolfe and Horowitz, Nat. Rev. Neurosci. 2004).
Learning to detect objects provides the visual system with an effective set of features suitable for the detection task, and with a mapping from these features to an abstract representation of the object.
We suggest a model, in which V4-type features are shared between object detection and top-down attention. As the model familiarizes itself with objects, i.e. it learns to detect them, it acquires a representation for features to solve the detection task. We propose that by cortical feedback connections, top-down processes can re-use these same features to bias attention to locations with higher probability of containing the target object. We propose a model architecture that allows for such processing, and we present a computational implementation of the model that performs visual search in natural scenes for a given object category, e.g. for faces. We compare the performance of our model to pure bottom-up selection as well as to top-down attention using simple features such as hue
Saliency on a chip: a digital approach with an FPGA
Selective-visual-attention algorithms have
been successfully implemented in analog
VLSI circuits.1 However, in addition to
the usual issues of analog VLSI—such as
the need to fi ne-tune a large number of biases—
these implementations lack the spatial
resolution and pre-processing capabilities
to be truly useful for image-processing
applications. Here we take an alternative
approach and implement a neuro-mimetic
algorithm for selective visual attention in
digital hardware
Is bottom-up attention useful for object recognition?
A key problem in learning multiple objects from unlabeled
images is that it is a priori impossible to tell which
part of the image corresponds to each individual object,
and which part is irrelevant clutter which is not associated
to the objects. We investigate empirically to what extent
pure bottom-up attention can extract useful information
about the location, size and shape of objects from images
and demonstrate how this information can be utilized
to enable unsupervised learning of objects from unlabeled
images. Our experiments demonstrate that the proposed approach to using bottom-up attention is indeed useful for a
variety of applications
Measuring Symmetry in Real-World Scenes Using Derivatives of the Medial Axis Radius Function
Symmetry has been shown to be an important principle that guides the grouping of scene information. Previously, we have described a method for measuring the local, ribbon symmetry content of line-drawings of real-world scenes (Rezanejad, et al., MODVIS 2017), and we demonstrated that this information has important behavioral consequences (Wilder, et al., MODIVS 2017). Here, we describe a continuous, local version of the symmetry measure, that allows for both ribbon and taper symmetry to be captured. Our original method looked at the difference in the radius between successive maximal discs along a symmetric axis. The number of radii differences in a local region that exceeded a threshold, normalized by the number of total differences, was used as the symmetry score at an axis point. We now use the derivative of the radius function along the symmetric axis between two contours, which allows for a continuous method of estimating the score which does not need a threshold. By replacing the first derivative with a second derivative, we can generalize this method to allow pairs of contours which taper with respect to one another, to express high symmetry. Such situations arise, for example, when parallel lines in the 3D world project onto a 2D image. This generalization will allow us to determine the relative importance of taper and ribbon symmetries in natural scenes
- …