164 research outputs found

    Object Detection Through Exploration With A Foveated Visual Field

    Get PDF
    We present a foveated object detector (FOD) as a biologically-inspired alternative to the sliding window (SW) approach which is the dominant method of search in computer vision object detection. Similar to the human visual system, the FOD has higher resolution at the fovea and lower resolution at the visual periphery. Consequently, more computational resources are allocated at the fovea and relatively fewer at the periphery. The FOD processes the entire scene, uses retino-specific object detection classifiers to guide eye movements, aligns its fovea with regions of interest in the input image and integrates observations across multiple fixations. Our approach combines modern object detectors from computer vision with a recent model of peripheral pooling regions found at the V1 layer of the human visual system. We assessed various eye movement strategies on the PASCAL VOC 2007 dataset and show that the FOD performs on par with the SW detector while bringing significant computational cost savings.Comment: An extended version of this manuscript was published in PLOS Computational Biology (October 2017) at https://doi.org/10.1371/journal.pcbi.100574

    Perception of the visual environment

    Get PDF
    The eyes are the front end to the vast majority of the human behavioural repertoire. The manner in which our eyes sample the environment places fundamental constraints upon the information that is available for subsequent processing in the brain: the small window of clear vision at the centre of gaze can only be directed at an average of about three locations in the environment in every second. We are largely unaware of these continual movements, making eye movements a valuable objective measure that can provide a window into the cognitive processes underlying many of our behaviours. The valuable resource of high quality vision must be allocated with care in order to provide the right information at the right time for the behaviours we engage in. However, the mechanisms that underlie the decisions about where and when to move the eyes remain to be fully understood. In this chapter I consider what has been learnt about targeting the eyes in a range of different experimental paradigms, from simple stimuli arrays of only a few isolated targets, to complex arrays and photographs of real environments, and finally to natural task settings. Much has been learnt about how we view photographs, and current models incorporate low-level image salience, motor biases to favour certain ways of moving the eyes, higher-level expectations of what objects look like and expectations about where we will find objects in a scene. Finally in this chapter I will consider the fate of information that has received overt visual attention. While much of the detailed information from what we look at is lost, some remains, yet our understanding of what we retain and the factors that govern what is remembered and what is forgotten are not well understood. It appears that our expectations about what we will need to know later in the task are important in determining what we represent and retain in visual memory, and that our representations are shaped by the interactions that we engage in with objects

    Evolution and Optimality of Similar Neural Mechanisms for Perception and Action during Search

    Get PDF
    A prevailing theory proposes that the brain's two visual pathways, the ventral and dorsal, lead to differing visual processing and world representations for conscious perception than those for action. Others have claimed that perception and action share much of their visual processing. But which of these two neural architectures is favored by evolution? Successful visual search is life-critical and here we investigate the evolution and optimality of neural mechanisms mediating perception and eye movement actions for visual search in natural images. We implement an approximation to the ideal Bayesian searcher with two separate processing streams, one controlling the eye movements and the other stream determining the perceptual search decisions. We virtually evolved the neural mechanisms of the searchers' two separate pathways built from linear combinations of primary visual cortex receptive fields (V1) by making the simulated individuals' probability of survival depend on the perceptual accuracy finding targets in cluttered backgrounds. We find that for a variety of targets, backgrounds, and dependence of target detectability on retinal eccentricity, the mechanisms of the searchers' two processing streams converge to similar representations showing that mismatches in the mechanisms for perception and eye movements lead to suboptimal search. Three exceptions which resulted in partial or no convergence were a case of an organism for which the targets are equally detectable across the retina, an organism with sufficient time to foveate all possible target locations, and a strict two-pathway model with no interconnections and differential pre-filtering based on parvocellular and magnocellular lateral geniculate cell properties. Thus, similar neural mechanisms for perception and eye movement actions during search are optimal and should be expected from the effects of natural selection on an organism with limited time to search for food that is not equi-detectable across its retina and interconnected perception and action neural pathways

    Reinforcement learning or active inference?

    Get PDF
    This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain

    Human Visual Search Does Not Maximize the Post-Saccadic Probability of Identifying Targets

    Get PDF
    Researchers have conjectured that eye movements during visual search are selected to minimize the number of saccades. The optimal Bayesian eye movement strategy minimizing saccades does not simply direct the eye to whichever location is judged most likely to contain the target but makes use of the entire retina as an information gathering device during each fixation. Here we show that human observers do not minimize the expected number of saccades in planning saccades in a simple visual search task composed of three tokens. In this task, the optimal eye movement strategy varied, depending on the spacing between tokens (in the first experiment) or the size of tokens (in the second experiment), and changed abruptly once the separation or size surpassed a critical value. None of our observers changed strategy as a function of separation or size. Human performance fell far short of ideal, both qualitatively and quantitatively

    Generalized information theory meets human cognition: Introducing a unified framework to model uncertainty and information search

    Get PDF
    Searching for information is critical in many situations. In medicine, for instance, careful choice of a diagnostic test can help narrow down the range of plausible diseases that the patient might have. In a probabilistic framework, test selection is often modeled by assuming that people’s goal is to reduce uncertainty about possible states of the world. In cognitive science, psychology, and medical decision making, Shannon entropy is the most prominent and most widely used model to formalize probabilistic uncertainty and the reduction thereof. However, a variety of alternative entropy metrics (Hartley, Quadratic, Tsallis, Rényi, and more) are popular in the social and the natural sciences, computer science, and philosophy of science. Particular entropy measures have been predominant in particular research areas, and it is often an open issue whether these divergences emerge from different theoretical and practical goals or are merely due to historical accident. Cutting across disciplinary boundaries, we show that several entropy and entropy reduction measures arise as special cases in a unified formalism, the Sharma-Mittal framework. Using mathematical results, computer simulations, and analyses of published behavioral data, we discuss four key questions: How do various entropy models relate to each other? What insights can be obtained by considering diverse entropy models within a unified framework? What is the psychological plausibility of different entropy models? What new questions and insights for research on human information acquisition follow? Our work provides several new pathways for theoretical and empirical research, reconciling apparently conflicting approaches and empirical findings within a comprehensive and unified information-theoretic formalism

    Combining eye and hand in search is suboptimal

    Get PDF
    When performing everyday tasks, we often move our eyes and hand together: we look where we are reaching in order to better guide the hand. This coordinated pattern with the eye leading the hand is presumably optimal behaviour. But eyes and hands can move to different locations if they are involved in different tasks. To find out whether this leads to optimal performance, we studied the combination of visual and haptic search. We asked ten participants to perform a combined visual and haptic search for a target that was present in both modalities and compared their search times to those on visual only and haptic only search tasks. Without distractors, search times were faster for visual search than for haptic search. With many visual distractors, search times were longer for visual than for haptic search. For the combined search, performance was poorer than the optimal strategy whereby each modality searched a different part of the display. The results are consistent with several alternative accounts, for instance with vision and touch searching independently at the same time

    Assessing Visual Attention Using Eye Tracking Sensors in Intelligent Cognitive Therapies Based on Serious Games

    Get PDF
    This study examines the use of eye tracking sensors as a means to identify children's behavior in attention-enhancement therapies. For this purpose, a set of data collected from 32 children with different attention skills is analyzed during their interaction with a set of puzzle games. The authors of this study hypothesize that participants with better performance may have quantifiably different eye-movement patterns from users with poorer results. The use of eye trackers outside the research community may help to extend their potential with available intelligent therapies, bringing state-of-the-art technologies to users. The use of gaze data constitutes a new information source in intelligent therapies that may help to build new approaches that are fully-customized to final users' needs. This may be achieved by implementing machine learning algorithms for classification. The initial study of the dataset has proven a 0.88 (±0.11) classification accuracy with a random forest classifier, using cross-validation and hierarchical tree-based feature selection. Further approaches need to be examined in order to establish more detailed attention behaviors and patterns among children with and without attention problems

    Keeping an eye on noisy movements: On different approaches to perceptual-motor skill research and training

    Get PDF
    Contemporary theorising on the complementary nature of perception and action in expert performance has led to the emergence of different emphases in studying movement coordination and gaze behaviour. On the one hand, coordination research has examined the role that variability plays in movement control, evidencing that variability facilitates individualised adaptations during both learning and performance. On the other hand, and at odds with this principle, the majority of gaze behaviour studies have tended to average data over participants and trials, proposing the importance of universal 'optimal' gaze patterns in a given task, for all performers, irrespective of stage of learning. In this article, new lines of inquiry are considered with the aim of reconciling these two distinct approaches. The role that inter- and intra-individual variability may play in gaze behaviours is considered, before suggesting directions for future research
    corecore