1,145 research outputs found

    Driving forces in free visual search : An ethology

    Get PDF
    Peer reviewedPostprin

    Attentive monitoring of multiple video streams driven by a Bayesian foraging strategy

    Full text link
    In this paper we shall consider the problem of deploying attention to subsets of the video streams for collating the most relevant data and information of interest related to a given task. We formalize this monitoring problem as a foraging problem. We propose a probabilistic framework to model observer's attentive behavior as the behavior of a forager. The forager, moment to moment, focuses its attention on the most informative stream/camera, detects interesting objects or activities, or switches to a more profitable stream. The approach proposed here is suitable to be exploited for multi-stream video summarization. Meanwhile, it can serve as a preliminary step for more sophisticated video surveillance, e.g. activity and behavior analysis. Experimental results achieved on the UCR Videoweb Activities Dataset, a publicly available dataset, are presented to illustrate the utility of the proposed technique.Comment: Accepted to IEEE Transactions on Image Processin

    A BIASED COMPETITION COMPUTATIONAL MODEL OF SPATIAL AND OBJECT-BASED ATTENTION MEDIATING ACTIVE VISUAL SEARCH

    Get PDF
    A computational cognitive neuroscience approach was used to examine processes of visual attention in the human and monkey brain. The aim of the work was to produce a biologically plausible neurodynamical model of both spatial and object-based attention that accounted for observations in monkey visual areas V4, inferior temporal cortex (IT) and the lateral intraparietal area (LIP), and was able to produce search scan path behaviour similar to that observed in humans and monkeys. Of particular interest currently in the visual attention literature is the biased competition hypothesis (Desimone & Duncan. 1995). The model presented here is the first active vision implementation of biased competition, where attcntional shifts are overt. Therefore, retinal inputs change during the scan path and this approach raised issues, such as memory for searched locations across saccades, not addressed bv previous models with static retinas. This is the first model to examine the different time courses associated with spatial and object-based effects at the cellular level. Single cell recordings in areas V4 (Luck et al., 1997; Chelazzi et al., 2001) and IT (Chelazzi ct al., 1993, 1998) were replicated such that attentional effects occurred at the appropriate time after onset of the stimulus. Object-based effects at the cellular level of the model led to systems level behaviour that replicated that observed during active visual search for orientation and colour feature conjunction targets in psychophysical investigations. This provides a valuable insight into the link between cellular and system level behaviour in natural systems. At the systems level, the simulated search process showed selectivity in its scan path that was similar to that observed in humans (Scialfa & Joffe, 1998; Williams & Reingold, 2001) and monkeys (Motter & Belky. 1998b), being guided to target coloured locations in preference to locations containing the target orientation or blank areas. A connection between the ventral and dorsal visual processing streams (Ungerleider & Mishkin. 1982) is suggested to contribute to this selectivity and priority in the featural guidance of search. Such selectivity and avoidance of blank areas has potential application in computer vision applications. Simulation of lesions within the model and comparison with patient data provided further verification of the model. Simulation of visual neglect due to parietal cortical lesion suggests that the model has the capability to provide insights into the neural correlates of the conscious perception of stimuli The biased competition approach described here provides an extendable framework within which further "bottom-up" stimulus and "top-down" mnemonic and cognitive biases can be added, in order to further examine exogenous versus endogenous factors in the capture of attention

    Perception of the visual environment

    Get PDF
    The eyes are the front end to the vast majority of the human behavioural repertoire. The manner in which our eyes sample the environment places fundamental constraints upon the information that is available for subsequent processing in the brain: the small window of clear vision at the centre of gaze can only be directed at an average of about three locations in the environment in every second. We are largely unaware of these continual movements, making eye movements a valuable objective measure that can provide a window into the cognitive processes underlying many of our behaviours. The valuable resource of high quality vision must be allocated with care in order to provide the right information at the right time for the behaviours we engage in. However, the mechanisms that underlie the decisions about where and when to move the eyes remain to be fully understood. In this chapter I consider what has been learnt about targeting the eyes in a range of different experimental paradigms, from simple stimuli arrays of only a few isolated targets, to complex arrays and photographs of real environments, and finally to natural task settings. Much has been learnt about how we view photographs, and current models incorporate low-level image salience, motor biases to favour certain ways of moving the eyes, higher-level expectations of what objects look like and expectations about where we will find objects in a scene. Finally in this chapter I will consider the fate of information that has received overt visual attention. While much of the detailed information from what we look at is lost, some remains, yet our understanding of what we retain and the factors that govern what is remembered and what is forgotten are not well understood. It appears that our expectations about what we will need to know later in the task are important in determining what we represent and retain in visual memory, and that our representations are shaped by the interactions that we engage in with objects

    VISUAL SALIENCY ANALYSIS, PREDICTION, AND VISUALIZATION: A DEEP LEARNING PERSPECTIVE

    Get PDF
    In the recent years, a huge success has been accomplished in prediction of human eye fixations. Several studies employed deep learning to achieve high accuracy of prediction of human eye fixations. These studies rely on pre-trained deep learning for object classification. They exploit deep learning either as a transfer-learning problem, or the weights of the pre-trained network as the initialization to learn a saliency model. The utilization of such pre-trained neural networks is due to the relatively small datasets of human fixations available to train a deep learning model. Another relatively less prioritized problem is amount of computation of such deep learning models requires expensive hardware. In this dissertation, two approaches are proposed to tackle abovementioned problems. The first approach, codenamed DeepFeat, incorporates the deep features of convolutional neural networks pre-trained for object and scene classifications. This approach is the first approach that uses deep features without further learning. Performance of the DeepFeat model is extensively evaluated over a variety of datasets using a variety of implementations. The second approach is a deep learning saliency model, codenamed ClassNet. Two main differences separate the ClassNet from other deep learning saliency models. The ClassNet model is the only deep learning saliency model that learns its weights from scratch. In addition, the ClassNet saliency model treats prediction of human fixation as a classification problem, while other deep learning saliency models treat the human fixation prediction as a regression problem or as a classification of a regression problem

    Salience Models: A Computational Cognitive Neuroscience Review

    Get PDF
    The seminal model by Laurent Itti and Cristoph Koch demonstrated that we can compute the entire flow of visual processing from input to resulting fixations. Despite many replications and follow-ups, few have matched the impact of the original model—so what made this model so groundbreaking? We have selected five key contributions that distinguish the original salience model by Itti and Koch; namely, its contribution to our theoretical, neural, and computational understanding of visual processing, as well as the spatial and temporal predictions for fixation distributions. During the last 20 years, advances in the field have brought up various techniques and approaches to salience modelling, many of which tried to improve or add to the initial Itti and Koch model. One of the most recent trends has been to adopt the computational power of deep learning neural networks; however, this has also shifted their primary focus to spatial classification. We present a review of recent approaches to modelling salience, starting from direct variations of the Itti and Koch salience model to sophisticated deep-learning architectures, and discuss the models from the point of view of their contribution to computational cognitive neuroscience
    • …
    corecore