Search CORE

40,953 research outputs found

Multi-feature Bottom-up Processing and Top-down Selection for an Object-based Visual Attention Model

Author: Bandera-Rubio Antonio Jesus
Bandera-Rubio Juan Pedro
Marfil Rebeca
Palomino Antonio Jesús
Publication venue: Servicio de Publicaciones de la Universidad de Málaga
Publication date: 01/01/2013
Field of study

Artificial vision systems can not process all the information that they receive from the world in real time because it is highly expensive and inefficient in terms of computational cost. However, inspired by biological perception systems, it is possible to develop an artificial attention model able to select only the relevant part of the scene, as human vision does. This paper presents an attention model which draws attention over perceptual units of visual information, called proto-objects, and which uses a linear combination of multiple low-level features (such as colour, symmetry or shape) in order to calculate the saliency of each of them. But not only bottom-up processing is addressed, the proposed model also deals with the top-down component of attention. It is shown how a high-level task can modulate the global saliency computation, modifying the weights involved in the basic features linear combination.Ministerio de Economía y Competitividad (MINECO), proyectos: TIN2008-06196 y TIN2012-38079-C03-03. Campus de Excelencia Internacional Andalucía Tech

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional Universidad de Málaga

Attentive monitoring of multiple video streams driven by a Bayesian foraging strategy

Author: Boccignone Giuseppe
Napoletano Paolo
Tisato Francesco
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/04/2015
Field of study

In this paper we shall consider the problem of deploying attention to subsets of the video streams for collating the most relevant data and information of interest related to a given task. We formalize this monitoring problem as a foraging problem. We propose a probabilistic framework to model observer's attentive behavior as the behavior of a forager. The forager, moment to moment, focuses its attention on the most informative stream/camera, detects interesting objects or activities, or switches to a more profitable stream. The approach proposed here is suitable to be exploited for multi-stream video summarization. Meanwhile, it can serve as a preliminary step for more sophisticated video surveillance, e.g. activity and behavior analysis. Experimental results achieved on the UCR Videoweb Activities Dataset, a publicly available dataset, are presented to illustrate the utility of the proposed technique.Comment: Accepted to IEEE Transactions on Image Processin

arXiv.org e-Print Archive

AIR Universita degli studi di Milano

From perception to action and vice versa: a new architecture showing how perception and action can modulate each other simultaneously

Author: Bandera-Rubio Juan Pedro
Fernández Fernando
García-Olaya Angel
Palomino Antonio Jesús
Publication venue
Publication date: 01/01/2013
Field of study

Presentado en: 6th European Conference on Mobile Robots (ECMR) Sep 25-27, 2013 Barcelona, SpainArtificial vision systems can not process all the information that they receive from the world in real time because it is highly expensive and inefficient in terms of computational cost. However, inspired by biological perception systems, it is possible to develop an artificial attention model able to select only the relevant part of the scene, as human vision does. From the Automated Planning point of view, a relevant area can be seen as an area where the objects involved in the execution of a plan are located. Thus, the planning system should guide the attention model to track relevant objects. But, at the same time, the perceived objects may constrain or provide new information that could suggest the modification of a current plan. Therefore, a plan that is being executed should be adapted or recomputed taking into account actual information perceived from the world. In this work, we introduce an architecture that creates a symbiosis between the planning and the attention modules of a robotic system, linking visual features with high level behaviours. The architecture is based on the interaction of an oversubscription planner, that produces plans constrained by the information perceived from the vision system, and an object-based attention system, able to focus on the relevant objects of the plan being executed.Spanish MINECO projects TIN2008-06196, TIN2012-38079-C03-03 and TIN2012-38079-C03-02. Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tec

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional Universidad de Málaga

Universidad Carlos III de Madrid e-Archivo

Real-time Neuromorphic Visual Pre-Processing and Dynamic Saliency

Author: Molin Jamal Lottier
Publication venue: 'The Busan Gyeongnam Mathematical Society'
Publication date: 26/07/2017
Field of study

The human brain is by far the most computationally complex, efficient, and reliable computing system operating under such low-power, small-size, and light-weight specifications. Within the field of neuromorphic engineering, we seek to design systems with facsimiles to that of the human brain with means to reach its desirable properties. In this doctoral work, the focus is within the realm of vision, specifically visual saliency and related visual tasks with bio-inspired, real-time processing. The human visual system, from the retina through the visual cortical hierarchy, is responsible for extracting visual information and processing this information, forming our visual perception. This visual information is transmitted through these various layers of the visual system via spikes (or action potentials), representing information in the temporal domain. The objective is to exploit this neurological communication protocol and functionality within the systems we design. This approach is essential for the advancement of autonomous, mobile agents (i.e. drones/MAVs, cars) which must perform visual tasks under size and power constraints in which traditional CPU or GPU implementations to not suffice. Although the high-level objective is to design a complete visual processor with direct physical and functional correlates to the human visual system, we focus on three specific tasks. The first focus of this thesis is the integration of motion into a biologically-plausible proto-object-based visual saliency model. Laurent Itti, one of the pioneers in the field, defines visual saliency as ``the distinct subjective perceptual quality which makes some items in the world stand out from their neighbors and immediately grab our attention.'' From humans to insects, visual saliency is important for the extraction of only interesting regions of visual stimuli for further processing. Prior to this doctoral work, Russel et al. \cite{russell2014model} designed a model of proto-object-based visual saliency with biological correlates. This model was designed for computing saliency only on static images. However, motion is a naturally occurring phenomena that plays an essential role in both human and animal visual processing. Henceforth, the most ideal model of visual saliency should consider motion that may be exhibited within the visual scene. In this work a novel dynamic proto-object-based visual saliency is described which extends the Russel et. al. saliency model to consider not only static, but also temporal information. This model was validated by using metrics for determining how accurate the model is in predicting human eye fixations and saccades on a public dataset of videos with attached eye tracking data. This model outperformed other state-of-the-art visual saliency models in computing dynamic visual saliency. Such a model that can accurately predict where humans look, can serve as a front-end component to other visual processors performing tasks such as object detection and recognition, or object tracking. In doing so it can reduce throughput and increase processing speed for such tasks. Furthermore, it has more obvious applications in artificial intelligence in mimicking the functionality of the human visual system. The second focus of this thesis is the implementation of this visual saliency model on an FPGA (Field Programmable Gate Array) for real-time processing. Initially, this model was designed within MATLAB, a software-based approach running on a CPU, which limits the processing speed and consumes unnecessary amounts of power due to overhead. This is detrimental for integration with an autonomous, mobile system which must operate in real-time. This novel FPGA implementation allows for a low-power, high-speed approach to computing visual saliency. There are a few existing FPGA-based implementations of visual saliency, and of those, none are based on the notion of proto-objects. This work presents the first, to our knowledge, FPGA implementation of an object-based visual saliency model. Such an FPGA implementation allows for the low-power, light-weight, and small-size specifications that we seek within the field of neuromorphic engineering. For validating the FPGA model, the same metrics are used for determining the extent to which it predicts human eye saccades and fixations. We compare this hardware implementation to the software model for validation. The third focus of this thesis is the design of a generic neuromorphic platform both on FPGA and VLSI (Very-Large-Scale-Integration) technology for performing visual tasks, including those necessary in the computation of the visual saliency. Visual processing tasks such as image filtering and image dewarping are demonstrated via this novel neuromorphic technology consisting of an array of hardware-based generalized integrate-and-fire neurons. It allows the visual saliency model's computation to be offloaded onto this hardware-based architecture. We first demonstrate an emulation of this neuromorphic system on FPGA demonstrating its capability of dewarping and filtering tasks as well as integration with a neuromorphic camera called the ATIS (Asynchronous Time-based Image Sensor). We then demonstrate the neuromorphic platform implemented in CMOS technology, specifically designed for low-mismatch, high-density, and low-power. Such a VLSI technology-based platform further bridges the gap between engineering and biology and moves us closer towards developing a complete neuromorphic visual processor

JScholarship

A computer vision model for visual-object-based attention and eye movements

Author: Backer
Bonmassar
Chambers
Craighero
Duncan
Fang Wang
Herman Martins Gomes
Hoffman
Horowitz
Itti
Juan
Kelley
LaBerge
Lee
McPeek
Pashler
Posner
Pylyshyn
Rensink
Rizzolatti
Robert Fisher
Scholl
Sela
Serences
Sun
Thompson
Tipper
Tsotsos
Walther
Wright
Yaoru Sun
Publication venue: 'Elsevier BV'
Publication date: 01/11/2008
Field of study

This is the post-print version of the final paper published in Computer Vision and Image Understanding. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2008 Elsevier B.V.This paper presents a new computational framework for modelling visual-object-based attention and attention-driven eye movements within an integrated system in a biologically inspired approach. Attention operates at multiple levels of visual selection by space, feature, object and group depending on the nature of targets and visual tasks. Attentional shifts and gaze shifts are constructed upon their common process circuits and control mechanisms but also separated from their different function roles, working together to fulfil flexible visual selection tasks in complicated visual environments. The framework integrates the important aspects of human visual attention and eye movements resulting in sophisticated performance in complicated natural scenes. The proposed approach aims at exploring a useful visual selection system for computer vision, especially for usage in cluttered natural visual environments.National Natural Science of Founda- tion of Chin

Crossref

Edinburgh Research Explorer

Brunel University Research Archive

Identification of Invariant Sensorimotor Structures as a Prerequisite for the Discovery of Objects

Author: Hir Nicolas Le
Laflaquière Alban
Sigaud Olivier
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

Perceiving the surrounding environment in terms of objects is useful for any general purpose intelligent agent. In this paper, we investigate a fundamental mechanism making object perception possible, namely the identification of spatio-temporally invariant structures in the sensorimotor experience of an agent. We take inspiration from the Sensorimotor Contingencies Theory to define a computational model of this mechanism through a sensorimotor, unsupervised and predictive approach. Our model is based on processing the unsupervised interaction of an artificial agent with its environment. We show how spatio-temporally invariant structures in the environment induce regularities in the sensorimotor experience of an agent, and how this agent, while building a predictive model of its sensorimotor experience, can capture them as densely connected subgraphs in a graph of sensory states connected by motor commands. Our approach is focused on elementary mechanisms, and is illustrated with a set of simple experiments in which an agent interacts with an environment. We show how the agent can build an internal model of moving but spatio-temporally invariant structures by performing a Spectral Clustering of the graph modeling its overall sensorimotor experiences. We systematically examine properties of the model, shedding light more globally on the specificities of the paradigm with respect to methods based on the supervised processing of collections of static images.Comment: 24 pages, 10 figures, published in Frontiers Robotics and A

arXiv.org e-Print Archive

Frontiers - Publisher Connector

Seeing, Sensing, and Scrutinizing

Author: Rensink Ronald A
Publication venue: Pergamon
Publication date: 01/01/2000
Field of study

Large changes in a scene often become difficult to notice if made during an eye movement, image flicker, movie cut, or other such disturbance. It is argued here that this change blindness can serve as a useful tool to explore various aspects of vision. This argument centers around the proposal that focused attention is needed for the explicit perception of change. Given this, the study of change perception can provide a useful way to determine the nature of visual attention, and to cast new light on the way that it isand is notinvolved in visual perception. To illustrate the power of this approach, this paper surveys its use in exploring three different aspects of vision. The first concerns the general nature of seeing. To explain why change blindness can be easily induced in experiments but apparently not in everyday life, it is proposed that perception involves a virtual representation, where object representations do not accumulate, but are formed as needed. An architecture containing both attentional and nonattentional streams is proposed as a way to implement this scheme. The second aspect concerns the ability of observers to detect change even when they have no visual experience of it. This sensing is found to take on at least two forms: detection without visual experience (but still with conscious awareness), and detection without any awareness at all. It is proposed that these are both due to the operation of a nonattentional visual stream. The final aspect considered is the nature of visual attention itselfthe mechanisms involved when scrutinizing items. Experiments using controlled stimuli show the existence of various limits on visual search for change. It is shown that these limits provide a powerful means to map out the attentional mechanisms involved

PhilPapers

CiteSeerX

Elsevier - Publisher Connector

CogPrints Cognitive Sciences Eprint Archive