Search CORE

378,930 research outputs found

Crowdsourcing in Computer Vision

Author: Fei-Fei Li
Grauman Kristen
Kovashka Adriana
Russakovsky Olga
Publication venue: 'Now Publishers'
Publication date: 01/01/2016
Field of study

Computer vision systems require large amounts of manually annotated data to properly learn challenging visual concepts. Crowdsourcing platforms offer an inexpensive method to capture human knowledge and understanding, for a vast number of visual perception tasks. In this survey, we describe the types of annotations computer vision researchers have collected using crowdsourcing, and how they have ensured that this data is of high quality while annotation effort is minimized. We begin by discussing data collection on both classic (e.g., object recognition) and recent (e.g., visual story-telling) vision tasks. We then summarize key design decisions for creating effective data collection interfaces and workflows, and present strategies for intelligently selecting the most important data instances to annotate. Finally, we conclude with some thoughts on the future of crowdsourcing in computer vision.Comment: A 69-page meta review of the field, Foundations and Trends in Computer Graphics and Vision, 201

arXiv.org e-Print Archive

Crossref

Towards Active Event Recognition

Author: Demiris Y
Ognibene D
Publication venue: AIII Press
Publication date: 31/08/2013
Field of study

Directing robot attention to recognise activities and to anticipate events like goal-directed actions is a crucial skill for human-robot interaction. Unfortunately, issues like intrinsic time constraints, the spatially distributed nature of the entailed information sources, and the existence of a multitude of unobservable states affecting the system, like latent intentions, have long rendered achievement of such skills a rather elusive goal. The problem tests the limits of current attention control systems. It requires an integrated solution for tracking, exploration and recognition, which traditionally have been seen as separate problems in active vision.We propose a probabilistic generative framework based on a mixture of Kalman filters and information gain maximisation that uses predictions in both recognition and attention-control. This framework can efficiently use the observations of one element in a dynamic environment to provide information on other elements, and consequently enables guided exploration.Interestingly, the sensors-control policy, directly derived from first principles, represents the intuitive trade-off between finding the most discriminative clues and maintaining overall awareness.Experiments on a simulated humanoid robot observing a human executing goal-oriented actions demonstrated improvement on recognition time and precision over baseline systems

Spiral - Imperial College Digital Repository

How active perception and attractor dynamics shape perceptual categorization: A computational model

Author: Akrami
Amit
Amit
Ballard
Barca
Barsalou
Barsalou
Barsalou
Barsalou
Barsalou
Barsalou
Bassett
Bickhard
Bogacz
Chao
Churchland
Cisek
Cisek
Clifford
Coles
Desmurget
Fagioli
Frintrop
Friston
Friston
Geisler
Gigliotta
Giovanni Pezzulo
Gold
Gold
Grush
Hayhoe
Hommel
Hopfield
Itti
Jean Charles Quinton
Jeannerod
Jeannerod
Kawato
Kietzmann
Kilner
Kokinov
Krajbich
Kruschke
Lamberts
Medin
Mirolli
Miyashita
Nelson
Nelson
Nicola Catenacci Volpi
Nolfi
Nosofsky
Olman
O’Regan
Pezzulo
Pezzulo
Pezzulo
Pezzulo
Pezzulo
Pezzulo
Pezzulo
Pezzulo
Quinton
Quinton
Rao
Ratcliff
Ratcliff
Rehder
Resulaj
Rizzolatti
Rojas
Rosch
Rothkopf
Roy
Sakai
Salinas
Sanborn
Schoener
Shadlen
Song
Spivey
Strauss
Tipper
Tosoni
Trabasso
Tuci
Tucker
Usher
Wang
Wolpert
Yarbus
Publication venue: 'Elsevier BV'
Publication date: 23/07/2014
Field of study

We propose a computational model of perceptual categorization that fuses elements of grounded and sensorimotor theories of cognition with dynamic models of decision-making. We assume that category information consists in anticipated patterns of agent–environment interactions that can be elicited through overt or covert (simulated) eye movements, object manipulation, etc. This information is firstly encoded when category information is acquired, and then re-enacted during perceptual categorization. The perceptual categorization consists in a dynamic competition between attractors that encode the sensorimotor patterns typical of each category; action prediction success counts as ‘‘evidence’’ for a given category and contributes to falling into the corresponding attractor. The evidence accumulation process is guided by an active perception loop, and the active exploration of objects (e.g., visual exploration) aims at eliciting expected sensorimotor patterns that count as evidence for the object category. We present a computational model incorporating these elements and describing action prediction, active perception, and attractor dynamics as key elements of perceptual categorizations. We test the model in three simulated perceptual categorization tasks, and we discuss its relevance for grounded and sensorimotor theories of cognition.Peer reviewe

Crossref

HAL Clermont Université

University of Hertfordshire Research Archive

The Cat Is On the Mat. Or Is It a Dog? Dynamic Competition in Perceptual Decision Making

Author: Barca Laura
Catenacci Volpi Nicola
Pezzulo Giovanni
Quinton Jean Charles
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/09/2013
Field of study

Recent neurobiological findings suggest that the brain solves simple perceptual decision-making tasks by means of a dynamic competition in which evidence is accumulated in favor of the alternatives. However, it is unclear if and how the same process applies in more complex, real-world tasks, such as the categorization of ambiguous visual scenes and what elements are considered as evidence in this case. Furthermore, dynamic decision models typically consider evidence accumulation as a passive process disregarding the role of active perception strategies. In this paper, we adopt the principles of dynamic competition and active vision for the realization of a biologically- motivated computational model, which we test in a visual catego- rization task. Moreover, our system uses predictive power of the features as the main dimension for both evidence accumulation and the guidance of active vision. Comparison of human and synthetic data in a common experimental setup suggests that the proposed model captures essential aspects of how the brain solves perceptual ambiguities in time. Our results point to the importance of the proposed principles of dynamic competi- tion, parallel specification, and selection of multiple alternatives through prediction, as well as active guidance of perceptual strategies for perceptual decision-making and the resolution of perceptual ambiguities. These principles could apply to both the simple perceptual decision problems studied in neuroscience and the more complex ones addressed by vision research.Peer reviewe

Crossref

HAL Clermont Université

University of Hertfordshire Research Archive

Towards Contextual Action Recognition and Target Localization with Active Allocation of Attention

Author: D. Marr
D.H. Ballard
G.C.H.E. Croon de
J. Schmidhuber
J.J. Heisz
K. Kastella
M. Suzuki
M.F. Land
R. Bajcsy
U. Sailer
Y. Demiris
Y. Demiris
Y. Demiris
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Exploratory gaze movements are fundamental for gathering the most relevant information regarding the partner during social interactions. We have designed and implemented a system for dynamic attention allocation which is able to actively control gaze movements during a visual action recognition task. During the observation of a partners reaching movement, the robot is able to contextually estimate the goal position of the partner hand and the location in space of the candidate targets, while moving its gaze around with the purpose of optimizing the gathering of information relevant for the task. Experimental results on a simulated environment show that active gaze control provides a relevant advantage with respect to typical passive observation, both in term of estimation precision and of time required for action recognition. © 2012 Springer-Verlag

Crossref

Spiral - Imperial College Digital Repository

Attention and Anticipation in Fast Visual-Inertial Navigation

Author: Carlone Luca
Karaman Sertac
Publication venue
Publication date: 22/03/2018
Field of study

We study a Visual-Inertial Navigation (VIN) problem in which a robot needs to estimate its state using an on-board camera and an inertial sensor, without any prior knowledge of the external environment. We consider the case in which the robot can allocate limited resources to VIN, due to tight computational constraints. Therefore, we answer the following question: under limited resources, what are the most relevant visual cues to maximize the performance of visual-inertial navigation? Our approach has four key ingredients. First, it is task-driven, in that the selection of the visual cues is guided by a metric quantifying the VIN performance. Second, it exploits the notion of anticipation, since it uses a simplified model for forward-simulation of robot dynamics, predicting the utility of a set of visual cues over a future time horizon. Third, it is efficient and easy to implement, since it leads to a greedy algorithm for the selection of the most relevant visual cues. Fourth, it provides formal performance guarantees: we leverage submodularity to prove that the greedy selection cannot be far from the optimal (combinatorial) selection. Simulations and real experiments on agile drones show that our approach ensures state-of-the-art VIN performance while maintaining a lean processing time. In the easy scenarios, our approach outperforms appearance-based feature selection in terms of localization errors. In the most challenging scenarios, it enables accurate visual-inertial navigation while appearance-based feature selection fails to track robot's motion during aggressive maneuvers.Comment: 20 pages, 7 figures, 2 table

arXiv.org e-Print Archive

DSpace@MIT

Crossref