Search CORE

248,859 research outputs found

Bridging Between Computer and Robot Vision Through Data Augmentation: A Case Study on Object Recognition

Author: Caputo Barbara
Carlucci FABIO MARIA
Colosi Mirco
D'Innocente Antonio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Despite the impressive progress brought by deep network in visual object recognition, robot vision is still far from being a solved problem. The most successful convolutional architectures are developed starting from ImageNet, a large scale collection of images of object categories downloaded from the Web. This kind of images is very different from the situated and embodied visual experience of robots deployed in unconstrained settings. To reduce the gap between these two visual experiences, this paper proposes a simple yet effective data augmentation layer that zooms on the object of interest and simulates the object detection outcome of a robot vision system. The layer, that can be used with any convolutional deep architecture, brings to an increase in object recognition performance of up to 7{\%}, in experiments performed over three different benchmark databases. An implementation of our robot data augmentation layer has been made publicly available

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition

Author: Ardila Diego
Cadieu Charles F.
DiCarlo James J.
Hong Ha
Majaj Najib J.
Pinto Nicolas
Solomon Ethan A.
Yamins Daniel L. K.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 12/06/2014
Field of study

The primate visual system achieves remarkable visual object recognition performance even in brief presentations and under changes to object exemplar, geometric transformations, and background variation (a.k.a. core visual object recognition). This remarkable performance is mediated by the representation formed in inferior temporal (IT) cortex. In parallel, recent advances in machine learning have led to ever higher performing models of object recognition using artificial deep neural networks (DNNs). It remains unclear, however, whether the representational performance of DNNs rivals that of the brain. To accurately produce such a comparison, a major difficulty has been a unifying metric that accounts for experimental limitations such as the amount of noise, the number of neural recording sites, and the number trials, and computational limitations such as the complexity of the decoding classifier and the number of classifier training examples. In this work we perform a direct comparison that corrects for these experimental limitations and computational considerations. As part of our methodology, we propose an extension of "kernel analysis" that measures the generalization accuracy as a function of representational complexity. Our evaluations show that, unlike previous bio-inspired models, the latest DNNs rival the representational performance of IT cortex on this visual object recognition task. Furthermore, we show that models that perform well on measures of representational performance also perform well on measures of representational similarity to IT and on measures of predicting individual IT multi-unit responses. Whether these DNNs rely on computational mechanisms similar to the primate visual system is yet to be determined, but, unlike all previous bio-inspired models, that possibility cannot be ruled out merely on representational performance grounds.Comment: 35 pages, 12 figures, extends and expands upon arXiv:1301.353

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

FigShare

Visual object recognition and tracking

Author: Chang Chu-Yin
English James D.
Tardella Neil M.
Publication venue
Publication date: 16/03/2010
Field of study

This invention describes a method for identifying and tracking an object from two-dimensional data pictorially representing said object by an object-tracking system through processing said two-dimensional data using at least one tracker-identifier belonging to the object-tracking system for providing an output signal containing: a) a type of the object, and/or b) a position or an orientation of the object in three-dimensions, and/or c) an articulation or a shape change of said object in said three dimensions