920 research outputs found
Integrating a Non-Uniformly Sampled Software Retina with a Deep CNN Model
We present a biologically inspired method for pre-processing images applied to CNNs
that reduces their memory requirements while increasing their invariance to scale and rotation
changes. Our method is based on the mammalian retino-cortical transform: a
mapping between a pseudo-randomly tessellated retina model (used to sample an input
image) and a CNN. The aim of this first pilot study is to demonstrate a functional retinaintegrated
CNN implementation and this produced the following results: a network using
the full retino-cortical transform yielded an F1 score of 0.80 on a test set during a 4-way
classification task, while an identical network not using the proposed method yielded an
F1 score of 0.86 on the same task. The method reduced the visual data by e×7, the input
data to the CNN by 40% and the number of CNN training epochs by 64%. These results
demonstrate the viability of our method and hint at the potential of exploiting functional
traits of natural vision systems in CNNs
Efficient Egocentric Visual Perception Combining Eye-tracking, a Software Retina and Deep Learning
We present ongoing work to harness biological approaches to achieving highly
efficient egocentric perception by combining the space-variant imaging
architecture of the mammalian retina with Deep Learning methods. By
pre-processing images collected by means of eye-tracking glasses to control the
fixation locations of a software retina model, we demonstrate that we can
reduce the input to a DCNN by a factor of 3, reduce the required number of
training epochs and obtain over 98% classification rates when training and
validating the system on a database of over 26,000 images of 9 object classes.Comment: Accepted for: EPIC Workshop at the European Conference on Computer
Vision, ECCV201
Parallel stereo vision algorithm
Integrating a stereo-photogrammetric robot
head into a real-time system requires software
solutions that rapidly resolve the stereo correspondence
problem. The stereo-matcher presented in this
paper uses therefore code parallelisation and was
tested on three different processors with x87 and AVX.
The results show that a 5mega pixels colour image can
be matched in 5,55 seconds or as monochrome in 3,3
seconds
Object Edge Contour Localisation Based on HexBinary Feature Matching
This paper addresses the issue of localising object
edge contours in cluttered backgrounds to support robotics
tasks such as grasping and manipulation and also to improve
the potential perceptual capabilities of robot vision systems. Our
approach is based on coarse-to-fine matching of a new recursively
constructed hierarchical, dense, edge-localised descriptor,
the HexBinary, based on the HexHog descriptor structure first
proposed in [1]. Since Binary String image descriptors [2]–
[5] require much lower computational resources, but provide
similar or even better matching performance than Histogram
of Orientated Gradient (HoG) descriptors, we have replaced
the HoG base descriptor fields used in HexHog with Binary
Strings generated from first and second order polar derivative
approximations. The ALOI [6] dataset is used to evaluate
the HexBinary descriptors which we demonstrate to achieve
a superior performance to that of HexHoG [1] for pose
refinement. The validation of our object contour localisation
system shows promising results with correctly labelling ~86% of edgel positions and mis-labelling ~3%
A Portable Active Binocular Robot Vision Architecture for Scene Exploration
We present a portable active binocular robot vision archi-
tecture that integrates a number of visual behaviours. This vision archi-
tecture inherits the abilities of vergence, localisation, recognition and si-
multaneous identification of multiple target object instances. To demon-
strate the portability of our vision architecture, we carry out qualitative
and comparative analysis under two different hardware robotic settings,
feature extraction techniques and viewpoints. Our portable active binoc-
ular robot vision architecture achieved average recognition rates of 93.5%
for fronto-parallel viewpoints and, 83% percentage for anthropomorphic
viewpoints, respectively
Interactive Perception based on Gaussian Process Classification Applied to Household Object Recognition & Sorting
No abstract available
Interactive Perception Based on Gaussian Process Classification for House-Hold Objects Recognition and Sorting
We present an interactive perception model for
object sorting based on Gaussian Process (GP) classification
that is capable of recognizing objects categories from point
cloud data. In our approach, FPFH features are extracted from
point clouds to describe the local 3D shape of objects and
a Bag-of-Words coding method is used to obtain an object-level
vocabulary representation. Multi-class Gaussian Process
classification is employed to provide and probable estimation of
the identity of the object and serves a key role in the interactive
perception cycle – modelling perception confidence. We show
results from simulated input data on both SVM and GP based
multi-class classifiers to validate the recognition accuracy of our
proposed perception model. Our results demonstrate that by
using a GP-based classifier, we obtain true positive classification
rates of up to 80%. Our semi-autonomous object sorting
experiments show that the proposed GP based interactive
sorting approach outperforms random sorting by up to 30%
when applied to scenes comprising configurations of household
objects
Recognising the Clothing Categories from Free-Configuration Using Gaussian-Process-Based Interactive Perception
In this paper, we propose a Gaussian Process- based interactive perception approach for recognising highly- wrinkled clothes. We have integrated this recognition method within a clothes sorting pipeline for the pre-washing stage of an autonomous laundering process. Our approach differs from reported clothing manipulation approaches by allowing the robot to update its perception confidence via numerous interactions with the garments. The classifiers predominantly reported in clothing perception (e.g. SVM, Random Forest) studies do not provide true classification probabilities, due to their inherent structure. In contrast, probabilistic classifiers (of which the Gaussian Process is a popular example) are able to provide predictive probabilities. In our approach, we employ a multi-class Gaussian Process classification using the Laplace approximation for posterior inference and optimising hyper-parameters via marginal likelihood maximisation. Our experimental results show that our approach is able to recognise unknown garments from highly-occluded and wrinkled con- figurations and demonstrates a substantial improvement over non-interactive perception approaches
- …