17,250 research outputs found
Two-Stage Transfer Learning for Heterogeneous Robot Detection and 3D Joint Position Estimation in a 2D Camera Image using CNN
Collaborative robots are becoming more common on factory floors as well as
regular environments, however, their safety still is not a fully solved issue.
Collision detection does not always perform as expected and collision avoidance
is still an active research area. Collision avoidance works well for fixed
robot-camera setups, however, if they are shifted around, Eye-to-Hand
calibration becomes invalid making it difficult to accurately run many of the
existing collision avoidance algorithms. We approach the problem by presenting
a stand-alone system capable of detecting the robot and estimating its
position, including individual joints, by using a simple 2D colour image as an
input, where no Eye-to-Hand calibration is needed. As an extension of previous
work, a two-stage transfer learning approach is used to re-train a
multi-objective convolutional neural network (CNN) to allow it to be used with
heterogeneous robot arms. Our method is capable of detecting the robot in
real-time and new robot types can be added by having significantly smaller
training datasets compared to the requirements of a fully trained network. We
present data collection approach, the structure of the multi-objective CNN, the
two-stage transfer learning training and test results by using real robots from
Universal Robots, Kuka, and Franka Emika. Eventually, we analyse possible
application areas of our method together with the possible improvements.Comment: 6+n pages, ICRA 2019 submissio
Information recovery from rank-order encoded images
The time to detection of a visual stimulus by the primate eye is recorded at
100 ā 150ms. This near instantaneous recognition is in spite of the considerable
processing required by the several stages of the visual pathway to recognise and
react to a visual scene. How this is achieved is still a matter of speculation.
Rank-order codes have been proposed as a means of encoding by the primate
eye in the rapid transmission of the initial burst of information from the sensory
neurons to the brain. We study the efficiency of rank-order codes in encoding
perceptually-important information in an image. VanRullen and Thorpe built a
model of the ganglion cell layers of the retina to simulate and study the viability
of rank-order as a means of encoding by retinal neurons. We validate their model
and quantify the information retrieved from rank-order encoded images in terms
of the visually-important information recovered. Towards this goal, we apply
the āperceptual information preservation algorithmā, proposed by Petrovic and
Xydeas after slight modification. We observe a low information recovery due
to losses suffered during the rank-order encoding and decoding processes. We
propose to minimise these losses to recover maximum information in minimum
time from rank-order encoded images. We first maximise information recovery by
using the pseudo-inverse of the filter-bank matrix to minimise losses during rankorder
decoding. We then apply the biological principle of lateral inhibition to
minimise losses during rank-order encoding. In doing so, we propose the Filteroverlap
Correction algorithm. To test the perfomance of rank-order codes in
a biologically realistic model, we design and simulate a model of the foveal-pit
ganglion cells of the retina keeping close to biological parameters. We use this
as a rank-order encoder and analyse its performance relative to VanRullen and
Thorpeās retinal model
Deep Architectures and Ensembles for Semantic Video Classification
This work addresses the problem of accurate semantic labelling of short
videos. To this end, a multitude of different deep nets, ranging from
traditional recurrent neural networks (LSTM, GRU), temporal agnostic networks
(FV,VLAD,BoW), fully connected neural networks mid-stage AV fusion and others.
Additionally, we also propose a residual architecture-based DNN for video
classification, with state-of-the art classification performance at
significantly reduced complexity. Furthermore, we propose four new approaches
to diversity-driven multi-net ensembling, one based on fast correlation measure
and three incorporating a DNN-based combiner. We show that significant
performance gains can be achieved by ensembling diverse nets and we investigate
factors contributing to high diversity. Based on the extensive YouTube8M
dataset, we provide an in-depth evaluation and analysis of their behaviour. We
show that the performance of the ensemble is state-of-the-art achieving the
highest accuracy on the YouTube-8M Kaggle test data. The performance of the
ensemble of classifiers was also evaluated on the HMDB51 and UCF101 datasets,
and show that the resulting method achieves comparable accuracy with
state-of-the-art methods using similar input features
MERLiN: Mixture Effect Recovery in Linear Networks
Causal inference concerns the identification of cause-effect relationships
between variables, e.g. establishing whether a stimulus affects activity in a
certain brain region. The observed variables themselves often do not constitute
meaningful causal variables, however, and linear combinations need to be
considered. In electroencephalographic studies, for example, one is not
interested in establishing cause-effect relationships between electrode signals
(the observed variables), but rather between cortical signals (the causal
variables) which can be recovered as linear combinations of electrode signals.
We introduce MERLiN (Mixture Effect Recovery in Linear Networks), a family of
causal inference algorithms that implement a novel means of constructing causal
variables from non-causal variables. We demonstrate through application to EEG
data how the basic MERLiN algorithm can be extended for application to
different (neuroimaging) data modalities. Given an observed linear mixture, the
algorithms can recover a causal variable that is a linear effect of another
given variable. That is, MERLiN allows us to recover a cortical signal that is
affected by activity in a certain brain region, while not being a direct effect
of the stimulus. The Python/Matlab implementation for all presented algorithms
is available on https://github.com/sweichwald/MERLi
- ā¦