23,495 research outputs found
Metric Learning for Generalizing Spatial Relations to New Objects
Human-centered environments are rich with a wide variety of spatial relations
between everyday objects. For autonomous robots to operate effectively in such
environments, they should be able to reason about these relations and
generalize them to objects with different shapes and sizes. For example, having
learned to place a toy inside a basket, a robot should be able to generalize
this concept using a spoon and a cup. This requires a robot to have the
flexibility to learn arbitrary relations in a lifelong manner, making it
challenging for an expert to pre-program it with sufficient knowledge to do so
beforehand. In this paper, we address the problem of learning spatial relations
by introducing a novel method from the perspective of distance metric learning.
Our approach enables a robot to reason about the similarity between pairwise
spatial relations, thereby enabling it to use its previous knowledge when
presented with a new relation to imitate. We show how this makes it possible to
learn arbitrary spatial relations from non-expert users using a small number of
examples and in an interactive manner. Our extensive evaluation with real-world
data demonstrates the effectiveness of our method in reasoning about a
continuous spectrum of spatial relations and generalizing them to new objects.Comment: Accepted at the 2017 IEEE/RSJ International Conference on Intelligent
Robots and Systems. The new Freiburg Spatial Relations Dataset and a demo
video of our approach running on the PR-2 robot are available at our project
website: http://spatialrelations.cs.uni-freiburg.d
Learning Human Motion Models for Long-term Predictions
We propose a new architecture for the learning of predictive spatio-temporal
motion models from data alone. Our approach, dubbed the Dropout Autoencoder
LSTM, is capable of synthesizing natural looking motion sequences over long
time horizons without catastrophic drift or motion degradation. The model
consists of two components, a 3-layer recurrent neural network to model
temporal aspects and a novel auto-encoder that is trained to implicitly recover
the spatial structure of the human skeleton via randomly removing information
about joints during training time. This Dropout Autoencoder (D-AE) is then used
to filter each predicted pose of the LSTM, reducing accumulation of error and
hence drift over time. Furthermore, we propose new evaluation protocols to
assess the quality of synthetic motion sequences even for which no ground truth
data exists. The proposed protocols can be used to assess generated sequences
of arbitrary length. Finally, we evaluate our proposed method on two of the
largest motion-capture datasets available to date and show that our model
outperforms the state-of-the-art on a variety of actions, including cyclic and
acyclic motion, and that it can produce natural looking sequences over longer
time horizons than previous methods
Disturbance Grassmann Kernels for Subspace-Based Learning
In this paper, we focus on subspace-based learning problems, where data
elements are linear subspaces instead of vectors. To handle this kind of data,
Grassmann kernels were proposed to measure the space structure and used with
classifiers, e.g., Support Vector Machines (SVMs). However, the existing
discriminative algorithms mostly ignore the instability of subspaces, which
would cause the classifiers misled by disturbed instances. Thus we propose
considering all potential disturbance of subspaces in learning processes to
obtain more robust classifiers. Firstly, we derive the dual optimization of
linear classifiers with disturbance subject to a known distribution, resulting
in a new kernel, Disturbance Grassmann (DG) kernel. Secondly, we research into
two kinds of disturbance, relevant to the subspace matrix and singular values
of bases, with which we extend the Projection kernel on Grassmann manifolds to
two new kernels. Experiments on action data indicate that the proposed kernels
perform better compared to state-of-the-art subspace-based methods, even in a
worse environment.Comment: This paper include 3 figures, 10 pages, and has been accpeted to
SIGKDD'1
On Interpretability of Deep Learning based Skin Lesion Classifiers using Concept Activation Vectors
Deep learning based medical image classifiers have shown remarkable prowess
in various application areas like ophthalmology, dermatology, pathology, and
radiology. However, the acceptance of these Computer-Aided Diagnosis (CAD)
systems in real clinical setups is severely limited primarily because their
decision-making process remains largely obscure. This work aims at elucidating
a deep learning based medical image classifier by verifying that the model
learns and utilizes similar disease-related concepts as described and employed
by dermatologists. We used a well-trained and high performing neural network
developed by REasoning for COmplex Data (RECOD) Lab for classification of three
skin tumours, i.e. Melanocytic Naevi, Melanoma and Seborrheic Keratosis and
performed a detailed analysis on its latent space. Two well established and
publicly available skin disease datasets, PH2 and derm7pt, are used for
experimentation. Human understandable concepts are mapped to RECOD image
classification model with the help of Concept Activation Vectors (CAVs),
introducing a novel training and significance testing paradigm for CAVs. Our
results on an independent evaluation set clearly shows that the classifier
learns and encodes human understandable concepts in its latent representation.
Additionally, TCAV scores (Testing with CAVs) suggest that the neural network
indeed makes use of disease-related concepts in the correct way when making
predictions. We anticipate that this work can not only increase confidence of
medical practitioners on CAD but also serve as a stepping stone for further
development of CAV-based neural network interpretation methods.Comment: Accepted for the IEEE International Joint Conference on Neural
Networks (IJCNN) 202
- …