17,153 research outputs found
Optimization Beyond the Convolution: Generalizing Spatial Relations with End-to-End Metric Learning
To operate intelligently in domestic environments, robots require the ability
to understand arbitrary spatial relations between objects and to generalize
them to objects of varying sizes and shapes. In this work, we present a novel
end-to-end approach to generalize spatial relations based on distance metric
learning. We train a neural network to transform 3D point clouds of objects to
a metric space that captures the similarity of the depicted spatial relations,
using only geometric models of the objects. Our approach employs gradient-based
optimization to compute object poses in order to imitate an arbitrary target
relation by reducing the distance to it under the learned metric. Our results
based on simulated and real-world experiments show that the proposed method
enables robots to generalize spatial relations to unknown objects over a
continuous spectrum.Comment: Accepted for publication at ICRA2018. Supplementary Video:
http://spatialrelations.cs.uni-freiburg.de
Metric Learning for Generalizing Spatial Relations to New Objects
Human-centered environments are rich with a wide variety of spatial relations
between everyday objects. For autonomous robots to operate effectively in such
environments, they should be able to reason about these relations and
generalize them to objects with different shapes and sizes. For example, having
learned to place a toy inside a basket, a robot should be able to generalize
this concept using a spoon and a cup. This requires a robot to have the
flexibility to learn arbitrary relations in a lifelong manner, making it
challenging for an expert to pre-program it with sufficient knowledge to do so
beforehand. In this paper, we address the problem of learning spatial relations
by introducing a novel method from the perspective of distance metric learning.
Our approach enables a robot to reason about the similarity between pairwise
spatial relations, thereby enabling it to use its previous knowledge when
presented with a new relation to imitate. We show how this makes it possible to
learn arbitrary spatial relations from non-expert users using a small number of
examples and in an interactive manner. Our extensive evaluation with real-world
data demonstrates the effectiveness of our method in reasoning about a
continuous spectrum of spatial relations and generalizing them to new objects.Comment: Accepted at the 2017 IEEE/RSJ International Conference on Intelligent
Robots and Systems. The new Freiburg Spatial Relations Dataset and a demo
video of our approach running on the PR-2 robot are available at our project
website: http://spatialrelations.cs.uni-freiburg.d
Recommended from our members
Automated CT and MRI Liver Segmentation and Biometry Using a Generalized Convolutional Neural Network.
PurposeTo assess feasibility of training a convolutional neural network (CNN) to automate liver segmentation across different imaging modalities and techniques used in clinical practice and apply this to enable automation of liver biometry.MethodsWe trained a 2D U-Net CNN for liver segmentation in two stages using 330 abdominal MRI and CT exams acquired at our institution. First, we trained the neural network with non-contrast multi-echo spoiled-gradient-echo (SGPR)images with 300 MRI exams to provide multiple signal-weightings. Then, we used transfer learning to generalize the CNN with additional images from 30 contrast-enhanced MRI and CT exams.We assessed the performance of the CNN using a distinct multi-institutional data set curated from multiple sources (n = 498 subjects). Segmentation accuracy was evaluated by computing Dice scores. Utilizing these segmentations, we computed liver volume from CT and T1-weighted (T1w) MRI exams, and estimated hepatic proton- density-fat-fraction (PDFF) from multi-echo T2*w MRI exams. We compared quantitative volumetry and PDFF estimates between automated and manual segmentation using Pearson correlation and Bland-Altman statistics.ResultsDice scores were 0.94 ± 0.06 for CT (n = 230), 0.95 ± 0.03 (n = 100) for T1w MR, and 0.92 ± 0.05 for T2*w MR (n = 169). Liver volume measured by manual and automated segmentation agreed closely for CT (95% limit-of-agreement (LoA) = [-298 mL, 180 mL]) and T1w MR (LoA = [-358 mL, 180 mL]). Hepatic PDFF measured by the two segmentations also agreed closely (LoA = [-0.62%, 0.80%]).ConclusionsUtilizing a transfer-learning strategy, we have demonstrated the feasibility of a CNN to be generalized to perform liver segmentations across different imaging techniques and modalities. With further refinement and validation, CNNs may have broad applicability for multimodal liver volumetry and hepatic tissue characterization
Deep Reinforcement Learning on a Budget: 3D Control and Reasoning Without a Supercomputer
An important goal of research in Deep Reinforcement Learning in mobile
robotics is to train agents capable of solving complex tasks, which require a
high level of scene understanding and reasoning from an egocentric perspective.
When trained from simulations, optimal environments should satisfy a currently
unobtainable combination of high-fidelity photographic observations, massive
amounts of different environment configurations and fast simulation speeds. In
this paper we argue that research on training agents capable of complex
reasoning can be simplified by decoupling from the requirement of high fidelity
photographic observations. We present a suite of tasks requiring complex
reasoning and exploration in continuous, partially observable 3D environments.
The objective is to provide challenging scenarios and a robust baseline agent
architecture that can be trained on mid-range consumer hardware in under 24h.
Our scenarios combine two key advantages: (i) they are based on a simple but
highly efficient 3D environment (ViZDoom) which allows high speed simulation
(12000fps); (ii) the scenarios provide the user with a range of difficulty
settings, in order to identify the limitations of current state of the art
algorithms and network architectures. We aim to increase accessibility to the
field of Deep-RL by providing baselines for challenging scenarios where new
ideas can be iterated on quickly. We argue that the community should be able to
address challenging problems in reasoning of mobile agents without the need for
a large compute infrastructure
- …