33,747 research outputs found
Human Motion Trajectory Prediction: A Survey
With growing numbers of intelligent autonomous systems in human environments,
the ability of such systems to perceive, understand and anticipate human
behavior becomes increasingly important. Specifically, predicting future
positions of dynamic agents and planning considering such predictions are key
tasks for self-driving vehicles, service robots and advanced surveillance
systems. This paper provides a survey of human motion trajectory prediction. We
review, analyze and structure a large selection of work from different
communities and propose a taxonomy that categorizes existing methods based on
the motion modeling approach and level of contextual information used. We
provide an overview of the existing datasets and performance metrics. We
discuss limitations of the state of the art and outline directions for further
research.Comment: Submitted to the International Journal of Robotics Research (IJRR),
37 page
ShapeCodes: Self-Supervised Feature Learning by Lifting Views to Viewgrids
We introduce an unsupervised feature learning approach that embeds 3D shape
information into a single-view image representation. The main idea is a
self-supervised training objective that, given only a single 2D image, requires
all unseen views of the object to be predictable from learned features. We
implement this idea as an encoder-decoder convolutional neural network. The
network maps an input image of an unknown category and unknown viewpoint to a
latent space, from which a deconvolutional decoder can best "lift" the image to
its complete viewgrid showing the object from all viewing angles. Our
class-agnostic training procedure encourages the representation to capture
fundamental shape primitives and semantic regularities in a data-driven
manner---without manual semantic labels. Our results on two widely-used shape
datasets show 1) our approach successfully learns to perform "mental rotation"
even for objects unseen during training, and 2) the learned latent space is a
powerful representation for object recognition, outperforming several existing
unsupervised feature learning methods.Comment: To appear at ECCV 201
Crossmodal Attentive Skill Learner
This paper presents the Crossmodal Attentive Skill Learner (CASL), integrated
with the recently-introduced Asynchronous Advantage Option-Critic (A2OC)
architecture [Harb et al., 2017] to enable hierarchical reinforcement learning
across multiple sensory inputs. We provide concrete examples where the approach
not only improves performance in a single task, but accelerates transfer to new
tasks. We demonstrate the attention mechanism anticipates and identifies useful
latent features, while filtering irrelevant sensor modalities during execution.
We modify the Arcade Learning Environment [Bellemare et al., 2013] to support
audio queries, and conduct evaluations of crossmodal learning in the Atari 2600
game Amidar. Finally, building on the recent work of Babaeizadeh et al. [2017],
we open-source a fast hybrid CPU-GPU implementation of CASL.Comment: International Conference on Autonomous Agents and Multiagent Systems
(AAMAS) 2018, NIPS 2017 Deep Reinforcement Learning Symposiu
- …