4 research outputs found
Reinforcement Learning-based Visual Navigation with Information-Theoretic Regularization
To enhance the cross-target and cross-scene generalization of target-driven
visual navigation based on deep reinforcement learning (RL), we introduce an
information-theoretic regularization term into the RL objective. The
regularization maximizes the mutual information between navigation actions and
visual observation transforms of an agent, thus promoting more informed
navigation decisions. This way, the agent models the action-observation
dynamics by learning a variational generative model. Based on the model, the
agent generates (imagines) the next observation from its current observation
and navigation target. This way, the agent learns to understand the causality
between navigation actions and the changes in its observations, which allows
the agent to predict the next action for navigation by comparing the current
and the imagined next observations. Cross-target and cross-scene evaluations on
the AI2-THOR framework show that our method attains at least a
improvement of average success rate over some state-of-the-art models. We
further evaluate our model in two real-world settings: navigation in unseen
indoor scenes from a discrete Active Vision Dataset (AVD) and continuous
real-world environments with a TurtleBot.We demonstrate that our navigation
model is able to successfully achieve navigation tasks in these scenarios.
Videos and models can be found in the supplementary material.Comment: 11 pages, corresponding author: Kai Xu ([email protected]) and
Jun Wang ([email protected]