37 research outputs found
Memory Augmented Control Networks
Planning problems in partially observable environments cannot be solved
directly with convolutional networks and require some form of memory. But, even
memory networks with sophisticated addressing schemes are unable to learn
intelligent reasoning satisfactorily due to the complexity of simultaneously
learning to access memory and plan. To mitigate these challenges we introduce
the Memory Augmented Control Network (MACN). The proposed network architecture
consists of three main parts. The first part uses convolutions to extract
features and the second part uses a neural network-based planning module to
pre-plan in the environment. The third part uses a network controller that
learns to store those specific instances of past information that are necessary
for planning. The performance of the network is evaluated in discrete grid
world environments for path planning in the presence of simple and complex
obstacles. We show that our network learns to plan and can generalize to new
environments
Value Propagation Networks
We present Value Propagation (VProp), a set of parameter-efficient
differentiable planning modules built on Value Iteration which can successfully
be trained using reinforcement learning to solve unseen tasks, has the
capability to generalize to larger map sizes, and can learn to navigate in
dynamic environments. We show that the modules enable learning to plan when the
environment also includes stochastic elements, providing a cost-efficient
learning system to build low-level size-invariant planners for a variety of
interactive navigation problems. We evaluate on static and dynamic
configurations of MazeBase grid-worlds, with randomly generated environments of
several different sizes, and on a StarCraft navigation scenario, with more
complex dynamics, and pixels as input.Comment: Updated to match ICLR 2019 OpenReview's versio