Planning problems in partially observable environments cannot be solved
directly with convolutional networks and require some form of memory. But, even
memory networks with sophisticated addressing schemes are unable to learn
intelligent reasoning satisfactorily due to the complexity of simultaneously
learning to access memory and plan. To mitigate these challenges we introduce
the Memory Augmented Control Network (MACN). The proposed network architecture
consists of three main parts. The first part uses convolutions to extract
features and the second part uses a neural network-based planning module to
pre-plan in the environment. The third part uses a network controller that
learns to store those specific instances of past information that are necessary
for planning. The performance of the network is evaluated in discrete grid
world environments for path planning in the presence of simple and complex
obstacles. We show that our network learns to plan and can generalize to new
environments