The Multi-Robot Continuous Control (MRCC) problem in Deep Reinforcement Learning requires a single neural controller (agent) to learn to control the behavior of multiple robot bodies. When learning to control a single robot body, sensors and motors are arbitrarily connected to the input and output layers of the neural controller, respectively, and this arrangement does not affect the learnability of target robot behaviors. If and how such arrangement can affect learnability in MRCC---when dealing with multiple robots with different body plans---is as of yet unknown.
In this thesis, I demonstrate the following: (1) A neural controller can control a small number of robot bodies with an arbitrary arrangement of sensors to control inputs, and control outputs to motors for locomotion, which explains why arrangements can be ignored in this case. But such arbitrary arrangements do not work well when the number of robot bodies increases. (2) For a given set of robot bodies, some arrangements can make the MRCC problem easier. In certain cases, the variation in MRCC facilitation provided by different arrangements is pronounced. This fact holds both in bodies with parametric differences (e.g. short and long legs) and bodies with topological differences (e.g. differing numbers of legs). Arrangement thus provides a heretofore unknown optimization opportunity in MRCC: searching for arrangements that increasingly facilitate learning of a single policy for different robots