We present a machine learning framework for multi-agent systems to learn both
the optimal policy for maximizing the rewards and the encoding of the high
dimensional visual observation. The encoding is useful for sharing local visual
observations with other agents under communication resource constraints. The
actor-encoder encodes the raw images and chooses an action based on local
observations and messages sent by the other agents. The machine learning agent
generates not only an actuator command to the physical device, but also a
communication message to the other agents. We formulate a reinforcement
learning problem, which extends the action space to consider the communication
action as well. The feasibility of the reinforcement learning framework is
demonstrated using a 3D simulation environment with two collaborating agents.
The environment provides realistic visual observations to be used and shared
between the two agents.Comment: AIAA SciTech 201