This paper proposes a generative probabilistic model integrating emergent
communication and multi-agent reinforcement learning. The agents plan their
actions by probabilistic inference, called control as inference, and
communicate using messages that are latent variables and estimated based on the
planned actions. Through these messages, each agent can send information about
its actions and know information about the actions of another agent. Therefore,
the agents change their actions according to the estimated messages to achieve
cooperative tasks. This inference of messages can be considered as
communication, and this procedure can be formulated by the Metropolis-Hasting
naming game. Through experiments in the grid world environment, we show that
the proposed PGM can infer meaningful messages to achieve the cooperative task