We present a computational framework for Theory of Mind (ToM): the human ability to make joint inferences about the unobservable beliefs and preferences underlying the observed actions of other agents. These mental state attributions can be understood as Bayesian inferences in a probabilistic generative model for rational action, or planning under uncertain and incomplete information, formalized as a Partially Observable Markov Decision Problem (POMDP). That is, we posit that ToM inferences approximately reconstruct the combination of a reward function and belief state trajectory for an agent based on observing that agent’s action sequence in a given environment. We test this POMDP model by showing human subjects the trajectories of agents moving in simple spatial environments and asking for joint inferences about the agents ’ utilities and beliefs about unobserved aspects of the environment. Our model performs substantially better than two simpler variants: one in which preferences are inferred without reference to an agents ’ beliefs, and another in which beliefs are inferred without reference to the agent’s dynamic observations in the environment. We find that preference inferences are substantially more robust and consistent with our model’s predictions than are belief inferences, in line with classic work showing that the ability to infer goals is more concretely grounded in visual data, develops earlier in infancy, and can be localized to specific neurons in the primate brain.