Monte Carlo methods have become increasingly relevant for control of
non-differentiable systems, approximate dynamics models and learning from data.
These methods scale to high-dimensional spaces and are effective at the
non-convex optimizations often seen in robot learning. We look at sample-based
methods from the perspective of inference-based control, specifically posterior
policy iteration. From this perspective, we highlight how Gaussian noise priors
produce rough control actions that are unsuitable for physical robot
deployment. Considering smoother Gaussian process priors, as used in episodic
reinforcement learning and motion planning, we demonstrate how smoother model
predictive control can be achieved using online sequential inference. This
inference is realized through an efficient factorization of the action
distribution and a novel means of optimizing the likelihood temperature to
improve importance sampling accuracy. We evaluate this approach on several
high-dimensional robot control tasks, matching the sample efficiency of prior
heuristic methods while also ensuring smoothness. Simulation results can be
seen at https://monte-carlo-ppi.github.io/.Comment: 43 pages, 37 figures. Conference on Robot Learning 202