Imitation learning considerably simplifies policy synthesis compared to
alternative approaches by exploiting access to expert demonstrations. For such
imitation policies, errors away from the training samples are particularly
critical. Even rare slip-ups in the policy action outputs can compound quickly
over time, since they lead to unfamiliar future states where the policy is
still more likely to err, eventually causing task failures. We revisit simple
supervised ``behavior cloning'' for conveniently training the policy from
nothing more than pre-recorded demonstrations, but carefully design the model
class to counter the compounding error phenomenon. Our ``memory-consistent
neural network'' (MCNN) outputs are hard-constrained to stay within clearly
specified permissible regions anchored to prototypical ``memory'' training
samples. We provide a guaranteed upper bound for the sub-optimality gap induced
by MCNN policies. Using MCNNs on 9 imitation learning tasks, with MLP,
Transformer, and Diffusion backbones, spanning dexterous robotic manipulation
and driving, proprioceptive inputs and visual inputs, and varying sizes and
types of demonstration data, we find large and consistent gains in performance,
validating that MCNNs are better-suited than vanilla deep neural networks for
imitation learning applications. Website:
https://sites.google.com/view/mcnn-imitationComment: 22 pages (9 main pages