We introduce MoCA, a Motion-Conditioned Image Animation approach for video
editing. It leverages a simple decomposition of the video editing problem into
image editing followed by motion-conditioned image animation. Furthermore,
given the lack of robust evaluation datasets for video editing, we introduce a
new benchmark that measures edit capability across a wide variety of tasks,
such as object replacement, background changes, style changes, and motion
edits. We present a comprehensive human evaluation of the latest video editing
methods along with MoCA, on our proposed benchmark. MoCA establishes a new
state-of-the-art, demonstrating greater human preference win-rate, and
outperforming notable recent approaches including Dreamix (63%), MasaCtrl
(75%), and Tune-A-Video (72%), with especially significant improvements for
motion edits.Comment: Project page: https://facebookresearch.github.io/MoC