1 research outputs found
Oops! Predicting Unintentional Action in Video
From just a short glance at a video, we can often tell whether a person's
action is intentional or not. Can we train a model to recognize this? We
introduce a dataset of in-the-wild videos of unintentional action, as well as a
suite of tasks for recognizing, localizing, and anticipating its onset. We
train a supervised neural network as a baseline and analyze its performance
compared to human consistency on the tasks. We also investigate self-supervised
representations that leverage natural signals in our dataset, and show the
effectiveness of an approach that uses the intrinsic speed of video to perform
competitively with highly-supervised pretraining. However, a significant gap
between machine and human performance remains. The project website is available
at https://oops.cs.columbia.eduComment: 11 pages, 9 figure