1 research outputs found
Automatic Generation of Labeled Data for Video-Based Human Pose Analysis via NLP applied to YouTube Subtitles
With recent advancements in computer vision as well as machine learning (ML),
video-based at-home exercise evaluation systems have become a popular topic of
current research. However, performance depends heavily on the amount of
available training data. Since labeled datasets specific to exercising are
rare, we propose a method that makes use of the abundance of fitness videos
available online. Specifically, we utilize the advantage that videos often not
only show the exercises, but also provide language as an additional source of
information. With push-ups as an example, we show that through the analysis of
subtitle data using natural language processing (NLP), it is possible to create
a labeled (irrelevant, relevant correct, relevant incorrect) dataset containing
relevant information for pose analysis. In particular, we show that irrelevant
clips () have significantly different joint visibility values compared
to relevant clips (). Inspecting cluster centroids also show different
poses for the different classes.Comment: 4 pages, 5 figure