Learning automatically the structure of object categories remains an
important open problem in computer vision. In this paper, we propose a novel
unsupervised approach that can discover and learn landmarks in object
categories, thus characterizing their structure. Our approach is based on
factorizing image deformations, as induced by a viewpoint change or an object
deformation, by learning a deep neural network that detects landmarks
consistently with such visual effects. Furthermore, we show that the learned
landmarks establish meaningful correspondences between different object
instances in a category without having to impose this requirement explicitly.
We assess the method qualitatively on a variety of object types, natural and
man-made. We also show that our unsupervised landmarks are highly predictive of
manually-annotated landmarks in face benchmark datasets, and can be used to
regress these with a high degree of accuracy.Comment: To be published in ICCV 201