conference paper

Equivariant Unsupervised Object Detection with Learnable Riesz Transform and Composite Spatial Transformers

Abstract

International audienceBuilding models robust to transformations such as rotation, scale, and translation is a challenge in machine learning and computer vision. Existing approaches often provide only partial and discrete equivariance (group equivariance) or rely on supervision or very abundant data to learn equivariant representations. To achieve fine-grained equivariance from low data, we combine and improve over both approaches. We propose a novel, learnable, Riesz-transform-based architecture that achieves built-in group equivariance for translation, rotation, and scale. We combine it with a Spatial Transform Network (STN) tailored for the sequential estimation of composite transformations, reducing the combinatorial data requirements for learning fine-grained equivariance. Improved generalization guarantees and extensive experiments demonstrate that our approach brings improvements over state-of-the-art methods in unsupervised representation learning and object discovery, even more so in low-data regimes

Similar works

Full text

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.

Licence: info:eu-repo/semantics/OpenAccess