We consider the problem of simultaneously clustering and learning a linear
representation of data lying close to a union of low-dimensional manifolds, a
fundamental task in machine learning and computer vision. When the manifolds
are assumed to be linear subspaces, this reduces to the classical problem of
subspace clustering, which has been studied extensively over the past two
decades. Unfortunately, many real-world datasets such as natural images can not
be well approximated by linear subspaces. On the other hand, numerous works
have attempted to learn an appropriate transformation of the data, such that
data is mapped from a union of general non-linear manifolds to a union of
linear subspaces (with points from the same manifold being mapped to the same
subspace). However, many existing works have limitations such as assuming
knowledge of the membership of samples to clusters, requiring high sampling
density, or being shown theoretically to learn trivial representations. In this
paper, we propose to optimize the Maximal Coding Rate Reduction metric with
respect to both the data representation and a novel doubly stochastic cluster
membership, inspired by state-of-the-art subspace clustering results. We give a
parameterization of such a representation and membership, allowing efficient
mini-batching and one-shot initialization. Experiments on CIFAR-10, -20, -100,
and TinyImageNet-200 datasets show that the proposed method is much more
accurate and scalable than state-of-the-art deep clustering methods, and
further learns a latent linear representation of the data