3 research outputs found
Do Not Escape From the Manifold: Discovering the Local Coordinates on the Latent Space of GANs
The discovery of the disentanglement properties of the latent space in GANs
motivated a lot of research to find the semantically meaningful directions on
it. In this paper, we suggest that the disentanglement property is closely
related to the geometry of the latent space. In this regard, we propose an
unsupervised method for finding the semantic-factorizing directions on the
intermediate latent space of GANs based on the local geometry. Intuitively, our
proposed method, called Local Basis, finds the principal variation of the
latent space in the neighborhood of the base latent variable. Experimental
results show that the local principal variation corresponds to the semantic
factorization and traversing along it provides strong robustness to image
traversal. Moreover, we suggest an explanation for the limited success in
finding the global traversal directions in the latent space, especially W-space
of StyleGAN2. We show that W-space is warped globally by comparing the local
geometry, discovered from Local Basis, through the metric on Grassmannian
Manifold. The global warpage implies that the latent space is not well-aligned
globally and therefore the global traversal directions are bound to show
limited success on it.Comment: 23 pages, 19 figure
Disentangling Visual Embeddings with Minimal Distributional Assumptions
Interest in understanding and factorizing embedding spaces learned by deep
encoders is growing. Concept discovery methods search the embedding spaces for
interpretable latent components like object shape or color and disentangle them
into individual axes in the embedding space. Yet, the applicability of modern
disentanglement learning techniques or independent component analysis (ICA) is
limited when it comes to vision tasks: They either require training a model of
the complex image-generating process or their rigid stochastic independence
assumptions on the component distribution are violated in practice. In this
work, we identify components in encoder embedding spaces without distributional
assumptions and without training a generator. Instead, we utilize functional
compositionality properties of image-generating processes. We derive two novel
post-hoc component discovery methods and prove theoretical identifiability
guarantees. We study them in realistic visual disentanglement tasks with
correlated components and violated functional assumptions. Our approaches
stably maintain superior performance against 300+ state-of-the-art
disentanglement and component analysis models.Comment: 23 pages. The first two authors contributed equall