3 research outputs found

    Do Not Escape From the Manifold: Discovering the Local Coordinates on the Latent Space of GANs

    Full text link
    The discovery of the disentanglement properties of the latent space in GANs motivated a lot of research to find the semantically meaningful directions on it. In this paper, we suggest that the disentanglement property is closely related to the geometry of the latent space. In this regard, we propose an unsupervised method for finding the semantic-factorizing directions on the intermediate latent space of GANs based on the local geometry. Intuitively, our proposed method, called Local Basis, finds the principal variation of the latent space in the neighborhood of the base latent variable. Experimental results show that the local principal variation corresponds to the semantic factorization and traversing along it provides strong robustness to image traversal. Moreover, we suggest an explanation for the limited success in finding the global traversal directions in the latent space, especially W-space of StyleGAN2. We show that W-space is warped globally by comparing the local geometry, discovered from Local Basis, through the metric on Grassmannian Manifold. The global warpage implies that the latent space is not well-aligned globally and therefore the global traversal directions are bound to show limited success on it.Comment: 23 pages, 19 figure

    Disentangling Visual Embeddings with Minimal Distributional Assumptions

    Full text link
    Interest in understanding and factorizing embedding spaces learned by deep encoders is growing. Concept discovery methods search the embedding spaces for interpretable latent components like object shape or color and disentangle them into individual axes in the embedding space. Yet, the applicability of modern disentanglement learning techniques or independent component analysis (ICA) is limited when it comes to vision tasks: They either require training a model of the complex image-generating process or their rigid stochastic independence assumptions on the component distribution are violated in practice. In this work, we identify components in encoder embedding spaces without distributional assumptions and without training a generator. Instead, we utilize functional compositionality properties of image-generating processes. We derive two novel post-hoc component discovery methods and prove theoretical identifiability guarantees. We study them in realistic visual disentanglement tasks with correlated components and violated functional assumptions. Our approaches stably maintain superior performance against 300+ state-of-the-art disentanglement and component analysis models.Comment: 23 pages. The first two authors contributed equall
    corecore