Closed-Loop Unsupervised Representation Disentanglement with $\beta$-VAE
  Distillation and Diffusion Probabilistic Feedback

Jin, Xin; Li, Bohan; Li, Ziqiang; Liu, Jinming; Xie, BAAO; Yang, Tao; Zeng, Wenjun; Zhang, Wenyao

Closed-Loop Unsupervised Representation Disentanglement with $\beta$ -VAE Distillation and Diffusion Probabilistic Feedback

Authors: Xin Jin
Bohan Li
Ziqiang Li
Jinming Liu
BAAO Xie
Tao Yang
Wenjun Zeng
Wenyao Zhang
Publication date: 4 February 2024
Publisher

Abstract

Representation disentanglement may help AI fundamentally understand the real world and thus benefit both discrimination and generation tasks. It currently has at least three unresolved core issues: (i) heavy reliance on label annotation and synthetic data -- causing poor generalization on natural scenarios; (ii) heuristic/hand-craft disentangling constraints make it hard to adaptively achieve an optimal training trade-off; (iii) lacking reasonable evaluation metric, especially for the real label-free data. To address these challenges, we propose a \textbf{C}losed-\textbf{L}oop unsupervised representation \textbf{Dis}entanglement approach dubbed \textbf{CL-Dis}. Specifically, we use diffusion-based autoencoder (Diff-AE) as a backbone while resorting to

\beta

-VAE as a co-pilot to extract semantically disentangled representations. The strong generation ability of diffusion model and the good disentanglement ability of VAE model are complementary. To strengthen disentangling, VAE-latent distillation and diffusion-wise feedback are interconnected in a closed-loop system for a further mutual promotion. Then, a self-supervised \textbf{Navigation} strategy is introduced to identify interpretable semantic directions in the disentangled latent space. Finally, a new metric based on content tracking is designed to evaluate the disentanglement effect. Experiments demonstrate the superiority of CL-Dis on applications like real image manipulation and visual analysis

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2402.02346

Last time updated on 28/08/2024