3 research outputs found
We never go out of Style: Motion Disentanglement by Subspace Decomposition of Latent Space
Real-world objects perform complex motions that involve multiple independent
motion components. For example, while talking, a person continuously changes
their expressions, head, and body pose. In this work, we propose a novel method
to decompose motion in videos by using a pretrained image GAN model. We
discover disentangled motion subspaces in the latent space of widely used
style-based GAN models that are semantically meaningful and control a single
explainable motion component. The proposed method uses only a few
ground truth video sequences to obtain such subspaces. We extensively evaluate
the disentanglement properties of motion subspaces on face and car datasets,
quantitatively and qualitatively. Further, we present results for multiple
downstream tasks such as motion editing, and selective motion transfer, e.g.
transferring only facial expressions without training for it.Comment: AI for content creation, CVPRW-202
Exploring Attribute Variations in Style-based GANs using Diffusion Models
Existing attribute editing methods treat semantic attributes as binary,
resulting in a single edit per attribute. However, attributes such as
eyeglasses, smiles, or hairstyles exhibit a vast range of diversity. In this
work, we formulate the task of \textit{diverse attribute editing} by modeling
the multidimensional nature of attribute edits. This enables users to generate
multiple plausible edits per attribute. We capitalize on disentangled latent
spaces of pretrained GANs and train a Denoising Diffusion Probabilistic Model
(DDPM) to learn the latent distribution for diverse edits. Specifically, we
train DDPM over a dataset of edit latent directions obtained by embedding image
pairs with a single attribute change. This leads to latent subspaces that
enable diverse attribute editing. Applying diffusion in the highly compressed
latent space allows us to model rich distributions of edits within limited
computational resources. Through extensive qualitative and quantitative
experiments conducted across a range of datasets, we demonstrate the
effectiveness of our approach for diverse attribute editing. We also showcase
the results of our method applied for 3D editing of various face attributes.Comment: Neurips Workshop on Diffusion Models 202
Strata-NeRF : Neural Radiance Fields for Stratified Scenes
Neural Radiance Field (NeRF) approaches learn the underlying 3D
representation of a scene and generate photo-realistic novel views with high
fidelity. However, most proposed settings concentrate on modelling a single
object or a single level of a scene. However, in the real world, we may capture
a scene at multiple levels, resulting in a layered capture. For example,
tourists usually capture a monument's exterior structure before capturing the
inner structure. Modelling such scenes in 3D with seamless switching between
levels can drastically improve immersive experiences. However, most existing
techniques struggle in modelling such scenes. We propose Strata-NeRF, a single
neural radiance field that implicitly captures a scene with multiple levels.
Strata-NeRF achieves this by conditioning the NeRFs on Vector Quantized (VQ)
latent representations which allow sudden changes in scene structure. We
evaluate the effectiveness of our approach in multi-layered synthetic dataset
comprising diverse scenes and then further validate its generalization on the
real-world RealEstate10K dataset. We find that Strata-NeRF effectively captures
stratified scenes, minimizes artifacts, and synthesizes high-fidelity views
compared to existing approaches.Comment: ICCV 2023, Project Page: https://ankitatiisc.github.io/Strata-NeRF