10 research outputs found

    A Decoupled 3D Facial Shape Model by Adversarial Training

    Get PDF
    Data-driven generative 3D face models are used to compactly encode facial shape data into meaningful parametric representations. A desirable property of these models is their ability to effectively decouple natural sources of variation, in particular identity and expression. While factorized representations have been proposed for that purpose, they are still limited in the variability they can capture and may present modeling artifacts when applied to tasks such as expression transfer. In this work, we explore a new direction with Generative Adversarial Networks and show that they contribute to better face modeling performances, especially in decoupling natural factors, while also achieving more diverse samples. To train the model we introduce a novel architecture that combines a 3D generator with a 2D discriminator that leverages conventional CNNs, where the two components are bridged by a geometry mapping layer. We further present a training scheme, based on auxiliary classifiers, to explicitly disentangle identity and expression attributes. Through quantitative and qualitative results on standard face datasets, we illustrate the benefits of our model and demonstrate that it outperforms competing state of the art methods in terms of decoupling and diversity.Comment: camera-ready version for ICCV'1

    A Decoupled 3D Facial Shape Model by Adversarial Training

    Get PDF
    International audienceData-driven generative 3D face models are used to compactly encode facial shape data into meaningful parametric representations. A desirable property of these models is their ability to effectively decouple natural sources of variation, in particular identity and expression. While factorizedrepresentations have been proposed for that purpose, they are still limited in the variability they can capture and may present modeling artifacts when applied to tasks such as expression transfer. In this work, we explore a new direction with Generative Adversarial Networks and show thatthey contribute to better face modeling performances, especially in decoupling natural factors, while also achieving more diverse samples. To train the model we introduce a novel architecture that combines a 3D generator with a 2D discriminator that leverages conventional CNNs, where the two components are bridged by a geometry mapping layer. We further present a training scheme, based on auxiliary classifiers, to explicitly disentangle identity and expression attributes. Through quantitative and qualitative results on standard face datasets, we illustrate the benefits of our model and demonstrate that it outperforms competing state of the art methods in terms of decoupling and diversity

    Generating High-Resolution 3D Faces and Bodies Using VQ-VAE-2 with PixelSNAIL Networks on 2D Representations

    Get PDF
    Modeling and representing 3D shapes of the human body and face is a prominent field due to its applications in the healthcare, clothes, and movie industry. In our work, we tackled the problem of 3D face and body synthesis by reducing 3D meshes to 2D image representations. We show that the face can naturally be modeled on a 2D grid. At the same time, for more challenging 3D body geometries, we proposed a novel non-bijective 3D–2D conversion method representing the 3D body mesh as a plurality of rendered projections on the 2D grid. Then, we trained a state-of-the-art vector-quantized variational autoencoder (VQ-VAE-2) to learn a latent representation of 2D images and fit a PixelSNAIL autoregressive model to sample novel synthetic meshes. We evaluated our method versus a classical one based on principal component analysis (PCA) by sampling from the empirical cumulative distribution of the PCA scores. We used the empirical distributions of two commonly used metrics, specificity and diversity, to quantitatively demonstrate that the synthetic faces generated with our method are statistically closer to real faces when compared with the PCA ones. Our experiment on the 3D body geometry requires further research to match the test set statistics but shows promising results

    DSM-Net: Disentangled Structured Mesh Net for Controllable Generation of Fine Geometry

    Get PDF
    3D shape generation is a fundamental operation in computer graphics. While significant progress has been made, especially with recent deep generative models, it remains a challenge to synthesize high-quality geometric shapes with rich detail and complex structure, in a controllable manner. To tackle this, we introduce DSM-Net, a deep neural network that learns a disentangled structured mesh representation for 3D shapes, where two key aspects of shapes, geometry and structure, are encoded in a synergistic manner to ensure plausibility of the generated shapes, while also being disentangled as much as possible. This supports a range of novel shape generation applications with intuitive control, such as interpolation of structure (geometry) while keeping geometry (structure) unchanged. To achieve this, we simultaneously learn structure and geometry through variational autoencoders (VAEs) in a hierarchical manner for both, with bijective mappings at each level. In this manner we effectively encode geometry and structure in separate latent spaces, while ensuring their compatibility: the structure is used to guide the geometry and vice versa. At the leaf level, the part geometry is represented using a conditional part VAE, to encode high-quality geometric details, guided by the structure context as the condition. Our method not only supports controllable generation applications, but also produces high-quality synthesized shapes, outperforming state-of-the-art methods

    DSG-Net: Learning disentangled structure and geometry for 3D shape generation

    Get PDF
    3D shape generation is a fundamental operation in computer graphics. While significant progress has been made, especially with recent deep generative models, it remains a challenge to synthesize high-quality shapes with rich geometric details and complex structures, in a controllable manner. To tackle this, we introduce DSG-Net, a deep neural network that learns a disentangled structured & geometric mesh representation for 3D shapes, where two key aspects of shapes, geometry and structure, are encoded in a synergistic manner to ensure plausibility of the generated shapes, while also being disentangled as much as possible. This supports a range of novel shape generation applications with disentangled control, such as interpolation of structure (geometry) while keeping geometry (structure) unchanged. To achieve this, we simultaneously learn structure and geometry through variational autoencoders (VAEs) in a hierarchical manner for both, with bijective mappings at each level. In this manner, we effectively encode geometry and structure in separate latent spaces, while ensuring their compatibility: the structure is used to guide the geometry and vice versa. At the leaf level, the part geometry is represented using a conditional part VAE, to encode high-quality geometric details, guided by the structure context as the condition. Our method not only supports controllable generation applications, but also produces high-quality synthesized shapes, outperforming state-of-the-art methods

    A Decoupled 3D Facial Shape Model by Adversarial Training

    No full text
    International audienceData-driven generative 3D face models are used to compactly encode facial shape data into meaningful parametric representations. A desirable property of these models is their ability to effectively decouple natural sources of variation, in particular identity and expression. While factorizedrepresentations have been proposed for that purpose, they are still limited in the variability they can capture and may present modeling artifacts when applied to tasks such as expression transfer. In this work, we explore a new direction with Generative Adversarial Networks and show thatthey contribute to better face modeling performances, especially in decoupling natural factors, while also achieving more diverse samples. To train the model we introduce a novel architecture that combines a 3D generator with a 2D discriminator that leverages conventional CNNs, where the two components are bridged by a geometry mapping layer. We further present a training scheme, based on auxiliary classifiers, to explicitly disentangle identity and expression attributes. Through quantitative and qualitative results on standard face datasets, we illustrate the benefits of our model and demonstrate that it outperforms competing state of the art methods in terms of decoupling and diversity
    corecore