131 research outputs found
A Style-Based Generator Architecture for Generative Adversarial Networks
We propose an alternative generator architecture for generative adversarial
networks, borrowing from style transfer literature. The new architecture leads
to an automatically learned, unsupervised separation of high-level attributes
(e.g., pose and identity when trained on human faces) and stochastic variation
in the generated images (e.g., freckles, hair), and it enables intuitive,
scale-specific control of the synthesis. The new generator improves the
state-of-the-art in terms of traditional distribution quality metrics, leads to
demonstrably better interpolation properties, and also better disentangles the
latent factors of variation. To quantify interpolation quality and
disentanglement, we propose two new, automated methods that are applicable to
any generator architecture. Finally, we introduce a new, highly varied and
high-quality dataset of human faces.Comment: CVPR 2019 final versio
MegaPortraits: One-shot Megapixel Neural Head Avatars
In this work, we advance the neural head avatar technology to the megapixel
resolution while focusing on the particularly challenging task of cross-driving
synthesis, i.e., when the appearance of the driving image is substantially
different from the animated source image. We propose a set of new neural
architectures and training methods that can leverage both medium-resolution
video data and high-resolution image data to achieve the desired levels of
rendered image quality and generalization to novel views and motion. We
demonstrate that suggested architectures and methods produce convincing
high-resolution neural avatars, outperforming the competitors in the
cross-driving scenario. Lastly, we show how a trained high-resolution neural
avatar model can be distilled into a lightweight student model which runs in
real-time and locks the identities of neural avatars to several dozens of
pre-defined source images. Real-time operation and identity lock are essential
for many practical applications head avatar systems
Re-Training StyleGAN -- A First Step Towards Building Large, Scalable Synthetic Facial Datasets
StyleGAN is a state-of-art generative adversarial network architecture that
generates random 2D high-quality synthetic facial data samples. In this paper,
we recap the StyleGAN architecture and training methodology and present our
experiences of retraining it on a number of alternative public datasets.
Practical issues and challenges arising from the retraining process are
discussed. Tests and validation results are presented and a comparative
analysis of several different re-trained StyleGAN weightings is provided 1. The
role of this tool in building large, scalable datasets of synthetic facial data
is also discussed
TEGLO: High Fidelity Canonical Texture Mapping from Single-View Images
Recent work in Neural Fields (NFs) learn 3D representations from
class-specific single view image collections. However, they are unable to
reconstruct the input data preserving high-frequency details. Further, these
methods do not disentangle appearance from geometry and hence are not suitable
for tasks such as texture transfer and editing. In this work, we propose TEGLO
(Textured EG3D-GLO) for learning 3D representations from single view
in-the-wild image collections for a given class of objects. We accomplish this
by training a conditional Neural Radiance Field (NeRF) without any explicit 3D
supervision. We equip our method with editing capabilities by creating a dense
correspondence mapping to a 2D canonical space. We demonstrate that such
mapping enables texture transfer and texture editing without requiring meshes
with shared topology. Our key insight is that by mapping the input image pixels
onto the texture space we can achieve near perfect reconstruction (>= 74 dB
PSNR at 1024^2 resolution). Our formulation allows for high quality 3D
consistent novel view synthesis with high-frequency details at megapixel image
resolution
- …