87 research outputs found
Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars
3D-aware generative adversarial networks (GANs) synthesize high-fidelity and
multi-view-consistent facial images using only collections of single-view 2D
imagery. Towards fine-grained control over facial attributes, recent efforts
incorporate 3D Morphable Face Model (3DMM) to describe deformation in
generative radiance fields either explicitly or implicitly. Explicit methods
provide fine-grained expression control but cannot handle topological changes
caused by hair and accessories, while implicit ones can model varied topologies
but have limited generalization caused by the unconstrained deformation fields.
We propose a novel 3D GAN framework for unsupervised learning of generative,
high-quality and 3D-consistent facial avatars from unstructured 2D images. To
achieve both deformation accuracy and topological flexibility, we propose a 3D
representation called Generative Texture-Rasterized Tri-planes. The proposed
representation learns Generative Neural Textures on top of parametric mesh
templates and then projects them into three orthogonal-viewed feature planes
through rasterization, forming a tri-plane feature representation for volume
rendering. In this way, we combine both fine-grained expression control of
mesh-guided explicit deformation and the flexibility of implicit volumetric
representation. We further propose specific modules for modeling mouth interior
which is not taken into account by 3DMM. Our method demonstrates
state-of-the-art 3D-aware synthesis quality and animation ability through
extensive experiments. Furthermore, serving as 3D prior, our animatable 3D
representation boosts multiple applications including one-shot facial avatars
and 3D-aware stylization.Comment: Project page: https://mrtornado24.github.io/Next3D
Breathing New Life into 3D Assets with Generative Repainting
Diffusion-based text-to-image models ignited immense attention from the
vision community, artists, and content creators. Broad adoption of these models
is due to significant improvement in the quality of generations and efficient
conditioning on various modalities, not just text. However, lifting the rich
generative priors of these 2D models into 3D is challenging. Recent works have
proposed various pipelines powered by the entanglement of diffusion models and
neural fields. We explore the power of pretrained 2D diffusion models and
standard 3D neural radiance fields as independent, standalone tools and
demonstrate their ability to work together in a non-learned fashion. Such
modularity has the intrinsic advantage of eased partial upgrades, which became
an important property in such a fast-paced domain. Our pipeline accepts any
legacy renderable geometry, such as textured or untextured meshes, orchestrates
the interaction between 2D generative refinement and 3D consistency enforcement
tools, and outputs a painted input geometry in several formats. We conduct a
large-scale study on a wide range of objects and categories from the
ShapeNetSem dataset and demonstrate the advantages of our approach, both
qualitatively and quantitatively. Project page:
https://www.obukhov.ai/repainting_3d_asset
Recommended from our members
Surface-based flow visualization
This is the author's peer-reviewed final manuscript, as accepted by the publisher. The published article is copyrighted by Elsevier and can be found at: http://www.journals.elsevier.com/computers-and-graphics/.With increasing computing power, it is possible to process more complex fluid simulations. However, a gap between increasing\ud
data size and our ability to visualize them still remains. Despite the great amount of progress that has been made in the field of\ud
flow visualization over the last two decades, a number of challenges remain. Whilst the visualization of 2D flow has many good\ud
solutions, the visualization of 3D flow still poses many problems. Challenges such as domain coverage, speed of computation, and\ud
perception remain key directions for further research. Flow visualization with a focus on surface-based techniques forms the basis\ud
of this literature survey, including surface construction techniques and visualization methods applied to surfaces. We detail our\ud
investigation into these algorithms with discussions of their applicability and their relative strengths and drawbacks. We review the\ud
most important challenges when considering such visualizations. The result is an up-to-date overview of the current state-of-the-art\ud
that highlights both solved and unsolved problems in this rapidly evolving branch of research
Computational Aesthetics for Fashion
The online fashion industry is growing fast and with it, the need for advanced systems able to automatically solve different tasks in an accurate way. With the rapid advance of digital technologies, Deep Learning has played an important role in Computational Aesthetics, an interdisciplinary area that tries to bridge fine art, design, and computer science. Specifically, Computational Aesthetics aims to automatize human aesthetic judgments with computational methods. In this thesis, we focus on three applications of computer vision in fashion, and we discuss how Computational Aesthetics helps solve them accurately
Generative Semi-supervised Learning with Meta-Optimized Synthetic Samples
Semi-supervised learning (SSL) is a promising approach for training deep
classification models using labeled and unlabeled datasets. However, existing
SSL methods rely on a large unlabeled dataset, which may not always be
available in many real-world applications due to legal constraints (e.g.,
GDPR). In this paper, we investigate the research question: Can we train SSL
models without real unlabeled datasets? Instead of using real unlabeled
datasets, we propose an SSL method using synthetic datasets generated from
generative foundation models trained on datasets containing millions of samples
in diverse domains (e.g., ImageNet). Our main concepts are identifying
synthetic samples that emulate unlabeled samples from generative foundation
models and training classifiers using these synthetic samples. To achieve this,
our method is formulated as an alternating optimization problem: (i)
meta-learning of generative foundation models and (ii) SSL of classifiers using
real labeled and synthetic unlabeled samples. For (i), we propose a
meta-learning objective that optimizes latent variables to generate samples
that resemble real labeled samples and minimize the validation loss. For (ii),
we propose a simple unsupervised loss function that regularizes the feature
extractors of classifiers to maximize the performance improvement obtained from
synthetic samples. We confirm that our method outperforms baselines using
generative foundation models on SSL. We also demonstrate that our methods
outperform SSL using real unlabeled datasets in scenarios with extremely small
amounts of labeled datasets. This suggests that synthetic samples have the
potential to provide improvement gains more efficiently than real unlabeled
data.Comment: Accepted to the 15th Asian Conference on Machine Learning (ACML2023);
a preprint of the camera-ready versio
RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars
Synthesizing high-fidelity head avatars is a central problem for computer
vision and graphics. While head avatar synthesis algorithms have advanced
rapidly, the best ones still face great obstacles in real-world scenarios. One
of the vital causes is inadequate datasets -- 1) current public datasets can
only support researchers to explore high-fidelity head avatars in one or two
task directions; 2) these datasets usually contain digital head assets with
limited data volume, and narrow distribution over different attributes. In this
paper, we present RenderMe-360, a comprehensive 4D human head dataset to drive
advance in head avatar research. It contains massive data assets, with 243+
million complete head frames, and over 800k video sequences from 500 different
identities captured by synchronized multi-view cameras at 30 FPS. It is a
large-scale digital library for head avatars with three key attributes: 1) High
Fidelity: all subjects are captured by 60 synchronized, high-resolution 2K
cameras in 360 degrees. 2) High Diversity: The collected subjects vary from
different ages, eras, ethnicities, and cultures, providing abundant materials
with distinctive styles in appearance and geometry. Moreover, each subject is
asked to perform various motions, such as expressions and head rotations, which
further extend the richness of assets. 3) Rich Annotations: we provide
annotations with different granularities: cameras' parameters, matting, scan,
2D/3D facial landmarks, FLAME fitting, and text description.
Based on the dataset, we build a comprehensive benchmark for head avatar
research, with 16 state-of-the-art methods performed on five main tasks: novel
view synthesis, novel expression synthesis, hair rendering, hair editing, and
talking head generation. Our experiments uncover the strengths and weaknesses
of current methods. RenderMe-360 opens the door for future exploration in head
avatars.Comment: Technical Report; Project Page: 36; Github Link:
https://github.com/RenderMe-360/RenderMe-36
AI-generated Content for Various Data Modalities: A Survey
AI-generated content (AIGC) methods aim to produce text, images, videos, 3D
assets, and other media using AI algorithms. Due to its wide range of
applications and the demonstrated potential of recent works, AIGC developments
have been attracting lots of attention recently, and AIGC methods have been
developed for various data modalities, such as image, video, text, 3D shape (as
voxels, point clouds, meshes, and neural implicit fields), 3D scene, 3D human
avatar (body and head), 3D motion, and audio -- each presenting different
characteristics and challenges. Furthermore, there have also been many
significant developments in cross-modality AIGC methods, where generative
methods can receive conditioning input in one modality and produce outputs in
another. Examples include going from various modalities to image, video, 3D
shape, 3D scene, 3D avatar (body and head), 3D motion (skeleton and avatar),
and audio modalities. In this paper, we provide a comprehensive review of AIGC
methods across different data modalities, including both single-modality and
cross-modality methods, highlighting the various challenges, representative
works, and recent technical directions in each setting. We also survey the
representative datasets throughout the modalities, and present comparative
results for various modalities. Moreover, we also discuss the challenges and
potential future research directions
Mobile Wound Assessment and 3D Modeling from a Single Image
The prevalence of camera-enabled mobile phones have made mobile wound assessment a viable treatment option for millions of previously difficult to reach patients. We have designed a complete mobile wound assessment platform to ameliorate the many challenges related to chronic wound care. Chronic wounds and infections are the most severe, costly and fatal types of wounds, placing them at the center of mobile wound assessment. Wound physicians assess thousands of single-view wound images from all over the world, and it may be difficult to determine the location of the wound on the body, for example, if the wound is taken at close range. In our solution, end-users capture an image of the wound by taking a picture with their mobile camera. The wound image is segmented and classified using modern convolution neural networks, and is stored securely in the cloud for remote tracking. We use an interactive semi-automated approach to allow users to specify the location of the wound on the body. To accomplish this we have created, to the best our knowledge, the first 3D human surface anatomy labeling system, based off the current NYU and Anatomy Mapper labeling systems. To interactively view wounds in 3D, we have presented an efficient projective texture mapping algorithm for texturing wounds onto a 3D human anatomy model. In so doing, we have demonstrated an approach to 3D wound reconstruction that works even for a single wound image
Recommended from our members
Surface-Based Flow Visualization
With increasing computing power, it is possible to process more complex fluid simulations. However, a gap between increasing
data size and our ability to visualize them still remains. Despite the great amount of progress that has been made in the field of
flow visualization over the last two decades, a number of challenges remain. Whilst the visualization of 2D flow has many good
solutions, the visualization of 3D flow still poses many problems. Challenges such as domain coverage, speed of computation, and
perception remain key directions for further research. Flow visualization with a focus on surface-based techniques forms the basis
of this literature survey, including surface construction techniques and visualization methods applied to surfaces. We detail our
investigation into these algorithms with discussions of their applicability and their relative strengths and drawbacks. We review the
most important challenges when considering such visualizations. The result is an up-to-date overview of the current state-of-the-art
that highlights both solved and unsolved problems in this rapidly evolving branch of research.Keywords: Flow visualization, Survey, Surface
- …