193 research outputs found
SceneDreamer: Unbounded 3D Scene Generation from 2D Image Collections
In this work, we present SceneDreamer, an unconditional generative model for
unbounded 3D scenes, which synthesizes large-scale 3D landscapes from random
noise. Our framework is learned from in-the-wild 2D image collections only,
without any 3D annotations. At the core of SceneDreamer is a principled
learning paradigm comprising 1) an efficient yet expressive 3D scene
representation, 2) a generative scene parameterization, and 3) an effective
renderer that can leverage the knowledge from 2D images. Our approach begins
with an efficient bird's-eye-view (BEV) representation generated from simplex
noise, which includes a height field for surface elevation and a semantic field
for detailed scene semantics. This BEV scene representation enables 1)
representing a 3D scene with quadratic complexity, 2) disentangled geometry and
semantics, and 3) efficient training. Moreover, we propose a novel generative
neural hash grid to parameterize the latent space based on 3D positions and
scene semantics, aiming to encode generalizable features across various scenes.
Lastly, a neural volumetric renderer, learned from 2D image collections through
adversarial training, is employed to produce photorealistic images. Extensive
experiments demonstrate the effectiveness of SceneDreamer and superiority over
state-of-the-art methods in generating vivid yet diverse unbounded 3D worlds.Comment: Project Page https://scene-dreamer.github.io/ Code
https://github.com/FrozenBurning/SceneDreame
Text2Light: Zero-Shot Text-Driven HDR Panorama Generation
High-quality HDRIs(High Dynamic Range Images), typically HDR panoramas, are
one of the most popular ways to create photorealistic lighting and 360-degree
reflections of 3D scenes in graphics. Given the difficulty of capturing HDRIs,
a versatile and controllable generative model is highly desired, where layman
users can intuitively control the generation process. However, existing
state-of-the-art methods still struggle to synthesize high-quality panoramas
for complex scenes. In this work, we propose a zero-shot text-driven framework,
Text2Light, to generate 4K+ resolution HDRIs without paired training data.
Given a free-form text as the description of the scene, we synthesize the
corresponding HDRI with two dedicated steps: 1) text-driven panorama generation
in low dynamic range(LDR) and low resolution, and 2) super-resolution inverse
tone mapping to scale up the LDR panorama both in resolution and dynamic range.
Specifically, to achieve zero-shot text-driven panorama generation, we first
build dual codebooks as the discrete representation for diverse environmental
textures. Then, driven by the pre-trained CLIP model, a text-conditioned global
sampler learns to sample holistic semantics from the global codebook according
to the input text. Furthermore, a structure-aware local sampler learns to
synthesize LDR panoramas patch-by-patch, guided by holistic semantics. To
achieve super-resolution inverse tone mapping, we derive a continuous
representation of 360-degree imaging from the LDR panorama as a set of
structured latent codes anchored to the sphere. This continuous representation
enables a versatile module to upscale the resolution and dynamic range
simultaneously. Extensive experiments demonstrate the superior capability of
Text2Light in generating high-quality HDR panoramas. In addition, we show the
feasibility of our work in realistic rendering and immersive VR.Comment: SIGGRAPH Asia 2022; Project Page
https://frozenburning.github.io/projects/text2light/ Codes are available at
https://github.com/FrozenBurning/Text2Ligh
Day-Ahead Energy Planning with 100% Electric Vehicle Penetration in the Nordic Region by 2050
This paper presents the day-ahead energy planning of passenger cars with 100% electric vehicle (EV) penetration in the Nordic region by 2050. EVs will play an important role in the future energy systems which can both reduce the greenhouse gas (GHG) emissions from the transport sector and provide the demand side flexibility required by smart grids. On the other hand, the EVs will increase the electricity consumption. In order to quantify the electricity consumption increase due to the 100% EV penetration in the Nordic region to facilitate the power system planning studies, the day-ahead energy planning of EVs has been investigated with different EV charging scenarios. Five EV charging scenarios have been considered in the energy planning analysis which are: uncontrolled charging all day, uncontrolled charging at home, timed charging, spot price based charging all day and spot price based charging at home. The demand profiles of the five charging analysis show that timed charging is the least favorable charging option and the spot priced based EV charging might induce high peak demands. The EV charging demand will have a considerable share of the energy consumption in the future Nordic power system
SparseNeRF: Distilling Depth Ranking for Few-shot Novel View Synthesis
Neural Radiance Field (NeRF) significantly degrades when only a limited
number of views are available. To complement the lack of 3D information,
depth-based models, such as DSNeRF and MonoSDF, explicitly assume the
availability of accurate depth maps of multiple views. They linearly scale the
accurate depth maps as supervision to guide the predicted depth of few-shot
NeRFs. However, accurate depth maps are difficult and expensive to capture due
to wide-range depth distances in the wild.
In this work, we present a new Sparse-view NeRF (SparseNeRF) framework that
exploits depth priors from real-world inaccurate observations. The inaccurate
depth observations are either from pre-trained depth models or coarse depth
maps of consumer-level depth sensors. Since coarse depth maps are not strictly
scaled to the ground-truth depth maps, we propose a simple yet effective
constraint, a local depth ranking method, on NeRFs such that the expected depth
ranking of the NeRF is consistent with that of the coarse depth maps in local
patches. To preserve the spatial continuity of the estimated depth of NeRF, we
further propose a spatial continuity constraint to encourage the consistency of
the expected depth continuity of NeRF with coarse depth maps. Surprisingly,
with simple depth ranking constraints, SparseNeRF outperforms all
state-of-the-art few-shot NeRF methods (including depth-based models) on
standard LLFF and DTU datasets. Moreover, we collect a new dataset NVS-RGBD
that contains real-world depth maps from Azure Kinect, ZED 2, and iPhone 13
Pro. Extensive experiments on NVS-RGBD dataset also validate the superiority
and generalizability of SparseNeRF. Project page is available at
https://sparsenerf.github.io/.Comment: Technical Report, Project page: https://sparsenerf.github.io
PERF: Panoramic Neural Radiance Field from a Single Panorama
Neural Radiance Field (NeRF) has achieved substantial progress in novel view
synthesis given multi-view images. Recently, some works have attempted to train
a NeRF from a single image with 3D priors. They mainly focus on a limited field
of view with a few occlusions, which greatly limits their scalability to
real-world 360-degree panoramic scenarios with large-size occlusions. In this
paper, we present PERF, a 360-degree novel view synthesis framework that trains
a panoramic neural radiance field from a single panorama. Notably, PERF allows
3D roaming in a complex scene without expensive and tedious image collection.
To achieve this goal, we propose a novel collaborative RGBD inpainting method
and a progressive inpainting-and-erasing method to lift up a 360-degree 2D
scene to a 3D scene. Specifically, we first predict a panoramic depth map as
initialization given a single panorama and reconstruct visible 3D regions with
volume rendering. Then we introduce a collaborative RGBD inpainting approach
into a NeRF for completing RGB images and depth maps from random views, which
is derived from an RGB Stable Diffusion model and a monocular depth estimator.
Finally, we introduce an inpainting-and-erasing strategy to avoid inconsistent
geometry between a newly-sampled view and reference views. The two components
are integrated into the learning of NeRFs in a unified optimization framework
and achieve promising results. Extensive experiments on Replica and a new
dataset PERF-in-the-wild demonstrate the superiority of our PERF over
state-of-the-art methods. Our PERF can be widely used for real-world
applications, such as panorama-to-3D, text-to-3D, and 3D scene stylization
applications. Project page and code are available at
https://perf-project.github.io/ and https://github.com/perf-project/PeRF.Comment: Project Page: https://perf-project.github.io/ , Code:
https://github.com/perf-project/PeR
PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation
We present PrimDiffusion, the first diffusion-based framework for 3D human
generation. Devising diffusion models for 3D human generation is difficult due
to the intensive computational cost of 3D representations and the articulated
topology of 3D humans. To tackle these challenges, our key insight is operating
the denoising diffusion process directly on a set of volumetric primitives,
which models the human body as a number of small volumes with radiance and
kinematic information. This volumetric primitives representation marries the
capacity of volumetric representations with the efficiency of primitive-based
rendering. Our PrimDiffusion framework has three appealing properties: 1)
compact and expressive parameter space for the diffusion model, 2) flexible 3D
representation that incorporates human prior, and 3) decoder-free rendering for
efficient novel-view and novel-pose synthesis. Extensive experiments validate
that PrimDiffusion outperforms state-of-the-art methods in 3D human generation.
Notably, compared to GAN-based methods, our PrimDiffusion supports real-time
rendering of high-quality 3D humans at a resolution of once the
denoising process is done. We also demonstrate the flexibility of our framework
on training-free conditional generation such as texture transfer and 3D
inpainting.Comment: NeurIPS 2023; Project page
https://frozenburning.github.io/projects/primdiffusion/ Code available at
https://github.com/FrozenBurning/PrimDiffusio
- …