7,937 research outputs found
WaveNeRF: Wavelet-based Generalizable Neural Radiance Fields
Neural Radiance Field (NeRF) has shown impressive performance in novel view
synthesis via implicit scene representation. However, it usually suffers from
poor scalability as requiring densely sampled images for each new scene.
Several studies have attempted to mitigate this problem by integrating
Multi-View Stereo (MVS) technique into NeRF while they still entail a
cumbersome fine-tuning process for new scenes. Notably, the rendering quality
will drop severely without this fine-tuning process and the errors mainly
appear around the high-frequency features. In the light of this observation, we
design WaveNeRF, which integrates wavelet frequency decomposition into MVS and
NeRF to achieve generalizable yet high-quality synthesis without any per-scene
optimization. To preserve high-frequency information when generating 3D feature
volumes, WaveNeRF builds Multi-View Stereo in the Wavelet domain by integrating
the discrete wavelet transform into the classical cascade MVS, which
disentangles high-frequency information explicitly. With that, disentangled
frequency features can be injected into classic NeRF via a novel hybrid neural
renderer to yield faithful high-frequency details, and an intuitive
frequency-guided sampling strategy can be designed to suppress artifacts around
high-frequency regions. Extensive experiments over three widely studied
benchmarks show that WaveNeRF achieves superior generalizable radiance field
modeling when only given three images as input.Comment: Accepted to ICCV 2023. Project website:
https://mxuai.github.io/WaveNeRF
C2F2NeUS: Cascade Cost Frustum Fusion for High Fidelity and Generalizable Neural Surface Reconstruction
There is an emerging effort to combine the two popular 3D frameworks using
Multi-View Stereo (MVS) and Neural Implicit Surfaces (NIS) with a specific
focus on the few-shot / sparse view setting. In this paper, we introduce a
novel integration scheme that combines the multi-view stereo with neural signed
distance function representations, which potentially overcomes the limitations
of both methods. MVS uses per-view depth estimation and cross-view fusion to
generate accurate surfaces, while NIS relies on a common coordinate volume.
Based on this strategy, we propose to construct per-view cost frustum for finer
geometry estimation, and then fuse cross-view frustums and estimate the
implicit signed distance functions to tackle artifacts that are due to noise
and holes in the produced surface reconstruction. We further apply a cascade
frustum fusion strategy to effectively captures global-local information and
structural consistency. Finally, we apply cascade sampling and a
pseudo-geometric loss to foster stronger integration between the two
architectures. Extensive experiments demonstrate that our method reconstructs
robust surfaces and outperforms existing state-of-the-art methods.Comment: Accepted by ICCV202
Constraining Depth Map Geometry for Multi-View Stereo: A Dual-Depth Approach with Saddle-shaped Depth Cells
Learning-based multi-view stereo (MVS) methods deal with predicting accurate
depth maps to achieve an accurate and complete 3D representation. Despite the
excellent performance, existing methods ignore the fact that a suitable depth
geometry is also critical in MVS. In this paper, we demonstrate that different
depth geometries have significant performance gaps, even using the same depth
prediction error. Therefore, we introduce an ideal depth geometry composed of
Saddle-Shaped Cells, whose predicted depth map oscillates upward and downward
around the ground-truth surface, rather than maintaining a continuous and
smooth depth plane. To achieve it, we develop a coarse-to-fine framework called
Dual-MVSNet (DMVSNet), which can produce an oscillating depth plane.
Technically, we predict two depth values for each pixel (Dual-Depth), and
propose a novel loss function and a checkerboard-shaped selecting strategy to
constrain the predicted depth geometry. Compared to existing methods,DMVSNet
achieves a high rank on the DTU benchmark and obtains the top performance on
challenging scenes of Tanks and Temples, demonstrating its strong performance
and generalization ability. Our method also points to a new research direction
for considering depth geometry in MVS.Comment: Accepted by ICCV 202
- …