35 research outputs found
CG3D: Compositional Generation for Text-to-3D via Gaussian Splatting
With the onset of diffusion-based generative models and their ability to
generate text-conditioned images, content generation has received a massive
invigoration. Recently, these models have been shown to provide useful guidance
for the generation of 3D graphics assets. However, existing work in
text-conditioned 3D generation faces fundamental constraints: (i) inability to
generate detailed, multi-object scenes, (ii) inability to textually control
multi-object configurations, and (iii) physically realistic scene composition.
In this work, we propose CG3D, a method for compositionally generating scalable
3D assets that resolves these constraints. We find that explicit Gaussian
radiance fields, parameterized to allow for compositions of objects, possess
the capability to enable semantically and physically consistent scenes. By
utilizing a guidance framework built around this explicit representation, we
show state of the art results, capable of even exceeding the guiding diffusion
model in terms of object combinations and physics accuracy
Polarized 3D: High-Quality Depth Sensing with Polarization Cues
Coarse depth maps can be enhanced by using the shape information from polarization cues. We propose a framework to combine surface normals from polarization (hereafter polarization normals) with an aligned depth map. Polarization normals have not been used for depth enhancement before. This is because polarization normals suffer from physics-based artifacts, such as azimuthal ambiguity, refractive distortion and fronto-parallel signal degradation. We propose a framework to overcome these key challenges, allowing the benefits of polarization to be used to enhance depth maps. Our results demonstrate improvement with respect to state-of-the-art 3D reconstruction techniques.Charles Stark Draper Laboratory (Doctoral Fellowship)Singapore. Ministry of Education (Academic Research Foundation MOE2013-T2-1-159)Singapore. National Research Foundation (Singapore University of Technology and Design
MIME: Minority Inclusion for Majority Group Enhancement of AI Performance
Several papers have rightly included minority groups in artificial
intelligence (AI) training data to improve test inference for minority groups
and/or society-at-large. A society-at-large consists of both minority and
majority stakeholders. A common misconception is that minority inclusion does
not increase performance for majority groups alone. In this paper, we make the
surprising finding that including minority samples can improve test error for
the majority group. In other words, minority group inclusion leads to majority
group enhancements (MIME) in performance. A theoretical existence proof of the
MIME effect is presented and found to be consistent with experimental results
on six different datasets. Project webpage:
https://visual.ee.ucla.edu/mime.htm
Resolving Multi-path Interference in Time-of-Flight Imaging via Modulation Frequency Diversity and Sparse Regularization
Time-of-flight (ToF) cameras calculate depth maps by reconstructing phase
shifts of amplitude-modulated signals. For broad illumination or transparent
objects, reflections from multiple scene points can illuminate a given pixel,
giving rise to an erroneous depth map. We report here a sparsity regularized
solution that separates K-interfering components using multiple modulation
frequency measurements. The method maps ToF imaging to the general framework of
spectral estimation theory and has applications in improving depth profiles and
exploiting multiple scattering.Comment: 11 Pages, 4 figures, appeared with minor changes in Optics Letter
Coded time of flight cameras: sparse deconvolution to address multipath interference and recover time profiles
Time of flight cameras produce real-time range maps at a relatively low cost using continuous wave amplitude modulation and demodulation. However, they are geared to measure range (or phase) for a single reflected bounce of light and suffer from systematic errors due to multipath interference.
We re-purpose the conventional time of flight device for a new goal: to recover per-pixel sparse time profiles expressed as a sequence of impulses. With this modification, we show that we can not only address multipath interference but also enable new applications such as recovering depth of near-transparent surfaces, looking through diffusers and creating time-profile movies of sweeping light.
Our key idea is to formulate the forward amplitude modulated light propagation as a convolution with custom codes, record samples by introducing a simple sequence of electronic time delays, and perform sparse deconvolution to recover sequences of Diracs that correspond to multipath returns. Applications to computer vision include ranging of near-transparent objects and subsurface imaging through diffusers. Our low cost prototype may lead to new insights regarding forward and inverse problems in light transport.United States. Defense Advanced Research Projects Agency (DARPA Young Faculty Award)Alfred P. Sloan Foundation (Fellowship)Massachusetts Institute of Technology. Media Laboratory. Camera Culture Grou