243,306 research outputs found
Learning to Synthesize a 4D RGBD Light Field from a Single Image
We present a machine learning algorithm that takes as input a 2D RGB image
and synthesizes a 4D RGBD light field (color and depth of the scene in each ray
direction). For training, we introduce the largest public light field dataset,
consisting of over 3300 plenoptic camera light fields of scenes containing
flowers and plants. Our synthesis pipeline consists of a convolutional neural
network (CNN) that estimates scene geometry, a stage that renders a Lambertian
light field using that geometry, and a second CNN that predicts occluded rays
and non-Lambertian effects. Our algorithm builds on recent view synthesis
methods, but is unique in predicting RGBD for each light field ray and
improving unsupervised single image depth estimation by enforcing consistency
of ray depths that should intersect the same scene point. Please see our
supplementary video at https://youtu.be/yLCvWoQLnmsComment: International Conference on Computer Vision (ICCV) 201
Neural View-Interpolation for Sparse Light Field Video
We suggest representing light field (LF) videos as "one-off" neural networks (NN), i.e., a learned mapping from view-plus-time coordinates to high-resolution color values, trained on sparse views. Initially, this sounds like a bad idea for three main reasons: First, a NN LF will likely have less quality than a same-sized pixel basis representation. Second, only few training data, e.g., 9 exemplars per frame are available for sparse LF videos. Third, there is no generalization across LFs, but across view and time instead. Consequently, a network needs to be trained for each LF video. Surprisingly, these problems can turn into substantial advantages: Other than the linear pixel basis, a NN has to come up with a compact, non-linear i.e., more intelligent, explanation of color, conditioned on the sparse view and time coordinates. As observed for many NN however, this representation now is interpolatable: if the image output for sparse view coordinates is plausible, it is for all intermediate, continuous coordinates as well. Our specific network architecture involves a differentiable occlusion-aware warping step, which leads to a compact set of trainable parameters and consequently fast learning and fast execution
Transport-Based Neural Style Transfer for Smoke Simulations
Artistically controlling fluids has always been a challenging task.
Optimization techniques rely on approximating simulation states towards target
velocity or density field configurations, which are often handcrafted by
artists to indirectly control smoke dynamics. Patch synthesis techniques
transfer image textures or simulation features to a target flow field. However,
these are either limited to adding structural patterns or augmenting coarse
flows with turbulent structures, and hence cannot capture the full spectrum of
different styles and semantically complex structures. In this paper, we propose
the first Transport-based Neural Style Transfer (TNST) algorithm for volumetric
smoke data. Our method is able to transfer features from natural images to
smoke simulations, enabling general content-aware manipulations ranging from
simple patterns to intricate motifs. The proposed algorithm is physically
inspired, since it computes the density transport from a source input smoke to
a desired target configuration. Our transport-based approach allows direct
control over the divergence of the stylization velocity field by optimizing
incompressible and irrotational potentials that transport smoke towards
stylization. Temporal consistency is ensured by transporting and aligning
subsequent stylized velocities, and 3D reconstructions are computed by
seamlessly merging stylizations from different camera viewpoints.Comment: ACM Transaction on Graphics (SIGGRAPH ASIA 2019), additional
materials: http://www.byungsoo.me/project/neural-flow-styl
Depth Assisted Full Resolution Network for Single Image-based View Synthesis
Researches in novel viewpoint synthesis majorly focus on interpolation from
multi-view input images. In this paper, we focus on a more challenging and
ill-posed problem that is to synthesize novel viewpoints from one single input
image. To achieve this goal, we propose a novel deep learning-based technique.
We design a full resolution network that extracts local image features with the
same resolution of the input, which contributes to derive high resolution and
prevent blurry artifacts in the final synthesized images. We also involve a
pre-trained depth estimation network into our system, and thus 3D information
is able to be utilized to infer the flow field between the input and the target
image. Since the depth network is trained by depth order information between
arbitrary pairs of points in the scene, global image features are also involved
into our system. Finally, a synthesis layer is used to not only warp the
observed pixels to the desired positions but also hallucinate the missing
pixels with recorded pixels. Experiments show that our technique performs well
on images of various scenes, and outperforms the state-of-the-art techniques
- …