10,756 research outputs found

    Learning to Synthesize a 4D RGBD Light Field from a Single Image

    Full text link
    We present a machine learning algorithm that takes as input a 2D RGB image and synthesizes a 4D RGBD light field (color and depth of the scene in each ray direction). For training, we introduce the largest public light field dataset, consisting of over 3300 plenoptic camera light fields of scenes containing flowers and plants. Our synthesis pipeline consists of a convolutional neural network (CNN) that estimates scene geometry, a stage that renders a Lambertian light field using that geometry, and a second CNN that predicts occluded rays and non-Lambertian effects. Our algorithm builds on recent view synthesis methods, but is unique in predicting RGBD for each light field ray and improving unsupervised single image depth estimation by enforcing consistency of ray depths that should intersect the same scene point. Please see our supplementary video at https://youtu.be/yLCvWoQLnmsComment: International Conference on Computer Vision (ICCV) 201

    Neural View-Interpolation for Sparse Light Field Video

    No full text
    We suggest representing light field (LF) videos as "one-off" neural networks (NN), i.e., a learned mapping from view-plus-time coordinates to high-resolution color values, trained on sparse views. Initially, this sounds like a bad idea for three main reasons: First, a NN LF will likely have less quality than a same-sized pixel basis representation. Second, only few training data, e.g., 9 exemplars per frame are available for sparse LF videos. Third, there is no generalization across LFs, but across view and time instead. Consequently, a network needs to be trained for each LF video. Surprisingly, these problems can turn into substantial advantages: Other than the linear pixel basis, a NN has to come up with a compact, non-linear i.e., more intelligent, explanation of color, conditioned on the sparse view and time coordinates. As observed for many NN however, this representation now is interpolatable: if the image output for sparse view coordinates is plausible, it is for all intermediate, continuous coordinates as well. Our specific network architecture involves a differentiable occlusion-aware warping step, which leads to a compact set of trainable parameters and consequently fast learning and fast execution

    Pycortex: an interactive surface visualizer for fMRI.

    Get PDF
    Surface visualizations of fMRI provide a comprehensive view of cortical activity. However, surface visualizations are difficult to generate and most common visualization techniques rely on unnecessary interpolation which limits the fidelity of the resulting maps. Furthermore, it is difficult to understand the relationship between flattened cortical surfaces and the underlying 3D anatomy using tools available currently. To address these problems we have developed pycortex, a Python toolbox for interactive surface mapping and visualization. Pycortex exploits the power of modern graphics cards to sample volumetric data on a per-pixel basis, allowing dense and accurate mapping of the voxel grid across the surface. Anatomical and functional information can be projected onto the cortical surface. The surface can be inflated and flattened interactively, aiding interpretation of the correspondence between the anatomical surface and the flattened cortical sheet. The output of pycortex can be viewed using WebGL, a technology compatible with modern web browsers. This allows complex fMRI surface maps to be distributed broadly online without requiring installation of complex software

    Neural Actor: Neural Free-view Synthesis of Human Actors with Pose Control

    Get PDF
    We propose Neural Actor (NA), a new method for high-quality synthesis of humans from arbitrary viewpoints and under arbitrary controllable poses. Our method is built upon recent neural scene representation and rendering works which learn representations of geometry and appearance from only 2D images. While existing works demonstrated compelling rendering of static scenes and playback of dynamic scenes, photo-realistic reconstruction and rendering of humans with neural implicit methods, in particular under user-controlled novel poses, is still difficult. To address this problem, we utilize a coarse body model as the proxy to unwarp the surrounding 3D space into a canonical pose. A neural radiance field learns pose-dependent geometric deformations and pose- and view-dependent appearance effects in the canonical space from multi-view video input. To synthesize novel views of high fidelity dynamic geometry and appearance, we leverage 2D texture maps defined on the body model as latent variables for predicting residual deformations and the dynamic appearance. Experiments demonstrate that our method achieves better quality than the state-of-the-arts on playback as well as novel pose synthesis, and can even generalize well to new poses that starkly differ from the training poses. Furthermore, our method also supports body shape control of the synthesized results

    A low-cost, practical acquisition and rendering pipeline for real-time free-viewpoint video communication

    Get PDF
    We present a semiautomatic real-time pipeline for capturing and rendering free-viewpoint video using passive stereo matching. The pipeline is simple and achieves agreeable quality in real time on a system of commodity web cameras and a single desktop computer. We suggest an automatic algorithm to compute a constrained search space for an efficient and robust hierarchical stereo reconstruction algorithm. Due to our fast reconstruction times, we can eliminate the need for an expensive global surface reconstruction with a combination of high coverage and aggressive filtering. Finally, we employ a novel color weighting scheme that generates credible new viewpoints without noticeable seams, while keeping the computational complexity low. The simplicity and low cost of the system make it an accessible and more practical alternative for many applications compared to previous methods

    3D-TV Production from Conventional Cameras for Sports Broadcast

    Get PDF
    3DTV production of live sports events presents a challenging problem involving conflicting requirements of main- taining broadcast stereo picture quality with practical problems in developing robust systems for cost effective deployment. In this paper we propose an alternative approach to stereo production in sports events using the conventional monocular broadcast cameras for 3D reconstruction of the event and subsequent stereo rendering. This approach has the potential advantage over stereo camera rigs of recovering full scene depth, allowing inter-ocular distance and convergence to be adapted according to the requirements of the target display and enabling stereo coverage from both existing and ‘virtual’ camera positions without additional cameras. A prototype system is presented with results of sports TV production trials for rendering of stereo and free-viewpoint video sequences of soccer and rugby
    • …
    corecore