110,927 research outputs found
Light Field Super-Resolution Via Graph-Based Regularization
Light field cameras capture the 3D information in a scene with a single
exposure. This special feature makes light field cameras very appealing for a
variety of applications: from post-capture refocus, to depth estimation and
image-based rendering. However, light field cameras suffer by design from
strong limitations in their spatial resolution, which should therefore be
augmented by computational methods. On the one hand, off-the-shelf single-frame
and multi-frame super-resolution algorithms are not ideal for light field data,
as they do not consider its particular structure. On the other hand, the few
super-resolution algorithms explicitly tailored for light field data exhibit
significant limitations, such as the need to estimate an explicit disparity map
at each view. In this work we propose a new light field super-resolution
algorithm meant to address these limitations. We adopt a multi-frame alike
super-resolution approach, where the complementary information in the different
light field views is used to augment the spatial resolution of the whole light
field. We show that coupling the multi-frame approach with a graph regularizer,
that enforces the light field structure via nonlocal self similarities, permits
to avoid the costly and challenging disparity estimation step for all the
views. Extensive experiments show that the new algorithm compares favorably to
the other state-of-the-art methods for light field super-resolution, both in
terms of PSNR and visual quality.Comment: This new version includes more material. In particular, we added: a
new section on the computational complexity of the proposed algorithm,
experimental comparisons with a CNN-based super-resolution algorithm, and new
experiments on a third datase
Volumetric Isosurface Rendering with Deep Learning-Based Super-Resolution
Rendering an accurate image of an isosurface in a volumetric field typically
requires large numbers of data samples. Reducing the number of required samples
lies at the core of research in volume rendering. With the advent of deep
learning networks, a number of architectures have been proposed recently to
infer missing samples in multi-dimensional fields, for applications such as
image super-resolution and scan completion. In this paper, we investigate the
use of such architectures for learning the upscaling of a low-resolution
sampling of an isosurface to a higher resolution, with high fidelity
reconstruction of spatial detail and shading. We introduce a fully
convolutional neural network, to learn a latent representation generating a
smooth, edge-aware normal field and ambient occlusions from a low-resolution
normal and depth field. By adding a frame-to-frame motion loss into the
learning stage, the upscaling can consider temporal variations and achieves
improved frame-to-frame coherence. We demonstrate the quality of the network
for isosurfaces which were never seen during training, and discuss remote and
in-situ visualization as well as focus+context visualization as potential
application
Depth of field guided visualisation on light field displays
Light field displays are capable of realistic visualization of arbitrary 3D content. However, due to the finite number of light rays reproduced by the display, its bandwidth is limited in terms of angular and spatial resolution. Consequently, 3D content that falls outside of that bandwidth will cause aliasing during visualization. Therefore, a light field to be visualized must be properly preprocessed. In this thesis, we propose three methods that properly filter the parts in the input light field that would cause aliasing. First method is based on a 2D FIR circular filter that is applied over the 4D light field. Second method utilizes the structured nature of the epipolar plane images representing the light field. Third method adopts real-time multi-layer depth-of-field rendering using tiled splatting. We also establish a connection between lens parameters in the proposed depth-of-field rendering and the display’s bandwidth in order to determine the optimal blurring amount. As we prepare light field for light field displays, a stage is added to the proposed real-time rendering pipeline that simultaneously renders adjacent views. The rendering performance of the proposed methods is demonstrated on Holografika’s Holovizio 722RC projection-based light field display
Graph-Based Light Field Super-Resolution
Light field cameras can capture the 3D information in a scene with a single exposure. This special feature makes light field cameras very appealing for a variety of applications: from post capture refocus, to depth estimation and image-based rendering. However, light field cameras exhibit a very limited spatial resolution, which should therefore be increased by computational methods. Off-the-shelf single-frame and multi-frame super-resolution algorithms are not ideal for light field data, as they ignore its particular structure. A few super-resolution algorithms explicitly devised for light field data exist, but they exhibit significant limitations, such as the need to carry out an explicit disparity estimation step for one or several light field views. In this work we present a new light field super-resolution algorithm meant to address these limitations. We adopt a multi- frame alike super-resolution approach, where the information in the different light field views is used to augment the spatial resolution of the whole light field. In particular, we show that coupling the multi-frame paradigma with a graph regularizer that enforces the light field structure permits to avoid the costly and challenging disparity estimation step. Our experiments show that the proposed method compares favorably to the state-of-the- art for light field super-resolution algorithms, both in terms of PSNR and visual quality
Virtual Rephotography: Novel View Prediction Error for 3D Reconstruction
The ultimate goal of many image-based modeling systems is to render
photo-realistic novel views of a scene without visible artifacts. Existing
evaluation metrics and benchmarks focus mainly on the geometric accuracy of the
reconstructed model, which is, however, a poor predictor of visual accuracy.
Furthermore, using only geometric accuracy by itself does not allow evaluating
systems that either lack a geometric scene representation or utilize coarse
proxy geometry. Examples include light field or image-based rendering systems.
We propose a unified evaluation approach based on novel view prediction error
that is able to analyze the visual quality of any method that can render novel
views from input images. One of the key advantages of this approach is that it
does not require ground truth geometry. This dramatically simplifies the
creation of test datasets and benchmarks. It also allows us to evaluate the
quality of an unknown scene during the acquisition and reconstruction process,
which is useful for acquisition planning. We evaluate our approach on a range
of methods including standard geometry-plus-texture pipelines as well as
image-based rendering techniques, compare it to existing geometry-based
benchmarks, and demonstrate its utility for a range of use cases.Comment: 10 pages, 12 figures, paper was submitted to ACM Transactions on
Graphics for revie
Neural View-Interpolation for Sparse Light Field Video
We suggest representing light field (LF) videos as "one-off" neural networks (NN), i.e., a learned mapping from view-plus-time coordinates to high-resolution color values, trained on sparse views. Initially, this sounds like a bad idea for three main reasons: First, a NN LF will likely have less quality than a same-sized pixel basis representation. Second, only few training data, e.g., 9 exemplars per frame are available for sparse LF videos. Third, there is no generalization across LFs, but across view and time instead. Consequently, a network needs to be trained for each LF video. Surprisingly, these problems can turn into substantial advantages: Other than the linear pixel basis, a NN has to come up with a compact, non-linear i.e., more intelligent, explanation of color, conditioned on the sparse view and time coordinates. As observed for many NN however, this representation now is interpolatable: if the image output for sparse view coordinates is plausible, it is for all intermediate, continuous coordinates as well. Our specific network architecture involves a differentiable occlusion-aware warping step, which leads to a compact set of trainable parameters and consequently fast learning and fast execution
- …