2,508 research outputs found
Steered mixture-of-experts for light field images and video : representation and coding
Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution
A Survey of Signal Processing Problems and Tools in Holographic Three-Dimensional Television
Cataloged from PDF version of article.Diffraction and holography are fertile areas for application of signal theory and processing. Recent work on 3DTV displays has posed particularly challenging signal processing problems. Various procedures to compute Rayleigh-Sommerfeld, Fresnel and Fraunhofer diffraction exist in the literature. Diffraction between parallel planes and tilted planes can be efficiently computed. Discretization and quantization of diffraction fields yield interesting theoretical and practical results, and allow efficient schemes compared to commonly used Nyquist sampling. The literature on computer-generated holography provides a good resource for holographic 3DTV related issues. Fast algorithms to compute Fourier, Walsh-Hadamard, fractional Fourier, linear canonical, Fresnel, and wavelet transforms, as well as optimization-based techniques such as best orthogonal basis, matching pursuit, basis pursuit etc., are especially relevant signal processing techniques for wave propagation, diffraction, holography, and related problems. Atomic decompositions, multiresolution techniques, Gabor functions, and Wigner distributions are among the signal processing techniques which have or may be applied to problems in optics. Research aimed at solving such problems at the intersection of wave optics and signal processing promises not only to facilitate the development of 3DTV systems, but also to contribute to fundamental advances in optics and signal processing theory. © 2007 IEEE
Vision technology/algorithms for space robotics applications
The thrust of automation and robotics for space applications has been proposed for increased productivity, improved reliability, increased flexibility, higher safety, and for the performance of automating time-consuming tasks, increasing productivity/performance of crew-accomplished tasks, and performing tasks beyond the capability of the crew. This paper provides a review of efforts currently in progress in the area of robotic vision. Both systems and algorithms are discussed. The evolution of future vision/sensing is projected to include the fusion of multisensors ranging from microwave to optical with multimode capability to include position, attitude, recognition, and motion parameters. The key feature of the overall system design will be small size and weight, fast signal processing, robust algorithms, and accurate parameter determination. These aspects of vision/sensing are also discussed
Capture, processing, and display of real-world 3D objects using digital holography
"Digital holography for 3D and 4D real-world objects'
capture, processing, and display" (acronym "Real 3D") is a
research project funded under the Information and Communication Technologies theme of the European Commission's Seventh Framework Programme, and brings together nine participants from academia and industry (see www.digitalholography.eu).This three-year project marks the beginning a long-term effort to facilitate the entry of a new technology (digital holography) into the three-dimensional capture and display markets. Its progress
at the end of year 2 is summarised
Tensor4D : Efficient Neural 4D Decomposition for High-fidelity Dynamic Reconstruction and Rendering
We present Tensor4D, an efficient yet effective approach to dynamic scene
modeling. The key of our solution is an efficient 4D tensor decomposition
method so that the dynamic scene can be directly represented as a 4D
spatio-temporal tensor. To tackle the accompanying memory issue, we decompose
the 4D tensor hierarchically by projecting it first into three time-aware
volumes and then nine compact feature planes. In this way, spatial information
over time can be simultaneously captured in a compact and memory-efficient
manner. When applying Tensor4D for dynamic scene reconstruction and rendering,
we further factorize the 4D fields to different scales in the sense that
structural motions and dynamic detailed changes can be learned from coarse to
fine. The effectiveness of our method is validated on both synthetic and
real-world scenes. Extensive experiments show that our method is able to
achieve high-quality dynamic reconstruction and rendering from sparse-view
camera rigs or even a monocular camera. The code and dataset will be released
at https://liuyebin.com/tensor4d/tensor4d.html
A Novel Light Field Coding Scheme Based on Deep Belief Network and Weighted Binary Images for Additive Layered Displays
Light field display caters to the viewer's immersive experience by providing
binocular depth sensation and motion parallax. Glasses-free tensor light field
display is becoming a prominent area of research in auto-stereoscopic display
technology. Stacking light attenuating layers is one of the approaches to
implement a light field display with a good depth of field, wide viewing angles
and high resolution. This paper presents a compact and efficient representation
of light field data based on scalable compression of the binary represented
image layers suitable for additive layered display using a Deep Belief Network
(DBN). The proposed scheme learns and optimizes the additive layer patterns
using a convolutional neural network (CNN). Weighted binary images represent
the optimized patterns, reducing the file size and introducing scalable
encoding. The DBN further compresses the weighted binary patterns into a latent
space representation followed by encoding the latent data using an h.254 codec.
The proposed scheme is compared with benchmark codecs such as h.264 and h.265
and achieved competitive performance on light field data
Special issue on advances in three-dimensional television and video: Guest editorial
Cataloged from PDF version of article
Motion-enhanced Holography
Holographic displays, which enable pixel-level depth control and aberration
correction, are considered the key technology for the next-generation virtual
reality (VR) and augmented reality (AR) applications. However, traditional
holographic systems suffer from limited spatial bandwidth product (SBP), which
makes them impossible to reproduce \textit{realistic} 3D displays.
Time-multiplexed holography creates different speckle patterns over time and
then averages them to achieve a speckle-free 3D display. However, this approach
requires spatial light modulators (SLMs) with ultra-fast refresh rates, and
current algorithms cannot update holograms at such speeds. To overcome the
aforementioned challenge, we proposed a novel architecture, motion-enhanced
holography, that achieves \textit{realistic} 3D holographic displays without
artifacts by continuously shifting a special hologram. We introduced an
iterative algorithm to synthesize motion-enhanced holograms and demonstrated
that our method achieved a 10 dB improvement in the peak signal-to-noise ratio
(PSNR) of 3D focal stacks in numerical simulations compared to traditional
holographic systems. Furthermore, we validated this idea in optical experiments
utilizing a high-speed and high-precision programmable three-axis displacement
stage to display full-color and high-quality 3D focal stacks.Comment: 10 pages, 5 figure
- …