2,446 research outputs found

    Steered mixture-of-experts for light field images and video : representation and coding

    Get PDF
    Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution

    A Survey of Signal Processing Problems and Tools in Holographic Three-Dimensional Television

    Get PDF
    Cataloged from PDF version of article.Diffraction and holography are fertile areas for application of signal theory and processing. Recent work on 3DTV displays has posed particularly challenging signal processing problems. Various procedures to compute Rayleigh-Sommerfeld, Fresnel and Fraunhofer diffraction exist in the literature. Diffraction between parallel planes and tilted planes can be efficiently computed. Discretization and quantization of diffraction fields yield interesting theoretical and practical results, and allow efficient schemes compared to commonly used Nyquist sampling. The literature on computer-generated holography provides a good resource for holographic 3DTV related issues. Fast algorithms to compute Fourier, Walsh-Hadamard, fractional Fourier, linear canonical, Fresnel, and wavelet transforms, as well as optimization-based techniques such as best orthogonal basis, matching pursuit, basis pursuit etc., are especially relevant signal processing techniques for wave propagation, diffraction, holography, and related problems. Atomic decompositions, multiresolution techniques, Gabor functions, and Wigner distributions are among the signal processing techniques which have or may be applied to problems in optics. Research aimed at solving such problems at the intersection of wave optics and signal processing promises not only to facilitate the development of 3DTV systems, but also to contribute to fundamental advances in optics and signal processing theory. © 2007 IEEE

    Vision technology/algorithms for space robotics applications

    Get PDF
    The thrust of automation and robotics for space applications has been proposed for increased productivity, improved reliability, increased flexibility, higher safety, and for the performance of automating time-consuming tasks, increasing productivity/performance of crew-accomplished tasks, and performing tasks beyond the capability of the crew. This paper provides a review of efforts currently in progress in the area of robotic vision. Both systems and algorithms are discussed. The evolution of future vision/sensing is projected to include the fusion of multisensors ranging from microwave to optical with multimode capability to include position, attitude, recognition, and motion parameters. The key feature of the overall system design will be small size and weight, fast signal processing, robust algorithms, and accurate parameter determination. These aspects of vision/sensing are also discussed

    Capture, processing, and display of real-world 3D objects using digital holography

    Get PDF
    "Digital holography for 3D and 4D real-world objects' capture, processing, and display" (acronym "Real 3D") is a research project funded under the Information and Communication Technologies theme of the European Commission's Seventh Framework Programme, and brings together nine participants from academia and industry (see www.digitalholography.eu).This three-year project marks the beginning a long-term effort to facilitate the entry of a new technology (digital holography) into the three-dimensional capture and display markets. Its progress at the end of year 2 is summarised

    Tensor4D : Efficient Neural 4D Decomposition for High-fidelity Dynamic Reconstruction and Rendering

    Full text link
    We present Tensor4D, an efficient yet effective approach to dynamic scene modeling. The key of our solution is an efficient 4D tensor decomposition method so that the dynamic scene can be directly represented as a 4D spatio-temporal tensor. To tackle the accompanying memory issue, we decompose the 4D tensor hierarchically by projecting it first into three time-aware volumes and then nine compact feature planes. In this way, spatial information over time can be simultaneously captured in a compact and memory-efficient manner. When applying Tensor4D for dynamic scene reconstruction and rendering, we further factorize the 4D fields to different scales in the sense that structural motions and dynamic detailed changes can be learned from coarse to fine. The effectiveness of our method is validated on both synthetic and real-world scenes. Extensive experiments show that our method is able to achieve high-quality dynamic reconstruction and rendering from sparse-view camera rigs or even a monocular camera. The code and dataset will be released at https://liuyebin.com/tensor4d/tensor4d.html

    A Novel Light Field Coding Scheme Based on Deep Belief Network and Weighted Binary Images for Additive Layered Displays

    Full text link
    Light field display caters to the viewer's immersive experience by providing binocular depth sensation and motion parallax. Glasses-free tensor light field display is becoming a prominent area of research in auto-stereoscopic display technology. Stacking light attenuating layers is one of the approaches to implement a light field display with a good depth of field, wide viewing angles and high resolution. This paper presents a compact and efficient representation of light field data based on scalable compression of the binary represented image layers suitable for additive layered display using a Deep Belief Network (DBN). The proposed scheme learns and optimizes the additive layer patterns using a convolutional neural network (CNN). Weighted binary images represent the optimized patterns, reducing the file size and introducing scalable encoding. The DBN further compresses the weighted binary patterns into a latent space representation followed by encoding the latent data using an h.254 codec. The proposed scheme is compared with benchmark codecs such as h.264 and h.265 and achieved competitive performance on light field data
    corecore