Search CORE

930 research outputs found

Steered mixture-of-experts for light field images and video : representation and coding

Author: Lambert Peter
Sikora Thomas
Van Wallendael Glenn
Verhack Ruben
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution

Ghent University Academic Bibliography

Progressive modeling of steered mixture-of-experts for light field video approximation

Author: Courteaux Martijn
Lambert Peter
Sikora Thomas
Van Wallendael Glenn
Verhack Ruben
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Steered Mixture-of-Experts (SMoE) is a novel framework for the approximation, coding, and description of image modalities. The future goal is to arrive at a representation for Six Degrees-of-Freedom (6DoF) image data. The goal of this paper is to introduce SMoE for 4D light field videos by including the temporal dimension. However, these videos contain vast amounts of samples due to the large number of views per frame. Previous work on static light field images mitigated the problem by hard subdividing the modeling problem. However, such a hard subdivision introduces visually disturbing block artifacts on moving objects in dynamic image data. We propose a novel modeling method that does not result in block artifacts while minimizing the computational complexity and which allows for a varying spread of kernels in the spatio-temporal domain. Experiments validate that we can progressively model light field videos with increasing objective quality up to 0.97 SSIM

Crossref

Ghent University Academic Bibliography

Archivsystem Ask23

Steered Mixture-of-Experts' for image and light field representation, processing and coding : a universal approach for immersive experiences of camera-captured scenes

Author: Verhack Ruben
Publication venue: Universiteit Gent. Faculteit Ingenieurswetenschappen en Architectuur
Publication date: 01/01/2020
Field of study

Ghent University Academic Bibliography

Hard real-time, pixel-parallel rendering of light field videos using steered mixture-of-experts

Author: Avramelos Vasileios
Lambert Peter
Saenen Ignace
Van Wallendael Glenn
Verhack Ruben
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Steered Mixture-of-Experts (SMoE) is a novel framework for the approximation, coding, and description of image modalities such as light field images and video. The future goal is to arrive at a representation for Six Degrees-of-Freedom (6DoF) image data. Previous research has shown the feasibility of real-time pixel-parallel rendering of static light field images. Each pixel is independently reconstructed by kernels that lay in its vicinity. The number of kernels involved forms the bottleneck on the achievable framerate. The goal of this paper is twofold. Firstly, we introduce pixel-level rendering of light field video, as previous work only rendered static content. Secondly, we investigate rendering using a predefined number of most significant kernels. As such, we can deliver hard real-time constraints by trading off the reconstruction quality

Ghent University Academic Bibliography

Steered mixture-of-experts for light field video coding

Author: Avramelos Vasileios
Lambert Peter
Saenen Ignace
Sikora Thomas
Van Wallendael Glenn
Verhack Ruben
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 01/01/2018
Field of study

Ghent University Academic Bibliography

Steered mixture-of-experts for light field coding, depth estimation, and processing

Author: Jongebloed R.
Lambert Peter
Lange L.
Sikora T.
Van Wallendael Glenn
Verhack Ruben
Publication venue
Publication date: 01/01/2017
Field of study

Ghent University Academic Bibliography

Hierarchical learning of sparse image representations using steered mixture-of-experts

Author: Jongebloed R.
Lange L.
Sikora T.
Verhack Ruben
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Previous research showed highly efficient compression results for low bit-rates using Steered Mixture-of-Experts (SMoE), higher rates still pose a challenge due to the non-convex optimization problem that becomes more difficult when increasing the number of components. Therefore, a novel estimation method based on Hidden Markov Random Fields is introduced taking spatial dependencies of neighboring pixels into account combined with a tree-structured splitting strategy. Experimental evaluations for images show that our approach outperforms state-of-the-art techniques using only one robust parameter set. For video and light field modeling even more gain can be expected

Crossref

Ghent University Academic Bibliography

Steered mixture-of-experts for light field coding, depth estimation, and processing

Author: Jongebloed R.
Lambert Peter
Lange L.
Sikora T.
Van Wallendael Glenn
Verhack Ruben
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Crossref

Ghent University Academic Bibliography

Random access prediction structures for light field video coding with MV-HEVC

Author: Avramelos Vasileios
De Praeter Johan
Lambert Peter
Van Wallendael Glenn
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Computational imaging and light field technology promise to deliver the required six-degrees-of-freedom for natural scenes in virtual reality. Already existing extensions of standardized video coding formats, such as multi-view coding and multi-view plus depth, are the most conventional light field video coding solutions at the moment. The latest multi-view coding format, which is a direct extension of the high efficiency video coding (HEVC) standard, is called multi-view HEVC (or MV-HEVC). MV-HEVC treats each light field view as a separate video sequence, and uses syntax elements similar to standard HEVC for exploiting redundancies between neighboring views. To achieve this, inter-view and temporal prediction schemes are deployed with the aim to find the most optimal trade-off between coding performance and reconstruction quality. The number of possible prediction structures is unlimited and many of them are proposed in the literature. Although some of them are efficient in terms of compression ratio, they complicate random access due to the dependencies on previously decoded pixels or frames. Random access is an important feature in video delivery, and a crucial requirement in multi-view video coding. In this work, we propose and compare different prediction structures for coding light field video using MV-HEVC with a focus on both compression efficiency and random accessibility. Experiments on three different short-baseline light field video sequences show the trade-off between bit-rate and distortion, as well as the average number of decoded views/frames, necessary for displaying any random frame at any time instance. The findings of this work indicate the most appropriate prediction structure depending on the available bandwidth and the required degree of random access

Ghent University Academic Bibliography

Adapting Computer Vision Models To Limitations On Input Dimensionality And Model Complexity

Author: Abbas Alhabib
Publication venue: UCL (University College London)
Publication date: 28/02/2020
Field of study

When considering instances of distributed systems where visual sensors communicate with remote predictive models, data traffic is limited to the capacity of communication channels, and hardware limits the processing of collected data prior to transmission. We study novel methods of adapting visual inference to limitations on complexity and data availability at test time, wherever the aforementioned limitations exist. Our contributions detailed in this thesis consider both task-specific and task-generic approaches to reducing the data requirement for inference, and evaluate our proposed methods on a wide range of computer vision tasks. This thesis makes four distinct contributions: (i) We investigate multi-class action classification via two-stream convolutional neural networks that directly ingest information extracted from compressed video bitstreams. We show that selective access to macroblock motion vector information provides a good low-dimensional approximation of the underlying optical flow in visual sequences. (ii) We devise a bitstream cropping method by which AVC/H.264 and H.265 bitstreams are reduced to the minimum amount of necessary elements for optical flow extraction, while maintaining compliance with codec standards. We additionally study the effect of codec rate-quality control on the sparsity and noise incurred on optical flow derived from resulting bitstreams, and do so for multiple coding standards. (iii) We demonstrate degrees of variability in the amount of data required for action classification, and leverage this to reduce the dimensionality of input volumes by inferring the required temporal extent for accurate classification prior to processing via learnable machines. (iv) We extend the Mixtures-of-Experts (MoE) paradigm to adapt the data cost of inference for any set of constituent experts. We postulate that the minimum acceptable data cost of inference varies for different input space partitions, and consider mixtures where each expert is designed to meet a different set of constraints on input dimensionality. To take advantage of the flexibility of such mixtures in processing different input representations and modalities, we train biased gating functions such that experts requiring less information to make their inferences are favoured to others. We finally note that, our proposed data utility optimization solutions include a learnable component which considers specified priorities on the amount of information to be used prior to inference, and can be realized for any combination of tasks, modalities, and constraints on available data

UCL Discovery