1,668 research outputs found
Reducing the complexity of a multiview H.264/AVC and HEVC hybrid architecture
With the advent of 3D displays, an efficient encoder is required to compress the video information needed by them. Moreover, for gradual market acceptance of this new technology, it is advisable to offer backward compatibility with existing devices. Thus, a multiview H.264/Advance Video Coding (AVC) and High Efficiency Video Coding (HEVC) hybrid architecture was proposed in the standardization process of HEVC. However, it requires long encoding times due to the use of HEVC. With the aim of tackling this problem, this paper presents an algorithm that reduces the complexity of this hybrid architecture by reducing the encoding complexity of the HEVC views. By using Na < ve-Bayes classifiers, the proposed technique exploits the information gathered in the encoding of the H.264/AVC view to make decisions on the splitting of coding units in HEVC side views. Given the novelty of the proposal, the only similar work found in the literature is an unoptimized version of the algorithm presented here. Experimental results show that the proposed algorithm can achieve a good tradeoff between coding efficiency and complexity
Optimized Data Representation for Interactive Multiview Navigation
In contrary to traditional media streaming services where a unique media
content is delivered to different users, interactive multiview navigation
applications enable users to choose their own viewpoints and freely navigate in
a 3-D scene. The interactivity brings new challenges in addition to the
classical rate-distortion trade-off, which considers only the compression
performance and viewing quality. On the one hand, interactivity necessitates
sufficient viewpoints for richer navigation; on the other hand, it requires to
provide low bandwidth and delay costs for smooth navigation during view
transitions. In this paper, we formally describe the novel trade-offs posed by
the navigation interactivity and classical rate-distortion criterion. Based on
an original formulation, we look for the optimal design of the data
representation by introducing novel rate and distortion models and practical
solving algorithms. Experiments show that the proposed data representation
method outperforms the baseline solution by providing lower resource
consumptions and higher visual quality in all navigation configurations, which
certainly confirms the potential of the proposed data representation in
practical interactive navigation systems
Rate-Distortion Analysis of Multiview Coding in a DIBR Framework
Depth image based rendering techniques for multiview applications have been
recently introduced for efficient view generation at arbitrary camera
positions. Encoding rate control has thus to consider both texture and depth
data. Due to different structures of depth and texture images and their
different roles on the rendered views, distributing the available bit budget
between them however requires a careful analysis. Information loss due to
texture coding affects the value of pixels in synthesized views while errors in
depth information lead to shift in objects or unexpected patterns at their
boundaries. In this paper, we address the problem of efficient bit allocation
between textures and depth data of multiview video sequences. We adopt a
rate-distortion framework based on a simplified model of depth and texture
images. Our model preserves the main features of depth and texture images.
Unlike most recent solutions, our method permits to avoid rendering at encoding
time for distortion estimation so that the encoding complexity is not
augmented. In addition to this, our model is independent of the underlying
inpainting method that is used at decoder. Experiments confirm our theoretical
results and the efficiency of our rate allocation strategy
A multi-camera approach to image-based rendering and 3-D/Multiview display of ancient chinese artifacts
published_or_final_versio
Optimal layered representation for adaptive interactive multiview video streaming
We consider an interactive multiview video streaming (IMVS) system where clients select their preferred viewpoint in a given navigation window. To provide high quality IMVS, many high quality views should be transmitted to the clients. However, this is not always possible due to the limited and heterogeneous capabilities of the clients. In this paper, we propose a novel adaptive IMVS solution based on a layered multiview representation where camera views are organized into layered subsets to match the different clients constraints. We formulate an optimization problem for the joint selection of the views subsets and their encoding rates. Then, we propose an optimal and a reduced computational complexity greedy algorithms, both based on dynamic-programming. Simulation results show the good performance of our novel algorithms compared to a baseline algorithm, proving that an effective IMVS adaptive solution should consider the scene content and the client capabilities and their preferences in navigation
Learning sparse representations of depth
This paper introduces a new method for learning and inferring sparse
representations of depth (disparity) maps. The proposed algorithm relaxes the
usual assumption of the stationary noise model in sparse coding. This enables
learning from data corrupted with spatially varying noise or uncertainty,
typically obtained by laser range scanners or structured light depth cameras.
Sparse representations are learned from the Middlebury database disparity maps
and then exploited in a two-layer graphical model for inferring depth from
stereo, by including a sparsity prior on the learned features. Since they
capture higher-order dependencies in the depth structure, these priors can
complement smoothness priors commonly used in depth inference based on Markov
Random Field (MRF) models. Inference on the proposed graph is achieved using an
alternating iterative optimization technique, where the first layer is solved
using an existing MRF-based stereo matching algorithm, then held fixed as the
second layer is solved using the proposed non-stationary sparse coding
algorithm. This leads to a general method for improving solutions of state of
the art MRF-based depth estimation algorithms. Our experimental results first
show that depth inference using learned representations leads to state of the
art denoising of depth maps obtained from laser range scanners and a time of
flight camera. Furthermore, we show that adding sparse priors improves the
results of two depth estimation methods: the classical graph cut algorithm by
Boykov et al. and the more recent algorithm of Woodford et al.Comment: 12 page
Fast Motion Estimation Algorithms for Block-Based Video Coding Encoders
The objective of my research is reducing the complexity of video coding standards in real-time scalable and multi-view applications
- …