1,082 research outputs found
Loss-resilient Coding of Texture and Depth for Free-viewpoint Video Conferencing
Free-viewpoint video conferencing allows a participant to observe the remote
3D scene from any freely chosen viewpoint. An intermediate virtual viewpoint
image is commonly synthesized using two pairs of transmitted texture and depth
maps from two neighboring captured viewpoints via depth-image-based rendering
(DIBR). To maintain high quality of synthesized images, it is imperative to
contain the adverse effects of network packet losses that may arise during
texture and depth video transmission. Towards this end, we develop an
integrated approach that exploits the representation redundancy inherent in the
multiple streamed videos a voxel in the 3D scene visible to two captured views
is sampled and coded twice in the two views. In particular, at the receiver we
first develop an error concealment strategy that adaptively blends
corresponding pixels in the two captured views during DIBR, so that pixels from
the more reliable transmitted view are weighted more heavily. We then couple it
with a sender-side optimization of reference picture selection (RPS) during
real-time video coding, so that blocks containing samples of voxels that are
visible in both views are more error-resiliently coded in one view only, given
adaptive blending will erase errors in the other view. Further, synthesized
view distortion sensitivities to texture versus depth errors are analyzed, so
that relative importance of texture and depth code blocks can be computed for
system-wide RPS optimization. Experimental results show that the proposed
scheme can outperform the use of a traditional feedback channel by up to 0.82
dB on average at 8% packet loss rate, and by as much as 3 dB for particular
frames
Reducing the complexity of a multiview H.264/AVC and HEVC hybrid architecture
With the advent of 3D displays, an efficient encoder is required to compress the video information needed by them. Moreover, for gradual market acceptance of this new technology, it is advisable to offer backward compatibility with existing devices. Thus, a multiview H.264/Advance Video Coding (AVC) and High Efficiency Video Coding (HEVC) hybrid architecture was proposed in the standardization process of HEVC. However, it requires long encoding times due to the use of HEVC. With the aim of tackling this problem, this paper presents an algorithm that reduces the complexity of this hybrid architecture by reducing the encoding complexity of the HEVC views. By using Na < ve-Bayes classifiers, the proposed technique exploits the information gathered in the encoding of the H.264/AVC view to make decisions on the splitting of coding units in HEVC side views. Given the novelty of the proposal, the only similar work found in the literature is an unoptimized version of the algorithm presented here. Experimental results show that the proposed algorithm can achieve a good tradeoff between coding efficiency and complexity
Optimized Data Representation for Interactive Multiview Navigation
In contrary to traditional media streaming services where a unique media
content is delivered to different users, interactive multiview navigation
applications enable users to choose their own viewpoints and freely navigate in
a 3-D scene. The interactivity brings new challenges in addition to the
classical rate-distortion trade-off, which considers only the compression
performance and viewing quality. On the one hand, interactivity necessitates
sufficient viewpoints for richer navigation; on the other hand, it requires to
provide low bandwidth and delay costs for smooth navigation during view
transitions. In this paper, we formally describe the novel trade-offs posed by
the navigation interactivity and classical rate-distortion criterion. Based on
an original formulation, we look for the optimal design of the data
representation by introducing novel rate and distortion models and practical
solving algorithms. Experiments show that the proposed data representation
method outperforms the baseline solution by providing lower resource
consumptions and higher visual quality in all navigation configurations, which
certainly confirms the potential of the proposed data representation in
practical interactive navigation systems
Depth map compression via 3D region-based representation
In 3D video, view synthesis is used to create new virtual views between
encoded camera views. Errors in the coding of the depth maps introduce
geometry inconsistencies in synthesized views. In this paper, a new 3D plane
representation of the scene is presented which improves the performance of
current standard video codecs in the view synthesis domain. Two image segmentation
algorithms are proposed for generating a color and depth segmentation.
Using both partitions, depth maps are segmented into regions without
sharp discontinuities without having to explicitly signal all depth edges. The
resulting regions are represented using a planar model in the 3D world scene.
This 3D representation allows an efficient encoding while preserving the 3D
characteristics of the scene. The 3D planes open up the possibility to code
multiview images with a unique representation.Postprint (author's final draft
In-Network View Synthesis for Interactive Multiview Video Systems
To enable Interactive multiview video systems with a minimum view-switching
delay, multiple camera views are sent to the users, which are used as reference
images to synthesize additional virtual views via depth-image-based rendering.
In practice, bandwidth constraints may however restrict the number of reference
views sent to clients per time unit, which may in turn limit the quality of the
synthesized viewpoints. We argue that the reference view selection should
ideally be performed close to the users, and we study the problem of in-network
reference view synthesis such that the navigation quality is maximized at the
clients. We consider a distributed cloud network architecture where data stored
in a main cloud is delivered to end users with the help of cloudlets, i.e.,
resource-rich proxies close to the users. In order to satisfy last-hop
bandwidth constraints from the cloudlet to the users, a cloudlet re-samples
viewpoints of the 3D scene into a discrete set of views (combination of
received camera views and virtual views synthesized) to be used as reference
for the synthesis of additional virtual views at the client. This in-network
synthesis leads to better viewpoint sampling given a bandwidth constraint
compared to simple selection of camera views, but it may however carry a
distortion penalty in the cloudlet-synthesized reference views. We therefore
cast a new reference view selection problem where the best subset of views is
defined as the one minimizing the distortion over a view navigation window
defined by the user under some transmission bandwidth constraints. We show that
the view selection problem is NP-hard, and propose an effective polynomial time
algorithm using dynamic programming to solve the optimization problem.
Simulation results finally confirm the performance gain offered by virtual view
synthesis in the network
Multi-View Video Packet Scheduling
In multiview applications, multiple cameras acquire the same scene from
different viewpoints and generally produce correlated video streams. This
results in large amounts of highly redundant data. In order to save resources,
it is critical to handle properly this correlation during encoding and
transmission of the multiview data. In this work, we propose a
correlation-aware packet scheduling algorithm for multi-camera networks, where
information from all cameras are transmitted over a bottleneck channel to
clients that reconstruct the multiview images. The scheduling algorithm relies
on a new rate-distortion model that captures the importance of each view in the
scene reconstruction. We propose a problem formulation for the optimization of
the packet scheduling policies, which adapt to variations in the scene content.
Then, we design a low complexity scheduling algorithm based on a trellis search
that selects the subset of candidate packets to be transmitted towards
effective multiview reconstruction at clients. Extensive simulation results
confirm the gain of our scheduling algorithm when inter-source correlation
information is used in the scheduler, compared to scheduling policies with no
information about the correlation or non-adaptive scheduling policies. We
finally show that increasing the optimization horizon in the packet scheduling
algorithm improves the transmission performance, especially in scenarios where
the level of correlation rapidly varies with time
- …