12 research outputs found
Multi-view video coding via virtual view generation
In this paper, a multi-view video coding method via generation of virtual picture sequences is proposed. Pictures are synthesized for the sake of better exploitation of the redundancies between neighbouring views in a multi-view sequence. Pictures are synthesized through a 3D warping method to estimate certain views in a multi-view set. Depth map and associated colour video sequences are used for view generation and tests. H. 264/AVC coding standard based MVC draft software is used for coding colour videos and depth maps as well as certain views which are predicted from the virtually generated views. Results for coding these views with the proposed method are compared against the reference H. 264/AVC simulcast method under some low delay coding scenarios. The rate-distortion performance of the proposed method outperforms that of the reference method at all bit-rates
A temporal subsampling approach for multiview depth map compression
In this letter, a new method is proposed for multiview
depth map compression. It is intended to skip some
parts of certain depth map viewpoints without encoding and
to just predict those skipped parts by exploiting the multiview
correspondences and some flags transmitted. It is targeted to save
the bit rate allocated for depth map sequences to a great extent.
Multiview correspondences are exploited for each skipped depth
map frame by making use of the depth map frames belonging
to neighboring views and captured at the same time instant.
A prediction depth map frame is constructed block by block
on a free viewpoint qualitywise selective basis from a couple of
candidate predictors generated through the implicit and explicit
usage of the 3-D scene geometry. Especially at lower bit rates,
dropping higher temporal layers of certain depth map viewpoints
and replacing them with corresponding predictors generated
using the proposed multiview aided approach save a great amount
of bit rate for those depth map viewpoints. At the same time,
the perceived quality of the reconstructed stereoscopic videos is
maintained, which is proved through a set of subjective tests
Low-delay random view access in multi-view coding using a bit-rate adaptive downsampling approach
In this paper, a new multi-view coding (MVC) scheme is
proposed and evaluated. The scheme offers improved low delay
view random access capability and at the same time
comparable compression performance with respect to the
reference multi-view coding scheme currently used. The
proposed scheme uses the concept of multiple-resolution
view coding, exploiting the trade-off between quantization
distortion and downsampling distortion at changing bit-rates,
which in turn provides improved coding efficiency. Bipredictive
(B) coded views, used in the conventional MVC
method, are replaced with predictive coded downscaled
views, reducing the view dependency in a multi-view set and
hence reducing the random view access delay, but
preserving the compression performance at the same time.
Results show that the proposed method reduces the view
random access delay in a MVC system significantly, but has
a similar objective and subjective performance with the
conventional MVC method
Bit-rate-adaptive downsampling for the coding of multi-view video with depth information
In this paper, the potential for improving the compression
efficiency of multi-view video coding with depth
information is explored. The proposed technique uses
downsampling prior to encoding, for arbitrary views and
depth maps. A bit-rate adaptive downscaling-ratio decision
approach is proposed for certain views and depth maps prior
to encoding. Colour and depth videos are considered
separately due to their different characteristics and effects on
synthesized free view-point videos. The inter-view
references, if present, are downsampled to the same
resolution as the input video to be coded. The results for
several multi-view with depth sequences indicate that using
bit-rate adaptive mixed spatial resolution coding for both
views and depth maps can achieve savings in bit-rate,
compared to full resolution and fixed depth-to-colour ratio
multi-view coding when the quality of synthesized viewpoints
are considered. The computational complexity in the
encoder is significantly reduced at the same time, since the
number of blocks coded is reduced, and hence the number of
block mode decisions carried out is reduced
Edge and motion-adaptive median filtering for multi-view depth map enhancement
We present a novel multi-view depth map enhancement method deployed
as a post-processing of initially estimated depth maps, which
are incoherent in the temporal and inter-view dimensions. The proposed
method is based on edge and motion-adaptive median filtering
and allows for an improved quality of virtual view synthesis. To
enforce the spatial, temporal and inter-view coherence in the multiview
depth maps, the median filtering is applied to 4-dimensional
windows that consist of the spatially neighbor depth map values
taken at different viewpoints and time instants. These windows have
locally adaptive shapes in a presence of edges or motion to preserve
sharpness and realistic rendering. We show that our enhancement
method leads to a reduction of a coding bit-rate required for representation
of the depth maps and also to a gain in the quality of synthesized
views at an arbitrary virtual viewpoint. At the same time,
the method carries a low additional computational complexity
Utilisation of edge adaptive upsampling in compression of depth map videos for enhanced free-viewpoint rendering
In this paper we propose a novel video object edge adaptive
upsampling scheme for application in video-plus-depth and
Multi-View plus Depth (MVD) video coding chains with
reduced resolution. Proposed scheme is for improving the ratedistortion
performance of reduced-resolution depth map coders
taking into account the rendering distortion induced in freeviewpoint
videos. The inherent loss in fine details due to
downsampling, particularly at video object boundaries causes
significant visual artefacts in rendered free-viewpoint images.
The proposed edge adaptive upsampling filter allows the
conservation and better reconstruction of such critical object
boundaries. Furthermore, the proposed scheme does not require
the edge information to be communicated to the decoder, as the
edge information used in the adaptive upsampling is derived
from the reconstructed colour video. Test results show that as
much as 1.2 dB gain in free-viewpoint video quality can be
achieved with the utilization of the proposed method compared
to the scheme that uses the linear MPEG re-sampling filter. The
proposed approach is suitable for video-plus-depth as well as
MVD applications, in which it is critical to satisfy bandwidth
constraints while maintaining high free-viewpoint image
quality
Edge-adaptive upsampling of depth map videos for enhanced free viewpoint video quality
Quality enhancement of free-viewpoint videos is addressed for 3D
video systems that use the colour texture video plus depth map representation
format. More specifically, a novel and efficient shape adaptive
filter is presented for upsampling depth map videos that are of
lower resolution than their colour texture counterparts. Either measurement
or estimation of depth map videos can take place at lower resolution.
At the same time, depth map reconstruction takes place at low
resolution if reduced resolution compression techniques are utilised.
The proposed design is based on the observation that significant
transitions in depth intensity across depth map frames influence the
overall quality of generated free-viewpoint videos. Hence, sharpness
and accuracy in the free-viewpoint videos rendered using 3D
geometry via depth maps, especially across object borders, are targeted.
Accordingly, significant enhancement of rendered free-viewpoint video
quality is obtained when the proposed method is applied on top of
MPEG spatial scalability filters
Display-dependent preprocessing of depth maps based on just-noticeable depth difference modeling
This paper addresses the sensitivity of human vision to spatial depth variations in a 3-D video scene, seen on a stereoscopic display, based on an experimental derivation of a just noticeable depth difference (JNDD) model. The main target is to exploit the depth perception sensitivity of humans in suppressing the unnecessary spatial depth details, hence reducing the transmission overhead allocated to depth maps. Based on the JNDD model derived, depth map sequences are preprocessed to suppress the depth details that are not perceivable by the viewers and to minimize the rendering artefacts that arise due to optical noise, where the optical noise is triggered by the inaccuracies in the depth estimation process. Theoretical and experimental evidences are provided to illustrate that the proposed depth adaptive preprocessing filter does not alter the 3-D visual quality or the view synthesis quality for free-viewpoint video applications. Experimental results suggest that the bit rate for depth map coding can be reduced up to 78% for the depth maps captured with depth-range cameras and up to 24% for the depth maps estimated with computer vision algorithms, without affecting the 3-D visual quality or the arbitrary view synthesis quality
A scalable multi-view audiovisual entertainment framework with content-aware distribution
Delivery of 3D immersive entertainment to the home remains a highly challenging problem due to the large amount of data involved, and the need to support a wide variety of different displays. Support of such displays may require different numbers of views, delivered over time varying networks. This requires a delivery scheme featuring scalable compression to adapt to varying network conditions, and error resiliency to overcome disturbing losses in the 3D perception. Audio and video attention models can be used in designing an optimal content-aware compression and transmission scheme, by prioritizing the most visually important areas of the video. This paper gives an overview of a content-aware, scalable multi-view audiovisual entertainment delivery framework. Results are shown to evaluate the kinds of error robustness improvements that could be seen using such a system
Utilisation of motion similarity in colour-plus-depth 3D video for improved error resiliency
Robust 3D stereoscopic video transmission over error-prone networks has been a challenging task. Sustainability of the perceived 3D video quality is essential in case of channel losses. Colour-plus-Depth format on the other hand, has been popular for representing the stereoscopic video, due to its flexibility, low encoding cost compared to left-right stereoscopic video and backwards compatibility. Traditionally, the similarities existing between the colour and the depth map videos are not exploited during 3D video coding. In other words, both components are encoded separately. The similarities include the similarity in motion, image gradients and segments. In this work, we propose to exploit the similarity in the motion characteristics of the colour and the depth map videos by computing only a set of motion vectors and duplicating it for the sake of error resiliency. As the previous research has shown that the stereoscopic video quality is primarily affected by the colour texture quality, especially the motion vectors are computed for the colour video component and the corresponding vectors are used to encode the depth maps. Since the colour motion vectors are protected by duplication, the results have shown that both the colour video quality and the overall stereoscopic video quality are maintained in error-prone conditions at the expense of slight loss in depth map video coding performance. Furthermore, total encoding time is reduced by not calculating the motion vectors for depth map