22,348 research outputs found
Co-projection-plane based 3-D padding for polyhedron projection for 360-degree video
The polyhedron projection for 360-degree video is becoming more and more
popular since it can lead to much less geometry distortion compared with the
equirectangular projection. However, in the polyhedron projection, we can
observe very obvious texture discontinuity in the area near the face boundary.
Such a texture discontinuity may lead to serious quality degradation when
motion compensation crosses the discontinuous face boundary. To solve this
problem, in this paper, we first propose to fill the corresponding neighboring
faces in the suitable positions as the extension of the current face to keep
approximated texture continuity. Then a co-projection-plane based 3-D padding
method is proposed to project the reference pixels in the neighboring face to
the current face to guarantee exact texture continuity. Under the proposed
scheme, the reference pixel is always projected to the same plane with the
current pixel when performing motion compensation so that the texture
discontinuity problem can be solved. The proposed scheme is implemented in the
reference software of High Efficiency Video Coding. Compared with the existing
method, the proposed algorithm can significantly improve the rate-distortion
performance. The experimental results obviously demonstrate that the texture
discontinuity in the face boundary can be well handled by the proposed
algorithm.Comment: 6 pages, 9 figure
Bridge the Gap Between VQA and Human Behavior on Omnidirectional Video: A Large-Scale Dataset and a Deep Learning Model
Omnidirectional video enables spherical stimuli with the viewing range. Meanwhile, only the viewport region of omnidirectional
video can be seen by the observer through head movement (HM), and an even
smaller region within the viewport can be clearly perceived through eye
movement (EM). Thus, the subjective quality of omnidirectional video may be
correlated with HM and EM of human behavior. To fill in the gap between
subjective quality and human behavior, this paper proposes a large-scale visual
quality assessment (VQA) dataset of omnidirectional video, called VQA-OV, which
collects 60 reference sequences and 540 impaired sequences. Our VQA-OV dataset
provides not only the subjective quality scores of sequences but also the HM
and EM data of subjects. By mining our dataset, we find that the subjective
quality of omnidirectional video is indeed related to HM and EM. Hence, we
develop a deep learning model, which embeds HM and EM, for objective VQA on
omnidirectional video. Experimental results show that our model significantly
improves the state-of-the-art performance of VQA on omnidirectional video.Comment: Accepted by ACM MM 201
Dynamic Adaptive Point Cloud Streaming
High-quality point clouds have recently gained interest as an emerging form
of representing immersive 3D graphics. Unfortunately, these 3D media are bulky
and severely bandwidth intensive, which makes it difficult for streaming to
resource-limited and mobile devices. This has called researchers to propose
efficient and adaptive approaches for streaming of high-quality point clouds.
In this paper, we run a pilot study towards dynamic adaptive point cloud
streaming, and extend the concept of dynamic adaptive streaming over HTTP
(DASH) towards DASH-PC, a dynamic adaptive bandwidth-efficient and view-aware
point cloud streaming system. DASH-PC can tackle the huge bandwidth demands of
dense point cloud streaming while at the same time can semantically link to
human visual acuity to maintain high visual quality when needed. In order to
describe the various quality representations, we propose multiple thinning
approaches to spatially sub-sample point clouds in the 3D space, and design a
DASH Media Presentation Description manifest specific for point cloud
streaming. Our initial evaluations show that we can achieve significant
bandwidth and performance improvement on dense point cloud streaming with minor
negative quality impacts compared to the baseline scenario when no adaptations
is applied.Comment: 6 pages, 23rd ACM Packet Video (PV'18) Workshop, June 12--15, 2018,
Amsterdam, Netherland
Analysis of Neural Video Compression Networks for 360-Degree Video Coding
With the increasing efforts of bringing high-quality virtual reality
technologies into the market, efficient 360-degree video compression gains in
importance. As such, the state-of-the-art H.266/VVC video coding standard
integrates dedicated tools for 360-degree video, and considerable efforts have
been put into designing 360-degree projection formats with improved compression
efficiency. For the fast-evolving field of neural video compression networks
(NVCs), the effects of different 360-degree projection formats on the overall
compression performance have not yet been investigated. It is thus unclear,
whether a resampling from the conventional equirectangular projection (ERP) to
other projection formats yields similar gains for NVCs as for hybrid video
codecs, and which formats perform best. In this paper, we analyze several
generations of NVCs and an extensive set of 360-degree projection formats with
respect to their compression performance for 360-degree video. Based on our
analysis, we find that projection format resampling yields significant
improvements in compression performance also for NVCs. The adjusted cubemap
projection (ACP) and equatorial cylindrical projection (ECP) show to perform
best and achieve rate savings of more than 55% compared to ERP based on WS-PSNR
for the most recent NVC. Remarkably, the observed rate savings are higher than
for H.266/VVC, emphasizing the importance of projection format resampling for
NVCs.Comment: 5 pages, 4 figures, 1 table, accepted for Picture Coding Symposium
2024 (PCS 2024
- …