22,348 research outputs found

    Co-projection-plane based 3-D padding for polyhedron projection for 360-degree video

    Full text link
    The polyhedron projection for 360-degree video is becoming more and more popular since it can lead to much less geometry distortion compared with the equirectangular projection. However, in the polyhedron projection, we can observe very obvious texture discontinuity in the area near the face boundary. Such a texture discontinuity may lead to serious quality degradation when motion compensation crosses the discontinuous face boundary. To solve this problem, in this paper, we first propose to fill the corresponding neighboring faces in the suitable positions as the extension of the current face to keep approximated texture continuity. Then a co-projection-plane based 3-D padding method is proposed to project the reference pixels in the neighboring face to the current face to guarantee exact texture continuity. Under the proposed scheme, the reference pixel is always projected to the same plane with the current pixel when performing motion compensation so that the texture discontinuity problem can be solved. The proposed scheme is implemented in the reference software of High Efficiency Video Coding. Compared with the existing method, the proposed algorithm can significantly improve the rate-distortion performance. The experimental results obviously demonstrate that the texture discontinuity in the face boundary can be well handled by the proposed algorithm.Comment: 6 pages, 9 figure

    Bridge the Gap Between VQA and Human Behavior on Omnidirectional Video: A Large-Scale Dataset and a Deep Learning Model

    Full text link
    Omnidirectional video enables spherical stimuli with the 360×180∘360 \times 180^ \circ viewing range. Meanwhile, only the viewport region of omnidirectional video can be seen by the observer through head movement (HM), and an even smaller region within the viewport can be clearly perceived through eye movement (EM). Thus, the subjective quality of omnidirectional video may be correlated with HM and EM of human behavior. To fill in the gap between subjective quality and human behavior, this paper proposes a large-scale visual quality assessment (VQA) dataset of omnidirectional video, called VQA-OV, which collects 60 reference sequences and 540 impaired sequences. Our VQA-OV dataset provides not only the subjective quality scores of sequences but also the HM and EM data of subjects. By mining our dataset, we find that the subjective quality of omnidirectional video is indeed related to HM and EM. Hence, we develop a deep learning model, which embeds HM and EM, for objective VQA on omnidirectional video. Experimental results show that our model significantly improves the state-of-the-art performance of VQA on omnidirectional video.Comment: Accepted by ACM MM 201

    Dynamic Adaptive Point Cloud Streaming

    Full text link
    High-quality point clouds have recently gained interest as an emerging form of representing immersive 3D graphics. Unfortunately, these 3D media are bulky and severely bandwidth intensive, which makes it difficult for streaming to resource-limited and mobile devices. This has called researchers to propose efficient and adaptive approaches for streaming of high-quality point clouds. In this paper, we run a pilot study towards dynamic adaptive point cloud streaming, and extend the concept of dynamic adaptive streaming over HTTP (DASH) towards DASH-PC, a dynamic adaptive bandwidth-efficient and view-aware point cloud streaming system. DASH-PC can tackle the huge bandwidth demands of dense point cloud streaming while at the same time can semantically link to human visual acuity to maintain high visual quality when needed. In order to describe the various quality representations, we propose multiple thinning approaches to spatially sub-sample point clouds in the 3D space, and design a DASH Media Presentation Description manifest specific for point cloud streaming. Our initial evaluations show that we can achieve significant bandwidth and performance improvement on dense point cloud streaming with minor negative quality impacts compared to the baseline scenario when no adaptations is applied.Comment: 6 pages, 23rd ACM Packet Video (PV'18) Workshop, June 12--15, 2018, Amsterdam, Netherland

    Analysis of Neural Video Compression Networks for 360-Degree Video Coding

    Full text link
    With the increasing efforts of bringing high-quality virtual reality technologies into the market, efficient 360-degree video compression gains in importance. As such, the state-of-the-art H.266/VVC video coding standard integrates dedicated tools for 360-degree video, and considerable efforts have been put into designing 360-degree projection formats with improved compression efficiency. For the fast-evolving field of neural video compression networks (NVCs), the effects of different 360-degree projection formats on the overall compression performance have not yet been investigated. It is thus unclear, whether a resampling from the conventional equirectangular projection (ERP) to other projection formats yields similar gains for NVCs as for hybrid video codecs, and which formats perform best. In this paper, we analyze several generations of NVCs and an extensive set of 360-degree projection formats with respect to their compression performance for 360-degree video. Based on our analysis, we find that projection format resampling yields significant improvements in compression performance also for NVCs. The adjusted cubemap projection (ACP) and equatorial cylindrical projection (ECP) show to perform best and achieve rate savings of more than 55% compared to ERP based on WS-PSNR for the most recent NVC. Remarkably, the observed rate savings are higher than for H.266/VVC, emphasizing the importance of projection format resampling for NVCs.Comment: 5 pages, 4 figures, 1 table, accepted for Picture Coding Symposium 2024 (PCS 2024
    • …