650 research outputs found
Spatial Perceptual Quality Aware Adaptive Volumetric Video Streaming
Volumetric video offers a highly immersive viewing experience, but poses
challenges in ensuring quality of experience (QoE) due to its high bandwidth
requirements. In this paper, we explore the effect of viewing distance
introduced by six degrees of freedom (6DoF) spatial navigation on user's
perceived quality. By considering human visual resolution limitations, we
propose a visual acuity model that describes the relationship between the
virtual viewing distance and the tolerable boundary point cloud density. The
proposed model satisfies spatial visual requirements during 6DoF exploration.
Additionally, it dynamically adjusts quality levels to balance perceptual
quality and bandwidth consumption. Furthermore, we present a QoE model to
represent user's perceived quality at different viewing distances precisely.
Extensive experimental results demonstrate that, the proposed scheme can
effectively improve the overall average QoE by up to 26% over real networks and
user traces, compared to existing baselines.Comment: Accepted byIEEE Globecom 202
From capturing to rendering : volumetric media delivery with six degrees of freedom
Technological improvements are rapidly advancing holographic-type content distribution. Significant research efforts have been made to meet the low latency and high bandwidth requirements set forward by interactive applications such as remote surgery and virtual reality. Recent research made six degrees of freedom (6DoF) for immersive media possible, where users may both move their head and change their position within a scene. In this article, we present the status and challenges of 6DoF applications based on volumetric media, focusing on the key aspects required to deliver such services. Furthermore, we present results from a subjective study to highlight relevant directions for future research
User centered adaptive streaming of dynamic point clouds with low complexity tiling
In recent years, the development of devices for acquisition and rendering of 3D contents have facilitated the diffusion of immersive virtual reality experiences. In particular, the point cloud representation has emerged as a popular format for volumetric photorealistic reconstructions of dynamic real world objects, due to its simplicity and versatility. To optimize the delivery of the large amount of data needed to provide these experiences, adaptive streaming over HTTP is a promising solution. In order to ensure the best quality of experience within the bandwidth constraints, adaptive streaming is combined with tiling to optimize the quality of what is being visualized by the user at a given moment; as such, it has been successfully used in the past for omnidirectional contents. However, its adoption to the point cloud streaming scenario has only been studied to optimize multi-object delivery. In this paper, we present a low-complexity tiling approach to perform adaptive streaming of point cloud content. Tiles are defined by segmenting each point cloud object in several parts, which are then independently encoded. In order to evaluate the approach, we first collect real navigation paths, obtained through a user study in 6 degrees of freedom with 26 participants. The variation in movements and interaction behaviour among users indicate that a user-centered adaptive delivery could lead to sensible gains in terms of perceived quality. Evaluation of the performance of the proposed tiling approach against state of the art solutions for point cloud compression, performed on the collected navigation paths, confirms that considerable gains can be obtained by exploiting user-adaptive streaming, achieving bitrate gains up to 57% with respect to a non-adaptive approach with the same codec. Moreover, we demonstrate that the selection of navigation data has an impact on the relative objective scores
Joint Communication and Computational Resource Allocation for QoE-driven Point Cloud Video Streaming
Point cloud video is the most popular representation of hologram, which is
the medium to precedent natural content in VR/AR/MR and is expected to be the
next generation video. Point cloud video system provides users immersive
viewing experience with six degrees of freedom and has wide applications in
many fields such as online education, entertainment. To further enhance these
applications, point cloud video streaming is in critical demand. The inherent
challenges lie in the large size by the necessity of recording the
three-dimensional coordinates besides color information, and the associated
high computation complexity of encoding. To this end, this paper proposes a
communication and computation resource allocation scheme for QoE-driven point
cloud video streaming. In particular, we maximize system resource utilization
by selecting different quantities, transmission forms and quality level tiles
to maximize the quality of experience. Extensive simulations are conducted and
the simulation results show the superior performance over the existing scheme
Machine Learning for Multimedia Communications
Machine learning is revolutionizing the way multimedia information is processed and transmitted to users. After intensive and powerful training, some impressive efficiency/accuracy improvements have been made all over the transmission pipeline. For example, the high model capacity of the learning-based architectures enables us to accurately model the image and video behavior such that tremendous compression gains can be achieved. Similarly, error concealment, streaming strategy or even user perception modeling have widely benefited from the recent learningoriented developments. However, learning-based algorithms often imply drastic changes to the way data are represented or consumed, meaning that the overall pipeline can be affected even though a subpart of it is optimized. In this paper, we review the recent major advances that have been proposed all across the transmission chain, and we discuss their potential impact and the research challenges that they raise
BOLA360: Near-optimal View and Bitrate Adaptation for 360-degree Video Streaming
Recent advances in omnidirectional cameras and AR/VR headsets have spurred
the adoption of 360-degree videos that are widely believed to be the future of
online video streaming. 360-degree videos allow users to wear a head-mounted
display (HMD) and experience the video as if they are physically present in the
scene. Streaming high-quality 360-degree videos at scale is an unsolved problem
that is more challenging than traditional (2D) video delivery. The data rate
required to stream 360-degree videos is an order of magnitude more than
traditional videos. Further, the penalty for rebuffering events where the video
freezes or displays a blank screen is more severe as it may cause
cybersickness. We propose an online adaptive bitrate (ABR) algorithm for
360-degree videos called BOLA360 that runs inside the client's video player and
orchestrates the download of video segments from the server so as to maximize
the quality-of-experience (QoE) of the user. BOLA360 conserves bandwidth by
downloading only those video segments that are likely to fall within the
field-of-view (FOV) of the user. In addition, BOLA360 continually adapts the
bitrate of the downloaded video segments so as to enable a smooth playback
without rebuffering. We prove that BOLA360 is near-optimal with respect to an
optimal offline algorithm that maximizes QoE. Further, we evaluate BOLA360 on a
wide range of network and user head movement profiles and show that it provides
to more QoE than state-of-the-art algorithms. While ABR
algorithms for traditional (2D) videos have been well-studied over the last
decade, our work is the first ABR algorithm for 360-degree videos with both
theoretical and empirical guarantees on its performance.Comment: 25 page
EdgeRIC: Empowering Realtime Intelligent Optimization and Control in NextG Networks
Radio Access Networks (RAN) are increasingly softwarized and accessible via
data-collection and control interfaces. RAN intelligent control (RIC) is an
approach to manage these interfaces at different timescales. In this paper, we
develop a RIC platform called RICworld, consisting of (i) EdgeRIC, which is
colocated, but decoupled from the RAN stack, and can access RAN and
application-level information to execute AI-optimized and other policies in
realtime (sub-millisecond) and (ii) DigitalTwin, a full-stack, trace-driven
emulator for training AI-based policies offline. We demonstrate that realtime
EdgeRIC operates as if embedded within the RAN stack and significantly
outperforms a cloud-based near-realtime RIC (> 15 ms latency) in terms of
attained throughput. We train AI-based polices on DigitalTwin, execute them on
EdgeRIC, and show that these policies are robust to channel dynamics, and
outperform queueing-model based policies by 5% to 25% on throughput and
application-level benchmarks in a variety of mobile environments.Comment: 16 pages, 15 figure
Joint optimization of bitrate selection and beamforming for holographic video cooperative streaming in VLC systems
Holographic video streaming requires ultrahigh channel capacity, which might not be achieved by the existing radio frequency-based wireless networks. To address this challenge, we propose a holographic video cooperative streaming framework by integrating coordinated multipoint transmission and beamforming technologies in visible light communication (VLC) systems. This framework enables simultaneous video streaming with an ultrahigh data rate for multiple users in the VLC system, resulting in a more efficient and effective streaming process. By mathematically modeling the streaming framework, we formulate a joint bitrate selection and beamforming problem, aiming to maximize the average video quality experienced by all users. The problem is a non-convex mixed-integer problem and is NP-hard in general. We propose an algorithm with polynomial time complexity for the problem using an alternative optimization technique along with an appropriate rounding operation. Numerical results demonstrate the superiority of the proposed joint bitrate selection and beamforming solution over baselines
Neural Radiance Fields: Past, Present, and Future
The various aspects like modeling and interpreting 3D environments and
surroundings have enticed humans to progress their research in 3D Computer
Vision, Computer Graphics, and Machine Learning. An attempt made by Mildenhall
et al in their paper about NeRFs (Neural Radiance Fields) led to a boom in
Computer Graphics, Robotics, Computer Vision, and the possible scope of
High-Resolution Low Storage Augmented Reality and Virtual Reality-based 3D
models have gained traction from res with more than 1000 preprints related to
NeRFs published. This paper serves as a bridge for people starting to study
these fields by building on the basics of Mathematics, Geometry, Computer
Vision, and Computer Graphics to the difficulties encountered in Implicit
Representations at the intersection of all these disciplines. This survey
provides the history of rendering, Implicit Learning, and NeRFs, the
progression of research on NeRFs, and the potential applications and
implications of NeRFs in today's world. In doing so, this survey categorizes
all the NeRF-related research in terms of the datasets used, objective
functions, applications solved, and evaluation criteria for these applications.Comment: 413 pages, 9 figures, 277 citation
- …