7,445 research outputs found
Navigation domain representation for interactive multiview imaging
Enabling users to interactively navigate through different viewpoints of a
static scene is a new interesting functionality in 3D streaming systems. While
it opens exciting perspectives towards rich multimedia applications, it
requires the design of novel representations and coding techniques in order to
solve the new challenges imposed by interactive navigation. Interactivity
clearly brings new design constraints: the encoder is unaware of the exact
decoding process, while the decoder has to reconstruct information from
incomplete subsets of data since the server can generally not transmit images
for all possible viewpoints due to resource constrains. In this paper, we
propose a novel multiview data representation that permits to satisfy bandwidth
and storage constraints in an interactive multiview streaming system. In
particular, we partition the multiview navigation domain into segments, each of
which is described by a reference image and some auxiliary information. The
auxiliary information enables the client to recreate any viewpoint in the
navigation segment via view synthesis. The decoder is then able to navigate
freely in the segment without further data request to the server; it requests
additional data only when it moves to a different segment. We discuss the
benefits of this novel representation in interactive navigation systems and
further propose a method to optimize the partitioning of the navigation domain
into independent segments, under bandwidth and storage constraints.
Experimental results confirm the potential of the proposed representation;
namely, our system leads to similar compression performance as classical
inter-view coding, while it provides the high level of flexibility that is
required for interactive streaming. Hence, our new framework represents a
promising solution for 3D data representation in novel interactive multimedia
services
Multi-View Video Packet Scheduling
In multiview applications, multiple cameras acquire the same scene from
different viewpoints and generally produce correlated video streams. This
results in large amounts of highly redundant data. In order to save resources,
it is critical to handle properly this correlation during encoding and
transmission of the multiview data. In this work, we propose a
correlation-aware packet scheduling algorithm for multi-camera networks, where
information from all cameras are transmitted over a bottleneck channel to
clients that reconstruct the multiview images. The scheduling algorithm relies
on a new rate-distortion model that captures the importance of each view in the
scene reconstruction. We propose a problem formulation for the optimization of
the packet scheduling policies, which adapt to variations in the scene content.
Then, we design a low complexity scheduling algorithm based on a trellis search
that selects the subset of candidate packets to be transmitted towards
effective multiview reconstruction at clients. Extensive simulation results
confirm the gain of our scheduling algorithm when inter-source correlation
information is used in the scheduler, compared to scheduling policies with no
information about the correlation or non-adaptive scheduling policies. We
finally show that increasing the optimization horizon in the packet scheduling
algorithm improves the transmission performance, especially in scenarios where
the level of correlation rapidly varies with time
Loss-resilient Coding of Texture and Depth for Free-viewpoint Video Conferencing
Free-viewpoint video conferencing allows a participant to observe the remote
3D scene from any freely chosen viewpoint. An intermediate virtual viewpoint
image is commonly synthesized using two pairs of transmitted texture and depth
maps from two neighboring captured viewpoints via depth-image-based rendering
(DIBR). To maintain high quality of synthesized images, it is imperative to
contain the adverse effects of network packet losses that may arise during
texture and depth video transmission. Towards this end, we develop an
integrated approach that exploits the representation redundancy inherent in the
multiple streamed videos a voxel in the 3D scene visible to two captured views
is sampled and coded twice in the two views. In particular, at the receiver we
first develop an error concealment strategy that adaptively blends
corresponding pixels in the two captured views during DIBR, so that pixels from
the more reliable transmitted view are weighted more heavily. We then couple it
with a sender-side optimization of reference picture selection (RPS) during
real-time video coding, so that blocks containing samples of voxels that are
visible in both views are more error-resiliently coded in one view only, given
adaptive blending will erase errors in the other view. Further, synthesized
view distortion sensitivities to texture versus depth errors are analyzed, so
that relative importance of texture and depth code blocks can be computed for
system-wide RPS optimization. Experimental results show that the proposed
scheme can outperform the use of a traditional feedback channel by up to 0.82
dB on average at 8% packet loss rate, and by as much as 3 dB for particular
frames
Multi-Stream Management for Supporting Multi-Party 3D Tele-Immersive Environments
Three-dimensional tele-immersive (3DTI) environments have great potential to promote collaborative work among geographically distributed participants. However, extensive application of 3DTI environments is still hindered by problems pertaining to scalability, manageability and reliance of special-purpose components. Thus, one critical question is how to organize the acquisition, transmission and display of large volume real-time 3D visual data over commercially available computing and networking infrastructures so that .everybody. would be able to install and enjoy 3DTI environments for high quality tele-collaboration.
In the thesis, we explore the design space from the angle of multi-stream Quality-of-Service (QoS) management to support multi-party 3DTI communication. In 3DTI environments, multiple correlated 3D video streams are deployed to provide a comprehensive representation of the physical scene. Traditional QoS approach in 2D and single-stream scenario has become inadequate. On the other hand, the existence of multiple streams provides unique opportunity for QoS provisioning.
We propose an innovative cross-layer hierarchical and distributed multi-stream management middleware framework for QoS provisioning to fully enable multi-party 3DTI communication over general delivery infrastructure. The major contributions are as follows. First, we introduce the view model for representing the user interest in the application layer. The design revolves around the concept of view-aware multi-stream coordination, which leverages the central role of view semantics in 3D video systems. Second, in the stream differentiation layer we present the design of view to stream mapping, where a subset of relevant streams are selected based on the relative importance of each stream to the current view. Conventional streaming controllers focus on a fixed set of streams specified by the application. Different from all the others, in our management framework the application layer only specifies the view information while the underlying controller dynamically determines the set of streams to be managed. Third, in the stream coordination layer we present two designs applicable in different situations. In the case of end-to-end 3DTI communication, a learning-based controller is embedded which provides bandwidth allocation for relevant streams. In the case of multi-party 3DTI communication, we propose a novel ViewCast protocol to coordinate the multi-stream content dissemination upon an end-system overlay network
An Efficient Transport Protocol for delivery of Multimedia An Efficient Transport Protocol for delivery of Multimedia Content in Wireless Grids
A grid computing system is designed for solving complicated scientific and
commercial problems effectively,whereas mobile computing is a traditional
distributed system having computing capability with mobility and adopting
wireless communications. Media and Entertainment fields can take advantage from
both paradigms by applying its usage in gaming applications and multimedia data
management. Multimedia data has to be stored and retrieved in an efficient and
effective manner to put it in use. In this paper, we proposed an application
layer protocol for delivery of multimedia data in wireless girds i.e.
multimedia grid protocol (MMGP). To make streaming efficient a new video
compression algorithm called dWave is designed and embedded in the proposed
protocol. This protocol will provide faster, reliable access and render an
imperceptible QoS in delivering multimedia in wireless grid environment and
tackles the challenging issues such as i) intermittent connectivity, ii) device
heterogeneity, iii) weak security and iv) device mobility.Comment: 20 pages, 15 figures, Peer Reviewed Journa
Mobile graphics: SIGGRAPH Asia 2017 course
Peer ReviewedPostprint (published version
Optimized Data Representation for Interactive Multiview Navigation
In contrary to traditional media streaming services where a unique media
content is delivered to different users, interactive multiview navigation
applications enable users to choose their own viewpoints and freely navigate in
a 3-D scene. The interactivity brings new challenges in addition to the
classical rate-distortion trade-off, which considers only the compression
performance and viewing quality. On the one hand, interactivity necessitates
sufficient viewpoints for richer navigation; on the other hand, it requires to
provide low bandwidth and delay costs for smooth navigation during view
transitions. In this paper, we formally describe the novel trade-offs posed by
the navigation interactivity and classical rate-distortion criterion. Based on
an original formulation, we look for the optimal design of the data
representation by introducing novel rate and distortion models and practical
solving algorithms. Experiments show that the proposed data representation
method outperforms the baseline solution by providing lower resource
consumptions and higher visual quality in all navigation configurations, which
certainly confirms the potential of the proposed data representation in
practical interactive navigation systems
Perception-driven approaches to real-time remote immersive visualization
In remote immersive visualization systems, real-time 3D perception through RGB-D cameras, combined with modern Virtual Reality (VR) interfaces, enhances the user’s sense of presence in a remote scene through 3D reconstruction rendered in a remote immersive visualization system. Particularly, in situations when there is a need to visualize, explore and perform tasks in inaccessible environments, too hazardous or distant. However, a remote visualization system requires the entire pipeline from 3D data acquisition to VR rendering satisfies the speed, throughput, and high visual realism. Mainly when using point-cloud, there is a fundamental quality difference between the acquired data of the physical world and the displayed data because of network latency and throughput limitations that negatively impact the sense of presence and provoke cybersickness. This thesis presents state-of-the-art research to address these problems by taking the human visual system as inspiration, from sensor data acquisition to VR rendering. The human visual system does not have a uniform vision across the field of view; It has the sharpest visual acuity at the center of the field of view. The acuity falls off towards the periphery. The peripheral vision provides lower resolution to guide the eye movements so that the central vision visits all the interesting crucial parts. As a first contribution, the thesis developed remote visualization strategies that utilize the acuity fall-off to facilitate the processing, transmission, buffering, and rendering in VR of 3D reconstructed scenes while simultaneously reducing throughput requirements and latency. As a second contribution, the thesis looked into attentional mechanisms to select and draw user engagement to specific information from the dynamic spatio-temporal environment. It proposed a strategy to analyze the remote scene concerning the 3D structure of the scene, its layout, and the spatial, functional, and semantic relationships between objects in the scene. The strategy primarily focuses on analyzing the scene with models the human visual perception uses. It sets a more significant proportion of computational resources on objects of interest and creates a more realistic visualization. As a supplementary contribution, A new volumetric point-cloud density-based Peak Signal-to-Noise Ratio (PSNR) metric is proposed to evaluate the introduced techniques. An in-depth evaluation of the presented systems, comparative examination of the proposed point cloud metric, user studies, and experiments demonstrated that the methods introduced in this thesis are visually superior while significantly reducing latency and throughput
- …