222 research outputs found
Navigation domain representation for interactive multiview imaging
Enabling users to interactively navigate through different viewpoints of a
static scene is a new interesting functionality in 3D streaming systems. While
it opens exciting perspectives towards rich multimedia applications, it
requires the design of novel representations and coding techniques in order to
solve the new challenges imposed by interactive navigation. Interactivity
clearly brings new design constraints: the encoder is unaware of the exact
decoding process, while the decoder has to reconstruct information from
incomplete subsets of data since the server can generally not transmit images
for all possible viewpoints due to resource constrains. In this paper, we
propose a novel multiview data representation that permits to satisfy bandwidth
and storage constraints in an interactive multiview streaming system. In
particular, we partition the multiview navigation domain into segments, each of
which is described by a reference image and some auxiliary information. The
auxiliary information enables the client to recreate any viewpoint in the
navigation segment via view synthesis. The decoder is then able to navigate
freely in the segment without further data request to the server; it requests
additional data only when it moves to a different segment. We discuss the
benefits of this novel representation in interactive navigation systems and
further propose a method to optimize the partitioning of the navigation domain
into independent segments, under bandwidth and storage constraints.
Experimental results confirm the potential of the proposed representation;
namely, our system leads to similar compression performance as classical
inter-view coding, while it provides the high level of flexibility that is
required for interactive streaming. Hence, our new framework represents a
promising solution for 3D data representation in novel interactive multimedia
services
Optimized Packet Scheduling in Multiview Video Navigation Systems
In multiview video systems, multiple cameras generally acquire the same scene
from different perspectives, such that users have the possibility to select
their preferred viewpoint. This results in large amounts of highly redundant
data, which needs to be properly handled during encoding and transmission over
resource-constrained channels. In this work, we study coding and transmission
strategies in multicamera systems, where correlated sources send data through a
bottleneck channel to a central server, which eventually transmits views to
different interactive users. We propose a dynamic correlation-aware packet
scheduling optimization under delay, bandwidth, and interactivity constraints.
The optimization relies both on a novel rate-distortion model, which captures
the importance of each view in the 3D scene reconstruction, and on an objective
function that optimizes resources based on a client navigation model. The
latter takes into account the distortion experienced by interactive clients as
well as the distortion variations that might be observed by clients during
multiview navigation. We solve the scheduling problem with a novel
trellis-based solution, which permits to formally decompose the multivariate
optimization problem thereby significantly reducing the computation complexity.
Simulation results show the gain of the proposed algorithm compared to baseline
scheduling policies. More in details, we show the gain offered by our dynamic
scheduling policy compared to static camera allocation strategies and to
schemes with constant coding strategies. Finally, we show that the best
scheduling policy consistently adapts to the most likely user navigation path
and that it minimizes distortion variations that can be very disturbing for
users in traditional navigation systems
Interactive Multiview Video System With Low Complexity 2D Look Around at Decoder
Multiview video with interactive and smooth view switching at the receiver is a challenging application with several issues in terms of effective use of storage and bandwidth resources, reactivity of the system, quality of the viewing experience and system complexity. The classical decoding system for generating virtual views first projects a reference or encoded frame to a given viewpoint and then fills in the holes due to potential occlusions. This last step still constitutes a complex operation with specific software or hardware at the receiver and requires a certain quantity of information from the neighboring frames for insuring consistency between the virtual images. In this work we propose a new approach that shifts most of the burden due to interactivity from the decoder to the encoder, by anticipating the navigation of the decoder and sending auxiliary information that guarantees temporal and interview consistency. This leads to an additional cost in terms of transmission rate and storage, which we minimize by using optimization techniques based on the user behavior modeling. We show by experiments that the proposed system represents a valid solution for interactive multiview systems with classical decoders
Interactive multiview video system with low decoding complexity
Research in multimedia is always investigating new ways of improving the immersive experience of the users. One current solution consists in designing systems which offer a high level of interactivity, such as multiview content navigation where the point of view can be changed while watching at a video sequence (e.g., free viewpoint television, gaming, etc.). The coding algorithm designed for the transmission of such media streams must be adapted to these novel decoder needs. However, video plus depth data transmission is usually performed by considering the information flows as two sequences encoded with MVC schemes. Whereas it achieves good compression performance, this coding approach is not appropriate for interactive applications since the decoding of a frame often requires the prior transmission and decoding of several reference frames. Moreover, the techniques recently developed to improve interactivity are generally implemented at the decoder, whose computational complexity requirements are augmented. In this paper, we propose a novel coding scheme for video plus depth sequences that is adapted to user navigation; contrarily to several common approaches, the additional complexity is added on the encoder side so that the decoder stays simple. We further propose to limit the additional bandwidth imposed by interactivity requirements by designing a rate allocation algorithm that builds on a model of the user behavior. A first version of our novel coding architecture is evaluated in terms of rate-distortion performance, where it is shown to offer a high interactivity at a reasonable bandwidth cost
FVV Live: A real-time free-viewpoint video system with consumer electronics hardware
FVV Live is a novel end-to-end free-viewpoint video system, designed for low
cost and real-time operation, based on off-the-shelf components. The system has
been designed to yield high-quality free-viewpoint video using consumer-grade
cameras and hardware, which enables low deployment costs and easy installation
for immersive event-broadcasting or videoconferencing.
The paper describes the architecture of the system, including acquisition and
encoding of multiview plus depth data in several capture servers and virtual
view synthesis on an edge server. All the blocks of the system have been
designed to overcome the limitations imposed by hardware and network, which
impact directly on the accuracy of depth data and thus on the quality of
virtual view synthesis. The design of FVV Live allows for an arbitrary number
of cameras and capture servers, and the results presented in this paper
correspond to an implementation with nine stereo-based depth cameras.
FVV Live presents low motion-to-photon and end-to-end delays, which enables
seamless free-viewpoint navigation and bilateral immersive communications.
Moreover, the visual quality of FVV Live has been assessed through subjective
assessment with satisfactory results, and additional comparative tests show
that it is preferred over state-of-the-art DIBR alternatives
Recommended from our members
End-to-end 3D video communication over heterogeneous networks
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Three-dimensional technology, more commonly referred to as 3D technology, has revolutionised many fields including entertainment, medicine, and communications to name a few. In addition to 3D films, games, and sports channels, 3D perception has made tele-medicine a reality. By the year 2015, 30% of the all HD panels at home will be 3D enabled, predicted by consumer electronics manufacturers. Stereoscopic cameras, a comparatively mature technology compared to other 3D systems, are now being used by ordinary citizens to produce 3D content and share at a click of a button just like they do with the 2D counterparts via sites like YouTube. But technical challenges still exist, including with autostereoscopic multiview displays. 3D content requires many complex considerations--including how to represent it, and deciphering what is the best compression format--when considering transmission or storage, because of its increased amount of data. Any decision must be taken in the light of the available bandwidth or storage capacity, quality and user expectations. Free viewpoint navigation also remains partly unsolved. The most pressing issue getting in the way of widespread uptake of consumer 3D systems is the ability to deliver 3D content to heterogeneous consumer displays over the heterogeneous networks. Optimising 3D video communication solutions must consider the entire pipeline, starting with optimisation at the video source to the end display and transmission optimisation. Multi-view offers the most compelling solution for 3D videos with motion parallax and freedom from wearing headgear for 3D video perception. Optimising multi-view video for delivery and display could increase the demand for true 3D in the consumer market. This thesis focuses on an end-to-end quality optimisation in 3D video communication/transmission, offering solutions for optimisation at the compression, transmission, and decoder levels.Brunel University - Isambard Research Scholarshi
- …