328 research outputs found
Multi-View Video Packet Scheduling
In multiview applications, multiple cameras acquire the same scene from
different viewpoints and generally produce correlated video streams. This
results in large amounts of highly redundant data. In order to save resources,
it is critical to handle properly this correlation during encoding and
transmission of the multiview data. In this work, we propose a
correlation-aware packet scheduling algorithm for multi-camera networks, where
information from all cameras are transmitted over a bottleneck channel to
clients that reconstruct the multiview images. The scheduling algorithm relies
on a new rate-distortion model that captures the importance of each view in the
scene reconstruction. We propose a problem formulation for the optimization of
the packet scheduling policies, which adapt to variations in the scene content.
Then, we design a low complexity scheduling algorithm based on a trellis search
that selects the subset of candidate packets to be transmitted towards
effective multiview reconstruction at clients. Extensive simulation results
confirm the gain of our scheduling algorithm when inter-source correlation
information is used in the scheduler, compared to scheduling policies with no
information about the correlation or non-adaptive scheduling policies. We
finally show that increasing the optimization horizon in the packet scheduling
algorithm improves the transmission performance, especially in scenarios where
the level of correlation rapidly varies with time
Optimized Data Representation for Interactive Multiview Navigation
In contrary to traditional media streaming services where a unique media
content is delivered to different users, interactive multiview navigation
applications enable users to choose their own viewpoints and freely navigate in
a 3-D scene. The interactivity brings new challenges in addition to the
classical rate-distortion trade-off, which considers only the compression
performance and viewing quality. On the one hand, interactivity necessitates
sufficient viewpoints for richer navigation; on the other hand, it requires to
provide low bandwidth and delay costs for smooth navigation during view
transitions. In this paper, we formally describe the novel trade-offs posed by
the navigation interactivity and classical rate-distortion criterion. Based on
an original formulation, we look for the optimal design of the data
representation by introducing novel rate and distortion models and practical
solving algorithms. Experiments show that the proposed data representation
method outperforms the baseline solution by providing lower resource
consumptions and higher visual quality in all navigation configurations, which
certainly confirms the potential of the proposed data representation in
practical interactive navigation systems
Rate-Distortion Analysis of Multiview Coding in a DIBR Framework
Depth image based rendering techniques for multiview applications have been
recently introduced for efficient view generation at arbitrary camera
positions. Encoding rate control has thus to consider both texture and depth
data. Due to different structures of depth and texture images and their
different roles on the rendered views, distributing the available bit budget
between them however requires a careful analysis. Information loss due to
texture coding affects the value of pixels in synthesized views while errors in
depth information lead to shift in objects or unexpected patterns at their
boundaries. In this paper, we address the problem of efficient bit allocation
between textures and depth data of multiview video sequences. We adopt a
rate-distortion framework based on a simplified model of depth and texture
images. Our model preserves the main features of depth and texture images.
Unlike most recent solutions, our method permits to avoid rendering at encoding
time for distortion estimation so that the encoding complexity is not
augmented. In addition to this, our model is independent of the underlying
inpainting method that is used at decoder. Experiments confirm our theoretical
results and the efficiency of our rate allocation strategy
Fine color guidance in diffusion models and its application to image compression at extremely low bitrates
This study addresses the challenge of, without training or fine-tuning,
controlling the global color aspect of images generated with a diffusion model.
We rewrite the guidance equations to ensure that the outputs are closer to a
known color map, and this without hindering the quality of the generation. Our
method leads to new guidance equations. We show in the color guidance context
that, the scaling of the guidance should not decrease but remains high
throughout the diffusion process. In a second contribution, our guidance is
applied in a compression framework, we combine both semantic and general color
information on the image to decode the images at low cost. We show that our
method is effective at improving fidelity and realism of compressed images at
extremely low bit rates, when compared to other classical or more semantic
oriented approaches.Comment: Submitted to IEEE Transactions on Image Processing (TIP
Navigation domain representation for interactive multiview imaging
Enabling users to interactively navigate through different viewpoints of a
static scene is a new interesting functionality in 3D streaming systems. While
it opens exciting perspectives towards rich multimedia applications, it
requires the design of novel representations and coding techniques in order to
solve the new challenges imposed by interactive navigation. Interactivity
clearly brings new design constraints: the encoder is unaware of the exact
decoding process, while the decoder has to reconstruct information from
incomplete subsets of data since the server can generally not transmit images
for all possible viewpoints due to resource constrains. In this paper, we
propose a novel multiview data representation that permits to satisfy bandwidth
and storage constraints in an interactive multiview streaming system. In
particular, we partition the multiview navigation domain into segments, each of
which is described by a reference image and some auxiliary information. The
auxiliary information enables the client to recreate any viewpoint in the
navigation segment via view synthesis. The decoder is then able to navigate
freely in the segment without further data request to the server; it requests
additional data only when it moves to a different segment. We discuss the
benefits of this novel representation in interactive navigation systems and
further propose a method to optimize the partitioning of the navigation domain
into independent segments, under bandwidth and storage constraints.
Experimental results confirm the potential of the proposed representation;
namely, our system leads to similar compression performance as classical
inter-view coding, while it provides the high level of flexibility that is
required for interactive streaming. Hence, our new framework represents a
promising solution for 3D data representation in novel interactive multimedia
services
Machine Learning for Multimedia Communications
Machine learning is revolutionizing the way multimedia information is processed and transmitted to users. After intensive and powerful training, some impressive efficiency/accuracy improvements have been made all over the transmission pipeline. For example, the high model capacity of the learning-based architectures enables us to accurately model the image and video behavior such that tremendous compression gains can be achieved. Similarly, error concealment, streaming strategy or even user perception modeling have widely benefited from the recent learningoriented developments. However, learning-based algorithms often imply drastic changes to the way data are represented or consumed, meaning that the overall pipeline can be affected even though a subpart of it is optimized. In this paper, we review the recent major advances that have been proposed all across the transmission chain, and we discuss their potential impact and the research challenges that they raise
Image embedding and user multi-preference modeling for data collection sampling
This work proposes an end-to-end user-centric sampling method aimed at selecting the images from an image collection that are able to maximize the information perceived by a given user. As main contributions, we first introduce novel metrics that assess the amount of perceived information retained by the user when experiencing a set of images. Given the actual information present in a set of images, which is the volume spanned by the set in the corresponding latent space, we show how to take into account the user’s preferences in such a volume calculation to build a user-centric metric for the perceived information. Finally, we propose a sampling strategy seeking the minimum set of images that maximize the information perceived by a given user. Experiments using the coco dataset show the ability of the proposed approach to accurately integrate user preference while keeping a reasonable diversity in the sampled image set
Correlation-aware packet scheduling in multi-camera networks
In multiview applications, multiple cameras acquire the same scene from different viewpoints and generally produce correlated video streams. This results in large amounts of highly redundant data. In order to save resources, it is critical to handle properly this correlation during encoding and transmission of the multiview data. In this work, we propose a correlation-aware packet scheduling algorithm for multi-camera networks, where information from all cameras are transmitted over a bottleneck channel to clients that reconstruct the multiview images. The scheduling algorithm relies on a new rate-distortion model that captures the importance of each view in the scene reconstruction. We propose a problem formulation for the optimization of the packet scheduling policies, which adapt to variations in the scene content. Then, we design a low complexity scheduling algorithm based on a trellis search that selects the subset of candidate packets to be transmitted towards effective multiview reconstruction at clients. Extensive simulation results confirm the gain of our scheduling algorithm when inter-source correlation information is used in the scheduler, compared to scheduling policies with no information about the correlation or non-adaptive scheduling policies. We finally show that increasing the optimization horizon in the packet scheduling algorithm improves the transmission performance, especially in scenarios where the level of correlation rapidly varies with time. © 2013 IEEE
Optimized Packet Scheduling in Multiview Video Navigation Systems
In multiview video systems, multiple cameras generally acquire the same scene
from different perspectives, such that users have the possibility to select
their preferred viewpoint. This results in large amounts of highly redundant
data, which needs to be properly handled during encoding and transmission over
resource-constrained channels. In this work, we study coding and transmission
strategies in multicamera systems, where correlated sources send data through a
bottleneck channel to a central server, which eventually transmits views to
different interactive users. We propose a dynamic correlation-aware packet
scheduling optimization under delay, bandwidth, and interactivity constraints.
The optimization relies both on a novel rate-distortion model, which captures
the importance of each view in the 3D scene reconstruction, and on an objective
function that optimizes resources based on a client navigation model. The
latter takes into account the distortion experienced by interactive clients as
well as the distortion variations that might be observed by clients during
multiview navigation. We solve the scheduling problem with a novel
trellis-based solution, which permits to formally decompose the multivariate
optimization problem thereby significantly reducing the computation complexity.
Simulation results show the gain of the proposed algorithm compared to baseline
scheduling policies. More in details, we show the gain offered by our dynamic
scheduling policy compared to static camera allocation strategies and to
schemes with constant coding strategies. Finally, we show that the best
scheduling policy consistently adapts to the most likely user navigation path
and that it minimizes distortion variations that can be very disturbing for
users in traditional navigation systems
- …