14,677 research outputs found
Loss-resilient Coding of Texture and Depth for Free-viewpoint Video Conferencing
Free-viewpoint video conferencing allows a participant to observe the remote
3D scene from any freely chosen viewpoint. An intermediate virtual viewpoint
image is commonly synthesized using two pairs of transmitted texture and depth
maps from two neighboring captured viewpoints via depth-image-based rendering
(DIBR). To maintain high quality of synthesized images, it is imperative to
contain the adverse effects of network packet losses that may arise during
texture and depth video transmission. Towards this end, we develop an
integrated approach that exploits the representation redundancy inherent in the
multiple streamed videos a voxel in the 3D scene visible to two captured views
is sampled and coded twice in the two views. In particular, at the receiver we
first develop an error concealment strategy that adaptively blends
corresponding pixels in the two captured views during DIBR, so that pixels from
the more reliable transmitted view are weighted more heavily. We then couple it
with a sender-side optimization of reference picture selection (RPS) during
real-time video coding, so that blocks containing samples of voxels that are
visible in both views are more error-resiliently coded in one view only, given
adaptive blending will erase errors in the other view. Further, synthesized
view distortion sensitivities to texture versus depth errors are analyzed, so
that relative importance of texture and depth code blocks can be computed for
system-wide RPS optimization. Experimental results show that the proposed
scheme can outperform the use of a traditional feedback channel by up to 0.82
dB on average at 8% packet loss rate, and by as much as 3 dB for particular
frames
Graph Element Networks: adaptive, structured computation and memory
We explore the use of graph neural networks (GNNs) to model spatial processes
in which there is no a priori graphical structure. Similar to finite element
analysis, we assign nodes of a GNN to spatial locations and use a computational
process defined on the graph to model the relationship between an initial
function defined over a space and a resulting function in the same space. We
use GNNs as a computational substrate, and show that the locations of the nodes
in space as well as their connectivity can be optimized to focus on the most
complex parts of the space. Moreover, this representational strategy allows the
learned input-output relationship to generalize over the size of the underlying
space and run the same model at different levels of precision, trading
computation for accuracy. We demonstrate this method on a traditional PDE
problem, a physical prediction problem from robotics, and learning to predict
scene images from novel viewpoints.Comment: Accepted to ICML 201
- …