9,368 research outputs found
Disparity map generation based on trapezoidal camera architecture for multiview video
Visual content acquisition is a strategic functional block of any visual system. Despite its wide possibilities,
the arrangement of cameras for the acquisition of good quality visual content for use in multi-view video
remains a huge challenge. This paper presents the mathematical description of trapezoidal camera
architecture and relationships which facilitate the determination of camera position for visual content
acquisition in multi-view video, and depth map generation. The strong point of Trapezoidal Camera
Architecture is that it allows for adaptive camera topology by which points within the scene, especially the
occluded ones can be optically and geometrically viewed from several different viewpoints either on the
edge of the trapezoid or inside it. The concept of maximum independent set, trapezoid characteristics, and
the fact that the positions of cameras (with the exception of few) differ in their vertical coordinate
description could very well be used to address the issue of occlusion which continues to be a major
problem in computer vision with regards to the generation of depth map
Object-based 2D-to-3D video conversion for effective stereoscopic content generation in 3D-TV applications
Three-dimensional television (3D-TV) has gained increasing popularity in the broadcasting domain, as it enables enhanced viewing experiences in comparison to conventional two-dimensional (2D) TV. However, its application has been constrained due to the lack of essential contents, i.e., stereoscopic videos. To alleviate such content shortage, an economical and practical solution is to reuse the huge media resources that are available in monoscopic 2D and convert them to stereoscopic 3D. Although stereoscopic video can be generated from monoscopic sequences using depth measurements extracted from cues like focus blur, motion and size, the quality of the resulting video may be poor as such measurements are usually arbitrarily defined and appear inconsistent with the real scenes. To help solve this problem, a novel method for object-based stereoscopic video generation is proposed which features i) optical-flow based occlusion reasoning in determining depth ordinal, ii) object segmentation using improved region-growing from masks of determined depth layers, and iii) a hybrid depth estimation scheme using content-based matching (inside a small library of true stereo image pairs) and depth-ordinal based regularization. Comprehensive experiments have validated the effectiveness of our proposed 2D-to-3D conversion method in generating stereoscopic videos of consistent depth measurements for 3D-TV applications
3D video coding and transmission
The capture, transmission, and display of
3D content has gained a lot of attention in the last few
years. 3D multimedia content is no longer con fined to
cinema theatres but is being transmitted using stereoscopic
video over satellite, shared on Blu-RayTMdisks,
or sent over Internet technologies. Stereoscopic displays
are needed at the receiving end and the viewer needs to
wear special glasses to present the two versions of the
video to the human vision system that then generates
the 3D illusion. To be more e ffective and improve the
immersive experience, more views are acquired from a
larger number of cameras and presented on di fferent displays,
such as autostereoscopic and light field displays.
These multiple views, combined with depth data, also
allow enhanced user experiences and new forms of interaction
with the 3D content from virtual viewpoints.
This type of audiovisual information is represented by a
huge amount of data that needs to be compressed and
transmitted over bandwidth-limited channels. Part of
the COST Action IC1105 \3D Content Creation, Coding
and Transmission over Future Media Networks" (3DConTourNet)
focuses on this research challenge.peer-reviewe
From NeRFLiX to NeRFLiX++: A General NeRF-Agnostic Restorer Paradigm
Neural radiance fields (NeRF) have shown great success in novel view
synthesis. However, recovering high-quality details from real-world scenes is
still challenging for the existing NeRF-based approaches, due to the potential
imperfect calibration information and scene representation inaccuracy. Even
with high-quality training frames, the synthetic novel views produced by NeRF
models still suffer from notable rendering artifacts, such as noise and blur.
To address this, we propose NeRFLiX, a general NeRF-agnostic restorer paradigm
that learns a degradation-driven inter-viewpoint mixer. Specially, we design a
NeRF-style degradation modeling approach and construct large-scale training
data, enabling the possibility of effectively removing NeRF-native rendering
artifacts for deep neural networks. Moreover, beyond the degradation removal,
we propose an inter-viewpoint aggregation framework that fuses highly related
high-quality training images, pushing the performance of cutting-edge NeRF
models to entirely new levels and producing highly photo-realistic synthetic
views. Based on this paradigm, we further present NeRFLiX++ with a stronger
two-stage NeRF degradation simulator and a faster inter-viewpoint mixer,
achieving superior performance with significantly improved computational
efficiency. Notably, NeRFLiX++ is capable of restoring photo-realistic
ultra-high-resolution outputs from noisy low-resolution NeRF-rendered views.
Extensive experiments demonstrate the excellent restoration ability of
NeRFLiX++ on various novel view synthesis benchmarks.Comment: 17 pages, 16 figures. Project Page:
https://redrock303.github.io/nerflix_plus/. arXiv admin note: text overlap
with arXiv:2303.0691
- …