1,756 research outputs found
A multi-camera approach to image-based rendering and 3-D/Multiview display of ancient chinese artifacts
published_or_final_versio
Providing 3D video services: the challenge from 2D to 3DTV quality of experience
Recently, three-dimensional (3D) video has decisively burst onto the entertainment industry scene, and has arrived in households even before the standardization process has been completed. 3D television (3DTV) adoption and deployment can be seen as a major leap in television history, similar to previous transitions from black and white (B&W) to color, from analog to digital television (TV), and from standard definition to high definition. In this paper, we analyze current 3D video technology trends in order to define a taxonomy of the availability and possible introduction of 3D-based services. We also propose an audiovisual network services architecture which provides a smooth transition from two-dimensional (2D) to 3DTV in an Internet Protocol (IP)-based scenario. Based on subjective assessment tests, we also analyze those factors which will influence the quality of experience in those 3D video services, focusing on effects of both coding and transmission errors. In addition, examples of the application of the architecture and results of assessment tests are provided
Pseudo-Dolly-In Video Generation Combining 3D Modeling and Image Reconstruction
This paper proposes a pseudo-dolly-in video generation method that reproduces motion parallax by applying image reconstruction processing to multi-view videos. Since dolly-in video is taken by moving a camera forward to reproduce motion parallax, we can present a sense of immersion. However, at a sporting event in a large-scale space, moving a camera is difficult. Our research generates dolly-in video from multi-view images captured by fixed cameras. By applying the Image-Based Modeling technique, dolly-in video can be generated. Unfortunately, the video quality is often damaged by the 3D estimation error. On the other hand, Bullet-Time realizes high-quality video observation. However, moving the virtual-viewpoint from the capturing positions is difficult. To solve these problems, we propose a method to generate a pseudo-dolly-in image by installing 3D estimation and image reconstruction techniques into Bullet-Time and show its effectiveness by applying it to multi-view videos captured at an actual soccer stadium. In the experiment, we compared the proposed method with digital zoom images and with the dolly-in video generated from the Image-Based Modeling and Rendering method.Published in: 2017 IEEE International Symposium on Mixed and Augmented Reality (ISMAR-Adjunct) Date of Conference: 9-13 Oct. 2017 Conference Location: Nantes, Franc
CONSAC: Robust Multi-Model Fitting by Conditional Sample Consensus
We present a robust estimator for fitting multiple parametric models of the
same form to noisy measurements. Applications include finding multiple
vanishing points in man-made scenes, fitting planes to architectural imagery,
or estimating multiple rigid motions within the same sequence. In contrast to
previous works, which resorted to hand-crafted search strategies for multiple
model detection, we learn the search strategy from data. A neural network
conditioned on previously detected models guides a RANSAC estimator to
different subsets of all measurements, thereby finding model instances one
after another. We train our method supervised as well as self-supervised. For
supervised training of the search strategy, we contribute a new dataset for
vanishing point estimation. Leveraging this dataset, the proposed algorithm is
superior with respect to other robust estimators as well as to designated
vanishing point estimation algorithms. For self-supervised learning of the
search, we evaluate the proposed algorithm on multi-homography estimation and
demonstrate an accuracy that is superior to state-of-the-art methods.Comment: CVPR 202
Two-View Geometry Scoring Without Correspondences
Camera pose estimation for two-view geometry traditionally relies on RANSAC. Normally, a multitude of image correspondences leads to a pool of proposed hypotheses, which are then scored to find a winning model. The inlier count is generally regarded as a reliable indicator of 'consensus'. We examine this scoring heuristic, and find that it favors disappointing models under certain circumstances. As a remedy, we propose the Fundamental Scoring Network (FSNet), which infers a score for a pair of overlap-ping images and any proposed fundamental matrix. It does not rely on sparse correspondences, but rather embodies a two-view geometry model through an epipolar attention mechanism that predicts the pose error of the two images. FSNet can be incorporated into traditional RANSAC loops. We evaluate FSNet onfundamental and essential matrix estimation on indoor and outdoor datasets, and establish that FSNet can successfully identify good poses for pairs of images with few or unreliable correspondences. Besides, we show that naively combining FSNet with MAGSAC++ scoring approach achieves state of the art results
NASA Automated Rendezvous and Capture Review. Executive summary
In support of the Cargo Transfer Vehicle (CTV) Definition Studies in FY-92, the Advanced Program Development division of the Office of Space Flight at NASA Headquarters conducted an evaluation and review of the United States capabilities and state-of-the-art in Automated Rendezvous and Capture (AR&C). This review was held in Williamsburg, Virginia on 19-21 Nov. 1991 and included over 120 attendees from U.S. government organizations, industries, and universities. One hundred abstracts were submitted to the organizing committee for consideration. Forty-two were selected for presentation. The review was structured to include five technical sessions. Forty-two papers addressed topics in the five categories below: (1) hardware systems and components; (2) software systems; (3) integrated systems; (4) operations; and (5) supporting infrastructure
Single-Image Depth Prediction Makes Feature Matching Easier
Good local features improve the robustness of many 3D re-localization and
multi-view reconstruction pipelines. The problem is that viewing angle and
distance severely impact the recognizability of a local feature. Attempts to
improve appearance invariance by choosing better local feature points or by
leveraging outside information, have come with pre-requisites that made some of
them impractical. In this paper, we propose a surprisingly effective
enhancement to local feature extraction, which improves matching. We show that
CNN-based depths inferred from single RGB images are quite helpful, despite
their flaws. They allow us to pre-warp images and rectify perspective
distortions, to significantly enhance SIFT and BRISK features, enabling more
good matches, even when cameras are looking at the same scene but in opposite
directions.Comment: 14 pages, 7 figures, accepted for publication at the European
conference on computer vision (ECCV) 202
- âŠ