27,912 research outputs found
Learning Single-Image Depth from Videos using Quality Assessment Networks
Depth estimation from a single image in the wild remains a challenging
problem. One main obstacle is the lack of high-quality training data for images
in the wild. In this paper we propose a method to automatically generate such
data through Structure-from-Motion (SfM) on Internet videos. The core of this
method is a Quality Assessment Network that identifies high-quality
reconstructions obtained from SfM. Using this method, we collect single-view
depth training data from a large number of YouTube videos and construct a new
dataset called YouTube3D. Experiments show that YouTube3D is useful in training
depth estimation networks and advances the state of the art of single-view
depth estimation in the wild
Recommended from our members
An evaluation framework for stereo-based driver assistance
This is the post-print version of the Article - Copyright @ 2012 Springer VerlagThe accuracy of stereo algorithms or optical flow methods is commonly assessed by comparing the results against the Middlebury
database. However, equivalent data for automotive or robotics applications
rarely exist as they are difficult to obtain. As our main contribution, we introduce an evaluation framework tailored for stereo-based driver assistance able to deliver excellent performance measures while
circumventing manual label effort. Within this framework one can combine several ways of ground-truthing, different comparison metrics, and use large image databases.
Using our framework we show examples on several types of ground truthing techniques: implicit ground truthing (e.g. sequence recorded without a crash occurred), robotic vehicles with high precision sensors, and to a small extent, manual labeling. To show the effectiveness of our evaluation framework we compare three different stereo algorithms on
pixel and object level. In more detail we evaluate an intermediate representation
called the Stixel World. Besides evaluating the accuracy of the Stixels, we investigate the completeness (equivalent to the detection rate) of the StixelWorld vs. the number of phantom Stixels. Among many findings, using this framework enables us to reduce the number of phantom Stixels by a factor of three compared to the base parametrization. This base parametrization has already been optimized by test driving vehicles for distances exceeding 10000 km
General Dynamic Scene Reconstruction from Multiple View Video
This paper introduces a general approach to dynamic scene reconstruction from
multiple moving cameras without prior knowledge or limiting constraints on the
scene structure, appearance, or illumination. Existing techniques for dynamic
scene reconstruction from multiple wide-baseline camera views primarily focus
on accurate reconstruction in controlled environments, where the cameras are
fixed and calibrated and background is known. These approaches are not robust
for general dynamic scenes captured with sparse moving cameras. Previous
approaches for outdoor dynamic scene reconstruction assume prior knowledge of
the static background appearance and structure. The primary contributions of
this paper are twofold: an automatic method for initial coarse dynamic scene
segmentation and reconstruction without prior knowledge of background
appearance or structure; and a general robust approach for joint segmentation
refinement and dense reconstruction of dynamic scenes from multiple
wide-baseline static or moving cameras. Evaluation is performed on a variety of
indoor and outdoor scenes with cluttered backgrounds and multiple dynamic
non-rigid objects such as people. Comparison with state-of-the-art approaches
demonstrates improved accuracy in both multiple view segmentation and dense
reconstruction. The proposed approach also eliminates the requirement for prior
knowledge of scene structure and appearance
Point Pair Feature based Object Detection for Random Bin Picking
Point pair features are a popular representation for free form 3D object
detection and pose estimation. In this paper, their performance in an
industrial random bin picking context is investigated. A new method to generate
representative synthetic datasets is proposed. This allows to investigate the
influence of a high degree of clutter and the presence of self similar
features, which are typical to our application. We provide an overview of
solutions proposed in literature and discuss their strengths and weaknesses. A
simple heuristic method to drastically reduce the computational complexity is
introduced, which results in improved robustness, speed and accuracy compared
to the naive approach
- …