34,545 research outputs found
Visual Landmark Recognition from Internet Photo Collections: A Large-Scale Evaluation
The task of a visual landmark recognition system is to identify photographed
buildings or objects in query photos and to provide the user with relevant
information on them. With their increasing coverage of the world's landmark
buildings and objects, Internet photo collections are now being used as a
source for building such systems in a fully automatic fashion. This process
typically consists of three steps: clustering large amounts of images by the
objects they depict; determining object names from user-provided tags; and
building a robust, compact, and efficient recognition index. To this date,
however, there is little empirical information on how well current approaches
for those steps perform in a large-scale open-set mining and recognition task.
Furthermore, there is little empirical information on how recognition
performance varies for different types of landmark objects and where there is
still potential for improvement. With this paper, we intend to fill these gaps.
Using a dataset of 500k images from Paris, we analyze each component of the
landmark recognition pipeline in order to answer the following questions: How
many and what kinds of objects can be discovered automatically? How can we best
use the resulting image clusters to recognize the object in a query? How can
the object be efficiently represented in memory for recognition? How reliably
can semantic information be extracted? And finally: What are the limiting
factors in the resulting pipeline from query to semantics? We evaluate how
different choices of methods and parameters for the individual pipeline steps
affect overall system performance and examine their effects for different query
categories such as buildings, paintings or sculptures
Depth mapping of integral images through viewpoint image extraction with a hybrid disparity analysis algorithm
Integral imaging is a technique capable of displaying 3–D images with continuous parallax in full natural color. It is one of the most promising methods for producing smooth 3–D images. Extracting depth information from integral image has various applications ranging from remote inspection, robotic vision, medical imaging, virtual reality, to content-based image coding and manipulation for integral imaging based 3–D TV. This paper presents a method of generating a depth map from unidirectional integral images through viewpoint image extraction and using a hybrid disparity analysis algorithm combining multi-baseline, neighbourhood constraint and relaxation strategies. It is shown that a depth map having few areas of uncertainty can be obtained from both computer and photographically generated integral images using this approach. The acceptable depth maps can be achieved from photographic captured integral images containing complicated object scene
PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume
We present a compact but effective CNN model for optical flow, called
PWC-Net. PWC-Net has been designed according to simple and well-established
principles: pyramidal processing, warping, and the use of a cost volume. Cast
in a learnable feature pyramid, PWC-Net uses the cur- rent optical flow
estimate to warp the CNN features of the second image. It then uses the warped
features and features of the first image to construct a cost volume, which is
processed by a CNN to estimate the optical flow. PWC-Net is 17 times smaller in
size and easier to train than the recent FlowNet2 model. Moreover, it
outperforms all published optical flow methods on the MPI Sintel final pass and
KITTI 2015 benchmarks, running at about 35 fps on Sintel resolution (1024x436)
images. Our models are available on https://github.com/NVlabs/PWC-Net.Comment: CVPR 2018 camera ready version (with github link to Caffe and PyTorch
code
Local Stereo Matching Using Adaptive Local Segmentation
We propose a new dense local stereo matching framework for gray-level images based on an adaptive local segmentation using a dynamic threshold. We define a new validity domain of the fronto-parallel assumption based on the local intensity variations in the 4-neighborhood of the matching pixel. The preprocessing step smoothes low textured areas and sharpens texture edges, whereas the postprocessing step detects and recovers occluded and unreliable disparities. The algorithm achieves high stereo reconstruction quality in regions with uniform intensities as well as in textured regions. The algorithm is robust against local radiometrical differences; and successfully recovers disparities around the objects edges, disparities of thin objects, and the disparities of the occluded region. Moreover, our algorithm intrinsically prevents errors caused by occlusion to propagate into nonoccluded regions. It has only a small number of parameters. The performance of our algorithm is evaluated on the Middlebury test bed stereo images. It ranks highly on the evaluation list outperforming many local and global stereo algorithms using color images. Among the local algorithms relying on the fronto-parallel assumption, our algorithm is the best ranked algorithm. We also demonstrate that our algorithm is working well on practical examples as for disparity estimation of a tomato seedling and a 3D reconstruction of a face
- …