64,308 research outputs found
Cross-View Image Matching for Geo-localization in Urban Environments
In this paper, we address the problem of cross-view image geo-localization.
Specifically, we aim to estimate the GPS location of a query street view image
by finding the matching images in a reference database of geo-tagged bird's eye
view images, or vice versa. To this end, we present a new framework for
cross-view image geo-localization by taking advantage of the tremendous success
of deep convolutional neural networks (CNNs) in image classification and object
detection. First, we employ the Faster R-CNN to detect buildings in the query
and reference images. Next, for each building in the query image, we retrieve
the nearest neighbors from the reference buildings using a Siamese network
trained on both positive matching image pairs and negative pairs. To find the
correct NN for each query building, we develop an efficient multiple nearest
neighbors matching method based on dominant sets. We evaluate the proposed
framework on a new dataset that consists of pairs of street view and bird's eye
view images. Experimental results show that the proposed method achieves better
geo-localization accuracy than other approaches and is able to generalize to
images at unseen locations
CASENet: Deep Category-Aware Semantic Edge Detection
Boundary and edge cues are highly beneficial in improving a wide variety of
vision tasks such as semantic segmentation, object recognition, stereo, and
object proposal generation. Recently, the problem of edge detection has been
revisited and significant progress has been made with deep learning. While
classical edge detection is a challenging binary problem in itself, the
category-aware semantic edge detection by nature is an even more challenging
multi-label problem. We model the problem such that each edge pixel can be
associated with more than one class as they appear in contours or junctions
belonging to two or more semantic classes. To this end, we propose a novel
end-to-end deep semantic edge learning architecture based on ResNet and a new
skip-layer architecture where category-wise edge activations at the top
convolution layer share and are fused with the same set of bottom layer
features. We then propose a multi-label loss function to supervise the fused
activations. We show that our proposed architecture benefits this problem with
better performance, and we outperform the current state-of-the-art semantic
edge detection methods by a large margin on standard data sets such as SBD and
Cityscapes.Comment: Accepted to CVPR 201
- …