1,558 research outputs found
Cross-View Image Matching for Geo-localization in Urban Environments
In this paper, we address the problem of cross-view image geo-localization.
Specifically, we aim to estimate the GPS location of a query street view image
by finding the matching images in a reference database of geo-tagged bird's eye
view images, or vice versa. To this end, we present a new framework for
cross-view image geo-localization by taking advantage of the tremendous success
of deep convolutional neural networks (CNNs) in image classification and object
detection. First, we employ the Faster R-CNN to detect buildings in the query
and reference images. Next, for each building in the query image, we retrieve
the nearest neighbors from the reference buildings using a Siamese network
trained on both positive matching image pairs and negative pairs. To find the
correct NN for each query building, we develop an efficient multiple nearest
neighbors matching method based on dominant sets. We evaluate the proposed
framework on a new dataset that consists of pairs of street view and bird's eye
view images. Experimental results show that the proposed method achieves better
geo-localization accuracy than other approaches and is able to generalize to
images at unseen locations
Vehicle-Rear: A New Dataset to Explore Feature Fusion for Vehicle Identification Using Convolutional Neural Networks
This work addresses the problem of vehicle identification through
non-overlapping cameras. As our main contribution, we introduce a novel dataset
for vehicle identification, called Vehicle-Rear, that contains more than three
hours of high-resolution videos, with accurate information about the make,
model, color and year of nearly 3,000 vehicles, in addition to the position and
identification of their license plates. To explore our dataset we design a
two-stream CNN that simultaneously uses two of the most distinctive and
persistent features available: the vehicle's appearance and its license plate.
This is an attempt to tackle a major problem: false alarms caused by vehicles
with similar designs or by very close license plate identifiers. In the first
network stream, shape similarities are identified by a Siamese CNN that uses a
pair of low-resolution vehicle patches recorded by two different cameras. In
the second stream, we use a CNN for OCR to extract textual information,
confidence scores, and string similarities from a pair of high-resolution
license plate patches. Then, features from both streams are merged by a
sequence of fully connected layers for decision. In our experiments, we
compared the two-stream network against several well-known CNN architectures
using single or multiple vehicle features. The architectures, trained models,
and dataset are publicly available at https://github.com/icarofua/vehicle-rear
- …