16,737 research outputs found
View Independent Vehicle Make, Model and Color Recognition Using Convolutional Neural Network
This paper describes the details of Sighthound's fully automated vehicle
make, model and color recognition system. The backbone of our system is a deep
convolutional neural network that is not only computationally inexpensive, but
also provides state-of-the-art results on several competitive benchmarks.
Additionally, our deep network is trained on a large dataset of several million
images which are labeled through a semi-automated process. Finally we test our
system on several public datasets as well as our own internal test dataset. Our
results show that we outperform other methods on all benchmarks by significant
margins. Our model is available to developers through the Sighthound Cloud API
at https://www.sighthound.com/products/cloudComment: 7 Page
Incorporating Intra-Class Variance to Fine-Grained Visual Recognition
Fine-grained visual recognition aims to capture discriminative
characteristics amongst visually similar categories. The state-of-the-art
research work has significantly improved the fine-grained recognition
performance by deep metric learning using triplet network. However, the impact
of intra-category variance on the performance of recognition and robust feature
representation has not been well studied. In this paper, we propose to leverage
intra-class variance in metric learning of triplet network to improve the
performance of fine-grained recognition. Through partitioning training images
within each category into a few groups, we form the triplet samples across
different categories as well as different groups, which is called Group
Sensitive TRiplet Sampling (GS-TRS). Accordingly, the triplet loss function is
strengthened by incorporating intra-class variance with GS-TRS, which may
contribute to the optimization objective of triplet network. Extensive
experiments over benchmark datasets CompCar and VehicleID show that the
proposed GS-TRS has significantly outperformed state-of-the-art approaches in
both classification and retrieval tasks.Comment: 6 pages, 5 figure
Part-based Multi-stream Model for Vehicle Searching
Due to the enormous requirement in public security and intelligent
transportation system, searching an identical vehicle has become more and more
important. Current studies usually treat vehicle as an integral object and then
train a distance metric to measure the similarity among vehicles. However,
these raw images may be exactly similar to ones with different identification
and include some pixels in background that may disturb the distance metric
learning. In this paper, we propose a novel and useful method to segment an
original vehicle image into several discriminative foreground parts, and these
parts consist of some fine grained regions that are named discriminative
patches. After that, these parts combined with the raw image are fed into the
proposed deep learning network. We can easily measure the similarity of two
vehicle images by computing the Euclidean distance of the features from FC
layer. Two main contributions of this paper are as follows. Firstly, a method
is proposed to estimate if a patch in a raw vehicle image is discriminative or
not. Secondly, a new Part-based Multi-Stream Model (PMSM) is designed and
optimized for vehicle retrieval and re-identification tasks. We evaluate the
proposed method on the VehicleID dataset, and the experimental results show
that our method can outperform the baseline.Comment: Published in International Conference on Pattern Recognition 201
Vehicle-Rear: A New Dataset to Explore Feature Fusion for Vehicle Identification Using Convolutional Neural Networks
This work addresses the problem of vehicle identification through
non-overlapping cameras. As our main contribution, we introduce a novel dataset
for vehicle identification, called Vehicle-Rear, that contains more than three
hours of high-resolution videos, with accurate information about the make,
model, color and year of nearly 3,000 vehicles, in addition to the position and
identification of their license plates. To explore our dataset we design a
two-stream CNN that simultaneously uses two of the most distinctive and
persistent features available: the vehicle's appearance and its license plate.
This is an attempt to tackle a major problem: false alarms caused by vehicles
with similar designs or by very close license plate identifiers. In the first
network stream, shape similarities are identified by a Siamese CNN that uses a
pair of low-resolution vehicle patches recorded by two different cameras. In
the second stream, we use a CNN for OCR to extract textual information,
confidence scores, and string similarities from a pair of high-resolution
license plate patches. Then, features from both streams are merged by a
sequence of fully connected layers for decision. In our experiments, we
compared the two-stream network against several well-known CNN architectures
using single or multiple vehicle features. The architectures, trained models,
and dataset are publicly available at https://github.com/icarofua/vehicle-rear
- …