508 research outputs found
Vehicle-Rear: A New Dataset to Explore Feature Fusion for Vehicle Identification Using Convolutional Neural Networks
This work addresses the problem of vehicle identification through
non-overlapping cameras. As our main contribution, we introduce a novel dataset
for vehicle identification, called Vehicle-Rear, that contains more than three
hours of high-resolution videos, with accurate information about the make,
model, color and year of nearly 3,000 vehicles, in addition to the position and
identification of their license plates. To explore our dataset we design a
two-stream CNN that simultaneously uses two of the most distinctive and
persistent features available: the vehicle's appearance and its license plate.
This is an attempt to tackle a major problem: false alarms caused by vehicles
with similar designs or by very close license plate identifiers. In the first
network stream, shape similarities are identified by a Siamese CNN that uses a
pair of low-resolution vehicle patches recorded by two different cameras. In
the second stream, we use a CNN for OCR to extract textual information,
confidence scores, and string similarities from a pair of high-resolution
license plate patches. Then, features from both streams are merged by a
sequence of fully connected layers for decision. In our experiments, we
compared the two-stream network against several well-known CNN architectures
using single or multiple vehicle features. The architectures, trained models,
and dataset are publicly available at https://github.com/icarofua/vehicle-rear
Deep Visual Feature Learning for Vehicle Detection, Recognition and Re-identification
Along with the ever-increasing number of motor vehicles in current transportation systems, intelligent video surveillance and management becomes more necessary which is one of the important artificial intelligence fields. Vehicle-related problems are being widely explored and applied practically. Among various techniques, computer vision and machine learning algorithms have been the most popular ones since a vast of video/image surveillance data are available for research, nowadays. In this thesis, vision-based approaches for vehicle detection, recognition, and re-identification are extensively investigated. Moreover, to address different challenges, several novel methods are proposed to overcome weaknesses of previous works and achieve compelling performance.
Deep visual feature learning has been widely researched in the past five years and obtained huge progress in many applications including image classification, image retrieval, object detection, image segmentation and image generation. Compared with traditional machine learning methods which consist of hand-crafted feature extraction and shallow model learning, deep neural networks can learn hierarchical feature representations from low-level to high-level features to get more robust recognition precision. For some specific tasks, researchers prefer to embed feature learning and classification/regression methods into end-to-end models, which can benefit both the accuracy and efficiency. In this thesis, deep models are mainly investigated to study the research problems.
Vehicle detection is the most fundamental task in intelligent video surveillance but faces many challenges such as severe illumination and viewpoint variations, occlusions and multi-scale problems. Moreover, learning vehicles’ diverse attributes is also an interesting and valuable problem. To address these tasks and their difficulties, a fast framework of Detection and Annotation for Vehicles (DAVE) is presented, which effectively combines vehicle detection and attributes annotation. DAVE consists of two convolutional neural networks (CNNs): afastvehicleproposalnetwork(FVPN)forvehicle-likeobjectsextraction and an attributes learning network (ALN) aiming to verify each proposal and infer each vehicle’s pose, color and type simultaneously. These two nets are jointly optimized so that the abundant latent knowledge learned from the ALN can be exploited to guide FVPN training. Once the model is trained, it can achieve efficient vehicle detection and annotation for real-world traffic surveillance data.
The second research problem of the thesis focuses on vehicle re-identification (re-ID). Vehicle re-ID aims to identify a target vehicle in different cameras with non-overlapping views. It has received far less attention in the computer vision community than the prevalent person re-ID problem. Possible reasons for this slow progress are the lack of appropriate research data and the special 3D structure of a vehicle. Previous works have generally focused on some specific views (e.g. front), but these methods are less effective in realistic scenarios where vehicles usually appear in arbitrary view points to cameras. In this thesis, I focus on the uncertainty of vehicle viewpoint in re-ID, proposing four different approaches to address the multi-view vehicle re-ID problem: (1) The Spatially Concatenated ConvNet (SCCN) in an encoder-decoder architecture is proposed to learn transformations across different viewpoints of a vehicle, and then spatially concatenate all the feature maps for further fusing them into a multi-view feature representation. (2) A Cross-View Generative Adversarial Network (XVGAN)is designed to take an input image’s feature as conditional embedding to effectively infer cross-view images. The features of the inferred and original images are combined to learn distance metrics for re-ID.(3)The great advantages of a bi-directional Long Short-Term Memory (LSTM) loop are investigated of modeling transformations across continuous view variation of a vehicle. (4) A Viewpoint-aware Attentive Multi-view Inference (VAMI) model is proposed, adopting a viewpoint-aware attention model to select core regions at different viewpoints and then performing multi-view feature inference by an adversarial training architecture
LSF-IDM: Automotive Intrusion Detection Model with Lightweight Attribution and Semantic Fusion
Autonomous vehicles (AVs) are more vulnerable to network attacks due to the
high connectivity and diverse communication modes between vehicles and external
networks. Deep learning-based Intrusion detection, an effective method for
detecting network attacks, can provide functional safety as well as a real-time
communication guarantee for vehicles, thereby being widely used for AVs.
Existing works well for cyber-attacks such as simple-mode but become a higher
false alarm with a resource-limited environment required when the attack is
concealed within a contextual feature. In this paper, we present a novel
automotive intrusion detection model with lightweight attribution and semantic
fusion, named LSF-IDM. Our motivation is based on the observation that, when
injected the malicious packets to the in-vehicle networks (IVNs), the packet
log presents a strict order of context feature because of the periodicity and
broadcast nature of the CAN bus. Therefore, this model first captures the
context as the semantic feature of messages by the BERT language framework.
Thereafter, the lightweight model (e.g., BiLSTM) learns the fused feature from
an input packet's classification and its output distribution in BERT based on
knowledge distillation. Experiment results demonstrate the effectiveness of our
methods in defending against several representative attacks from IVNs. We also
perform the difference analysis of the proposed method with lightweight models
and Bert to attain a deeper understanding of how the model balance detection
performance and model complexity.Comment: 18 pages, 8 figure
StRDAN: Synthetic-to-Real Domain Adaptation Network for Vehicle Re-Identification
Vehicle re-identification aims to obtain the same vehicles from vehicle
images. This is challenging but essential for analyzing and predicting traffic
flow in the city. Although deep learning methods have achieved enormous
progress for this task, their large data requirement is a critical shortcoming.
Therefore, we propose a synthetic-to-real domain adaptation network (StRDAN)
framework, which can be trained with inexpensive large-scale synthetic and real
data to improve performance. The StRDAN training method combines domain
adaptation and semi-supervised learning methods and their associated losses.
StRDAN offers significant improvement over the baseline model, which can only
be trained using real data, for VeRi and CityFlow-ReID datasets, achieving 3.1%
and 12.9% improved mean average precision, respectively.Comment: 7 pages, 2 figures, CVPR Workshop Paper (Revised
- …