19 research outputs found
Vehicle-Rear: A New Dataset to Explore Feature Fusion for Vehicle Identification Using Convolutional Neural Networks
This work addresses the problem of vehicle identification through
non-overlapping cameras. As our main contribution, we introduce a novel dataset
for vehicle identification, called Vehicle-Rear, that contains more than three
hours of high-resolution videos, with accurate information about the make,
model, color and year of nearly 3,000 vehicles, in addition to the position and
identification of their license plates. To explore our dataset we design a
two-stream CNN that simultaneously uses two of the most distinctive and
persistent features available: the vehicle's appearance and its license plate.
This is an attempt to tackle a major problem: false alarms caused by vehicles
with similar designs or by very close license plate identifiers. In the first
network stream, shape similarities are identified by a Siamese CNN that uses a
pair of low-resolution vehicle patches recorded by two different cameras. In
the second stream, we use a CNN for OCR to extract textual information,
confidence scores, and string similarities from a pair of high-resolution
license plate patches. Then, features from both streams are merged by a
sequence of fully connected layers for decision. In our experiments, we
compared the two-stream network against several well-known CNN architectures
using single or multiple vehicle features. The architectures, trained models,
and dataset are publicly available at https://github.com/icarofua/vehicle-rear
Leveraging Model Fusion for Improved License Plate Recognition
License Plate Recognition (LPR) plays a critical role in various
applications, such as toll collection, parking management, and traffic law
enforcement. Although LPR has witnessed significant advancements through the
development of deep learning, there has been a noticeable lack of studies
exploring the potential improvements in results by fusing the outputs from
multiple recognition models. This research aims to fill this gap by
investigating the combination of up to 12 different models using
straightforward approaches, such as selecting the most confident prediction or
employing majority vote-based strategies. Our experiments encompass a wide
range of datasets, revealing substantial benefits of fusion approaches in both
intra- and cross-dataset setups. Essentially, fusing multiple models reduces
considerably the likelihood of obtaining subpar performance on a particular
dataset/scenario. We also found that combining models based on their speed is
an appealing approach. Specifically, for applications where the recognition
task can tolerate some additional time, though not excessively, an effective
strategy is to combine 4-6 models. These models may not be the most accurate
individually, but their fusion strikes an optimal balance between accuracy and
speed.Comment: Accepted for presentation at the Iberoamerican Congress on Pattern
Recognition (CIARP) 202
A Robust Real-Time Automatic License Plate Recognition Based on the YOLO Detector
Automatic License Plate Recognition (ALPR) has been a frequent topic of
research due to many practical applications. However, many of the current
solutions are still not robust in real-world situations, commonly depending on
many constraints. This paper presents a robust and efficient ALPR system based
on the state-of-the-art YOLO object detector. The Convolutional Neural Networks
(CNNs) are trained and fine-tuned for each ALPR stage so that they are robust
under different conditions (e.g., variations in camera, lighting, and
background). Specially for character segmentation and recognition, we design a
two-stage approach employing simple data augmentation tricks such as inverted
License Plates (LPs) and flipped characters. The resulting ALPR approach
achieved impressive results in two datasets. First, in the SSIG dataset,
composed of 2,000 frames from 101 vehicle videos, our system achieved a
recognition rate of 93.53% and 47 Frames Per Second (FPS), performing better
than both Sighthound and OpenALPR commercial systems (89.80% and 93.03%,
respectively) and considerably outperforming previous results (81.80%). Second,
targeting a more realistic scenario, we introduce a larger public dataset,
called UFPR-ALPR dataset, designed to ALPR. This dataset contains 150 videos
and 4,500 frames captured when both camera and vehicles are moving and also
contains different types of vehicles (cars, motorcycles, buses and trucks). In
our proposed dataset, the trial versions of commercial systems achieved
recognition rates below 70%. On the other hand, our system performed better,
with recognition rate of 78.33% and 35 FPS.Comment: Accepted for presentation at the International Joint Conference on
Neural Networks (IJCNN) 201
Do We Train on Test Data? The Impact of Near-Duplicates on License Plate Recognition
This work draws attention to the large fraction of near-duplicates in the
training and test sets of datasets widely adopted in License Plate Recognition
(LPR) research. These duplicates refer to images that, although different, show
the same license plate. Our experiments, conducted on the two most popular
datasets in the field, show a substantial decrease in recognition rate when six
well-known models are trained and tested under fair splits, that is, in the
absence of duplicates in the training and test sets. Moreover, in one of the
datasets, the ranking of models changed considerably when they were trained and
tested under duplicate-free splits. These findings suggest that such duplicates
have significantly biased the evaluation and development of deep learning-based
models for LPR. The list of near-duplicates we have found and proposals for
fair splits are publicly available for further research at
https://raysonlaroca.github.io/supp/lpr-train-on-test/Comment: Accepted for presentation at the International Joint Conference on
Neural Networks (IJCNN) 202
Super-Resolution of License Plate Images Using Attention Modules and Sub-Pixel Convolution Layers
Recent years have seen significant developments in the field of License Plate
Recognition (LPR) through the integration of deep learning techniques and the
increasing availability of training data. Nevertheless, reconstructing license
plates (LPs) from low-resolution (LR) surveillance footage remains challenging.
To address this issue, we introduce a Single-Image Super-Resolution (SISR)
approach that integrates attention and transformer modules to enhance the
detection of structural and textural features in LR images. Our approach
incorporates sub-pixel convolution layers (also known as PixelShuffle) and a
loss function that uses an Optical Character Recognition (OCR) model for
feature extraction. We trained the proposed architecture on synthetic images
created by applying heavy Gaussian noise to high-resolution LP images from two
public datasets, followed by bicubic downsampling. As a result, the generated
images have a Structural Similarity Index Measure (SSIM) of less than 0.10. Our
results show that our approach for reconstructing these low-resolution
synthesized images outperforms existing ones in both quantitative and
qualitative measures. Our code is publicly available at
https://github.com/valfride/lpr-rsr-ext