3,753 research outputs found
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
Deep Learning for Vanishing Point Detection Using an Inverse Gnomonic Projection
We present a novel approach for vanishing point detection from uncalibrated
monocular images. In contrast to state-of-the-art, we make no a priori
assumptions about the observed scene. Our method is based on a convolutional
neural network (CNN) which does not use natural images, but a Gaussian sphere
representation arising from an inverse gnomonic projection of lines detected in
an image. This allows us to rely on synthetic data for training, eliminating
the need for labelled images. Our method achieves competitive performance on
three horizon estimation benchmark datasets. We further highlight some
additional use cases for which our vanishing point detection algorithm can be
used.Comment: Accepted for publication at German Conference on Pattern Recognition
(GCPR) 2017. This research was supported by German Research Foundation DFG
within Priority Research Programme 1894 "Volunteered Geographic Information:
Interpretation, Visualisation and Social Computing
MFL-YOLO: An Object Detection Model for Damaged Traffic Signs
Traffic signs are important facilities to ensure traffic safety and smooth
flow, but may be damaged due to many reasons, which poses a great safety
hazard. Therefore, it is important to study a method to detect damaged traffic
signs. Existing object detection techniques for damaged traffic signs are still
absent. Since damaged traffic signs are closer in appearance to normal ones, it
is difficult to capture the detailed local damage features of damaged traffic
signs using traditional object detection methods. In this paper, we propose an
improved object detection method based on YOLOv5s, namely MFL-YOLO (Mutual
Feature Levels Loss enhanced YOLO). We designed a simple cross-level loss
function so that each level of the model has its own role, which is beneficial
for the model to be able to learn more diverse features and improve the fine
granularity. The method can be applied as a plug-and-play module and it does
not increase the structural complexity or the computational complexity while
improving the accuracy. We also replaced the traditional convolution and CSP
with the GSConv and VoVGSCSP in the neck of YOLOv5s to reduce the scale and
computational complexity. Compared with YOLOv5s, our MFL-YOLO improves 4.3 and
5.1 in F1 scores and mAP, while reducing the FLOPs by 8.9%. The Grad-CAM heat
map visualization shows that our model can better focus on the local details of
the damaged traffic signs. In addition, we also conducted experiments on
CCTSDB2021 and TT100K to further validate the generalization of our model.Comment: 11 pages, 8 figures, 4 table
Traffic Sign Detection and Recognition with Voice Assistant
here are multitude of applications for detection and recognition of images across different fields. There are some specific applications for these systems used to help people to drive for example in autonomous driving as well as other applications. There has been another focus in the use of classification models used to help drivers providing details about their surrounding while driving. In places like Guadalajara, such models are a valuable tool to reduce traffic accidents. This document will explain the development of a detection and recognition of traffic signs model. This model has the intention of providing details about the meaning of the traffic signs. All this will happen close to real time and will be an additional information to the driver. This whole system could be used by anyone but specifically aimed to people with visual deficiencies. With the use of a robust machine learning and the use of Deep Learning (DL), the expectative is to achieve high accuracy levels on the traffic sign detection and recognition. This system is expected to be available and affordable for most of the drivers in Guadalajara.ITESO, A. C
Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields
This work presents a first evaluation of using spatio-temporal receptive
fields from a recently proposed time-causal spatio-temporal scale-space
framework as primitives for video analysis. We propose a new family of video
descriptors based on regional statistics of spatio-temporal receptive field
responses and evaluate this approach on the problem of dynamic texture
recognition. Our approach generalises a previously used method, based on joint
histograms of receptive field responses, from the spatial to the
spatio-temporal domain and from object recognition to dynamic texture
recognition. The time-recursive formulation enables computationally efficient
time-causal recognition. The experimental evaluation demonstrates competitive
performance compared to state-of-the-art. Especially, it is shown that binary
versions of our dynamic texture descriptors achieve improved performance
compared to a large range of similar methods using different primitives either
handcrafted or learned from data. Further, our qualitative and quantitative
investigation into parameter choices and the use of different sets of receptive
fields highlights the robustness and flexibility of our approach. Together,
these results support the descriptive power of this family of time-causal
spatio-temporal receptive fields, validate our approach for dynamic texture
recognition and point towards the possibility of designing a range of video
analysis methods based on these new time-causal spatio-temporal primitives.Comment: 29 pages, 16 figure
- …