2,073 research outputs found
SINet: A Scale-insensitive Convolutional Neural Network for Fast Vehicle Detection
Vision-based vehicle detection approaches achieve incredible success in
recent years with the development of deep convolutional neural network (CNN).
However, existing CNN based algorithms suffer from the problem that the
convolutional features are scale-sensitive in object detection task but it is
common that traffic images and videos contain vehicles with a large variance of
scales. In this paper, we delve into the source of scale sensitivity, and
reveal two key issues: 1) existing RoI pooling destroys the structure of small
scale objects, 2) the large intra-class distance for a large variance of scales
exceeds the representation capability of a single network. Based on these
findings, we present a scale-insensitive convolutional neural network (SINet)
for fast detecting vehicles with a large variance of scales. First, we present
a context-aware RoI pooling to maintain the contextual information and original
structure of small scale objects. Second, we present a multi-branch decision
network to minimize the intra-class distance of features. These lightweight
techniques bring zero extra time complexity but prominent detection accuracy
improvement. The proposed techniques can be equipped with any deep network
architectures and keep them trained end-to-end. Our SINet achieves
state-of-the-art performance in terms of accuracy and speed (up to 37 FPS) on
the KITTI benchmark and a new highway dataset, which contains a large variance
of scales and extremely small objects.Comment: Accepted by IEEE Transactions on Intelligent Transportation Systems
(T-ITS
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
Deep Thermal Imaging: Proximate Material Type Recognition in the Wild through Deep Learning of Spatial Surface Temperature Patterns
We introduce Deep Thermal Imaging, a new approach for close-range automatic
recognition of materials to enhance the understanding of people and ubiquitous
technologies of their proximal environment. Our approach uses a low-cost mobile
thermal camera integrated into a smartphone to capture thermal textures. A deep
neural network classifies these textures into material types. This approach
works effectively without the need for ambient light sources or direct contact
with materials. Furthermore, the use of a deep learning network removes the
need to handcraft the set of features for different materials. We evaluated the
performance of the system by training it to recognise 32 material types in both
indoor and outdoor environments. Our approach produced recognition accuracies
above 98% in 14,860 images of 15 indoor materials and above 89% in 26,584
images of 17 outdoor materials. We conclude by discussing its potentials for
real-time use in HCI applications and future directions.Comment: Proceedings of the 2018 CHI Conference on Human Factors in Computing
System
Fast object detection in compressed JPEG Images
Object detection in still images has drawn a lot of attention over past few
years, and with the advent of Deep Learning impressive performances have been
achieved with numerous industrial applications. Most of these deep learning
models rely on RGB images to localize and identify objects in the image.
However in some application scenarii, images are compressed either for storage
savings or fast transmission. Therefore a time consuming image decompression
step is compulsory in order to apply the aforementioned deep models. To
alleviate this drawback, we propose a fast deep architecture for object
detection in JPEG images, one of the most widespread compression format. We
train a neural network to detect objects based on the blockwise DCT (discrete
cosine transform) coefficients {issued from} the JPEG compression algorithm. We
modify the well-known Single Shot multibox Detector (SSD) by replacing its
first layers with one convolutional layer dedicated to process the DCT inputs.
Experimental evaluations on PASCAL VOC and industrial dataset comprising images
of road traffic surveillance show that the model is about faster than
regular SSD with promising detection performances. To the best of our
knowledge, this paper is the first to address detection in compressed JPEG
images
UDP-YOLO: High Efficiency and Real-Time Performance of Autonomous Driving Technology
In recent years, autonomous driving technology has gradually appeared in our field of vision. It senses the surrounding environment by using radar, laser, ultrasound, GPS, computer vision and other technologies, and then identifies obstacles and various signboards, and plans a suitable path to control the driving of vehicles. However, some problems occur when this technology is applied in foggy environment, such as the low probability of recognizing objects, or the fact that some objects cannot be recognized because the fog's fuzzy degree makes the planned path wrong. In view of this defect, and considering that automatic driving technology needs to respond quickly to objects when driving, this paper extends the prior defogging algorithm of dark channel, and proposes UDP-YOLO network to apply it to automatic driving technology. This paper is mainly divided into two parts: 1. Image processing: firstly, the data set is discriminated whether there is fog or not, then the fogged data set is defogged by defogging algorithm, and finally, the defogged data set is subjected to adaptive brightness enhancement; 2. Target detection: UDP-YOLO network proposed in this paper is used to detect the defogged data set. Through the observation results, it is found that the performance of the model proposed in this paper has been greatly improved while balancing the speed
Wrong Way Vehicle Detection in Single and Double Lane
Wrong-way driving is one of the primary causes of traffic jams and accidents globally. It is possible to identify vehicles going the wrong direction, which lessens accidents and traffic congestion. Surveillance footage has become an important source of data due to the accessibility of less priced cameras and the expanding use of real-time traffic management systems. In this paper, we propose a technique for automatically identifying automobiles moving against traffic. Our system uses the You Only Look Once (CNN) algorithm to recognize and track vehicles from video inputs and the centroid tracking method to determine each vehicle's orientation inside a given region of interest (ROI) in order to identify vehicles traveling in the wrong direction. It functions in three steps. The Deep sort tracking method is particularly good in detecting and tracking objects, and the centroid tracking technique can effectively monitor the direction of travel. Experiments with a variety of traffic films show that the suggested method can detect and identify wrong-way moving vehicles in a variety of lighting and weather scenarios. The interface of the system is quite simple and easy to use
A Novel Driver Distraction Behavior Detection Based on Self-Supervised Learning Framework with Masked Image Modeling
Driver distraction causes a significant number of traffic accidents every
year, resulting in economic losses and casualties. Currently, the level of
automation in commercial vehicles is far from completely unmanned, and drivers
still play an important role in operating and controlling the vehicle.
Therefore, driver distraction behavior detection is crucial for road safety. At
present, driver distraction detection primarily relies on traditional
Convolutional Neural Networks (CNN) and supervised learning methods. However,
there are still challenges such as the high cost of labeled datasets, limited
ability to capture high-level semantic information, and weak generalization
performance. In order to solve these problems, this paper proposes a new
self-supervised learning method based on masked image modeling for driver
distraction behavior detection. Firstly, a self-supervised learning framework
for masked image modeling (MIM) is introduced to solve the serious human and
material consumption issues caused by dataset labeling. Secondly, the Swin
Transformer is employed as an encoder. Performance is enhanced by reconfiguring
the Swin Transformer block and adjusting the distribution of the number of
window multi-head self-attention (W-MSA) and shifted window multi-head
self-attention (SW-MSA) detection heads across all stages, which leads to model
more lightening. Finally, various data augmentation strategies are used along
with the best random masking strategy to strengthen the model's recognition and
generalization ability. Test results on a large-scale driver distraction
behavior dataset show that the self-supervised learning method proposed in this
paper achieves an accuracy of 99.60%, approximating the excellent performance
of advanced supervised learning methods
- …