6,078 research outputs found
Modeling Camera Effects to Improve Visual Learning from Synthetic Data
Recent work has focused on generating synthetic imagery to increase the size
and variability of training data for learning visual tasks in urban scenes.
This includes increasing the occurrence of occlusions or varying environmental
and weather effects. However, few have addressed modeling variation in the
sensor domain. Sensor effects can degrade real images, limiting
generalizability of network performance on visual tasks trained on synthetic
data and tested in real environments. This paper proposes an efficient,
automatic, physically-based augmentation pipeline to vary sensor effects
--chromatic aberration, blur, exposure, noise, and color cast-- for synthetic
imagery. In particular, this paper illustrates that augmenting synthetic
training datasets with the proposed pipeline reduces the domain gap between
synthetic and real domains for the task of object detection in urban driving
scenes
VSSA-NET: Vertical Spatial Sequence Attention Network for Traffic Sign Detection
Although traffic sign detection has been studied for years and great progress
has been made with the rise of deep learning technique, there are still many
problems remaining to be addressed. For complicated real-world traffic scenes,
there are two main challenges. Firstly, traffic signs are usually small size
objects, which makes it more difficult to detect than large ones; Secondly, it
is hard to distinguish false targets which resemble real traffic signs in
complex street scenes without context information. To handle these problems, we
propose a novel end-to-end deep learning method for traffic sign detection in
complex environments. Our contributions are as follows: 1) We propose a
multi-resolution feature fusion network architecture which exploits densely
connected deconvolution layers with skip connections, and can learn more
effective features for the small size object; 2) We frame the traffic sign
detection as a spatial sequence classification and regression task, and propose
a vertical spatial sequence attention (VSSA) module to gain more context
information for better detection performance. To comprehensively evaluate the
proposed method, we do experiments on several traffic sign datasets as well as
the general object detection dataset and the results have shown the
effectiveness of our proposed method
MFMAN-YOLO: A Method for Detecting Pole-like Obstacles in Complex Environment
In real-world traffic, there are various uncertainties and complexities in
road and weather conditions. To solve the problem that the feature information
of pole-like obstacles in complex environments is easily lost, resulting in low
detection accuracy and low real-time performance, a multi-scale hybrid
attention mechanism detection algorithm is proposed in this paper. First, the
optimal transport function Monge-Kantorovich (MK) is incorporated not only to
solve the problem of overlapping multiple prediction frames with optimal
matching but also the MK function can be regularized to prevent model
over-fitting; then, the features at different scales are up-sampled separately
according to the optimized efficient multi-scale feature pyramid. Finally, the
extraction of multi-scale feature space channel information is enhanced in
complex environments based on the hybrid attention mechanism, which suppresses
the irrelevant complex environment background information and focuses the
feature information of pole-like obstacles. Meanwhile, this paper conducts real
road test experiments in a variety of complex environments. The experimental
results show that the detection precision, recall, and average precision of the
method are 94.7%, 93.1%, and 97.4%, respectively, and the detection frame rate
is 400 f/s. This research method can detect pole-like obstacles in a complex
road environment in real time and accurately, which further promotes innovation
and progress in the field of automatic driving.Comment: 11 page
Remote Sensing Object Detection Meets Deep Learning: A Meta-review of Challenges and Advances
Remote sensing object detection (RSOD), one of the most fundamental and
challenging tasks in the remote sensing field, has received longstanding
attention. In recent years, deep learning techniques have demonstrated robust
feature representation capabilities and led to a big leap in the development of
RSOD techniques. In this era of rapid technical evolution, this review aims to
present a comprehensive review of the recent achievements in deep learning
based RSOD methods. More than 300 papers are covered in this review. We
identify five main challenges in RSOD, including multi-scale object detection,
rotated object detection, weak object detection, tiny object detection, and
object detection with limited supervision, and systematically review the
corresponding methods developed in a hierarchical division manner. We also
review the widely used benchmark datasets and evaluation metrics within the
field of RSOD, as well as the application scenarios for RSOD. Future research
directions are provided for further promoting the research in RSOD.Comment: Accepted with IEEE Geoscience and Remote Sensing Magazine. More than
300 papers relevant to the RSOD filed were reviewed in this surve
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
Conteo de vehículos a partir de vídeos usando machine learning
This work presents a framework for vehicle counting from videos, using deep neural networks as detectors. The framework has 4 stages: preprocessing, detection and classification, tracking, and post-processing. For the detection stage, several deep object detector are compared and 3 new ones are proposed based on Tiny YOLOv3.
For the tracking, a new tracker based on IOU is compared against the classic ones: Boosting, KCF, TLD, Mediaflow, MOSSE and CSRT. The comparison is based on 8 multi-object tracking metrics over the Bog19 dataset.
The Bog19 dataset is a collection of annotated videos from the city of Bogota. The annotations include bicycles, buses, cars, motorbikes and trucks. Finally, the system is evaluated for the task of vehicle counting on this dataset.
For the counting task, the combinations of the proposed detectors with the Medianflow and MOSSE trackers obtain the best results. The founded detectors have the same performance as those of the state of the art but with a higher speed.Este trabajo presenta un framework para el conteo de vehı́culos a partir de videos, utilizando redes neuronales profundas como detectores. El framework tiene 4 etapas: preprocesamiento, detección y clasificación, seguimiento y post-procesamiento. Para la etapa de detección se comparan varios detectores de objetos profundos y se proponen 3 nuevos basados en Tiny YOLOv3.
Para el rastreo, se compara un nuevo rastreador basado en IOU con los clásicos: Boosting, KCF, TLD, Mediaflow, MOSSE y CSRT. La comparación se hace en base a 8 métricas de seguimiento multiobjeto sobre el conjunto de datos del Bog19.
El conjunto de datos Bog19 es una colección de videos anotados de la ciudad de Bogotá. Las clases de objetos anotados incluyen bicicletas, autobuses, coches, motos y camiones. Finalmente el sistema es evaluado para la tarea de contar vehı́culos en este conjunto de datos.
Para la tarea de conteo, las combinaciones de los detectores propuestos y los rastreadores Medianflow y MOSSE obtienen los mejores resultados. Los detectores encontrados tienen el mismo desempeño que los del estado del arte pero con una mayor velocidad.Magíster en Ingeniería - Ingeniería de Sistemas y ComputaciónMaestrí
Deep Learning-Based Object Detection in Maritime Unmanned Aerial Vehicle Imagery: Review and Experimental Comparisons
With the advancement of maritime unmanned aerial vehicles (UAVs) and deep
learning technologies, the application of UAV-based object detection has become
increasingly significant in the fields of maritime industry and ocean
engineering. Endowed with intelligent sensing capabilities, the maritime UAVs
enable effective and efficient maritime surveillance. To further promote the
development of maritime UAV-based object detection, this paper provides a
comprehensive review of challenges, relative methods, and UAV aerial datasets.
Specifically, in this work, we first briefly summarize four challenges for
object detection on maritime UAVs, i.e., object feature diversity, device
limitation, maritime environment variability, and dataset scarcity. We then
focus on computational methods to improve maritime UAV-based object detection
performance in terms of scale-aware, small object detection, view-aware,
rotated object detection, lightweight methods, and others. Next, we review the
UAV aerial image/video datasets and propose a maritime UAV aerial dataset named
MS2ship for ship detection. Furthermore, we conduct a series of experiments to
present the performance evaluation and robustness analysis of object detection
methods on maritime datasets. Eventually, we give the discussion and outlook on
future works for maritime UAV-based object detection. The MS2ship dataset is
available at
\href{https://github.com/zcj234/MS2ship}{https://github.com/zcj234/MS2ship}.Comment: 32 pages, 18 figure
- …