6,078 research outputs found

    Modeling Camera Effects to Improve Visual Learning from Synthetic Data

    Full text link
    Recent work has focused on generating synthetic imagery to increase the size and variability of training data for learning visual tasks in urban scenes. This includes increasing the occurrence of occlusions or varying environmental and weather effects. However, few have addressed modeling variation in the sensor domain. Sensor effects can degrade real images, limiting generalizability of network performance on visual tasks trained on synthetic data and tested in real environments. This paper proposes an efficient, automatic, physically-based augmentation pipeline to vary sensor effects --chromatic aberration, blur, exposure, noise, and color cast-- for synthetic imagery. In particular, this paper illustrates that augmenting synthetic training datasets with the proposed pipeline reduces the domain gap between synthetic and real domains for the task of object detection in urban driving scenes

    VSSA-NET: Vertical Spatial Sequence Attention Network for Traffic Sign Detection

    Full text link
    Although traffic sign detection has been studied for years and great progress has been made with the rise of deep learning technique, there are still many problems remaining to be addressed. For complicated real-world traffic scenes, there are two main challenges. Firstly, traffic signs are usually small size objects, which makes it more difficult to detect than large ones; Secondly, it is hard to distinguish false targets which resemble real traffic signs in complex street scenes without context information. To handle these problems, we propose a novel end-to-end deep learning method for traffic sign detection in complex environments. Our contributions are as follows: 1) We propose a multi-resolution feature fusion network architecture which exploits densely connected deconvolution layers with skip connections, and can learn more effective features for the small size object; 2) We frame the traffic sign detection as a spatial sequence classification and regression task, and propose a vertical spatial sequence attention (VSSA) module to gain more context information for better detection performance. To comprehensively evaluate the proposed method, we do experiments on several traffic sign datasets as well as the general object detection dataset and the results have shown the effectiveness of our proposed method

    MFMAN-YOLO: A Method for Detecting Pole-like Obstacles in Complex Environment

    Full text link
    In real-world traffic, there are various uncertainties and complexities in road and weather conditions. To solve the problem that the feature information of pole-like obstacles in complex environments is easily lost, resulting in low detection accuracy and low real-time performance, a multi-scale hybrid attention mechanism detection algorithm is proposed in this paper. First, the optimal transport function Monge-Kantorovich (MK) is incorporated not only to solve the problem of overlapping multiple prediction frames with optimal matching but also the MK function can be regularized to prevent model over-fitting; then, the features at different scales are up-sampled separately according to the optimized efficient multi-scale feature pyramid. Finally, the extraction of multi-scale feature space channel information is enhanced in complex environments based on the hybrid attention mechanism, which suppresses the irrelevant complex environment background information and focuses the feature information of pole-like obstacles. Meanwhile, this paper conducts real road test experiments in a variety of complex environments. The experimental results show that the detection precision, recall, and average precision of the method are 94.7%, 93.1%, and 97.4%, respectively, and the detection frame rate is 400 f/s. This research method can detect pole-like obstacles in a complex road environment in real time and accurately, which further promotes innovation and progress in the field of automatic driving.Comment: 11 page

    Remote Sensing Object Detection Meets Deep Learning: A Meta-review of Challenges and Advances

    Full text link
    Remote sensing object detection (RSOD), one of the most fundamental and challenging tasks in the remote sensing field, has received longstanding attention. In recent years, deep learning techniques have demonstrated robust feature representation capabilities and led to a big leap in the development of RSOD techniques. In this era of rapid technical evolution, this review aims to present a comprehensive review of the recent achievements in deep learning based RSOD methods. More than 300 papers are covered in this review. We identify five main challenges in RSOD, including multi-scale object detection, rotated object detection, weak object detection, tiny object detection, and object detection with limited supervision, and systematically review the corresponding methods developed in a hierarchical division manner. We also review the widely used benchmark datasets and evaluation metrics within the field of RSOD, as well as the application scenarios for RSOD. Future research directions are provided for further promoting the research in RSOD.Comment: Accepted with IEEE Geoscience and Remote Sensing Magazine. More than 300 papers relevant to the RSOD filed were reviewed in this surve

    Object Detection in 20 Years: A Survey

    Full text link
    Object detection, as of one the most fundamental and challenging problems in computer vision, has received great attention in recent years. Its development in the past two decades can be regarded as an epitome of computer vision history. If we think of today's object detection as a technical aesthetics under the power of deep learning, then turning back the clock 20 years we would witness the wisdom of cold weapon era. This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century's time (from the 1990s to 2019). A number of topics have been covered in this paper, including the milestone detectors in history, detection datasets, metrics, fundamental building blocks of the detection system, speed up techniques, and the recent state of the art detection methods. This paper also reviews some important detection applications, such as pedestrian detection, face detection, text detection, etc, and makes an in-deep analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible publicatio

    Conteo de vehículos a partir de vídeos usando machine learning

    Get PDF
    This work presents a framework for vehicle counting from videos, using deep neural networks as detectors. The framework has 4 stages: preprocessing, detection and classification, tracking, and post-processing. For the detection stage, several deep object detector are compared and 3 new ones are proposed based on Tiny YOLOv3. For the tracking, a new tracker based on IOU is compared against the classic ones: Boosting, KCF, TLD, Mediaflow, MOSSE and CSRT. The comparison is based on 8 multi-object tracking metrics over the Bog19 dataset. The Bog19 dataset is a collection of annotated videos from the city of Bogota. The annotations include bicycles, buses, cars, motorbikes and trucks. Finally, the system is evaluated for the task of vehicle counting on this dataset. For the counting task, the combinations of the proposed detectors with the Medianflow and MOSSE trackers obtain the best results. The founded detectors have the same performance as those of the state of the art but with a higher speed.Este trabajo presenta un framework para el conteo de vehı́culos a partir de videos, utilizando redes neuronales profundas como detectores. El framework tiene 4 etapas: preprocesamiento, detección y clasificación, seguimiento y post-procesamiento. Para la etapa de detección se comparan varios detectores de objetos profundos y se proponen 3 nuevos basados en Tiny YOLOv3. Para el rastreo, se compara un nuevo rastreador basado en IOU con los clásicos: Boosting, KCF, TLD, Mediaflow, MOSSE y CSRT. La comparación se hace en base a 8 métricas de seguimiento multiobjeto sobre el conjunto de datos del Bog19. El conjunto de datos Bog19 es una colección de videos anotados de la ciudad de Bogotá. Las clases de objetos anotados incluyen bicicletas, autobuses, coches, motos y camiones. Finalmente el sistema es evaluado para la tarea de contar vehı́culos en este conjunto de datos. Para la tarea de conteo, las combinaciones de los detectores propuestos y los rastreadores Medianflow y MOSSE obtienen los mejores resultados. Los detectores encontrados tienen el mismo desempeño que los del estado del arte pero con una mayor velocidad.Magíster en Ingeniería - Ingeniería de Sistemas y ComputaciónMaestrí

    Deep Learning-Based Object Detection in Maritime Unmanned Aerial Vehicle Imagery: Review and Experimental Comparisons

    Full text link
    With the advancement of maritime unmanned aerial vehicles (UAVs) and deep learning technologies, the application of UAV-based object detection has become increasingly significant in the fields of maritime industry and ocean engineering. Endowed with intelligent sensing capabilities, the maritime UAVs enable effective and efficient maritime surveillance. To further promote the development of maritime UAV-based object detection, this paper provides a comprehensive review of challenges, relative methods, and UAV aerial datasets. Specifically, in this work, we first briefly summarize four challenges for object detection on maritime UAVs, i.e., object feature diversity, device limitation, maritime environment variability, and dataset scarcity. We then focus on computational methods to improve maritime UAV-based object detection performance in terms of scale-aware, small object detection, view-aware, rotated object detection, lightweight methods, and others. Next, we review the UAV aerial image/video datasets and propose a maritime UAV aerial dataset named MS2ship for ship detection. Furthermore, we conduct a series of experiments to present the performance evaluation and robustness analysis of object detection methods on maritime datasets. Eventually, we give the discussion and outlook on future works for maritime UAV-based object detection. The MS2ship dataset is available at \href{https://github.com/zcj234/MS2ship}{https://github.com/zcj234/MS2ship}.Comment: 32 pages, 18 figure
    corecore