Search CORE

5 research outputs found

FINE-TUNING DEEP LEARNING MODELS FOR PEDESTRIAN DETECTION

Author: Amisse Caisse
Centeno Jorge Antonio Silva
Jijón-Palma Mario Ernesto
Publication venue: Bulletin of Geodetic Sciences
Publication date: 17/08/2021
Field of study

Object detection in high resolution images is a new challenge that the remote sensing community is facing thanks to introduction of unmanned aerial vehicles and monitoring cameras. One of the interests is to detect and trace persons in the images. Different from general objects, pedestrians can have different poses and are undergoing constant morphological changes while moving, this task needs an intelligent solution. Fine-tuning has woken up great interest among researchers due to its relevance for retraining convolutional networks for many and interesting applications. For object classification, detection, and segmentation fine-tuned models have shown state-of-the-art performance. In the present work, we evaluate the performance of fine-tuned models with a variation of training data by comparing Faster Region-based Convolutional Neural Network (Faster R-CNN) Inception v2, Single Shot MultiBox Detector (SSD) Inception v2, and SSD Mobilenet v2. To achieve the goal, the effect of varying training data on performance metrics such as accuracy, precision, F1-score, and recall are taken into account. After testing the detectors, it was identified that the precision and recall are more sensitive on the variation of the amount of training data. Under five variation of the amount of training data, we observe that the proportion of 60%-80% consistently achieve highly comparable performance, whereas in all variation of training data Faster R-CNN Inception v2 outperforms SSD Inception v2 and SSD Mobilenet v2 in evaluated metrics, but the SSD converges relatively quickly during the training phase. Overall, partitioning 80% of total data for fine-tuning trained models produces efficient detectors even with only 700 data samples.Object detection in high resolution images is a new challenge that the remote sensing community is facing thanks to introduction of unmanned aerial vehicles and monitoring cameras. One of the interests is to detect and trace persons in the images. Different from general objects, pedestrians can have different poses and are undergoing constant morphological changes while moving, this task needs an intelligent solution. Fine-tuning has woken up great interest among researchers due to its relevance for retraining convolutional networks for many and interesting applications. For object classification, detection, and segmentation fine-tuned models have shown state-of-the-art performance. In the present work, we evaluate the performance of fine-tuned models with a variation of training data by comparing Faster Region-based Convolutional Neural Network (Faster R-CNN) Inception v2, Single Shot MultiBox Detector (SSD) Inception v2, and SSD Mobilenet v2. To achieve the goal, the effect of varying training data on performance metrics such as accuracy, precision, F1-score, and recall are taken into account. After testing the detectors, it was identified that the precision and recall are more sensitive on the variation of the amount of training data. Under five variation of the amount of training data, we observe that the proportion of 60%-80% consistently achieve highly comparable performance, whereas in all variation of training data Faster R-CNN Inception v2 outperforms SSD Inception v2 and SSD Mobilenet v2 in evaluated metrics, but the SSD converges relatively quickly during the training phase. Overall, partitioning 80% of total data for fine-tuning trained models produces efficient detectors even with only 700 data samples

Biblioteca Digital de Periódicos da UFPR (Universidade Federal do Paraná)

Plataforma de análisis de imágenes satelitales para el descubrimiento de recursos hídricos mediante la aplicación de técnicas basadas en inteligencia artificial

Author: Carmona Balea Antía
Publication venue
Publication date: 01/01/2024
Field of study

[ES] España es el segundo país de Europa con más piscinas. Sin embargo, la literatura jurídica estima que el 20% de las piscinas no están declaradas de forma legal o son irregulares. La Administración cuenta con un cuerpo de personas que analizan mediante procedimientos manuales, imágenes de satélite o de drones para detectar estructuras ilegales o irregulares. Este método es costoso en términos de esfuerzo, implicación de recursos humanos y tiempo, además de ser un método basado en la subjetividad de la persona que lo lleva a cabo. La propuesta de este trabajo de investigación pretende diseñar una plataforma basada en sistemas multiagente que incluya técnicas de visión artificial y que permita la detección automática de estructuras ilegales, pudiendo destacar, por ejemplo, la detección de balsas irregulares. Para la consecución exitosa de este trabajo, se emplearán herramientas de información geográfica (SIG) basadas en ortofotografía, combinadas con técnicas avanzadas de visión artificial basadas en redes convolucionales para la detección de objetos. Además, el uso de una arquitectura multiagente permitirá que el sistema diseñado sea modular, con la posibilidad de que las diferentes partes del sistema trabajen conjuntamente, equilibrando la carga de trabajo. El sistema propuesto ha sido validado mediante pruebas en diferentes ciudades de España. El sistema ha mostrado resultados prometedores en la realización de esta tarea, con una tasa de acuerdo superior al 97%. [EN] Spain stands as the second-ranked European nation in terms of the abundance of swimming pools. However, it has come to light in legal circles that a substantial 20% of these aquatic facilities either evade declaration or exist in an irregular manner. To tackle this issue, the governing bodies employ a team of individuals who manually scrutinize satellite and drone imagery. Their objective is to pinpoint structures that run afoul of legality or convention. This approach demands significant expenditure of both labor and time, compounded by the inherent subjectivity associated with human interpretation. This proposal sets forth the ambition to craft a platform capable of autonomously identifying aberrant pools. This endeavor draws upon geographical information systems (GIS) grounded in orthophotography, coupled with cutting-edge machine learning methodologies for precise object detection. Moreover, a multi-agent architecture comes into play, introducing modularity into the system's framework. This modular design facilitates the collaborative functioning of distinct system components, enabling the equitable distribution of workloads. The efficacy of the proposed system has been established through rigorous testing across various municipalities in Spain. Encouragingly, the system has yielded promising outcomes in its execution of this task, boasting an impressive F1-Score of 97.1

Gestion del Repositorio Documental de la Universidad de Salamanca

A Robust Object Detection System for Driverless Vehicles through Sensor Fusion and Artificial Intelligence Techniques

Author: Khatab Esraa Alaaeldin hassan
Publication venue
Publication date: 22/07/2024
Field of study

Since the early 1990s, various research domains have been concerned with the concept of autonomous driving, leading to the widespread implementation of numerous advanced driver assistance features. However, fully automated vehicles have not yet been introduced to the market. The process of autonomous driving can be outlined through the following stages: environment perception, ego-vehicle localization, trajectory estimation, path planning, and vehicle control. Environment perception is partially based on computer vision algorithms that can detect and track surrounding objects. The process of objects detection performed by autonomous vehicles is considered challenging for several reasons, such as the presence of multiple dynamic objects in the same scene, interaction between objects, real-time speed requirements, and the presence of diverse weather conditions (e.g., rain, snow, fog, etc.). Although many studies have been conducted on objects detection performed by autonomous vehicles, it remains a challenging task, and improving the performance of object detection in diverse driving scenes is an ongoing field. This thesis aims to develop novel methods for the detection and 3D localization of surrounding dynamic objects in driving scenes in different rainy weather conditions. In this thesis, firstly, owing to the frequent occurrence of rain and its negative effect on the performance of objects detection operation, a real-time lightweight deraining network is proposed; it works on single real-time images separately. Rain streaks and the accumulation of rain streaks introduce distinct visual degradation effects to captured images. The proposed deraining network effectively removes both rain streaks and accumulated rain streaks from images. It makes use of the progressive operation of two main stages: rain streaks removal and rain streaks accumulation removal. The rain streaks removal stage is based on a Residual Network (ResNet) to maintain real-time performance and avoid adding to the computational complexity. Furthermore, the application of recursive computations involves the sharing of network parameters. Meanwhile, distant rain streaks accumulate and induce a distortion similar to fogging. Thus, it could be mitigated in a way similar to defogging. This stage relies on a transmission-guided lightweight network (TGL-Net). The proposed deraining network was evaluated on five datasets having synthetic rain of different properties and two other datasets with real rainy scenes. Secondly, an emphasis has been put on proposing a novel sensory system that achieves realtime multiple dynamic objects detection in driving scenes. The proposed sensory system utilizes a monocular camera and a 2D Light Detection and Ranging (LiDAR) sensor in a complementary fusion approach. YOLOv3- a baseline real-time object detection algorithm has been used to detect and classify objects in images captured by the camera; detected objects are surrounded by bounding boxes to localize them within the frames. Since objects present in a driving scene are dynamic and usually occluding each other, an algorithm has been developed to differentiate objects whose bounding boxes are overlapping. Moreover, the locations of bounding boxes within frames (in pixels) are converted into real-world angular coordinates. A 2D LiDAR was used to obtain depth measurements while maintaining low computational requirements in order to save resources for other autonomous driving related operations. A novel technique has been developed and tested for processing and mapping 2D LiDAR measurements with corresponding bounding boxes. The detection accuracy of the proposed system was manually evaluated in different real-time scenarios. Finally, the effectiveness of the proposed deraining network was validated in terms of its impact on objects detection in the context of de-rained images. Results of the proposed deraining network were compared to existing baseline deraining networks and have shown that the running time of the proposed network is 2.23× faster than the average running time of baseline deraining networks while achieving 1.2× improvement when tested on different synthetic datasets. Moreover, tests on the LiDAR measurements showed an average error of ±0.04m in real driving scenes. Also, both deraining and objects detection are jointly tested, and it was demonstrated that performing deraining ahead of objects detection caused 1.45× enhancement in the object detection precision

CLoK