11 research outputs found

    Deep neural network for traffic sign recognition systems: An analysis of spatial transformers and stochastic optimisation methods

    Get PDF
    This paper presents a Deep Learning approach for traffic sign recognition systems. Several classification experiments are conducted over publicly available traffic sign datasets from Germany and Belgium using a Deep Neural Network which comprises Convolutional layers and Spatial Transformer Networks. Such trials are built to measure the impact of diverse factors with the end goal of designing a Convolutional Neural Network that can improve the state-of-the-art of traffic sign classification task. First, different adaptive and non-adaptive stochastic gradient descent optimisation algorithms such as SGD, SGD-Nesterov, RMSprop and Adam are evaluated. Subsequently, multiple combinations of Spatial Transformer Networks placed at distinct positions within the main neural network are analysed. The recognition rate of the proposed Convolutional Neural Network reports an accuracy of 99.71% in the German Traffic Sign Recognition Benchmark, outperforming previous state-of-the-art methods and also being more efficient in terms of memory requirements.Ministerio de Economía y Competitividad TIN2017-82113-C2-1-RMinisterio de Economía y Competitividad TIN2013-46801-C4-1-

    Técnicas de inteligencia artificial aplicadas a sistemas de detección y clasificación de señales de tráfico.

    Get PDF
    Esta tesis, presentada como conjunto de artículos de investigación, estudia y analiza soluciones para los sistemas de detección y clasificación de señales de tráfico que suponen un reto en aplicaciones de la actualidad, como son la seguridad y asistencia en carretera a conductores, los coches autónomos, el mantenimiento de señalización vertical, o el análisis de escenas de tráfico. Las señales de tráfico constituyen un activo fundamental dentro de la red decarreteras porque su objetivo es ser fácilmente perceptible por los peatones y conductores para advertirles y guiarlos tanto de día como de noche. El hecho de que las señales estén diseñadas para ser únicas y tener características distinguibles, como formas simples y colores uniformes, implica que su detección y reconocimiento sea un problema limitado. Sin embargo, el desarrollo de un sistema de reconocimiento de señales en tiempo real aún presenta desafíos debido a los tiempos de respuesta, los cuales son cruciales para tomar decisiones en el entorno, y la variabilidad que presentan las imágenes de escenas de tráfico, que pueden incluir imágenes a distintas escalas, puntos de vista complicados, oclusiones, y diferentes condiciones de luz. Cualquier sistema de detección y clasificación de señales de tráfico debe hacer frente a estos retos. En este trabajo, se presenta un sistema de clasificación de señales de tráfico basado en aprendizaje profundo (Deep Learning). Concretamente, los principales componentes de la red neuronal profunda (Deep Neural Network) propuesta, son capas convolucionales y redes de transformaciones espaciales (Spatial Transformer Networks). Dicha red es alimentada con imágenes RGB de señales de tráfico de distintos países como Alemania, Bélgica o España. En el caso de las señales de Alemania, que pertenecen al dataset denominado German Traffic Sign Recognition Benchmark (GTSRB), la arquitectura de red y los parámetros de optimización propuestos obtienen un 99.71% de precisión, mejorando tanto al sistema visual humano como a todos los resultados previos del estado del arte, siendo además más eficiente en términos de requisitos de memoria. En el momento de redactar esta tesis, nuestro método se encuentra en la primera posición de la clasificación a nivel mundial. Por otro lado, respecto a la problemática de la detección de señales de tráfico, se analizan varios sistemas de detección de objetos propuestos en el estado del arte, que son específicamente modificados y adaptados al dominio del problema que nos ocupa para aplicar la transferencia de conocimiento en redes neuronales (transfer learning). También se estudian múltiples parámetros de rendimiento para cada uno de los modelos de detección con el fin de ofrecer al lector cuál sería el mejor detector de señales teniendo en cuenta restricciones del entorno donde se desplegará la solución, como la precisión, el consumo de memoria o la velocidad de ejecución. Nuestro estudio muestra que el modelo Faster R-CNN Inception Resnet V2 obtiene la mejor precisión (95.77% mAP), mientras que R-FCN Resnet 101 alcanza el mejor equilibrio entre tiempo de ejecución (85.45 ms por imagen) y precisión (95.15% mAP)

    Free-Shape Polygonal Object Localization

    Get PDF
    Polygonal objects are prevalent in man-made scenes. Early approaches to detecting them relied mainly on geometry while subsequent ones also incorporated appearance-based cues. It has recently been shown that this could be done fast by searching for cycles in graphs of line-fragments, provided that the cycle scoring function can be expressed as additive terms attached to individual fragments. In this paper, we propose an approach that eliminates this restriction. Given a weighted line-fragment graph, we use its cyclomatic number to partition the graph into managebly-sized sub-graphs that preserve nodes and edges with a high weight and are most likely to contain object contours. Object contours are then detected as maximally scoring elementary circuits enumerated in each sub-graph. Our approach can be used with any cycle scoring function and multiple candidates that share line fragments can be found. This is unlike in other approaches that rely on a greedy approach to finding candidates. We demonstrate that our approach significantly outperforms the state-of-the-art for the detection of building rooftops in aerial images and polygonal object categories from ImageNet

    Real-time landing place assessment in man-made environments

    Get PDF
    We propose a novel approach to the real-time landing site detection and assessment in unconstrained man-made environments using passive sensors. Because this task must be performed in a few seconds or less, existing methods are often limited to simple local intensity and edge variation cues. By contrast, we show how to efficiently take into account the potential sites' global shape, which is a critical cue in man-made scenes. Our method relies on a new segmentation algorithm and shape regularity measure to look for polygonal regions in video sequences. In this way, we enforce both temporal consistency and geometric regularity, resulting in very reliable and consistent detections. We demonstrate our approach for the detection of landable sites such as rural fields, building rooftops and runways from color and infrared monocular sequences significantly outperforming the state-of-the-art

    Monocular vision based navigation using image moments of polygonal features

    Get PDF
    This thesis presents a novel monocular-vision-based localization and mapping algorithm using moments of polygon features. The landmarks we use are polygonal regions instead of a dense set of feature points, which can significantly reduce the computational complexity of data association and produce a map that is geometrically and structurally more meaningful. Each region can be characterized using its depth and orientation with respect to the camera and an polygon detection and tracking algorithm is developed. The monocular vision Simultaneous Localization and Mapping (SLAM) problem is formulated as a filter problem to incorporate the image moments of the close regions or polygons tracked. The observability of the SLAM estimator is further improved by both the additional measurements with respect to the initial view location and the use of image moments. We analyze the performance of our SLAM algorithm with numerical simulations and experimental results. We also compared our results with ORB-SLAM to show the effectiveness of our algorithm in outdoor environments

    Localizing Polygonal Objects in Man-Made Environments

    Get PDF
    Object detection is a significant challenge in Computer Vision and has received a lot of attention in the field. One such challenge addressed in this thesis is the detection of polygonal objects, which are prevalent in man-made environments. Shape analysis is an important cue to detect these objects. We propose a contour-based object detection framework to deal with the related challenges, including how to efficiently detect polygonal shapes and how to exploit them for object detection. First, we propose an efficient component tree segmentation framework for stable region extraction and a multi-resolution line segment detection algorithm, which form the bases of our detection framework. Our component tree segmentation algorithm explores the optimal threshold for each branch of the component tree, and achieves a significant improvement over image thresholding segmentation, and comparable performance to more sophisticated methods but only at a fraction of computation time. Our line segment detector overcomes several inherent limitations of the Hough transform, and achieves a comparable performance to the state-of-the-art line segment detectors. However, our approach can better capture dominant structures and is more stable against low-quality imaging conditions. Second, we propose a global shape analysis measurement for simple polygon detection and use it to develop an approach for real-time landing site detection in unconstrained man-made environments. Since the task of detecting landing sites must be performed in a few seconds or less, existing methods are often limited to simple local intensity and edge variation cues. By contrast, we show how to efficiently take into account the potential sitesâ global shape, which is a critical cue in man-made scenes. Our method relies on component tree segmentation algorithm and a new shape regularity measure to look for polygonal regions in video sequences. In this way we enforce both temporal consistency and geometric regularity, resulting in reliable and consistent detections. Third, we propose a generic contour grouping based object detection approach by exploring promising cycles in a line fragment graph. Previous contour-based methods are limited to use additive scoring functions. In this thesis, we propose an approximate search approach that eliminates this restriction. Given a weighted line fragment graph, we prune its cycle space by removing cycles containing weak nodes or weak edges, until the upper bound of the cycle space is less than the threshold defined by the cyclomatic number. Object contours are then detected as maximally scoring elementary circuits in the pruned cycle space. Furthermore, we propose another more efficient algorithm, which reconstructs the graph by grouping the strongest edges iteratively until the number of the cycles reaches the upper bound. Our approximate search approaches can be used with any cycle scoring function. Moreover, unlike other contour grouping based approaches, our approach does not rely on a greedy strategy for finding multiple candidates and is capable of finding multiple candidates sharing common line fragments. We demonstrate that our approach significantly outperforms the state-of-the-art

    The regular polygon detector

    No full text
    This paper describes a robust regular polygon detector. Given image edges, we derive the a posteriori probability for a mixture of regular polygons, and thus the probability density function for the appearance of a set of regular polygons. Likely regula

    Regular Polygon Detection as an Interest Point Operator for SLAM

    No full text
    We present a new interest point operator based on the regular polygon detector developed by Loy and Barnes [2004]. This operator finds square-like features as a basis for scene reconstruction and visual Simultaneous Localisation and Mapping (SLAM) from robot camera sequences. In this paper we show results from the application of this detector as an interest point operator on a robot camera sequence in an indoor office environment. The detector shows good results for non-trivial frame baselines.

    Improved Signal To Noise Ratio And Computational Speed For Gradient-Based Detection Algorithms

    No full text
    Image gradient-based feature detectors offer great advantages over their standard edge-only equivalents. In driver support systems research, the radial symmetry detection algorithm has given real-time results for speed sign recognition. The regular polygon detector is a scan line algorithm for these features facilitating recognition of other road signs such as stop and give way signs. Radial symmetry has also been applied to real-time face detection, and the polygon detector is showing promising results as a feature detector for SLAM. However, gradient-based feature detection is more sensitive to noise than standard edge-based algorithms. As the total gradient magnitude at a pixel decreases, the component of the gradient at that point that arises from image noise increases. When a pixel votes in its gradient direction out to an extended radius, its position is more likely to be inaccurate if the gradient magnitude is low. In this paper, we analyse the performance of the radial symmetry and regular polygon detector algorithms under changes to the threshold on gradient magnitude. We show that the number of pixels correctly voting on a circle is not greatly reduced by thresholds that decrease the total number of pixels that vote in the image to 20%. This greatly reduces the noise component in the image, with only slight impact on the signal. This improves the performance, particularly for the regular polygon detector where the voting mechanism is complex and constitutes a large amount of the processing per pixel. This facilitates a real-time implementation, which is presented here
    corecore