Search CORE

4,968 research outputs found

Object Recognition from very few Training Examples for Enhancing Bicycle Maps

Author: Ackermann Hanno
Reinders Christoph
Rosenhahn Bodo
Yang Michael Ying
Publication venue
Publication date: 28/05/2018
Field of study

In recent years, data-driven methods have shown great success for extracting information about the infrastructure in urban areas. These algorithms are usually trained on large datasets consisting of thousands or millions of labeled training examples. While large datasets have been published regarding cars, for cyclists very few labeled data is available although appearance, point of view, and positioning of even relevant objects differ. Unfortunately, labeling data is costly and requires a huge amount of work. In this paper, we thus address the problem of learning with very few labels. The aim is to recognize particular traffic signs in crowdsourced data to collect information which is of interest to cyclists. We propose a system for object recognition that is trained with only 15 examples per class on average. To achieve this, we combine the advantages of convolutional neural networks and random forests to learn a patch-wise classifier. In the next step, we map the random forest to a neural network and transform the classifier to a fully convolutional network. Thereby, the processing of full images is significantly accelerated and bounding boxes can be predicted. Finally, we integrate data of the Global Positioning System (GPS) to localize the predictions on the map. In comparison to Faster R-CNN and other networks for object recognition or algorithms for transfer learning, we considerably reduce the required amount of labeled data. We demonstrate good performance on the recognition of traffic signs for cyclists as well as their localization in maps.Comment: Submitted to IV 2018. This research was supported by German Research Foundation DFG within Priority Research Programme 1894 "Volunteered Geographic Information: Interpretation, Visualization and Social Computing

arXiv.org e-Print Archive

Recommended from our members

Explainable and Advisable Learning for Self-driving Vehicles

Author: Kim Jinkyu
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Deep neural perception and control networks are likely to be a key component of self-driving vehicles. These models need to be explainable - they should provide easy-to-interpret rationales for their behavior - so that passengers, insurance companies, law enforcement, developers, etc., can understand what triggered a particular behavior. Explanations may be triggered by the neural controller, namely introspective explanations, or informed by the neural controller's output, namely rationalizations. Our work has focused on the challenge of generating introspective explanations of deep models for self-driving vehicles. In Chapter 3, we begin by exploring the use of visual explanations. These explanations take the form of real-time highlighted regions of an image that causally influence the network's output (steering control). In the first stage, we use a visual attention model to train a convolution network end-to-end from images to steering angle. The attention model highlights image regions that potentially influence the network's output. Some of these are true influences, but some are spurious. We then apply a causal filtering step to determine which input regions actually influence the output. This produces more succinct visual explanations and more accurately exposes the network's behavior. In Chapter 4, we add an attention-based video-to-text model to produce textual explanations of model actions, e.g. "the car slows down because the road is wet". The attention maps of controller and explanation model are aligned so that explanations are grounded in the parts of the scene that mattered to the controller. We explore two approaches to attention alignment, strong- and weak-alignment. These explainable systems represent an externalization of tacit knowledge. The network's opaque reasoning is simplified to a situation-specific dependence on a visible object in the image. This makes them brittle and potentially unsafe in situations that do not match training data. In Chapter 5, we propose to address this issue by augmenting training data with natural language advice from a human. Advice includes guidance about what to do and where to attend. We present the first step toward advice-giving, where we train an end-to-end vehicle controller that accepts advice. The controller adapts the way it attends to the scene (visual attention) and the control (steering and speed). Further, in Chapter 6, we propose a new approach that learns vehicle control with the help of long-term (global) human advice. Specifically, our system learns to summarize its visual observations in natural language, predict an appropriate action response (e.g. "I see a pedestrian crossing, so I stop"), and predict the controls, accordingly

eScholarship - University of California

Reliable localization methods for intelligent vehicles based on environment perception

Author: Moreno Olivo Francisco Miguel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/09/2022
Field of study

Mención Internacional en el título de doctorIn the near past, we would see autonomous vehicles and Intelligent Transport Systems (ITS) as a potential future of transportation. Today, thanks to all the technological advances in recent years, the feasibility of such systems is no longer a question. Some of these autonomous driving technologies are already sharing our roads, and even commercial vehicles are including more Advanced Driver-Assistance Systems (ADAS) over the years. As a result, transportation is becoming more efficient and the roads are considerably safer. One of the fundamental pillars of an autonomous system is self-localization. An accurate and reliable estimation of the vehicle’s pose in the world is essential to navigation. Within the context of outdoor vehicles, the Global Navigation Satellite System (GNSS) is the predominant localization system. However, these systems are far from perfect, and their performance is degraded in environments with limited satellite visibility. Additionally, their dependence on the environment can make them unreliable if it were to change. Accordingly, the goal of this thesis is to exploit the perception of the environment to enhance localization systems in intelligent vehicles, with special attention to their reliability. To this end, this thesis presents several contributions: First, a study on exploiting 3D semantic information in LiDAR odometry is presented, providing interesting insights regarding the contribution to the odometry output of each type of element in the scene. The experimental results have been obtained using a public dataset and validated on a real-world platform. Second, a method to estimate the localization error using landmark detections is proposed, which is later on exploited by a landmark placement optimization algorithm. This method, which has been validated in a simulation environment, is able to determine a set of landmarks so the localization error never exceeds a predefined limit. Finally, a cooperative localization algorithm based on a Genetic Particle Filter is proposed to utilize vehicle detections in order to enhance the estimation provided by GNSS systems. Multiple experiments are carried out in different simulation environments to validate the proposed method.En un pasado no muy lejano, los vehículos autónomos y los Sistemas Inteligentes del Transporte (ITS) se veían como un futuro para el transporte con gran potencial. Hoy, gracias a todos los avances tecnológicos de los últimos años, la viabilidad de estos sistemas ha dejado de ser una incógnita. Algunas de estas tecnologías de conducción autónoma ya están compartiendo nuestras carreteras, e incluso los vehículos comerciales cada vez incluyen más Sistemas Avanzados de Asistencia a la Conducción (ADAS) con el paso de los años. Como resultado, el transporte es cada vez más eficiente y las carreteras son considerablemente más seguras. Uno de los pilares fundamentales de un sistema autónomo es la autolocalización. Una estimación precisa y fiable de la posición del vehículo en el mundo es esencial para la navegación. En el contexto de los vehículos circulando en exteriores, el Sistema Global de Navegación por Satélite (GNSS) es el sistema de localización predominante. Sin embargo, estos sistemas están lejos de ser perfectos, y su rendimiento se degrada en entornos donde la visibilidad de los satélites es limitada. Además, los cambios en el entorno pueden provocar cambios en la estimación, lo que los hace poco fiables en ciertas situaciones. Por ello, el objetivo de esta tesis es utilizar la percepción del entorno para mejorar los sistemas de localización en vehículos inteligentes, con una especial atención a la fiabilidad de estos sistemas. Para ello, esta tesis presenta varias aportaciones: En primer lugar, se presenta un estudio sobre cómo aprovechar la información semántica 3D en la odometría LiDAR, generando una base de conocimiento sobre la contribución de cada tipo de elemento del entorno a la salida de la odometría. Los resultados experimentales se han obtenido utilizando una base de datos pública y se han validado en una plataforma de conducción del mundo real. En segundo lugar, se propone un método para estimar el error de localización utilizando detecciones de puntos de referencia, que posteriormente es explotado por un algoritmo de optimización de posicionamiento de puntos de referencia. Este método, que ha sido validado en un entorno de simulación, es capaz de determinar un conjunto de puntos de referencia para el cual el error de localización nunca supere un límite previamente fijado. Por último, se propone un algoritmo de localización cooperativa basado en un Filtro Genético de Partículas para utilizar las detecciones de vehículos con el fin de mejorar la estimación proporcionada por los sistemas GNSS. El método propuesto ha sido validado mediante múltiples experimentos en diferentes entornos de simulación.Programa de Doctorado en Ingeniería Eléctrica, Electrónica y Automática por la Universidad Carlos III de MadridSecretario: Joshué Manuel Pérez Rastelli.- Secretario: Jorge Villagrá Serrano.- Vocal: Enrique David Martí Muño

Universidad Carlos III de Madrid e-Archivo

Perception and intelligent localization for autonomous driving

Author: Sequeira Miguel da Rosa Carvalhal
Publication venue: Universidade de Aveiro
Publication date: 01/01/2009
Field of study

Mestrado em Engenharia de Computadores e TelemáticaVisão por computador e fusão sensorial são temas relativamente recentes, no entanto largamente adoptados no desenvolvimento de robôs autónomos que exigem adaptabilidade ao seu ambiente envolvente. Esta dissertação foca-se numa abordagem a estes dois temas para alcançar percepção no contexto de condução autónoma. O uso de câmaras para atingir este fim é um processo bastante complexo. Ao contrário dos meios sensoriais clássicos que fornecem sempre o mesmo tipo de informação precisa e atingida de forma determinística, as sucessivas imagens adquiridas por uma câmara estão repletas da mais variada informação e toda esta ambígua e extremamente difícil de extrair. A utilização de câmaras como meio sensorial em robótica é o mais próximo que chegamos na semelhança com aquele que é o de maior importância no processo de percepção humana, o sistema de visão. Visão por computador é uma disciplina científica que engloba àreas como: processamento de sinal, inteligência artificial, matemática, teoria de controlo, neurobiologia e física. A plataforma de suporte ao estudo desenvolvido no âmbito desta dissertação é o ROTA (RObô Triciclo Autónomo) e todos os elementos que consistem o seu ambiente. No contexto deste, são descritas abordagens que foram introduzidas com fim de desenvolver soluções para todos os desafios que o robô enfrenta no seu ambiente: detecção de linhas de estrada e consequente percepção desta, detecção de obstáculos, semáforos, zona da passadeira e zona de obras. É também descrito um sistema de calibração e aplicação da remoção da perspectiva da imagem, desenvolvido de modo a mapear os elementos percepcionados em distâncias reais. Em consequência do sistema de percepção, é ainda abordado o desenvolvimento de auto-localização integrado numa arquitectura distribuída incluindo navegação com planeamento inteligente. Todo o trabalho desenvolvido no decurso da dissertação é essencialmente centrado no desenvolvimento de percepção robótica no contexto de condução autónoma.Computer vision and sensor fusion are subjects that are quite recent, however widely adopted in the development of autonomous robots that require adaptability to their surrounding environment. This thesis gives an approach on both in order to achieve perception in the scope of autonomous driving. The use of cameras to achieve this goal is a rather complex subject. Unlike the classic sensorial devices that provide the same type of information with precision and achieve this in a deterministic way, the successive images acquired by a camera are replete with the most varied information, that this ambiguous and extremely dificult to extract. The use of cameras for robotic sensing is the closest we got within the similarities with what is of most importance in the process of human perception, the vision system. Computer vision is a scientific discipline that encompasses areas such as signal processing, artificial intelligence, mathematics, control theory, neurobiology and physics. The support platform in which the study within this thesis was developed, includes ROTA (RObô Triciclo Autónomo) and all elements comprising its environment. In its context, are described approaches that introduced in the platform in order to develop solutions for all the challenges facing the robot in its environment: detection of lane markings and its consequent perception, obstacle detection, trafic lights, crosswalk and road maintenance area. It is also described a calibration system and implementation for the removal of the image perspective, developed in order to map the elements perceived in actual real world distances. As a result of the perception system development, it is also addressed self-localization integrated in a distributed architecture that allows navigation with long term planning. All the work developed in the course of this work is essentially focused on robotic perception in the context of autonomous driving

Repositório Institucional da Universidade de Aveiro

Object Detection in 20 Years: A Survey

Author: Guo Yuhong
Shi Zhenwei
Ye Jieping
Zou Zhengxia
Publication venue
Publication date: 15/05/2019
Field of study

Object detection, as of one the most fundamental and challenging problems in computer vision, has received great attention in recent years. Its development in the past two decades can be regarded as an epitome of computer vision history. If we think of today's object detection as a technical aesthetics under the power of deep learning, then turning back the clock 20 years we would witness the wisdom of cold weapon era. This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century's time (from the 1990s to 2019). A number of topics have been covered in this paper, including the milestone detectors in history, detection datasets, metrics, fundamental building blocks of the detection system, speed up techniques, and the recent state of the art detection methods. This paper also reviews some important detection applications, such as pedestrian detection, face detection, text detection, etc, and makes an in-deep analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible publicatio

arXiv.org e-Print Archive

Cooperative Road Sign and Traffic Light Using Near Infrared Identification and Zigbee Smartdust Technologies

Author: Arief Budi
Fusée Antoine
von Arnim Axel
Publication venue
Publication date: 01/01/2008
Field of study

Vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I as well as I2V)applications are developing very fast. They rely on telecommunication and localizationtechnologies to detect, identify and geo-localize the sources of information (such as vehicles,roadside objects, or pedestrians). This paper presents an original approach on how twodifferent technologies (a near infrared identification sensor and a Zigbee smartdust sensor)can work together in order to create an improved system. After an introduction of these twosensors, two concrete applications will be presented: a road sign detection application and acooperative traffic light application. These applications show how the coupling of the twosensors enables robust detection and how they complement each other to add dynamicinformation to road-side objects

Kent Academic Repository