194 research outputs found

    Efficient Pedestrian Detection in Urban Traffic Scenes

    Get PDF
    Pedestrians are important participants in urban traffic environments, and thus act as an interesting category of objects for autonomous cars. Automatic pedestrian detection is an essential task for protecting pedestrians from collision. In this thesis, we investigate and develop novel approaches by interpreting spatial and temporal characteristics of pedestrians, in three different aspects: shape, cognition and motion. The special up-right human body shape, especially the geometry of the head and shoulder area, is the most discriminative characteristic for pedestrians from other object categories. Inspired by the success of Haar-like features for detecting human faces, which also exhibit a uniform shape structure, we propose to design particular Haar-like features for pedestrians. Tailored to a pre-defined statistical pedestrian shape model, Haar-like templates with multiple modalities are designed to describe local difference of the shape structure. Cognition theories aim to explain how human visual systems process input visual signals in an accurate and fast way. By emulating the center-surround mechanism in human visual systems, we design multi-channel, multi-direction and multi-scale contrast features, and boost them to respond to the appearance of pedestrians. In this way, our detector is considered as a top-down saliency system. In the last part of this thesis, we exploit the temporal characteristics for moving pedestrians and then employ motion information for feature design, as well as for regions of interest (ROIs) selection. Motion segmentation on optical flow fields enables us to select those blobs most probably containing moving pedestrians; a combination of Histogram of Oriented Gradients (HOG) and motion self difference features further enables robust detection. We test our three approaches on image and video data captured in urban traffic scenes, which are rather challenging due to dynamic and complex backgrounds. The achieved results demonstrate that our approaches reach and surpass state-of-the-art performance, and can also be employed for other applications, such as indoor robotics or public surveillance. In this thesis, we investigate and develop novel approaches by interpreting spatial and temporal characteristics of pedestrians, in three different aspects: shape, cognition and motion. The special up-right human body shape, especially the geometry of the head and shoulder area, is the most discriminative characteristic for pedestrians from other object categories. Inspired by the success of Haar-like features for detecting human faces, which also exhibit a uniform shape structure, we propose to design particular Haar-like features for pedestrians. Tailored to a pre-defined statistical pedestrian shape model, Haar-like templates with multiple modalities are designed to describe local difference of the shape structure. Cognition theories aim to explain how human visual systems process input visual signals in an accurate and fast way. By emulating the center-surround mechanism in human visual systems, we design multi-channel, multi-direction and multi-scale contrast features, and boost them to respond to the appearance of pedestrians. In this way, our detector is considered as a top-down saliency system. In the last part of this thesis, we exploit the temporal characteristics for moving pedestrians and then employ motion information for feature design, as well as for regions of interest (ROIs) selection. Motion segmentation on optical flow fields enables us to select those blobs most probably containing moving pedestrians; a combination of Histogram of Oriented Gradients (HOG) and motion self difference features further enables robust detection. We test our three approaches on image and video data captured in urban traffic scenes, which are rather challenging due to dynamic and complex backgrounds. The achieved results demonstrate that our approaches reach and surpass state-of-the-art performance, and can also be employed for other applications, such as indoor robotics or public surveillance

    Driver Behavior Analysis Based on Real On-Road Driving Data in the Design of Advanced Driving Assistance Systems

    Get PDF
    The number of vehicles on the roads increases every day. According to the National Highway Traffic Safety Administration (NHTSA), the overwhelming majority of serious crashes (over 94 percent) are caused by human error. The broad aim of this research is to develop a driver behavior model using real on-road data in the design of Advanced Driving Assistance Systems (ADASs). For several decades, these systems have been a focus of many researchers and vehicle manufacturers in order to increase vehicle and road safety and assist drivers in different driving situations. Some studies have concentrated on drivers as the main actor in most driving circumstances. The way a driver monitors the traffic environment partially indicates the level of driver awareness. As an objective, we carry out a quantitative and qualitative analysis of driver behavior to identify the relationship between a driver’s intention and his/her actions. The RoadLAB project developed an instrumented vehicle equipped with On-Board Diagnostic systems (OBD-II), a stereo imaging system, and a non-contact eye tracker system to record some synchronized driving data of the driver cephalo-ocular behavior, the vehicle itself, and traffic environment. We analyze several behavioral features of the drivers to realize the potential relevant relationship between driver behavior and the anticipation of the next driver maneuver as well as to reach a better understanding of driver behavior while in the act of driving. Moreover, we detect and classify road lanes in the urban and suburban areas as they provide contextual information. Our experimental results show that our proposed models reached the F1 score of 84% and the accuracy of 94% for driver maneuver prediction and lane type classification respectively

    Predictive Model of Driver\u27s Eye Fixation for Maneuver Prediction in the Design of Advanced Driving Assistance Systems

    Get PDF
    Over the last few years, Advanced Driver Assistance Systems (ADAS) have been shown to significantly reduce the number of vehicle accidents. Accord- ing to the National Highway Traffic Safety Administration (NHTSA), driver errors contribute to 94% of road collisions. This research aims to develop a predictive model of driver eye fixation by analyzing the driver eye and head information (cephalo-ocular) for maneuver prediction in an Advanced Driving Assistance System (ADAS). Several ADASs have been developed to help drivers to perform driving tasks in complex environments and many studies were conducted on improving automated systems. Some research has relied on the fact that the driver plays a crucial role in most driving scenarios, recognizing the driver’s role as the central element in ADASs. The way in which a driver monitors the surrounding environment is at least partially descriptive of the driver’s situation awareness. This thesis’s primary goal is the quantitative and qualitative analysis of driver behavior to determine the relationship between driver intent and actions. The RoadLab initiative provided an instrumented vehicle equipped with an on-board diagnostic system, an eye-gaze tracker, and a stereo vision system for the extraction of relevant features from the driver, the vehicle, and the environment. Several driver behavioral features are investigated to determine whether there is a relevant relation between the driver’s eye fixations and the prediction of driving maneuvers

    Visual Analysis in Traffic & Re-identification

    Get PDF

    Deep reinforcement learning for multi-modal embodied navigation

    Full text link
    Ce travail se concentre sur une tâche de micro-navigation en plein air où le but est de naviguer vers une adresse de rue spécifiée en utilisant plusieurs modalités (par exemple, images, texte de scène et GPS). La tâche de micro-navigation extérieure s’avère etre un défi important pour de nombreuses personnes malvoyantes, ce que nous démontrons à travers des entretiens et des études de marché, et nous limitons notre définition des problèmes à leurs besoins. Nous expérimentons d’abord avec un monde en grille partiellement observable (Grid-Street et Grid City) contenant des maisons, des numéros de rue et des régions navigables. Ensuite, nous introduisons le Environnement de Trottoir pour la Navigation Visuelle (ETNV), qui contient des images panoramiques avec des boîtes englobantes pour les numéros de maison, les portes et les panneaux de nom de rue, et des formulations pour plusieurs tâches de navigation. Dans SEVN, nous formons un modèle de politique pour fusionner des observations multimodales sous la forme d’images à résolution variable, de texte visible et de données GPS simulées afin de naviguer vers une porte d’objectif. Nous entraînons ce modèle en utilisant l’algorithme d’apprentissage par renforcement, Proximal Policy Optimization (PPO). Nous espérons que cette thèse fournira une base pour d’autres recherches sur la création d’agents pouvant aider les membres de la communauté des gens malvoyantes à naviguer le monde.This work focuses on an Outdoor Micro-Navigation (OMN) task in which the goal is to navigate to a specified street address using multiple modalities including images, scene-text, and GPS. This task is a significant challenge to many Blind and Visually Impaired (BVI) people, which we demonstrate through interviews and market research. To investigate the feasibility of solving this task with Deep Reinforcement Learning (DRL), we first introduce two partially observable grid-worlds, Grid-Street and Grid City, containing houses, street numbers, and navigable regions. In these environments, we train an agent to find specific houses using local observations under a variety of training procedures. We parameterize our agent with a neural network and train using reinforcement learning methods. Next, we introduce the Sidewalk Environment for Visual Navigation (SEVN), which contains panoramic images with labels for house numbers, doors, and street name signs, and formulations for several navigation tasks. In SEVN, we train another neural network model using Proximal Policy Optimization (PPO) to fuse multi-modal observations in the form of variable resolution images, visible text, and simulated GPS data, and to use this representation to navigate to goal doors. Our best model used all available modalities and was able to navigate to over 100 goals with an 85% success rate. We found that models with access to only a subset of these modalities performed significantly worse, supporting the need for a multi-modal approach to the OMN task. We hope that this thesis provides a foundation for further research into the creation of agents to assist members of the BVI community to safely navigate

    A PhD Dissertation on Road Topology Classification for Autonomous Driving

    Get PDF
    La clasificaci´on de la topolog´ıa de la carretera es un punto clave si queremos desarrollar sistemas de conducci´on aut´onoma completos y seguros. Es l´ogico pensar que la comprensi ´on de forma exhaustiva del entorno que rodea al vehiculo, tal como sucede cuando es un ser humano el que toma las decisiones al volante, es una condici´on indispensable si se quiere avanzar en la consecuci´on de veh´ıculos aut´onomos de nivel 4 o 5. Si el conductor, ya sea un sistema aut´onomo, como un ser humano, no tiene acceso a la informaci´on del entorno la disminuci´on de la seguridad es cr´ıtica y el accidente es casi instant´aneo i.e., cuando un conductor se duerme al volante. A lo largo de esta tesis doctoral se presentan sendos sistemas basados en deep leaning que ayudan al sistema de conducci´on aut´onoma a comprender el entorno en el que se encuentra en ese instante. El primero de ellos 3D-Deep y su optimizaci´on 3D-Deepest, es una nueva arquitectura de red para la segmentaci´on sem´antica de carretera en el que se integran fuentes de datos de diferente tipolog´ıa. La segmentaci´on de carretera es clave en un veh´ıculo aut´onomo, ya que es el medio por el que deber´ıa circular en el 99,9% de los casos. El segundo es un sistema de clasificaci´on de intersecciones urbanas mediante diferentes enfoques comprendidos dentro del metric-learning, la integraci´on temporal y la generaci´on de im´agenes sint´eticas. La seguridad es un punto clave en cualquier sistema aut´onomo, y si es de conducci´on a´un m´as. Las intersecciones son uno de los lugares dentro de las ciudades donde la seguridad es cr´ıtica. Los coches siguen trayectorias secantes y por tanto pueden colisionar, la mayor´ıa de ellas son usadas por los peatones para atravesar la v´ıa independientemente de si existen pasos de cebra o no, lo que incrementa de forma alarmante los riesgos de atropello y colisi´on. La implementaci´on de la combinaci´on de ambos sistemas mejora substancialmente la comprensi´on del entorno, y puede considerarse que incrementa la seguridad, allanando el camino en la investigaci´on hacia un veh´ıculo completamente aut´onomo.Road topology classification is a crucial point if we want to develop complete and safe autonomous driving systems. It is logical to think that a thorough understanding of the environment surrounding the ego-vehicle, as it happens when a human being is a decision-maker at the wheel, is an indispensable condition if we want to advance in the achievement of level 4 or 5 autonomous vehicles. If the driver, either an autonomous system or a human being, does not have access to the information of the environment, the decrease in safety is critical, and the accident is almost instantaneous, i.e., when a driver falls asleep at the wheel. Throughout this doctoral thesis, we present two deep learning systems that will help an autonomous driving system understand the environment in which it is at that instant. The first one, 3D-Deep and its optimization 3D-Deepest, is a new network architecture for semantic road segmentation in which data sources of different types are integrated. Road segmentation is vital in an autonomous vehicle since it is the medium on which it should drive in 99.9% of the cases. The second is an urban intersection classification system using different approaches comprised of metric-learning, temporal integration, and synthetic image generation. Safety is a crucial point in any autonomous system, and if it is a driving system, even more so. Intersections are one of the places within cities where safety is critical. Cars follow secant trajectories and therefore can collide; most of them are used by pedestrians to cross the road regardless of whether there are crosswalks or not, which alarmingly increases the risks of being hit and collision. The implementation of the combination of both systems substantially improves the understanding of the environment and can be considered to increase safety, paving the way in the research towards a fully autonomous vehicle

    A monocular color vision system for road intersection detection

    Get PDF
    Urban driving has become the focus of autonomous robotics in recent years. Many groups seek to benefit from the research in this field including the military, who hopes to deploy autonomous rescue forces to battle-torn cities, and consumers, who will benefit from the safety and convenience resulting from new technologies finding purpose in consumer automobiles. One key aspect of autonomous urban driving is localization, or the ability of the robot to determine its position on a road network. Any information that can be obtained for the surrounding area including stop signs, road lines, and intersecting roads can aid this localization. The work here attempts to combine some previously established computer vision methods to identify roads and develop a new method that can identify both the road and any possible intersecting roads present in front of a vehicle using a single color camera. Computer vision systems rely on a few basic methods to understand and identify what they are looking at. Two valuable methods are the detection of edges that are present in the image and analysis of the colors that compose the image. The method described here attempts to utilize edge information to find road lines and color information to find the road area and any similarly colored intersecting roads. This work demonstrates that combining edge detection and color analysis methods utilizes their strengths and accommodates for their weaknesses and allows for a method that can successfully detect road lanes and intersecting roads at speeds fast enough for use with autonomous urban driving
    corecore