70 research outputs found
DeepSignals: Predicting Intent of Drivers Through Visual Signals
Detecting the intention of drivers is an essential task in self-driving,
necessary to anticipate sudden events like lane changes and stops. Turn signals
and emergency flashers communicate such intentions, providing seconds of
potentially critical reaction time. In this paper, we propose to detect these
signals in video sequences by using a deep neural network that reasons about
both spatial and temporal information. Our experiments on more than a million
frames show high per-frame accuracy in very challenging scenarios.Comment: To be presented at the IEEE International Conference on Robotics and
Automation (ICRA), 201
Nighttime Driver Behavior Prediction Using Taillight Signal Recognition via CNN-SVM Classifier
This paper aims to enhance the ability to predict nighttime driving behavior
by identifying taillights of both human-driven and autonomous vehicles. The
proposed model incorporates a customized detector designed to accurately detect
front-vehicle taillights on the road. At the beginning of the detector, a
learnable pre-processing block is implemented, which extracts deep features
from input images and calculates the data rarity for each feature. In the next
step, drawing inspiration from soft attention, a weighted binary mask is
designed that guides the model to focus more on predetermined regions. This
research utilizes Convolutional Neural Networks (CNNs) to extract
distinguishing characteristics from these areas, then reduces dimensions using
Principal Component Analysis (PCA). Finally, the Support Vector Machine (SVM)
is used to predict the behavior of the vehicles. To train and evaluate the
model, a large-scale dataset is collected from two types of dash-cams and
Insta360 cameras from the rear view of Ford Motor Company vehicles. This
dataset includes over 12k frames captured during both daytime and nighttime
hours. To address the limited nighttime data, a unique pixel-wise image
processing technique is implemented to convert daytime images into realistic
night images. The findings from the experiments demonstrate that the proposed
methodology can accurately categorize vehicle behavior with 92.14% accuracy,
97.38% specificity, 92.09% sensitivity, 92.10% F1-measure, and 0.895 Cohen's
Kappa Statistic. Further details are available at
https://github.com/DeepCar/Taillight_Recognition.Comment: 12 pages, 10 figure
Recommended from our members
Explainable and Advisable Learning for Self-driving Vehicles
Deep neural perception and control networks are likely to be a key component of self-driving vehicles. These models need to be explainable - they should provide easy-to-interpret rationales for their behavior - so that passengers, insurance companies, law enforcement, developers, etc., can understand what triggered a particular behavior. Explanations may be triggered by the neural controller, namely introspective explanations, or informed by the neural controller's output, namely rationalizations. Our work has focused on the challenge of generating introspective explanations of deep models for self-driving vehicles. In Chapter 3, we begin by exploring the use of visual explanations. These explanations take the form of real-time highlighted regions of an image that causally influence the network's output (steering control). In the first stage, we use a visual attention model to train a convolution network end-to-end from images to steering angle. The attention model highlights image regions that potentially influence the network's output. Some of these are true influences, but some are spurious. We then apply a causal filtering step to determine which input regions actually influence the output. This produces more succinct visual explanations and more accurately exposes the network's behavior. In Chapter 4, we add an attention-based video-to-text model to produce textual explanations of model actions, e.g. "the car slows down because the road is wet". The attention maps of controller and explanation model are aligned so that explanations are grounded in the parts of the scene that mattered to the controller. We explore two approaches to attention alignment, strong- and weak-alignment. These explainable systems represent an externalization of tacit knowledge. The network's opaque reasoning is simplified to a situation-specific dependence on a visible object in the image. This makes them brittle and potentially unsafe in situations that do not match training data. In Chapter 5, we propose to address this issue by augmenting training data with natural language advice from a human. Advice includes guidance about what to do and where to attend. We present the first step toward advice-giving, where we train an end-to-end vehicle controller that accepts advice. The controller adapts the way it attends to the scene (visual attention) and the control (steering and speed). Further, in Chapter 6, we propose a new approach that learns vehicle control with the help of long-term (global) human advice. Specifically, our system learns to summarize its visual observations in natural language, predict an appropriate action response (e.g. "I see a pedestrian crossing, so I stop"), and predict the controls, accordingly
A comparison among deep learning techniques in an autonomous driving context
Al giorno d’oggi, l’intelligenza artificiale è uno dei campi di ricerca che sta ricevendo sempre più attenzioni. Il miglioramento della potenza computazionale a disposizione dei ricercatori e sviluppatori sta rinvigorendo tutto il potenziale che era stato espresso a livello teorico agli albori dell’Intelligenza Artificiale. Tra tutti i campi dell’Intelligenza Artificiale, quella che sta attualmente suscitando maggiore interesse è la guida autonoma. Tantissime case automobilistiche e i più illustri college americani stanno investendo sempre più risorse su questa tecnologia. La ricerca e la descrizione dell’ampio spettro delle tecnologie disponibili per la guida autonoma è parte del confronto svolto in questo elaborato. Il caso di studio si incentra su un’azienda che partendo da zero, vorrebbe elaborare un sistema di guida autonoma senza dati, in breve tempo ed utilizzando solo sensori fatti da loro. Partendo da reti neurali e algoritmi classici, si è arrivati ad utilizzare algoritmi come A3C per descrivere tutte l’ampio spettro di possibilità . Le tecnologie selezionate verranno confrontate in due esperimenti. Il primo è un esperimento di pura visione artificiale usando DeepTesla. In questo esperimento verranno confrontate tecnologie quali le tradizionali tecniche di visione artificiale, CNN e CNN combinate con LSTM. Obiettivo è identificare quale algoritmo ha performance migliori elaborando solo immagini. Il secondo è un esperimento su CARLA, un simulatore basato su Unreal Engine. In questo esperimento, i risultati ottenuti in ambiente simulato con CNN combinate con LSTM, verranno confrontati con i risultati ottenuti con A3C. Obiettivo sarà capire se queste tecniche sono in grado di muoversi in autonomia utilizzando i dati forniti dal simulatore. Il confronto mira ad identificare le criticità e i possibili miglioramenti futuri di ciascuno degli algoritmi proposti in modo da poter trovare una soluzione fattibile che porta ottimi risultati in tempi brevi
Recommended from our members
End to End Learning in Autonomous Driving Systems
Convolutional neural networks have advanced visual perception significantly in recent years. Two major ingredients that enable such a success are the composition of simple modules into a complex network and the end to end optimization. However, such success has not yet revolutionized robotics as much as vision, even if robotics suffer from similar problems as traditional computer vision, i.e. imperfectness of the manual pipeline design of the system. This thesis investigates using end-to-end learning for the autonomous driving system, a concrete robotic application. End to end learning can produce reasonable driving behaviors, even in the complex urban driving scenarios. Representation learning in end-to-end driving models is crucial, and auxiliary vision tasks such as semantic segmentation can help to form a more informative driving representation especially when training data is limited. Naive convolutional neural networks are usually only capable of doing reactive control and can not involve complex reasoning in a particular scenario. This thesis also studies how to handle scene conditioned driving behavior, which goes beyond the capability of reactive control. Alongside the end-to-end structure, learning methods also play a critical role. Imitation learning methods will acquire meaningful behaviors but usually, the robot can not master the skill. Reinforcement learning, on the contrary, either barely learns anything if the environment is too complex, or it can master the skill otherwise. To get the best of both worlds, this thesis proposes an algorithmically unified method to learn from both demonstration data and the environment
Machine Learning for Next-Generation Intelligent Transportation Systems: A Survey
International audienceIntelligent Transportation Systems, or ITS for short, includes a variety of services and applications such as road traffic management, traveler information systems, public transit system management, and autonomous vehicles, to name a few. It is expected that ITS will be an integral part of urban planning and future cities as it will contribute to improved road and traffic safety, transportation and transit efficiency, as well as to increased energy efficiency and reduced environmental pollution. On the other hand, ITS poses a variety of challenges due to its scalability and diverse quality-of-service needs, as well as the massive amounts of data it will generate. In this survey, we explore the use of Machine Learning (ML), which has recently gained significant traction, to enable ITS. We provide a comprehensive survey of the current state-of-the-art of how ML technology has been applied to a broad range of ITS applications and services, such as cooperative driving and road hazard warning, and identify future directions for how ITS can use and benefit from ML technology
Marijuana Intoxication Detection Using Convolutional Neural Network
Machine learning is a broad study of computer science, widely used for data analysis and algorithms that has the ability to learn and improve by experience through training. Supervised learning, Unsupervised learning, Dimensionality Reduction, Deep Learning, etc are the methods offered by Machine learning. These techniques are applied in fields like medical, automotive finance, and many more. In this thesis, Convolutional neural network (CNN) which is a part of deep learning techniques is applied to identify if a person is under influence of Marijuana or sober, using facial feature changes like redness in eyes, watery eyes, and drowsiness caused after smoking Marijuana. CNN is a state-of-the-art method in tasks like image classification and pattern recognition. CNN’s ability to learn from training the model using image dataset is a suitable method to be used in the problem of identifying a person’s sobriety based on facial features. The proposed methodology is divided into three components. Which are dataset creation, face detection to extract input image from real-time video, and finally, tuning and training CNN model for making a prediction. The purpose of this thesis is to develop a CNN model that may be helpful if implemented in vehicles in the future to reduce impaired driving incidents. Impaired driving is a major criminal cause of vehicle accidents in Canada. Impaired driving is a serious problem that puts the lives of pedestrians on the road and drivers involved in impaired driving themselves in danger. This thesis presents how Machine Learning can be applied to predict driver’s sobriety that may be helpful in reducing impaired driving incidents in the future by implementing in vehicles
Trajectory Prediction with Event-Based Cameras for Robotics Applications
This thesis presents the study, analysis, and implementation of a framework to perform trajectory prediction using an event-based camera for robotics applications. Event-based perception represents a novel computation paradigm based on unconventional sensing technology that holds promise for data acquisition, transmission, and processing at very low latency and power consumption, crucial in the future of robotics. An event-based camera, in particular, is a sensor that responds to light changes in the scene, producing an asynchronous and sparse output over a wide illumination dynamic range. They only capture relevant spatio-temporal information - mostly driven by motion - at high rate, avoiding the inherent redundancy in static areas of the field of view. For such reasons, this device represents a potential key tool for robots that must function in highly dynamic and/or rapidly changing scenarios, or where the optimisation of the resources is fundamental, like robots with on-board systems. Prediction skills are something humans rely on daily - even unconsciously - for instance when driving, playing sports, or collaborating with other people. In the same way, predicting the trajectory or the end-point of a moving target allows a robot to plan for appropriate actions and their timing in advance, interacting with it in many different manners. Moreover, prediction is also helpful for compensating robot internal delays in the perception-action chain, due for instance to limited sensors and/or actuators. The question I addressed in this work is whether event-based cameras are advantageous or not in trajectory prediction for robotics. In particular, if classical deep learning architecture used for this task can accommodate for event-based data, working asynchronously, and which benefit they can bring with respect to standard cameras. The a priori hypothesis is that being the sampling of the scene driven by motion, such a device would allow for more meaningful information acquisition, improving the prediction accuracy and processing data only when needed - without any information loss or redundant acquisition. To test the hypothesis, experiments are mostly carried out using the neuromorphic iCub, a custom version of the iCub humanoid platform that mounts two event-based cameras in the eyeballs, along with standard RGB cameras. To further motivate the work on iCub, a preliminary step is the evaluation of the robot's internal delays, a value that should be compensated by the prediction to interact in real-time with the object perceived.
The first part of this thesis sees the implementation of the event-based framework for prediction, to answer the question if Long Short-Term Memory neural networks, the architecture used in this work, can be combined with event-based cameras. The task considered is the handover Human-Robot Interaction, during which the trajectory of the object in the human's hand must be inferred. Results show that the proposed pipeline can predict both spatial and temporal coordinates of the incoming trajectory with higher accuracy than model-based regression methods.
Moreover, fast recovery from failure cases and adaptive prediction horizon behavior are exhibited. Successively, I questioned how much the event-based sampling approach can be convenient with respect to the classical fixed-rate approach. The test case used is the trajectory prediction of a bouncing ball, implemented with the pipeline previously introduced. A comparison between the two sampling methods is analysed in terms of error for different working rates, showing how the spatial sampling of the event-based approach allows to achieve lower error and also to adapt the computational load dynamically, depending on the motion in the scene. Results from both works prove that the merging of event-based data and Long Short-Term Memory networks looks promising for spatio-temporal features prediction in highly dynamic tasks, and paves the way to further studies about the temporal aspect and to a wide range of applications, not only robotics-related. Ongoing work is now focusing on the robot control side, finding the best way to exploit the spatio-temporal information provided by the predictor and defining the optimal robot behavior.
Future work will see the shift of the full pipeline - prediction and robot control - to a spiking implementation. First steps in this direction have been already made thanks to a collaboration with a group from the University of Zurich, with which I propose a closed-loop motor controller implemented on a mixed-signal analog/digital neuromorphic processor, emulating a classical PID controller by means of spiking neural networks
- …