Search CORE

3,099 research outputs found

Low-Complexity Driving Event Detection from Side Information of a 3D Video Encoder

Author: Li Y.
Masala Enrico
Wang R.
Publication venue: IEEE
Publication date: 01/01/2013
Field of study

Mobile phones are often found in cars, for instance when they are used as navigation assistants. This work propose to use their camera, which is often already pointed to the road, to perform some low-complexity analysis of the driving context, with the final aim to detect potentially unsafe conditions. Since content understanding algorithms are typically too complex to run in real time on a mobile device, a driving event detection algorithm is presented based on the side information available from video encoders, which are a highly optimized application in mobile phones. A set of interesting and easy-to-extract features has been identified in the side information and then further reduced and adapted to the specific events of interest. A detection algorithm based on support vector machines has been designed and trained on several hours of video annotated by a human operator to extract the events of interest. The detection algorithm is shown to achieve a good identification rate for the considered events and feature sets. Moreover, results also show that the use of a stereoscopic camera significantly improves the performance of the detection algorithm in most case

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Predicting pedestrian crossing intentions using contextual information

Author: Lorenzo Diaz Javier
Publication venue
Publication date: 01/01/2022
Field of study

El entorno urbano es uno de los escenarios m as complejos para un veh culo aut onomo, ya que lo comparte con otros tipos de usuarios conocidos como usuarios vulnerables de la carretera, con los peatones como mayor representante. Estos usuarios se caracterizan por su gran dinamicidad. A pesar del gran n umero de interacciones entre veh culos y peatones, la seguridad de estos ultimos no ha aumentado al mismo ritmo que la de los ocupantes de los veh culos. Por esta raz on, es necesario abordar este problema. Una posible estrategia estar a basada en conseguir que los veh culos anticipen el comportamiento de los peatones para minimizar situaciones de riesgo, especialmente presentes en el momento de cruce. El objetivo de esta tesis doctoral es alcanzar dicha anticipaci on mediante el desarrollo de t ecnicas de predicci on de la acci on de cruce de peatones basadas en aprendizaje profundo. Previo al dise~no e implementaci on de los sistemas de predicci on, se ha desarrollado un sistema de clasi caci on con el objetivo de discernir a los peatones involucrados en la escena vial. El sistema, basado en redes neuronales convolucionales, ha sido entrenado y validado con un conjunto de datos personalizado. Dicho conjunto se ha construido a partir de varios conjuntos existentes y aumentado mediante la inclusi on de im agenes obtenidas de internet. Este paso previo a la anticipaci on permitir a reducir el procesamiento innecesario dentro del sistema de percepci on del veh culo. Tras este paso, se han desarrollado dos sistemas como propuesta para abordar el problema de predicci on. El primer sistema, basado en redes convolucionales y recurrentes, obtiene una predicci on a corto plazo de la acci on de cruce realizada un segundo en el futuro. La informaci on de entrada al modelo est a basada principalmente en imagen, que permite aportar contexto adicional del peat on. Adem as, el uso de otras variables relacionadas con el peat on junto con mejoras en la arquitectura, permiten mejorar considerablemente los resultados en el conjunto de datos JAAD. El segundo sistema se basa en una arquitectura end-to-end basado en la combinaci on de redes neuronales convolucionales tridimensionales y/o el codi cador de la arquitectura Transformer. En este modelo, a diferencia del anterior, la mayor a de las mejoras est an centradas en transformaciones de los datos de entrada. Tras analizar dichas mejoras, una serie de modelos se han evaluado y comparado con otros m etodos utilizando tanto el conjunto de datos JAAD como PIE. Los resultados obtenidos han conseguido liderar el estado del arte, validando la arquitectura propuesta.The urban environment is one of the most complex scenarios for an autonomous vehicle, as it is shared with other types of users known as vulnerable road users, with pedestrians as their principal representative. These users are characterized by their great dynamicity. Despite a large number of interactions between vehicles and pedestrians, the safety of pedestrians has not increased at the same rate as that of vehicle occupants. For this reason, it is necessary to address this problem. One possible strategy would be anticipating pedestrian behavior to minimize risky situations, especially during the crossing. The objective of this doctoral thesis is to achieve such anticipation through the development of crosswalk action prediction techniques based on deep learning. Before the design and implementation of the prediction systems, a classi cation system has been developed to discern the pedestrians involved in the road scene. The system, based on convolutional neural networks, has been trained and validated with a customized dataset. This set has been built from several existing sets and augmented by including images obtained from the Internet. This pre-anticipation step would reduce unnecessary processing within the vehicle perception system. After this step, two systems have been developed as a proposal to solve the prediction problem. The rst system is composed of convolutional and recurrent encoder networks. It obtains a short-term prediction of the crossing action performed one second in the future. The input information to the model is mainly image-based, which provides additional pedestrian context. In addition, the use of pedestrian-related variables and architectural improvements allows better results on the JAAD dataset. The second system is an end-to-end architecture based on the combination of threedimensional convolutional neural networks and/or the Transformer architecture encoder. In this model, most of the proposed and investigated improvements are focused on transformations of the input data. After an extensive set of individual tests, several models have been trained, evaluated, and compared with other methods using both JAAD and PIE datasets. Obtained results are among the best state-of-the-art models, validating the proposed architecture

e_Buah - Biblioteca Digital de la Universidad de Alcalá

Recommended from our members

Explainable and Advisable Learning for Self-driving Vehicles

Author: Kim Jinkyu
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Deep neural perception and control networks are likely to be a key component of self-driving vehicles. These models need to be explainable - they should provide easy-to-interpret rationales for their behavior - so that passengers, insurance companies, law enforcement, developers, etc., can understand what triggered a particular behavior. Explanations may be triggered by the neural controller, namely introspective explanations, or informed by the neural controller's output, namely rationalizations. Our work has focused on the challenge of generating introspective explanations of deep models for self-driving vehicles. In Chapter 3, we begin by exploring the use of visual explanations. These explanations take the form of real-time highlighted regions of an image that causally influence the network's output (steering control). In the first stage, we use a visual attention model to train a convolution network end-to-end from images to steering angle. The attention model highlights image regions that potentially influence the network's output. Some of these are true influences, but some are spurious. We then apply a causal filtering step to determine which input regions actually influence the output. This produces more succinct visual explanations and more accurately exposes the network's behavior. In Chapter 4, we add an attention-based video-to-text model to produce textual explanations of model actions, e.g. "the car slows down because the road is wet". The attention maps of controller and explanation model are aligned so that explanations are grounded in the parts of the scene that mattered to the controller. We explore two approaches to attention alignment, strong- and weak-alignment. These explainable systems represent an externalization of tacit knowledge. The network's opaque reasoning is simplified to a situation-specific dependence on a visible object in the image. This makes them brittle and potentially unsafe in situations that do not match training data. In Chapter 5, we propose to address this issue by augmenting training data with natural language advice from a human. Advice includes guidance about what to do and where to attend. We present the first step toward advice-giving, where we train an end-to-end vehicle controller that accepts advice. The controller adapts the way it attends to the scene (visual attention) and the control (steering and speed). Further, in Chapter 6, we propose a new approach that learns vehicle control with the help of long-term (global) human advice. Specifically, our system learns to summarize its visual observations in natural language, predict an appropriate action response (e.g. "I see a pedestrian crossing, so I stop"), and predict the controls, accordingly

eScholarship - University of California

Pattern Anomaly Detection based on Sequence-to-Sequence Regularity Learning

Author: Cheng Yuzhen
Li Min
Publication venue: Faculty of Mechanical Engineering in Slavonski Brod; Faculty of Electrical Engineering, Computer Science and Information Technology Osijek; Faculty of Civil Engineering in Osijek
Publication date: 01/01/2023
Field of study

Anomaly detection in traffic surveillance videos is a challenging task due to the ambiguity of anomaly definition and the complexity of scenes. In this paper, we propose to detect anomalous trajectories for vehicle behavior analysis via learning regularities in data. First, we train a sequence-to-sequence model under the autoencoder architecture and propose a new reconstruction error function for model optimization and anomaly evaluation. As such, the model is forced to learn the regular trajectory patterns in an unsupervised manner. Then, at the inference stage, we use the learned model to encode the test trajectory sample into a compact representation and generate a new trajectory sequence in the learned regular pattern. An anomaly score is computed based on the deviation of the generated trajectory from the test sample. Finally, we can find out the anomalous trajectories with an adaptive threshold. We evaluate the proposed method on two real-world traffic datasets and the experiments show favorable results against state-of-the-art algorithms. This paper\u27s research on sequence-to-sequence regularity learning can provide theoretical and practical support for pattern anomaly detection

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Combined Learned and Classical Methods for Real-Time Visual Perception in Autonomous Driving

Author: Aladem Mohamed D.
Publication venue
Publication date: 24/01/2020
Field of study

Autonomy, robotics, and Artificial Intelligence (AI) are among the main defining themes of next-generation societies. Of the most important applications of said technologies is driving automation which spans from different Advanced Driver Assistance Systems (ADAS) to full self-driving vehicles. Driving automation is promising to reduce accidents, increase safety, and increase access to mobility for more people such as the elderly and the handicapped. However, one of the main challenges facing autonomous vehicles is robust perception which can enable safe interaction and decision making. With so many sensors to perceive the environment, each with its own capabilities and limitations, vision is by far one of the main sensing modalities. Cameras are cheap and can provide rich information of the observed scene. Therefore, this dissertation develops a set of visual perception algorithms with a focus on autonomous driving as the target application area. This dissertation starts by addressing the problem of real-time motion estimation of an agent using only the visual input from a camera attached to it, a problem known as visual odometry. The visual odometry algorithm can achieve low drift rates over long-traveled distances. This is made possible through the innovative local mapping approach used. This visual odometry algorithm was then combined with my multi-object detection and tracking system. The tracking system operates in a tracking-by-detection paradigm where an object detector based on convolution neural networks (CNNs) is used. Therefore, the combined system can detect and track other traffic participants both in image domain and in 3D world frame while simultaneously estimating vehicle motion. This is a necessary requirement for obstacle avoidance and safe navigation. Finally, the operational range of traditional monocular cameras was expanded with the capability to infer depth and thus replace stereo and RGB-D cameras. This is accomplished through a single-stream convolution neural network which can output both depth prediction and semantic segmentation. Semantic segmentation is the process of classifying each pixel in an image and is an important step toward scene understanding. Literature survey, algorithms descriptions, and comprehensive evaluations on real-world datasets are presented.Ph.D.College of Engineering & Computer ScienceUniversity of Michiganhttps://deepblue.lib.umich.edu/bitstream/2027.42/153989/1/Mohamed Aladem Final Dissertation.pdfDescription of Mohamed Aladem Final Dissertation.pdf : Dissertatio

Deep Blue Documents at the University of Michigan

A Routine and Post-disaster Road Corridor Monitoring Framework for the Increased Resilience of Road Infrastructures

Author: Tilon S.M.
Publication venue: University of Twente, Faculty of Geo-Information Science and Earth Observation (ITC)
Publication date: 13/09/2023
Field of study

University of Twente Research Information