505 research outputs found
Human Motion Trajectory Prediction: A Survey
With growing numbers of intelligent autonomous systems in human environments,
the ability of such systems to perceive, understand and anticipate human
behavior becomes increasingly important. Specifically, predicting future
positions of dynamic agents and planning considering such predictions are key
tasks for self-driving vehicles, service robots and advanced surveillance
systems. This paper provides a survey of human motion trajectory prediction. We
review, analyze and structure a large selection of work from different
communities and propose a taxonomy that categorizes existing methods based on
the motion modeling approach and level of contextual information used. We
provide an overview of the existing datasets and performance metrics. We
discuss limitations of the state of the art and outline directions for further
research.Comment: Submitted to the International Journal of Robotics Research (IJRR),
37 page
Intent prediction of vulnerable road users for trusted autonomous vehicles
This study investigated how future autonomous vehicles could be further trusted by vulnerable road users (such as pedestrians and cyclists) that they would be interacting with in urban traffic environments. It focused on understanding the behaviours of such road users on a deeper level by predicting their future intentions based solely on vehicle-based sensors and AI techniques. The findings showed that personal/body language attributes of vulnerable road users besides their past motion trajectories and physics attributes in the environment led to more accurate predictions about their intended actions
Predicting pedestrian crossing intentions using contextual information
El entorno urbano es uno de los escenarios m as complejos para un veh culo aut onomo, ya
que lo comparte con otros tipos de usuarios conocidos como usuarios vulnerables de la
carretera, con los peatones como mayor representante. Estos usuarios se caracterizan por
su gran dinamicidad. A pesar del gran n umero de interacciones entre veh culos y peatones,
la seguridad de estos ultimos no ha aumentado al mismo ritmo que la de los ocupantes de
los veh culos. Por esta raz on, es necesario abordar este problema. Una posible estrategia
estar a basada en conseguir que los veh culos anticipen el comportamiento de los peatones
para minimizar situaciones de riesgo, especialmente presentes en el momento de cruce.
El objetivo de esta tesis doctoral es alcanzar dicha anticipaci on mediante el desarrollo
de t ecnicas de predicci on de la acci on de cruce de peatones basadas en aprendizaje
profundo.
Previo al dise~no e implementaci on de los sistemas de predicci on, se ha desarrollado
un sistema de clasi caci on con el objetivo de discernir a los peatones involucrados en la
escena vial. El sistema, basado en redes neuronales convolucionales, ha sido entrenado y
validado con un conjunto de datos personalizado. Dicho conjunto se ha construido a partir
de varios conjuntos existentes y aumentado mediante la inclusi on de im agenes obtenidas de
internet. Este paso previo a la anticipaci on permitir a reducir el procesamiento innecesario
dentro del sistema de percepci on del veh culo.
Tras este paso, se han desarrollado dos sistemas como propuesta para abordar el problema
de predicci on.
El primer sistema, basado en redes convolucionales y recurrentes, obtiene una predicci
on a corto plazo de la acci on de cruce realizada un segundo en el futuro. La informaci on
de entrada al modelo est a basada principalmente en imagen, que permite aportar contexto
adicional del peat on. Adem as, el uso de otras variables relacionadas con el peat on junto
con mejoras en la arquitectura, permiten mejorar considerablemente los resultados en el
conjunto de datos JAAD.
El segundo sistema se basa en una arquitectura end-to-end basado en la combinaci on
de redes neuronales convolucionales tridimensionales y/o el codi cador de la arquitectura
Transformer. En este modelo, a diferencia del anterior, la mayor a de las mejoras est an
centradas en transformaciones de los datos de entrada. Tras analizar dichas mejoras,
una serie de modelos se han evaluado y comparado con otros m etodos utilizando tanto el
conjunto de datos JAAD como PIE. Los resultados obtenidos han conseguido liderar el
estado del arte, validando la arquitectura propuesta.The urban environment is one of the most complex scenarios for an autonomous vehicle,
as it is shared with other types of users known as vulnerable road users, with pedestrians
as their principal representative. These users are characterized by their great dynamicity.
Despite a large number of interactions between vehicles and pedestrians, the safety of
pedestrians has not increased at the same rate as that of vehicle occupants. For this
reason, it is necessary to address this problem. One possible strategy would be anticipating
pedestrian behavior to minimize risky situations, especially during the crossing.
The objective of this doctoral thesis is to achieve such anticipation through the development
of crosswalk action prediction techniques based on deep learning.
Before the design and implementation of the prediction systems, a classi cation system
has been developed to discern the pedestrians involved in the road scene. The system,
based on convolutional neural networks, has been trained and validated with a customized
dataset. This set has been built from several existing sets and augmented by including
images obtained from the Internet. This pre-anticipation step would reduce unnecessary
processing within the vehicle perception system.
After this step, two systems have been developed as a proposal to solve the prediction
problem.
The rst system is composed of convolutional and recurrent encoder networks. It
obtains a short-term prediction of the crossing action performed one second in the future.
The input information to the model is mainly image-based, which provides additional
pedestrian context. In addition, the use of pedestrian-related variables and architectural
improvements allows better results on the JAAD dataset.
The second system is an end-to-end architecture based on the combination of threedimensional
convolutional neural networks and/or the Transformer architecture encoder.
In this model, most of the proposed and investigated improvements are focused on transformations
of the input data. After an extensive set of individual tests, several models
have been trained, evaluated, and compared with other methods using both JAAD and
PIE datasets. Obtained results are among the best state-of-the-art models, validating the
proposed architecture
Multiple path prediction for traffic scenes using LSTMs and mixture density models
This work presents an analysis of predicting multiple future paths of moving objects in traffic scenes by leveraging Long Short-Term Memory architectures (LSTMs) and Mixture Density Networks (MDNs) in a single-shot manner. Path prediction allows estimating the future positions of objects. This is useful in important applications such as security monitoring systems, Autonomous Driver Assistance Systems and assistive technologies. Normal approaches use observed positions (tracklets) of objects in video frames to predict their future paths as a sequence of position values. This can be treated as a time series. LSTMs have achieved good performance when dealing with time series. However, LSTMs have the limitation of only predicting a single path per tracklet. Path prediction is not a deterministic task and requires predicting with a level of uncertainty. Predicting multiple paths instead of a single one is therefore a more realistic manner of approaching this task. In this work, predicting a set of future paths with associated uncertainty was archived by combining LSTMs and MDNs. The evaluation was made on the KITTI and the CityFlow datasets on three type of objects, four prediction horizons and two different points of view (image coordinates and birds-eye vie
VIENA2: A Driving Anticipation Dataset
Action anticipation is critical in scenarios where one needs to react before
the action is finalized. This is, for instance, the case in automated driving,
where a car needs to, e.g., avoid hitting pedestrians and respect traffic
lights. While solutions have been proposed to tackle subsets of the driving
anticipation tasks, by making use of diverse, task-specific sensors, there is
no single dataset or framework that addresses them all in a consistent manner.
In this paper, we therefore introduce a new, large-scale dataset, called
VIENA2, covering 5 generic driving scenarios, with a total of 25 distinct
action classes. It contains more than 15K full HD, 5s long videos acquired in
various driving conditions, weathers, daytimes and environments, complemented
with a common and realistic set of sensor measurements. This amounts to more
than 2.25M frames, each annotated with an action label, corresponding to 600
samples per action class. We discuss our data acquisition strategy and the
statistics of our dataset, and benchmark state-of-the-art action anticipation
techniques, including a new multi-modal LSTM architecture with an effective
loss function for action anticipation in driving scenarios.Comment: Accepted in ACCV 201
Autonomous Vehicles Drive into Shared Spaces: eHMI Design Concept Focusing on Vulnerable Road Users
In comparison to conventional traffic designs, shared spaces promote a more
pleasant urban environment with slower motorized movement, smoother traffic,
and less congestion. In the foreseeable future, shared spaces will be populated
with a mixture of autonomous vehicles (AVs) and vulnerable road users (VRUs)
like pedestrians and cyclists. However, a driver-less AV lacks a way to
communicate with the VRUs when they have to reach an agreement of a
negotiation, which brings new challenges to the safety and smoothness of the
traffic. To find a feasible solution to integrating AVs seamlessly into
shared-space traffic, we first identified the possible issues that the
shared-space designs have not considered for the role of AVs. Then an online
questionnaire was used to ask participants about how they would like a driver
of the manually driving vehicle to communicate with VRUs in a shared space. We
found that when the driver wanted to give some suggestions to the VRUs in a
negotiation, participants thought that the communications via the driver's body
behaviors were necessary. Besides, when the driver conveyed information about
her/his intentions and cautions to the VRUs, participants selected different
communication methods with respect to their transport modes (as a driver,
pedestrian, or cyclist). These results suggest that novel eHMIs might be useful
for AV-VRU communication when the original drivers are not present. Hence, a
potential eHMI design concept was proposed for different VRUs to meet their
various expectations. In the end, we further discussed the effects of the eHMIs
on improving the sociality in shared spaces and the autonomous driving systems
- …