18 research outputs found
CARPe Posterum: A Convolutional Approach for Real-time Pedestrian Path Prediction
Pedestrian path prediction is an essential topic in computer vision and video
understanding. Having insight into the movement of pedestrians is crucial for
ensuring safe operation in a variety of applications including autonomous
vehicles, social robots, and environmental monitoring. Current works in this
area utilize complex generative or recurrent methods to capture many possible
futures. However, despite the inherent real-time nature of predicting future
paths, little work has been done to explore accurate and computationally
efficient approaches for this task. To this end, we propose a convolutional
approach for real-time pedestrian path prediction, CARPe. It utilizes a
variation of Graph Isomorphism Networks in combination with an agile
convolutional neural network design to form a fast and accurate path prediction
approach. Notable results in both inference speed and prediction accuracy are
achieved, improving FPS considerably in comparison to current state-of-the-art
methods while delivering competitive accuracy on well-known path prediction
datasets.Comment: AAAI-21 Camera Read
Trajectory Prediction for Autonomous Driving based on Multi-Head Attention with Joint Agent-Map Representation
Predicting the trajectories of surrounding agents is an essential ability for
autonomous vehicles navigating through complex traffic scenes. The future
trajectories of agents can be inferred using two important cues: the locations
and past motion of agents, and the static scene structure. Due to the high
variability in scene structure and agent configurations, prior work has
employed the attention mechanism, applied separately to the scene and agent
configuration to learn the most salient parts of both cues. However, the two
cues are tightly linked. The agent configuration can inform what part of the
scene is most relevant to prediction. The static scene in turn can help
determine the relative influence of agents on each other's motion. Moreover,
the distribution of future trajectories is multimodal, with modes corresponding
to the agent's intent. The agent's intent also informs what part of the scene
and agent configuration is relevant to prediction. We thus propose a novel
approach applying multi-head attention by considering a joint representation of
the static scene and surrounding agents. We use each attention head to generate
a distinct future trajectory to address multimodality of future trajectories.
Our model achieves state of the art results on the nuScenes prediction
benchmark and generates diverse future trajectories compliant with scene
structure and agent configuration.Comment: Revised submission for RA-