17,231 research outputs found
CAR-Net: Clairvoyant Attentive Recurrent Network
We present an interpretable framework for path prediction that leverages
dependencies between agents' behaviors and their spatial navigation
environment. We exploit two sources of information: the past motion trajectory
of the agent of interest and a wide top-view image of the navigation scene. We
propose a Clairvoyant Attentive Recurrent Network (CAR-Net) that learns where
to look in a large image of the scene when solving the path prediction task.
Our method can attend to any area, or combination of areas, within the raw
image (e.g., road intersections) when predicting the trajectory of the agent.
This allows us to visualize fine-grained semantic elements of navigation scenes
that influence the prediction of trajectories. To study the impact of space on
agents' trajectories, we build a new dataset made of top-view images of
hundreds of scenes (Formula One racing tracks) where agents' behaviors are
heavily influenced by known areas in the images (e.g., upcoming turns). CAR-Net
successfully attends to these salient regions. Additionally, CAR-Net reaches
state-of-the-art accuracy on the standard trajectory forecasting benchmark,
Stanford Drone Dataset (SDD). Finally, we show CAR-Net's ability to generalize
to unseen scenes.Comment: The 2nd and 3rd authors contributed equall
Realtime Multilevel Crowd Tracking using Reciprocal Velocity Obstacles
We present a novel, realtime algorithm to compute the trajectory of each
pedestrian in moderately dense crowd scenes. Our formulation is based on an
adaptive particle filtering scheme that uses a multi-agent motion model based
on velocity-obstacles, and takes into account local interactions as well as
physical and personal constraints of each pedestrian. Our method dynamically
changes the number of particles allocated to each pedestrian based on different
confidence metrics. Additionally, we use a new high-definition crowd video
dataset, which is used to evaluate the performance of different pedestrian
tracking algorithms. This dataset consists of videos of indoor and outdoor
scenes, recorded at different locations with 30-80 pedestrians. We highlight
the performance benefits of our algorithm over prior techniques using this
dataset. In practice, our algorithm can compute trajectories of tens of
pedestrians on a multi-core desktop CPU at interactive rates (27-30 frames per
second). To the best of our knowledge, our approach is 4-5 times faster than
prior methods, which provide similar accuracy
Survey on Vision-based Path Prediction
Path prediction is a fundamental task for estimating how pedestrians or
vehicles are going to move in a scene. Because path prediction as a task of
computer vision uses video as input, various information used for prediction,
such as the environment surrounding the target and the internal state of the
target, need to be estimated from the video in addition to predicting paths.
Many prediction approaches that include understanding the environment and the
internal state have been proposed. In this survey, we systematically summarize
methods of path prediction that take video as input and and extract features
from the video. Moreover, we introduce datasets used to evaluate path
prediction methods quantitatively.Comment: DAPI 201
Human Motion Trajectory Prediction: A Survey
With growing numbers of intelligent autonomous systems in human environments,
the ability of such systems to perceive, understand and anticipate human
behavior becomes increasingly important. Specifically, predicting future
positions of dynamic agents and planning considering such predictions are key
tasks for self-driving vehicles, service robots and advanced surveillance
systems. This paper provides a survey of human motion trajectory prediction. We
review, analyze and structure a large selection of work from different
communities and propose a taxonomy that categorizes existing methods based on
the motion modeling approach and level of contextual information used. We
provide an overview of the existing datasets and performance metrics. We
discuss limitations of the state of the art and outline directions for further
research.Comment: Submitted to the International Journal of Robotics Research (IJRR),
37 page
Cognitive visual tracking and camera control
Cognitive visual tracking is the process of observing and understanding the behaviour of a moving person. This paper presents an efficient solution to extract, in real-time, high-level information from an observed scene, and generate the most appropriate commands for a set of pan-tilt-zoom (PTZ) cameras in a surveillance scenario. Such a high-level feedback control loop, which is the main novelty of our work, will serve to reduce uncertainties in the observed scene and to maximize the amount of information extracted from it. It is implemented with a distributed camera system using SQL tables as virtual communication channels, and Situation Graph Trees for knowledge representation, inference and high-level camera control. A set of experiments in a surveillance scenario show the effectiveness of our approach and its potential for real applications of cognitive vision
LCrowdV: Generating Labeled Videos for Simulation-based Crowd Behavior Learning
We present a novel procedural framework to generate an arbitrary number of
labeled crowd videos (LCrowdV). The resulting crowd video datasets are used to
design accurate algorithms or training models for crowded scene understanding.
Our overall approach is composed of two components: a procedural simulation
framework for generating crowd movements and behaviors, and a procedural
rendering framework to generate different videos or images. Each video or image
is automatically labeled based on the environment, number of pedestrians,
density, behavior, flow, lighting conditions, viewpoint, noise, etc.
Furthermore, we can increase the realism by combining synthetically-generated
behaviors with real-world background videos. We demonstrate the benefits of
LCrowdV over prior lableled crowd datasets by improving the accuracy of
pedestrian detection and crowd behavior classification algorithms. LCrowdV
would be released on the WWW
- ā¦