2,546 research outputs found
Pedestrian Movement Direction Recognition Using Convolutional Neural Networks
Pedestrian movement direction recognition is an important factor in autonomous driver assistance and security surveillance systems. Pedestrians are the most crucial and fragile moving objects in streets, roads, and events, where thousands of people may gather on a regular basis. People flow analysis on zebra crossings and in shopping centers or events such as demonstrations are a key element to improve safety and to enable autonomous cars to drive in real life environments. This paper focuses on deep learning techniques such as convolutional neural networks (CNN) to achieve a reliable detection of pedestrians moving in a particular direction. We propose a CNN-based technique that leverages current pedestrian detection techniques (histograms of oriented gradients-linSVM) to generate a sum of subtracted frames (flow estimation around the detected pedestrian), which are used as an input for the proposed modified versions of various state-of-the-art CNN networks, such as AlexNet, GoogleNet, and ResNet. Moreover, we have also created a new data set for this purpose, and analyzed the importance of training in a known data set for the neural networks to achieve reliable results.This work was supported by the Feder funds, Spanish Government through the COMBAHO Project, under Grant TIN2016-76515-R, and in part by the University of Alicante Project under Grant GRE16-19
CARPe Posterum: A Convolutional Approach for Real-time Pedestrian Path Prediction
Pedestrian path prediction is an essential topic in computer vision and video
understanding. Having insight into the movement of pedestrians is crucial for
ensuring safe operation in a variety of applications including autonomous
vehicles, social robots, and environmental monitoring. Current works in this
area utilize complex generative or recurrent methods to capture many possible
futures. However, despite the inherent real-time nature of predicting future
paths, little work has been done to explore accurate and computationally
efficient approaches for this task. To this end, we propose a convolutional
approach for real-time pedestrian path prediction, CARPe. It utilizes a
variation of Graph Isomorphism Networks in combination with an agile
convolutional neural network design to form a fast and accurate path prediction
approach. Notable results in both inference speed and prediction accuracy are
achieved, improving FPS considerably in comparison to current state-of-the-art
methods while delivering competitive accuracy on well-known path prediction
datasets.Comment: AAAI-21 Camera Read
ContextVP: Fully Context-Aware Video Prediction
Video prediction models based on convolutional networks, recurrent networks,
and their combinations often result in blurry predictions. We identify an
important contributing factor for imprecise predictions that has not been
studied adequately in the literature: blind spots, i.e., lack of access to all
relevant past information for accurately predicting the future. To address this
issue, we introduce a fully context-aware architecture that captures the entire
available past context for each pixel using Parallel Multi-Dimensional LSTM
units and aggregates it using blending units. Our model outperforms a strong
baseline network of 20 recurrent convolutional layers and yields
state-of-the-art performance for next step prediction on three challenging
real-world video datasets: Human 3.6M, Caltech Pedestrian, and UCF-101.
Moreover, it does so with fewer parameters than several recently proposed
models, and does not rely on deep convolutional networks, multi-scale
architectures, separation of background and foreground modeling, motion flow
learning, or adversarial training. These results highlight that full awareness
of past context is of crucial importance for video prediction.Comment: 19 pages. ECCV 2018 oral presentation. Project webpage is at
https://wonmin-byeon.github.io/publication/2018-ecc
Survey on Vision-based Path Prediction
Path prediction is a fundamental task for estimating how pedestrians or
vehicles are going to move in a scene. Because path prediction as a task of
computer vision uses video as input, various information used for prediction,
such as the environment surrounding the target and the internal state of the
target, need to be estimated from the video in addition to predicting paths.
Many prediction approaches that include understanding the environment and the
internal state have been proposed. In this survey, we systematically summarize
methods of path prediction that take video as input and and extract features
from the video. Moreover, we introduce datasets used to evaluate path
prediction methods quantitatively.Comment: DAPI 201
Dynamic Occupancy Grid Prediction for Urban Autonomous Driving: A Deep Learning Approach with Fully Automatic Labeling
Long-term situation prediction plays a crucial role in the development of
intelligent vehicles. A major challenge still to overcome is the prediction of
complex downtown scenarios with multiple road users, e.g., pedestrians, bikes,
and motor vehicles, interacting with each other. This contribution tackles this
challenge by combining a Bayesian filtering technique for environment
representation, and machine learning as long-term predictor. More specifically,
a dynamic occupancy grid map is utilized as input to a deep convolutional
neural network. This yields the advantage of using spatially distributed
velocity estimates from a single time step for prediction, rather than a raw
data sequence, alleviating common problems dealing with input time series of
multiple sensors. Furthermore, convolutional neural networks have the inherent
characteristic of using context information, enabling the implicit modeling of
road user interaction. Pixel-wise balancing is applied in the loss function
counteracting the extreme imbalance between static and dynamic cells. One of
the major advantages is the unsupervised learning character due to fully
automatic label generation. The presented algorithm is trained and evaluated on
multiple hours of recorded sensor data and compared to Monte-Carlo simulation
Fully Convolutional Neural Networks for Dynamic Object Detection in Grid Maps
Grid maps are widely used in robotics to represent obstacles in the
environment and differentiating dynamic objects from static infrastructure is
essential for many practical applications. In this work, we present a methods
that uses a deep convolutional neural network (CNN) to infer whether grid cells
are covering a moving object or not. Compared to tracking approaches, that use
e.g. a particle filter to estimate grid cell velocities and then make a
decision for individual grid cells based on this estimate, our approach uses
the entire grid map as input image for a CNN that inspects a larger area around
each cell and thus takes the structural appearance in the grid map into account
to make a decision. Compared to our reference method, our concept yields a
performance increase from 83.9% to 97.2%. A runtime optimized version of our
approach yields similar improvements with an execution time of just 10
milliseconds.Comment: This is a shorter version of the masters thesis of Florian Piewak and
it was accapted at IV 201
- …