84,047 research outputs found
FCN-rLSTM: Deep Spatio-Temporal Neural Networks for Vehicle Counting in City Cameras
In this paper, we develop deep spatio-temporal neural networks to
sequentially count vehicles from low quality videos captured by city cameras
(citycams). Citycam videos have low resolution, low frame rate, high occlusion
and large perspective, making most existing methods lose their efficacy. To
overcome limitations of existing methods and incorporate the temporal
information of traffic video, we design a novel FCN-rLSTM network to jointly
estimate vehicle density and vehicle count by connecting fully convolutional
neural networks (FCN) with long short term memory networks (LSTM) in a residual
learning fashion. Such design leverages the strengths of FCN for pixel-level
prediction and the strengths of LSTM for learning complex temporal dynamics.
The residual learning connection reformulates the vehicle count regression as
learning residual functions with reference to the sum of densities in each
frame, which significantly accelerates the training of networks. To preserve
feature map resolution, we propose a Hyper-Atrous combination to integrate
atrous convolution in FCN and combine feature maps of different convolution
layers. FCN-rLSTM enables refined feature representation and a novel end-to-end
trainable mapping from pixels to vehicle count. We extensively evaluated the
proposed method on different counting tasks with three datasets, with
experimental results demonstrating their effectiveness and robustness. In
particular, FCN-rLSTM reduces the mean absolute error (MAE) from 5.31 to 4.21
on TRANCOS, and reduces the MAE from 2.74 to 1.53 on WebCamT. Training process
is accelerated by 5 times on average.Comment: Accepted by International Conference on Computer Vision (ICCV), 201
LCrowdV: Generating Labeled Videos for Simulation-based Crowd Behavior Learning
We present a novel procedural framework to generate an arbitrary number of
labeled crowd videos (LCrowdV). The resulting crowd video datasets are used to
design accurate algorithms or training models for crowded scene understanding.
Our overall approach is composed of two components: a procedural simulation
framework for generating crowd movements and behaviors, and a procedural
rendering framework to generate different videos or images. Each video or image
is automatically labeled based on the environment, number of pedestrians,
density, behavior, flow, lighting conditions, viewpoint, noise, etc.
Furthermore, we can increase the realism by combining synthetically-generated
behaviors with real-world background videos. We demonstrate the benefits of
LCrowdV over prior lableled crowd datasets by improving the accuracy of
pedestrian detection and crowd behavior classification algorithms. LCrowdV
would be released on the WWW
Online real-time crowd behavior detection in video sequences
Automatically detecting events in crowded scenes is a challenging task in Computer Vision. A number of offline approaches have been proposed for solving the problem of crowd behavior detection, however the offline assumption limits their application in real-world video surveillance systems. In this paper, we propose an online and real-time method for detecting events in crowded video sequences. The proposed approach is based on the combination of visual feature extraction and image segmentation and it works without the need of a training phase. A quantitative experimental evaluation has been carried out on multiple publicly available video sequences, containing data from various crowd scenarios and different types of events, to demonstrate the effectiveness of the approach
Texture-based crowd detection and localisation
This paper presents a crowd detection system based on texture analysis. The state-of-the-art techniques based on co-occurrence matrix have been revisited and a novel set of features proposed. These features provide a richer description of the co-occurrence matrix, and can be exploited to obtain stronger classification results, especially when smaller portions of the image are considered. This is extremely useful for crowd localisation: acquired images are divided into smaller regions in order to perform a classification on each one. A thorough evaluation of the proposed system on a real world data set is also presented: this validates the improvements in reliability of the crowd detection and localisation
- …