7,802 research outputs found
Crowd Counting with Decomposed Uncertainty
Research in neural networks in the field of computer vision has achieved
remarkable accuracy for point estimation. However, the uncertainty in the
estimation is rarely addressed. Uncertainty quantification accompanied by point
estimation can lead to a more informed decision, and even improve the
prediction quality. In this work, we focus on uncertainty estimation in the
domain of crowd counting. With increasing occurrences of heavily crowded events
such as political rallies, protests, concerts, etc., automated crowd analysis
is becoming an increasingly crucial task. The stakes can be very high in many
of these real-world applications. We propose a scalable neural network
framework with quantification of decomposed uncertainty using a bootstrap
ensemble. We demonstrate that the proposed uncertainty quantification method
provides additional insight to the crowd counting problem and is simple to
implement. We also show that our proposed method exhibits the state of the art
performances in many benchmark crowd counting datasets.Comment: Accepted in AAAI 2020 (Main Technical Track
FCN-rLSTM: Deep Spatio-Temporal Neural Networks for Vehicle Counting in City Cameras
In this paper, we develop deep spatio-temporal neural networks to
sequentially count vehicles from low quality videos captured by city cameras
(citycams). Citycam videos have low resolution, low frame rate, high occlusion
and large perspective, making most existing methods lose their efficacy. To
overcome limitations of existing methods and incorporate the temporal
information of traffic video, we design a novel FCN-rLSTM network to jointly
estimate vehicle density and vehicle count by connecting fully convolutional
neural networks (FCN) with long short term memory networks (LSTM) in a residual
learning fashion. Such design leverages the strengths of FCN for pixel-level
prediction and the strengths of LSTM for learning complex temporal dynamics.
The residual learning connection reformulates the vehicle count regression as
learning residual functions with reference to the sum of densities in each
frame, which significantly accelerates the training of networks. To preserve
feature map resolution, we propose a Hyper-Atrous combination to integrate
atrous convolution in FCN and combine feature maps of different convolution
layers. FCN-rLSTM enables refined feature representation and a novel end-to-end
trainable mapping from pixels to vehicle count. We extensively evaluated the
proposed method on different counting tasks with three datasets, with
experimental results demonstrating their effectiveness and robustness. In
particular, FCN-rLSTM reduces the mean absolute error (MAE) from 5.31 to 4.21
on TRANCOS, and reduces the MAE from 2.74 to 1.53 on WebCamT. Training process
is accelerated by 5 times on average.Comment: Accepted by International Conference on Computer Vision (ICCV), 201
- …