765 research outputs found
PDANet: Pyramid Density-aware Attention Net for Accurate Crowd Counting
Crowd counting, i.e., estimating the number of people in a crowded area, has
attracted much interest in the research community. Although many attempts have
been reported, crowd counting remains an open real-world problem due to the
vast scale variations in crowd density within the interested area, and severe
occlusion among the crowd. In this paper, we propose a novel Pyramid
Density-Aware Attention-based network, abbreviated as PDANet, that leverages
the attention, pyramid scale feature and two branch decoder modules for
density-aware crowd counting. The PDANet utilizes these modules to extract
different scale features, focus on the relevant information, and suppress the
misleading ones. We also address the variation of crowdedness levels among
different images with an exclusive Density-Aware Decoder (DAD). For this
purpose, a classifier evaluates the density level of the input features and
then passes them to the corresponding high and low crowded DAD modules.
Finally, we generate an overall density map by considering the summation of low
and high crowded density maps as spatial attention. Meanwhile, we employ two
losses to create a precise density map for the input scene. Extensive
evaluations conducted on the challenging benchmark datasets well demonstrate
the superior performance of the proposed PDANet in terms of the accuracy of
counting and generated density maps over the well-known state of the arts
Understanding Traffic Density from Large-Scale Web Camera Data
Understanding traffic density from large-scale web camera (webcam) videos is
a challenging problem because such videos have low spatial and temporal
resolution, high occlusion and large perspective. To deeply understand traffic
density, we explore both deep learning based and optimization based methods. To
avoid individual vehicle detection and tracking, both methods map the image
into vehicle density map, one based on rank constrained regression and the
other one based on fully convolution networks (FCN). The regression based
method learns different weights for different blocks in the image to increase
freedom degrees of weights and embed perspective information. The FCN based
method jointly estimates vehicle density map and vehicle count with a residual
learning framework to perform end-to-end dense prediction, allowing arbitrary
image resolution, and adapting to different vehicle scales and perspectives. We
analyze and compare both methods, and get insights from optimization based
method to improve deep model. Since existing datasets do not cover all the
challenges in our work, we collected and labelled a large-scale traffic video
dataset, containing 60 million frames from 212 webcams. Both methods are
extensively evaluated and compared on different counting tasks and datasets.
FCN based method significantly reduces the mean absolute error from 10.99 to
5.31 on the public dataset TRANCOS compared with the state-of-the-art baseline.Comment: Accepted by CVPR 2017. Preprint version was uploaded on
http://welcome.isr.tecnico.ulisboa.pt/publications/understanding-traffic-density-from-large-scale-web-camera-data
- …