13,484 research outputs found
PDANet: Pyramid Density-aware Attention Net for Accurate Crowd Counting
Crowd counting, i.e., estimating the number of people in a crowded area, has
attracted much interest in the research community. Although many attempts have
been reported, crowd counting remains an open real-world problem due to the
vast scale variations in crowd density within the interested area, and severe
occlusion among the crowd. In this paper, we propose a novel Pyramid
Density-Aware Attention-based network, abbreviated as PDANet, that leverages
the attention, pyramid scale feature and two branch decoder modules for
density-aware crowd counting. The PDANet utilizes these modules to extract
different scale features, focus on the relevant information, and suppress the
misleading ones. We also address the variation of crowdedness levels among
different images with an exclusive Density-Aware Decoder (DAD). For this
purpose, a classifier evaluates the density level of the input features and
then passes them to the corresponding high and low crowded DAD modules.
Finally, we generate an overall density map by considering the summation of low
and high crowded density maps as spatial attention. Meanwhile, we employ two
losses to create a precise density map for the input scene. Extensive
evaluations conducted on the challenging benchmark datasets well demonstrate
the superior performance of the proposed PDANet in terms of the accuracy of
counting and generated density maps over the well-known state of the arts
Crowd Counting via Segmentation Guided Attention Networks and Curriculum Loss
Automatic crowd behaviour analysis is an important task for intelligent transportation systems to enable effective flow control and dynamic route planning for varying road participants. Crowd counting is one of the keys to automatic crowd behaviour analysis. Crowd counting using deep convolutional neural networks (CNN) has achieved encouraging progress in recent years. Researchers have devoted much effort to the design of variant CNN architectures and most of them are based on the pre-trained VGG16 model. Due to the insufficient expressive capacity, the backbone network of VGG16 is usually followed by another cumbersome network specially designed for good counting performance. Although VGG models have been outperformed by Inception models in image classification tasks, the existing crowd counting networks built with Inception modules still only have a small number of layers with basic types of Inception modules. To fill in this gap, in this paper, we firstly benchmark the baseline Inception-v3 model on commonly used crowd counting datasets and achieve surprisingly good performance comparable with or better than most existing crowd counting models. Subsequently, we push the boundary of this disruptive work further by proposing a Segmentation Guided Attention Network (SGANet) with Inception-v3 as the backbone and a novel curriculum loss for crowd counting. We conduct thorough experiments to compare the performance of our SGANet with prior arts and the proposed model can achieve state-of-the-art performance with MAE of 57.6, 6.3 and 87.6 on ShanghaiTechA, ShanghaiTechB and UCF_QNRF, respectivel
- …