384 research outputs found
PDANet: Pyramid Density-aware Attention Net for Accurate Crowd Counting
Crowd counting, i.e., estimating the number of people in a crowded area, has
attracted much interest in the research community. Although many attempts have
been reported, crowd counting remains an open real-world problem due to the
vast scale variations in crowd density within the interested area, and severe
occlusion among the crowd. In this paper, we propose a novel Pyramid
Density-Aware Attention-based network, abbreviated as PDANet, that leverages
the attention, pyramid scale feature and two branch decoder modules for
density-aware crowd counting. The PDANet utilizes these modules to extract
different scale features, focus on the relevant information, and suppress the
misleading ones. We also address the variation of crowdedness levels among
different images with an exclusive Density-Aware Decoder (DAD). For this
purpose, a classifier evaluates the density level of the input features and
then passes them to the corresponding high and low crowded DAD modules.
Finally, we generate an overall density map by considering the summation of low
and high crowded density maps as spatial attention. Meanwhile, we employ two
losses to create a precise density map for the input scene. Extensive
evaluations conducted on the challenging benchmark datasets well demonstrate
the superior performance of the proposed PDANet in terms of the accuracy of
counting and generated density maps over the well-known state of the arts
Counting with Focus for Free
This paper aims to count arbitrary objects in images. The leading counting
approaches start from point annotations per object from which they construct
density maps. Then, their training objective transforms input images to density
maps through deep convolutional networks. We posit that the point annotations
serve more supervision purposes than just constructing density maps. We
introduce ways to repurpose the points for free. First, we propose supervised
focus from segmentation, where points are converted into binary maps. The
binary maps are combined with a network branch and accompanying loss function
to focus on areas of interest. Second, we propose supervised focus from global
density, where the ratio of point annotations to image pixels is used in
another branch to regularize the overall density estimation. To assist both the
density estimation and the focus from segmentation, we also introduce an
improved kernel size estimator for the point annotations. Experiments on six
datasets show that all our contributions reduce the counting error, regardless
of the base network, resulting in state-of-the-art accuracy using only a single
network. Finally, we are the first to count on WIDER FACE, allowing us to show
the benefits of our approach in handling varying object scales and crowding
levels. Code is available at
https://github.com/shizenglin/Counting-with-Focus-for-FreeComment: ICCV, 201
- …