6,552 research outputs found
Counting with Focus for Free
This paper aims to count arbitrary objects in images. The leading counting
approaches start from point annotations per object from which they construct
density maps. Then, their training objective transforms input images to density
maps through deep convolutional networks. We posit that the point annotations
serve more supervision purposes than just constructing density maps. We
introduce ways to repurpose the points for free. First, we propose supervised
focus from segmentation, where points are converted into binary maps. The
binary maps are combined with a network branch and accompanying loss function
to focus on areas of interest. Second, we propose supervised focus from global
density, where the ratio of point annotations to image pixels is used in
another branch to regularize the overall density estimation. To assist both the
density estimation and the focus from segmentation, we also introduce an
improved kernel size estimator for the point annotations. Experiments on six
datasets show that all our contributions reduce the counting error, regardless
of the base network, resulting in state-of-the-art accuracy using only a single
network. Finally, we are the first to count on WIDER FACE, allowing us to show
the benefits of our approach in handling varying object scales and crowding
levels. Code is available at
https://github.com/shizenglin/Counting-with-Focus-for-FreeComment: ICCV, 201
DecideNet: Counting Varying Density Crowds Through Attention Guided Detection and Density Estimation
In real-world crowd counting applications, the crowd densities vary greatly
in spatial and temporal domains. A detection based counting method will
estimate crowds accurately in low density scenes, while its reliability in
congested areas is downgraded. A regression based approach, on the other hand,
captures the general density information in crowded regions. Without knowing
the location of each person, it tends to overestimate the count in low density
areas. Thus, exclusively using either one of them is not sufficient to handle
all kinds of scenes with varying densities. To address this issue, a novel
end-to-end crowd counting framework, named DecideNet (DEteCtIon and Density
Estimation Network) is proposed. It can adaptively decide the appropriate
counting mode for different locations on the image based on its real density
conditions. DecideNet starts with estimating the crowd density by generating
detection and regression based density maps separately. To capture inevitable
variation in densities, it incorporates an attention module, meant to
adaptively assess the reliability of the two types of estimations. The final
crowd counts are obtained with the guidance of the attention module to adopt
suitable estimations from the two kinds of density maps. Experimental results
show that our method achieves state-of-the-art performance on three challenging
crowd counting datasets.Comment: CVPR 201
- …