Search CORE

975 research outputs found

Counting with Focus for Free

Author: Mettes Pascal
Shi Zenglin
Snoek Cees G. M.
Publication venue
Publication date: 01/01/2019
Field of study

This paper aims to count arbitrary objects in images. The leading counting approaches start from point annotations per object from which they construct density maps. Then, their training objective transforms input images to density maps through deep convolutional networks. We posit that the point annotations serve more supervision purposes than just constructing density maps. We introduce ways to repurpose the points for free. First, we propose supervised focus from segmentation, where points are converted into binary maps. The binary maps are combined with a network branch and accompanying loss function to focus on areas of interest. Second, we propose supervised focus from global density, where the ratio of point annotations to image pixels is used in another branch to regularize the overall density estimation. To assist both the density estimation and the focus from segmentation, we also introduce an improved kernel size estimator for the point annotations. Experiments on six datasets show that all our contributions reduce the counting error, regardless of the base network, resulting in state-of-the-art accuracy using only a single network. Finally, we are the first to count on WIDER FACE, allowing us to show the benefits of our approach in handling varying object scales and crowding levels. Code is available at https://github.com/shizenglin/Counting-with-Focus-for-FreeComment: ICCV, 201

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE

PDANet: Pyramid Density-aware Attention Net for Accurate Crowd Counting

Author: Amirgholipour Saeed
He Xiangjian
Jia Wenjing
Liu Lei
Wang Dadong
Publication venue
Publication date: 28/04/2020
Field of study

Crowd counting, i.e., estimating the number of people in a crowded area, has attracted much interest in the research community. Although many attempts have been reported, crowd counting remains an open real-world problem due to the vast scale variations in crowd density within the interested area, and severe occlusion among the crowd. In this paper, we propose a novel Pyramid Density-Aware Attention-based network, abbreviated as PDANet, that leverages the attention, pyramid scale feature and two branch decoder modules for density-aware crowd counting. The PDANet utilizes these modules to extract different scale features, focus on the relevant information, and suppress the misleading ones. We also address the variation of crowdedness levels among different images with an exclusive Density-Aware Decoder (DAD). For this purpose, a classifier evaluates the density level of the input features and then passes them to the corresponding high and low crowded DAD modules. Finally, we generate an overall density map by considering the summation of low and high crowded density maps as spatial attention. Meanwhile, we employ two losses to create a precise density map for the input scene. Extensive evaluations conducted on the challenging benchmark datasets well demonstrate the superior performance of the proposed PDANet in terms of the accuracy of counting and generated density maps over the well-known state of the arts

arXiv.org e-Print Archive

OPUS - University of Technology Sydney

CASA-Crowd: A Context-Aware Scale Aggregation CNN-Based Crowd Counting Technique

Author: Ahmad Ashfaq
Ilyas Naveed
Kim Kiseon
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/12/2019
Field of study

The accuracy of object-based computer vision techniques declines due to major challenges originating from large scale variation, varying shape, perspective variation, and lack of side information. To handle these challenges most of the crowd counting methods use multi-columns (restrict themselves to a set of specific density scenes), deploying a deeper and multi-networks for density estimation. However, these techniques suffer a lot of drawbacks such as extraction of identical features from multi-column, computationally complex architecture, overestimate the density estimation in sparse areas, underestimating in dense areas and averaging of feature maps result in reduced quality of density map. To overcome these drawbacks and to provide a state-of-the-art counting accuracy with comparable computational cost, we therefore propose a deeper and wider network: a Context-aware Scale Aggregation CNN-based Crowd Counting method (CASA-Crowd) to obtain the deep, varying scale and perspective varying features. Further, we include a dilated convolution with varying filter size to obtain contextual information. In addition, due to different dilation rates, a variation in receptive field size is more useful to overcome the perspective distortion. The quality of density map is enhanced while preserving the spatial dimension by obtaining a comparable computational complexity. We further evaluate our method on three well-known datasets: UCF_CC_50, ShanghaiTech Part_A, ShanghaiTech Part_B

University of East Anglia digital repository