Search CORE

1,407 research outputs found

Recommended from our members

MFP-Net: Multi-scale feature pyramid network for crowd counting

Author: Lei T
Li S
Nandi AK
Wang R
Zhang D
Zhang W
Publication venue: Institution of Engineering and Technology (IET)
Publication date: 09/06/2021
Field of study

Copyright © 2021 The Authors.. Although deep learning has been widely used for dense crowd counting, it still faces two challenges. Firstly, the popular network models are sensitive to scale variance of human head, human occlusions, and complex background due to repeated utilization of vanilla convolution kernels. Secondly, the vanilla feature fusion often depends on summation or concatenation, which ignores the correlation of different features leading to information redundancy and low robustness to background noise. To address these issues, a multi-scale feature pyramid network (MFP-Net) for dense crowd counting is proposed in this paper. The proposed MFP-Net makes two contributions. Firstly, the feature pyramid fusion module is designed that adopts rich convolutions with different depths and scales, not only to expand the receptive field, but also to improve the inference speed of models by using parallel group convolution. Secondly, a feature attention-aware module is added in the feature fusion stage. The module can achieve local and global information fusion by capturing the importance of the spatial and channel domains to improve model robustness. The proposed MFP-Net is evaluated on five publicly available datasets, and experiments show that the MFP-Net not only provides better crowd counting results than comparative models, but also requires fewer parameters.National Natural Science Foundation of China. Grant Numbers: 61871259, 61861024, 61701387; Natural Science Basic Research Program of Shaanxi. Grant Number: 2021JC-47; Science and Technology Program of Shaanxi Province of China. Grant Number: 2020NY-172

Brunel University Research Archive

PDANet: Pyramid Density-aware Attention Net for Accurate Crowd Counting

Author: Amirgholipour Saeed
He Xiangjian
Jia Wenjing
Liu Lei
Wang Dadong
Publication venue
Publication date: 28/04/2020
Field of study

Crowd counting, i.e., estimating the number of people in a crowded area, has attracted much interest in the research community. Although many attempts have been reported, crowd counting remains an open real-world problem due to the vast scale variations in crowd density within the interested area, and severe occlusion among the crowd. In this paper, we propose a novel Pyramid Density-Aware Attention-based network, abbreviated as PDANet, that leverages the attention, pyramid scale feature and two branch decoder modules for density-aware crowd counting. The PDANet utilizes these modules to extract different scale features, focus on the relevant information, and suppress the misleading ones. We also address the variation of crowdedness levels among different images with an exclusive Density-Aware Decoder (DAD). For this purpose, a classifier evaluates the density level of the input features and then passes them to the corresponding high and low crowded DAD modules. Finally, we generate an overall density map by considering the summation of low and high crowded density maps as spatial attention. Meanwhile, we employ two losses to create a precise density map for the input scene. Extensive evaluations conducted on the challenging benchmark datasets well demonstrate the superior performance of the proposed PDANet in terms of the accuracy of counting and generated density maps over the well-known state of the arts

arXiv.org e-Print Archive

OPUS - University of Technology Sydney

FCN-rLSTM: Deep Spatio-Temporal Neural Networks for Vehicle Counting in City Cameras

Author: Costeira João P.
Moura José M. F.
Wu Guanhang
Zhang Shanghang
Publication venue
Publication date: 31/07/2017
Field of study

In this paper, we develop deep spatio-temporal neural networks to sequentially count vehicles from low quality videos captured by city cameras (citycams). Citycam videos have low resolution, low frame rate, high occlusion and large perspective, making most existing methods lose their efficacy. To overcome limitations of existing methods and incorporate the temporal information of traffic video, we design a novel FCN-rLSTM network to jointly estimate vehicle density and vehicle count by connecting fully convolutional neural networks (FCN) with long short term memory networks (LSTM) in a residual learning fashion. Such design leverages the strengths of FCN for pixel-level prediction and the strengths of LSTM for learning complex temporal dynamics. The residual learning connection reformulates the vehicle count regression as learning residual functions with reference to the sum of densities in each frame, which significantly accelerates the training of networks. To preserve feature map resolution, we propose a Hyper-Atrous combination to integrate atrous convolution in FCN and combine feature maps of different convolution layers. FCN-rLSTM enables refined feature representation and a novel end-to-end trainable mapping from pixels to vehicle count. We extensively evaluated the proposed method on different counting tasks with three datasets, with experimental results demonstrating their effectiveness and robustness. In particular, FCN-rLSTM reduces the mean absolute error (MAE) from 5.31 to 4.21 on TRANCOS, and reduces the MAE from 2.74 to 1.53 on WebCamT. Training process is accelerated by 5 times on average.Comment: Accepted by International Conference on Computer Vision (ICCV), 201

arXiv.org e-Print Archive

Crossref

Counting with Focus for Free

Author: Mettes Pascal
Shi Zenglin
Snoek Cees G. M.
Publication venue
Publication date: 01/01/2019
Field of study

This paper aims to count arbitrary objects in images. The leading counting approaches start from point annotations per object from which they construct density maps. Then, their training objective transforms input images to density maps through deep convolutional networks. We posit that the point annotations serve more supervision purposes than just constructing density maps. We introduce ways to repurpose the points for free. First, we propose supervised focus from segmentation, where points are converted into binary maps. The binary maps are combined with a network branch and accompanying loss function to focus on areas of interest. Second, we propose supervised focus from global density, where the ratio of point annotations to image pixels is used in another branch to regularize the overall density estimation. To assist both the density estimation and the focus from segmentation, we also introduce an improved kernel size estimator for the point annotations. Experiments on six datasets show that all our contributions reduce the counting error, regardless of the base network, resulting in state-of-the-art accuracy using only a single network. Finally, we are the first to count on WIDER FACE, allowing us to show the benefits of our approach in handling varying object scales and crowding levels. Code is available at https://github.com/shizenglin/Counting-with-Focus-for-FreeComment: ICCV, 201

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE