16 research outputs found

    Realtime Multilevel Crowd Tracking using Reciprocal Velocity Obstacles

    Full text link
    We present a novel, realtime algorithm to compute the trajectory of each pedestrian in moderately dense crowd scenes. Our formulation is based on an adaptive particle filtering scheme that uses a multi-agent motion model based on velocity-obstacles, and takes into account local interactions as well as physical and personal constraints of each pedestrian. Our method dynamically changes the number of particles allocated to each pedestrian based on different confidence metrics. Additionally, we use a new high-definition crowd video dataset, which is used to evaluate the performance of different pedestrian tracking algorithms. This dataset consists of videos of indoor and outdoor scenes, recorded at different locations with 30-80 pedestrians. We highlight the performance benefits of our algorithm over prior techniques using this dataset. In practice, our algorithm can compute trajectories of tens of pedestrians on a multi-core desktop CPU at interactive rates (27-30 frames per second). To the best of our knowledge, our approach is 4-5 times faster than prior methods, which provide similar accuracy

    Learning a perspective-embedded deconvolution network for crowd counting

    Full text link
    © 2017 IEEE. We present a novel deep learning framework for crowd counting by learning a perspective-embedded deconvolution network. Perspective is an inherent property of most surveillance scenes. Unlike the traditional approaches that exploit the perspective as a separate normalization, we propose to fuse the perspective into a deconvolution network, aiming to obtain a robust, accurate and consistent crowd density map. Through layer-wise fusion, we merge perspective maps at different resolutions into the deconvolution network. With the injection of perspective, our network is driven to learn to combine the underlying scene geometric constraints adaptively, thus enabling an accurate interpretation from high-level feature maps to the pixel-wise crowd density map. In addition, our network allows generating density map for arbitrary-sized input in an end-to-end fashion. The proposed method achieves competitive result on the WorldExpo2010 crowd dataset

    Fusion of thermal and visible imagery for effective detection and tracking of salient objects in videos

    Get PDF
    In this paper, we present an efficient approach to detect and track salient objects from videos. In general, colored visible image in red-green-blue (RGB) has better distinguishability in human visual perception, yet it suffers from the effect of illumination noise and shadows. On the contrary, thermal image is less sensitive to these noise effects though its distinguishability varies according to environmental settings. To this end, fusion of these two modalities provides an effective solution to tackle this problem. First, a background model is extracted followed by background-subtraction for foreground detection in visible images. Meanwhile, adaptively thresholding is applied for foreground detection in thermal domain as human objects tend to be of higher temperature thus brighter than the background. To deal with cases of occlusion, prediction based forward tracking and backward tracking are employed to identify separate objects even the foreground detection fails. The proposed method is evaluated on OTCBVS, a publicly available color-thermal benchmark dataset. Promising results have shown that the proposed fusion based approach can successfully detect and track multiple human objects

    A Recent Trend in Individual Counting Approach Using Deep Network

    Get PDF
    In video surveillance scheme, counting individuals is regarded as a crucial task. Of all the individual counting techniques in existence, the regression technique can offer enhanced performance under overcrowded area. However, this technique is unable to specify the details of counting individual such that it fails in locating the individual. On contrary, the density map approach is very effective to overcome the counting problems in various situations such as heavy overlapping and low resolution. Nevertheless, this approach may break down in cases when only the heads of individuals appear in video scenes, and it is also restricted to the feature’s types. The popular technique to obtain the pertinent information automatically is Convolutional Neural Network (CNN). However, the CNN based counting scheme is unable to sufficiently tackle three difficulties, namely, distributions of non-uniform density, changes of scale and variation of drastic scale. In this study, we cater a review on current counting techniques which are in correlation with deep net in different applications of crowded scene. The goal of this work is to specify the effectiveness of CNN applied on popular individuals counting approaches for attaining higher precision results

    Real-time crowd density mapping using a novel sensory fusion model of infrared and visual systems

    Get PDF
    Crowd dynamic management research has seen significant attention in recent years in research and industry in an attempt to improve safety level and management of large scale events and in large public places such as stadiums, theatres, railway stations, subways and other places where high flow of people at high densities is expected. Failure to detect the crowd behaviour at the right time could lead to unnecessary injuries and fatalities. Over the past decades there have been many incidents of crowd which caused major injuries and fatalities and lead to physical damages. Examples of crowd disasters occurred in past decades include the tragedy of Hillsborough football stadium at Sheffield where at least 93 football supporters have been killed and 400 injured in 1989 in Britain's worst-ever sporting disaster (BBC, 1989). Recently in Cambodia a pedestrians stampede during the Water Festival celebration resulted in 345 deaths and 400 injuries (BBC, 2010) and in 2011 at least 16 people were killed and 50 others were injured in a stampede in the northern Indian town of Haridwar (BBC, 2011). Such disasters could be avoided or losses reduced by using different technologies. Crowd simulation models have been found effective in the prediction of potential crowd hazards in critical situations and thus help in reducing fatalities. However, there is a need to combine the advancement in simulation with real time crowd characterisation such as the estimation of real time density in order to provide accurate prognosis in crowd behaviour and enhance crowd management and safety, particularly in mega event such as the Hajj. This paper addresses the use of novel sensory technology in order to estimate people’s dynamic density du ring one of the Hajj activities. The ultimate goal is that real time accurate estimation of density in different areas within the crowd could help to improve the decision making process and provide more accurate prediction of the crowd dynamics. This paper investigates the use of infrared and visual cameras supported by auxiliary sensors and artificial intelligence to evaluate the accuracy in estimating crowd density in an open space during Muslims Pilgrimage to Makkah (Mecca)

    Fast heuristic method to detect people in frontal depth images

    Get PDF
    This paper presents a new method for detecting people using only depth images captured by a camera in a frontal position. The approach is based on first detecting all the objects present in the scene and determining their average depth (distance to the camera). Next, for each object, a 3D Region of Interest (ROI) is processed around it in order to determine if the characteristics of the object correspond to the biometric characteristics of a human head. The results obtained using three public datasets captured by three depth sensors with different spatial resolutions and different operation principle (structured light, active stereo vision and Time of Flight) are presented. These results demonstrate that our method can run in realtime using a low-cost CPU platform with a high accuracy, being the processing times smaller than 1 ms per frame for a 512 × 424 image resolution with a precision of 99.26% and smaller than 4 ms per frame for a 1280 × 720 image resolution with a precision of 99.77%


    Get PDF
    The categories of crowd counting in video falls in two broad categories: (a) ROI counting which estimates the total number of people in some regions at certain time instance (b) LOI counting which counts people who crosses a detecting line in certain time duration. The LOI counting can be developed using feature tracking techniques where the features are either tracked into trajectories and these trajectories are clustered into object tracks or based on extracting and counting crowd blobs from a temporal slice of the video. And the ROI counting can be developed using two techniques: Detection Based and Feature Based and Pixel Regression Techniques. Detection based methods detect people individually and count them. It utilizes any of the following methods:- Background Differencing, Motion and Appearance joint segmentation, Silhouette or shape matching and Standard object recognition method. Regression approaches extract the features such as foreground pixels and interest points, and vectors are formed with those features and it uses machine learning algorithms to subside the number of pedestrians or people. Some of the common features according to recent survey are edges, wavelet coefficients, and combination of large set of features. Some of the common Regressions are Linear Regression, Neural Networks, Gaussian Process Regression and Discrete Classifiers. This paper aims at presenting a decade survey on people (crowd) counting in surveillance videos