23 research outputs found

    Sparsity Driven People Localization with a Heterogeneous Network of Cameras

    Get PDF
    This paper addresses the problem of localizing people in low and high density crowds with a network of heterogeneous cameras. The problem is recast as a linear inverse problem. It relies on deducing the discretized occupancy vector of people on the ground, from the noisy binary silhouettes observed as foreground pixels in each camera. This inverse problem is regularized by imposing a sparse occupancy vector, i.e., made of few non-zero elements, while a particular dictionary of silhouettes linearly maps these non-empty grid locations to the multiple silhouettes viewed by the cameras network. The proposed framework is (i) generic to any scene of people, i.e., people are located in low and high density crowds, (ii) scalable to any number of cameras and already working with a single camera, (iii) unconstrained by the scene surface to be monitored, and (iv) versatile with respect to the camera's geometry, e.g., planar or omnidirectional. Qualitative and quantitative results are presented on the APIDIS and the PETS 2009 Benchmark datasets. The proposed algorithm successfully detects people occluding each other given severely degraded extracted features, while outperforming state-of-the-art people localization technique

    Sparsity Driven People Localization with a Heterogeneous Network of Cameras

    Get PDF
    In this paper, we propose to study the problem of localization of a dense set of people with a network of heterogeneous cameras. We propose to recast the problem as a linear inverse problem. The proposed framework is generic to any scene, scalable in the number of cameras and versatile with respect to their geometry, e.g. planar or omnidirectional. It relies on deducing an \emph {occupancy vector}, i.e. the discretized occupancy of people on the ground, from the noisy binary silhouettes observed as foreground pixels in each camera. This inverse problem is regularized by imposing a sparse occupancy vector, i.e. made of few non- zero elements, while a particular dictionary of silhouettes linearly maps these non-empty grid locations to the multiple silhouettes viewed by the cameras network. This constitutes a linearization of the problem, where non- linearities, such as occlusions, are treated as additional noise on the observed silhouettes. Mathematically, we express the final inverse problem either as Basis Pursuit DeNoise or Lasso convex optimization programs. The sparsity measure is reinforced by iteratively re-weighting the â„“1\ell_1-norm of the occupancy vector for better approximating its â„“0\ell_0 ``norm'', and a new kind of ``repulsive'' sparsity is used to adapt further the Lasso procedure to the occupancy reconstruction. Practically, an adaptive sampling process is proposed to reduce the computation cost and monitor a large occupancy area. Qualitative and quantitative results are presented on a basketball game. The proposed algorithm successfully detects people occluding each other given severely degraded extracted features, while outperforming state-of-the-art people localization techniques

    Probabilistic speed-density relationship for pedestrians based on data driven space and time representation

    Get PDF
    This paper proposes a mathematical framework that provides the detailed characterization of the pedestrian flow. It is specifically designed to address the heterogeneity of pedestrian population which is to be reflected through the pedestrian flow indicators. The key components of the presented work are: (i) data driven space discretization framework based on the Voronoi tessellations that allow pedestrian-oriented definition of density indicator; (ii) statistical and data driven approach to time aggregation, allowing for the pedestrian oriented definition of speed indicator; (iii) probabilistic model for speed-density relationship, so as to capture the empirically observed heterogeneity among pedestrians. The estimation and validation of the proposed model are performed on the basis of a pedestrian tracking input. Data is collected in a Lausanne railway station where a large-scale network of cameras has been installed to automatically locate and track thousands of pedestrians. Additionally, the performance provided by this methodology is compared with the well-accepted models published in the literature against empirical data with the aim at improving research on the pedestrian flow characterization

    Towards Global People Detection and Tracking using Multiple Depth Sensors

    Get PDF

    EFFICIENT SCALE INVARIENT FEATURE BASED METHOD FOR CROWD LOCALIZATION

    Get PDF
    Visual surveillance has been a very active research topic in the last few decade due to growing importance for security in the public areas. With the increasing number of CCTV networks in public areas, the enhancement in the computing power of modern computers and increase the possibility to entrust an automatic system with the security and the monitoring of events involving large crowds is within reach. Crowd detection and localization in the surveillance video is the first step in automatic crowd monitoring system. The performance of the whole system depends on this step. Detecting the crowd is a challenging task because the crowds come in different shape, size and color, against cluttered background and varying illumination conditions. As the size of the crowd increases managing the crowd becomes more complex

    Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection

    Full text link
    People detection in single 2D images has improved greatly in recent years. However, comparatively little of this progress has percolated into multi-camera multi-people tracking algorithms, whose performance still degrades severely when scenes become very crowded. In this work, we introduce a new architecture that combines Convolutional Neural Nets and Conditional Random Fields to explicitly model those ambiguities. One of its key ingredients are high-order CRF terms that model potential occlusions and give our approach its robustness even when many people are present. Our model is trained end-to-end and we show that it outperforms several state-of-art algorithms on challenging scenes

    Temporal Smoothing for Joint Probabilistic People Detection in a Depth Sensor Network

    Get PDF
    Wide-area indoor people detection in a network of depth sensors is the basis for many applications, e.g. people counting or customer behavior analysis. Existing probabilistic methods use approximative stochastic inference to estimate the marginal probability distribution of people present in the scene for a single time step. In this work we investigate how the temporal context, given by a time series of multi-view depth observations, can be exploited to regularize a mean-field variational inference optimization process. We present a probabilistic grid based dynamic model and deduce the corresponding mean-field update regulations to effectively approximate the joint probability distribution of people present in the scene across space and time. Our experiments show that the proposed temporal regularization leads to a more robust estimation of the desired probability distribution and increases the detection performance

    Joint Probabilistic People Detection in Overlapping Depth Images

    Get PDF
    Privacy-preserving high-quality people detection is a vital computer vision task for various indoor scenarios, e.g. people counting, customer behavior analysis, ambient assisted living or smart homes. In this work a novel approach for people detection in multiple overlapping depth images is proposed. We present a probabilistic framework utilizing a generative scene model to jointly exploit the multi-view image evidence, allowing us to detect people from arbitrary viewpoints. Our approach makes use of mean-field variational inference to not only estimate the maximum a posteriori (MAP) state but to also approximate the posterior probability distribution of people present in the scene. Evaluation shows state-of-the-art results on a novel data set for indoor people detection and tracking in depth images from the top-view with high perspective distortions. Furthermore it can be demonstrated that our approach (compared to the the mono-view setup) successfully exploits the multi-view image evidence and robustly converges in only a few iterations

    Learning Social Etiquette: Human Trajectory Understanding In Crowded Scenes

    Get PDF
    Humans navigate crowded spaces such as a university campus by following common sense rules based on social etiquette. In this paper, we argue that in order to enable the design of new target tracking or trajectory forecasting methods that can take full advantage of these rules, we need to have access to better data in the first place. To that end, we contribute a new large-scale dataset that collects videos of various types of targets (not just pedestrians, but also bikers, skateboarders, cars, buses, golf carts) that navigate in a real world outdoor environment such as a university campus. Moreover, we introduce a new characterization that describes the “social sensitivity” at which two targets interact. We use this characterization to define “navigation styles” and improve both forecasting models and state-of-the-art multi-target tracking–whereby the learnt forecasting models help the data association step
    corecore