222 research outputs found

    A sparsity-driven approach to multi-camera tracking in visual sensor networks

    Get PDF
    In this paper, a sparsity-driven approach is presented for multi-camera tracking in visual sensor networks (VSNs). VSNs consist of image sensors, embedded processors and wireless transceivers which are powered by batteries. Since the energy and bandwidth resources are limited, setting up a tracking system in VSNs is a challenging problem. Motivated by the goal of tracking in a bandwidth-constrained environment, we present a sparsity-driven method to compress the features extracted by the camera nodes, which are then transmitted across the network for distributed inference. We have designed special overcomplete dictionaries that match the structure of the features, leading to very parsimonious yet accurate representations. We have tested our method in indoor and outdoor people tracking scenarios. Our experimental results demonstrate how our approach leads to communication savings without significant loss in tracking performance

    PhD forum: multi-view occupancy maps using a network of low resolution visual sensors

    Get PDF
    An occupancy map provides an abstract top view of a scene and can be used for many applications such as domotics, surveillance, elderly-care and video teleconferencing. Such maps can be accurately estimated from multiple camera views. However, using a network of regular high resolution cameras makes the system expensive, and quickly raises privacy concerns (e. g. in elderly homes). Furthermore, their power consumption makes battery operation difficult. A solution could be the use of a network of low resolution visual sensors, but their limited resolution could degrade the accuracy of the maps. In this paper we used simulations to determine the minimum required resolution needed for deriving accurate occupancy maps which were then used to track people. Multi-view occupancy maps were computed from foreground silhouettes derived via an analysis of moving edges. Ground occupancies computed from each view were fused in a Dempster-Shafer framework. Tracking was done via a Bayes filter using the occupancy map per time instance as measurement. We found that for a room of 8.8 by 9.2 m, 4 cameras with a resolution as low as 64 by 48 pixels was sufficient to estimate accurate occupancy maps and track up to 4 people. These findings indicate that it is possible to use low resolution visual sensors to build a cheap, power efficient and privacy-friendly system for occupancy monitoring

    Multicamera trajectory analysis for semantic behaviour characterisation

    Get PDF
    In this paper we propose an innovative approach for behaviour recognition, from a multicamera environment, based on translating video activity into semantics. First, we fuse tracks from individual cameras through clustering employing soft computing techniques. Then, we introduce a higher-level module able to translate fused tracks into semantic information. With our proposed approach, we address the challenge set in PETS 2014 on recognising behaviours of interest around a parked vehicle, namely the abnormal behaviour of someone walking around the vehicle

    Multi-camera cooperative scene interpretation

    Get PDF
    In our society, video processing has become a convenient and widely used tool to assist, protect and simplify the daily life of people in areas such as surveillance and video conferencing. The growing number of cameras, the handling and analysis of these vast amounts of video data enable the development of multi-camera applications that cooperatively use multiple sensors. In many applications, bandwidth constraints, privacy issues, and difficulties in storing and analyzing large amounts of video data make applications costly and technically challenging. In this thesis, we deploy techniques ranging from low-level to high-level approaches, specifically designed for multi-camera networks. As a low-level approach, we designed a novel low-level foreground detection algorithm for real-time tracking applications, concentrating on difficult and changing illumination conditions. The main part of this dissertation focuses on a detailed analysis of two novel state-of-the-art real-time tracking approaches: a multi-camera tracking approach based on occupancy maps and a distributed multi-camera tracking approach with a feedback loop. As a high-level application we propose an approach to understand the dynamics in meetings - so called, smart meetings - using a multi-camera setup, consisting of fixed ambient and portable close-up cameras. For all method, we provided qualitative and quantitative results on several experiments, compared to state-of-the-art methods

    3D Tracking Using Multi-view Based Particle Filters

    Get PDF
    Visual surveillance and monitoring of indoor environments using multiple cameras has become a field of great activity in computer vision. Usual 3D tracking and positioning systems rely on several independent 2D tracking modules applied over individual camera streams, fused using geometrical relationships across cameras. As 2D tracking systems suffer inherent difficulties due to point of view limitations (perceptually similar foreground and background regions causing fragmentation of moving objects, occlusions), 3D tracking based on partially erroneous 2D tracks are likely to fail when handling multiple-people interaction. To overcome this problem, this paper proposes a Bayesian framework for combining 2D low-level cues from multiple cameras directly into the 3D world through 3D Particle Filters. This method allows to estimate the probability of a certain volume being occupied by a moving object, and thus to segment and track multiple people across the monitored area. The proposed method is developed on the basis of simple, binary 2D moving region segmentation on each camera, considered as different state observations. In addition, the method is proved well suited for integrating additional 2D low-level cues to increase system robustness to occlusions: in this line, a naïve color-based (HSI) appearance model has been integrated, resulting in clear performance improvements when dealing with complex scenarios

    A Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects

    Full text link
    Tracking humans that are interacting with the other subjects or environment remains unsolved in visual tracking, because the visibility of the human of interests in videos is unknown and might vary over time. In particular, it is still difficult for state-of-the-art human trackers to recover complete human trajectories in crowded scenes with frequent human interactions. In this work, we consider the visibility status of a subject as a fluent variable, whose change is mostly attributed to the subject's interaction with the surrounding, e.g., crossing behind another object, entering a building, or getting into a vehicle, etc. We introduce a Causal And-Or Graph (C-AOG) to represent the causal-effect relations between an object's visibility fluent and its activities, and develop a probabilistic graph model to jointly reason the visibility fluent change (e.g., from visible to invisible) and track humans in videos. We formulate this joint task as an iterative search of a feasible causal graph structure that enables fast search algorithm, e.g., dynamic programming method. We apply the proposed method on challenging video sequences to evaluate its capabilities of estimating visibility fluent changes of subjects and tracking subjects of interests over time. Results with comparisons demonstrate that our method outperforms the alternative trackers and can recover complete trajectories of humans in complicated scenarios with frequent human interactions.Comment: accepted by CVPR 201

    Parameter-unaware autocalibration for occupancy mapping

    Get PDF
    People localization and occupancy mapping are common and important tasks for multi-camera systems. In this paper, we present a novel approach to overcome the hurdle of manual extrinsic calibration of the multi-camera system. Our approach is completely parameter unaware, meaning that the user does not need to know the focal length, position or viewing angle in advance, nor will these values be calibrated as such. The only requirement to the multi-camera setup is that the views overlap substantially and are mounted at approximately the same height, requirements that are satisfied in most typical multi-camera configurations. The proposed method uses the observed height of an object or person moving through the space to estimate the distance to the object or person. Using this distance to backproject the lowest point of each detected object, we obtain a rotated and anisotropically scaled view of the ground plane for each camera. An algorithm is presented to estimate the anisotropic scaling parameters and rotation for each camera, after which ground plane positions can be computed up to an isotropic scale factor. Lens distortion is not taken into account. The method is tested in simulation yielding average accuracies within 5cm, and in a real multi-camera environment with an accuracy within 15cm

    Towards Global People Detection and Tracking using Multiple Depth Sensors

    Get PDF
    corecore