10 research outputs found

    Principled Detection-by-Classification from Multiple Views

    Get PDF
    Machine-learning based classification techniques have been shown to be effective at detecting objects in complex scenes. However, the final results are often obtained from the alarms produced by the classifiers through a post-processing which typically relies on ad hoc heuristics. Spatially close alarms are assumed to be triggered by the same target and grouped together. Here we replace those heuristics by a principled Bayesian approach, which uses knowledge about both the classifier response model and the scene geometry to combine multiple classification answers. We demonstrate its effectiveness for multi-view pedestrian detection. We estimate the marginal probabilities of presence of people at any location in a scene, given the responses of classifiers evaluated in each view. Our approach naturally takes into account both the occlusions and the very low metric accuracy of the classifiers due to their invariance to translation and scale. Results show our method produces one order of magnitude fewer false positives than a method that is representative of typical state-of-the-art approaches. Moreover, the framework we propose is generic and could be applied to any detection-by-classification task

    Conditional Random Fields for Multi-Camera Object Detection

    Get PDF
    We formulate a model for multi-class object detection in a multi-camera environment. From our knowledge, this is the first time that this problem is addressed taken into account different object classes simultaneously. Given several images of the scene taken from different angles, our system estimates the ground plane location of the objects from the output of several object detectors applied at each viewpoint. We cast the problem as an energy minimization modeled with a Conditional Random Field (CRF). Instead of predicting the presence of an object at each image location independently, we simultaneously predict the labeling of the entire scene. Our CRF is able to take into account occlusions between objects and contextual constraints among them. We propose an effective iterative strategy that renders tractable the underlying optimization problem, and learn the parameters of the model with the max-margin paradigm. We evaluate the performance of our model on several challenging multi-camera pedestrian detection datasets namely PETS 2009 and EPFL terrace sequence. We also introduce a new dataset in which multiple classes of objects appear simultaneously in the scene. It is here where we show that our method effectively handles occlusions in the multi-class case

    Detection-based multi-human tracking using a CRF model

    Full text link

    Incorporating Perception Uncertainty in Human-Aware Navigation: A Comparative Study

    Get PDF
    In this work, we present a novel approach to human-aware navigation by probabilistically modelling the uncertainty of perception for a social robotic system and investigating its effect on the overall social navigation performance. The model of the social costmap around a person has been extended to consider this new uncertainty factor, which has been widely neglected despite playing an important role in situations with noisy perception. A social path planner based on the fast marching method has been augmented to account for the uncertainty in the positions of people. The effectiveness of the proposed approach has been tested in extensive experiments carried out with real robots and in simulation. Real experiments have been conducted, given noisy perception, in the presence of single/multiple, static/dynamic humans. Results show how this approach has been able to achieve trajectories that are able to keep a more appropriate social distance to the people, compared to those of the basic navigation approach, and the human-aware navigation approach which relies solely on perfect perception, when the complexity of the environment increases. Accounting for uncertainty of perception is shown to result in smoother trajectories with lower jerk that are more natural from the point of view of humans

    Global Optimisation of Multi‐Camera Moving Object Detection

    Get PDF
    An important task in intelligent video surveillance is to detect multiple pedestrians. These pedestrians may be occluded by each other in a camera view. To overcome this problem, multiple cameras can be deployed to provide complementary information, and homography mapping has been widely used for the association and fusion of multi‐camera observations. The intersection regions of the foreground projections usually indicate the locations of moving objects. However, many false positives may be generated from the intersections of non‐corresponding foreground regions. In this thesis, an algorithm for multi‐camera pedestrian detection is proposed. The first stage of this work is to propose pedestrian candidate locations on the top view. Two approaches are proposed in this stage. The first approach is a top‐down approach which is based on the probabilistic occupancy map framework. The ground plane is discretized into a grid, and the likelihood of pedestrian presence at each location is estimated by comparing a rectangle, of the average size of the pedestrians standing there, with the foreground silhouettes in all camera views. The second approach is a bottom‐up approach, which is based on the multi‐plane homography mapping. The foreground regions in all camera views are projected and overlaid in the top view according to the multi‐plane homographies and the potential locations of pedestrians are estimated from the intersection regions. In the second stage, where we borrowed the idea from the Quine‐McCluskey (QM) method for logic function minimisation, essential candidates are initially identified, each of which covers at least a significant part of the foreground that is not covered by the other candidates. Then non‐essential candidates are selected to cover the remaining foregrounds by following a repeated process, which alternates between merging redundant candidates and finding emerging essential candidates. Then, an alternative approach to the QM method, the Petrick’s method, is used for finding the minimum set of pedestrian candidates to cover all the foreground regions. These two methods are non‐iterative and can greatly increase the computational speed. No similar work has been proposed before. Experiments on benchmark video datasets have demonstrated the good performance of the proposed algorithm in comparison with other state‐of‐the‐art methods for pedestrian detection

    Robust moving object detection by information fusion from multiple cameras

    Get PDF
    Moving object detection is an essential process before tracking and event recognition in video surveillance can take place. To monitor a wider field of view and avoid occlusions in pedestrian tracking, multiple cameras are usually used and homography can be employed to associate multiple camera views. Foreground regions detected from each of the multiple camera views are projected into a virtual top view according to the homography for a plane. The intersection regions of the foreground projections indicate the locations of moving objects on that plane. The homography mapping for a set of parallel planes at different heights can increase the robustness of the detection. However, homography mapping is very time consuming and the intersections of non-corresponding foreground regions can cause false-positive detections. In this thesis, a real-time moving object detection algorithm using multiple cameras is proposed. Unlike the pixelwise homography mapping which projects binary foreground images, the approach used in the research described in this thesis was to approximate the contour of each foreground region with a polygon and only transmit and project the polygon vertices. The foreground projections are rebuilt from the projected polygons in the reference view. The experimental results have shown that this method can be run in real time and generate results similar to those using foreground images. To identify the false-positive detections, both geometrical information and colour cues are utilized. The former is a height matching algorithm based on the geometry between the camera views. The latter is a colour matching algorithm based on the Mahalanobis distance of the colour distributions of two foreground regions. Since the height matching is uncertain in the scenarios with the adjacent pedestrian and colour matching cannot handle occluded pedestrians, the two algorithms are combined to improve the robustness of the foreground intersection classification. The robustness of the proposed algorithm is demonstrated in real-world image sequences

    Pedestrian localization, tracking and behavior analysis from multiple cameras

    Get PDF
    Video surveillance is currently undergoing a rapid growth. However, while thousands of cameras are being installed in public places all over the world, computer programs that could reliably detect and track people in order to analyze their behavior are not yet operational. Challenges are numerous, ranging from low image quality, suboptimal scene lighting, changing appearances of pedestrians, occlusions with environment and between people, complex interacting trajectories in crowds, etc. In this thesis, we propose a complete approach for detecting and tracking an unknown number of interacting people from multiple cameras located at eye level. Our system works reliably in spite of significant occlusions and delivers metrically accurate trajectories for each tracked individual. Furthermore, we develop a method for representing the most common types of motion in a specific environment and learning them automatically from image data. We demonstrate that a generative model for detection can effectively handle occlusions in each time frame independently, even when the only data available comes from the output of a simple background subtraction algorithm and when the number of individuals is unknown a priori. We then advocate that multi-people tracking can be achieved by detecting people in individual frames and then linking detections across frames. We formulate the linking step as a problem of finding the most probable state of a hidden Markov process given the set of images and frame-independent detections. We first propose to solve this problem by optimizing trajectories independently with Dynamic Programming. In a second step, we reformulate the problem as a constrained flow optimization resulting in a convex problem that can be solved using standard Linear Programming techniques and is far simpler formally and algorithmically than existing techniques. We show that the particular structure of this framework lets us solve it equivalently using the k-shortest paths algorithm, which leads to a much faster optimization. Finally, we introduce a novel behavioral model to describe pedestrians motions, which is able to capture sophisticated motion patterns resulting from the mixture of different categories of random trajectories. Due to its simplicity, this model can be learned from video sequences in a totally unsupervised manner through an Expectation-Maximization procedure. We show that this behavior model can be used to make tracking systems more robust in ambiguous situations. Moreover, we demonstrate its ability to characterize and detect atypical individual motions
    corecore