487 research outputs found

    Visual Tracking: From An Individual To Groups Of Animals

    Get PDF
    This thesis is concerned with the development and application of visual tracking techniques to the domain of animal monitoring. The development and evaluation of a system which uses image analysis to control the robotic placement of a sensor on the back of a feeding pig is presented first. This single-target monitoring application is then followed by the evaluation of suitable techniques for tracking groups of animals, of which the most suitable existing technique is found to be a Markov chain Monte Carlo particle filtering algorithm with a Markov random field motion prior (MCMC MRF, Khan et al. 2004). Finally, a new tracking technique is developed which uses social motion information present in groups of social targets to guide the tracking. This is used in the new Motion Parameter Sharing (MPS) algorithm. MPS is designed to improve the tracking of groups of targets with coordinated motion by incorporating motion information from targets that have been moving in a similar way. Situations where coordinated motion information should improve tracking include animal flocking, people moving as a group or any situation where some targets are moving in a correlated fashion. This new method is tested on a variety of real and artificial data sequences, and its performance compared to that of the MCMC MRF algorithm. The new MPS algorithm is found to outperform the MCMC MRF algorithm during a number of different types of sequences (including during occlusion events and noisy sequences) where correlated motion is present between targets. This improvement is apparent both in the accuracy of target location and robustness of tracking, the latter of which is greatly improved

    BTLD+:A BAYESIAN APPROACH TO TRACKING LEARNING DETECTION BY PARTS

    Get PDF
    The contribution proposed in this thesis focuses on this particular instance of the visual tracking problem, referred as Adaptive Ap- iv \ufffcpearance Tracking. We proposed different approaches based on the Tracking Learning Detection (TLD) decomposition proposed in [55]. TLD decomposes visual tracking into three components, namely the tracker, the learner and detector. The tracker and the detector are two competitive processes for target localization based on comple- mentary sources of informations. The former searches for local fea- tures between consecutive frames in order to localize the target; the latter exploits an on-line appearance model to detect confident hy- pothesis over the entire image. The learner selects the final solution among the provided hypothesis. It updates the target appearance model, if necessary, reinitialize the tracker and bootstraps the detec- tor\u2019s appearance model. In particular, we investigated different ap- proaches to enforce the TLD stability. First, we replaced the tracker component with a novel one based on mcmc particle filtering; after- wards, we proposed a robust appearance modeling component able to characterize deformable objects in static images; after all, we inte- grated a modeling component able to integrate local visual features learning into the whole approach, lying to a couple layered represen- tation of the target appearance

    Visual Tracking: From An Individual To Groups Of Animals

    Get PDF
    This thesis is concerned with the development and application of visual tracking techniques to the domain of animal monitoring. The development and evaluation of a system which uses image analysis to control the robotic placement of a sensor on the back of a feeding pig is presented first. This single-target monitoring application is then followed by the evaluation of suitable techniques for tracking groups of animals, of which the most suitable existing technique is found to be a Markov chain Monte Carlo particle filtering algorithm with a Markov random field motion prior (MCMC MRF, Khan et al. 2004). Finally, a new tracking technique is developed which uses social motion information present in groups of social targets to guide the tracking. This is used in the new Motion Parameter Sharing (MPS) algorithm. MPS is designed to improve the tracking of groups of targets with coordinated motion by incorporating motion information from targets that have been moving in a similar way. Situations where coordinated motion information should improve tracking include animal flocking, people moving as a group or any situation where some targets are moving in a correlated fashion. This new method is tested on a variety of real and artificial data sequences, and its performance compared to that of the MCMC MRF algorithm. The new MPS algorithm is found to outperform the MCMC MRF algorithm during a number of different types of sequences (including during occlusion events and noisy sequences) where correlated motion is present between targets. This improvement is apparent both in the accuracy of target location and robustness of tracking, the latter of which is greatly improved

    Video foreground extraction for mobile camera platforms

    Get PDF
    Foreground object detection is a fundamental task in computer vision with many applications in areas such as object tracking, event identification, and behavior analysis. Most conventional foreground object detection methods work only in a stable illumination environments using fixed cameras. In real-world applications, however, it is often the case that the algorithm needs to operate under the following challenging conditions: drastic lighting changes, object shape complexity, moving cameras, low frame capture rates, and low resolution images. This thesis presents four novel approaches for foreground object detection on real-world datasets using cameras deployed on moving vehicles.The first problem addresses passenger detection and tracking tasks for public transport buses investigating the problem of changing illumination conditions and low frame capture rates. Our approach integrates a stable SIFT (Scale Invariant Feature Transform) background seat modelling method with a human shape model into a weighted Bayesian framework to detect passengers. To deal with the problem of tracking multiple targets, we employ the Reversible Jump Monte Carlo Markov Chain tracking algorithm. Using the SVM classifier, the appearance transformation models capture changes in the appearance of the foreground objects across two consecutives frames under low frame rate conditions. In the second problem, we present a system for pedestrian detection involving scenes captured by a mobile bus surveillance system. It integrates scene localization, foreground-background separation, and pedestrian detection modules into a unified detection framework. The scene localization module performs a two stage clustering of the video data.In the first stage, SIFT Homography is applied to cluster frames in terms of their structural similarity, and the second stage further clusters these aligned frames according to consistency in illumination. This produces clusters of images that are differential in viewpoint and lighting. A kernel density estimation (KDE) technique for colour and gradient is then used to construct background models for each image cluster, which is further used to detect candidate foreground pixels. Finally, using a hierarchical template matching approach, pedestrians can be detected.In addition to the second problem, we present three direct pedestrian detection methods that extend the HOG (Histogram of Oriented Gradient) techniques (Dalal and Triggs, 2005) and provide a comparative evaluation of these approaches. The three approaches include: a) a new histogram feature, that is formed by the weighted sum of both the gradient magnitude and the filter responses from a set of elongated Gaussian filters (Leung and Malik, 2001) corresponding to the quantised orientation, which we refer to as the Histogram of Oriented Gradient Banks (HOGB) approach; b) the codebook based HOG feature with branch-and-bound (efficient subwindow search) algorithm (Lampert et al., 2008) and; c) the codebook based HOGB approach.In the third problem, a unified framework that combines 3D and 2D background modelling is proposed to detect scene changes using a camera mounted on a moving vehicle. The 3D scene is first reconstructed from a set of videos taken at different times. The 3D background modelling identifies inconsistent scene structures as foreground objects. For the 2D approach, foreground objects are detected using the spatio-temporal MRF algorithm. Finally, the 3D and 2D results are combined using morphological operations.The significance of these research is that it provides basic frameworks for automatic large-scale mobile surveillance applications and facilitates many higher-level applications such as object tracking and behaviour analysis

    Multiple human tracking in RGB-depth data: A survey

    Get PDF
    Β© The Institution of Engineering and Technology. Multiple human tracking (MHT) is a fundamental task in many computer vision applications. Appearance-based approaches, primarily formulated on RGB data, are constrained and affected by problems arising from occlusions and/or illumination variations. In recent years, the arrival of cheap RGB-depth devices has led to many new approaches to MHT, and many of these integrate colour and depth cues to improve each and every stage of the process. In this survey, the authors present the common processing pipeline of these methods and review their methodology based (a) on how they implement this pipeline and (b) on what role depth plays within each stage of it. They identify and introduce existing, publicly available, benchmark datasets and software resources that fuse colour and depth data for MHT. Finally, they present a brief comparative evaluation of the performance of those works that have applied their methods to these datasets

    Rich probabilistic models for semantic labeling

    Get PDF
    Das Ziel dieser Monographie ist es die Methoden und Anwendungen des semantischen Labelings zu erforschen. Unsere BeitrΓ€ge zu diesem sich rasch entwickelten Thema sind bestimmte Aspekte der Modellierung und der Inferenz in probabilistischen Modellen und ihre Anwendungen in den interdisziplinΓ€ren Bereichen der Computer Vision sowie medizinischer Bildverarbeitung und Fernerkundung

    Efficient Belief Propagation for Perception and Manipulation in Clutter

    Full text link
    Autonomous service robots are required to perform tasks in common human indoor environments. To achieve goals associated with these tasks, the robot should continually perceive, reason its environment, and plan to manipulate objects, which we term as goal-directed manipulation. Perception remains the most challenging aspect of all stages, as common indoor environments typically pose problems in recognizing objects under inherent occlusions with physical interactions among themselves. Despite recent progress in the field of robot perception, accommodating perceptual uncertainty due to partial observations remains challenging and needs to be addressed to achieve the desired autonomy. In this dissertation, we address the problem of perception under uncertainty for robot manipulation in cluttered environments using generative inference methods. Specifically, we aim to enable robots to perceive partially observable environments by maintaining an approximate probability distribution as a belief over possible scene hypotheses. This belief representation captures uncertainty resulting from inter-object occlusions and physical interactions, which are inherently present in clutterred indoor environments. The research efforts presented in this thesis are towards developing appropriate state representations and inference techniques to generate and maintain such belief over contextually plausible scene states. We focus on providing the following features to generative inference while addressing the challenges due to occlusions: 1) generating and maintaining plausible scene hypotheses, 2) reducing the inference search space that typically grows exponentially with respect to the number of objects in a scene, 3) preserving scene hypotheses over continual observations. To generate and maintain plausible scene hypotheses, we propose physics informed scene estimation methods that combine a Newtonian physics engine within a particle based generative inference framework. The proposed variants of our method with and without a Monte Carlo step showed promising results on generating and maintaining plausible hypotheses under complete occlusions. We show that estimating such scenarios would not be possible by the commonly adopted 3D registration methods without the notion of a physical context that our method provides. To scale up the context informed inference to accommodate a larger number of objects, we describe a factorization of scene state into object and object-parts to perform collaborative particle-based inference. This resulted in the Pull Message Passing for Nonparametric Belief Propagation (PMPNBP) algorithm that caters to the demands of the high-dimensional multimodal nature of cluttered scenes while being computationally tractable. We demonstrate that PMPNBP is orders of magnitude faster than the state-of-the-art Nonparametric Belief Propagation method. Additionally, we show that PMPNBP successfully estimates poses of articulated objects under various simulated occlusion scenarios. To extend our PMPNBP algorithm for tracking object states over continuous observations, we explore ways to propose and preserve hypotheses effectively over time. This resulted in an augmentation-selection method, where hypotheses are drawn from various proposals followed by the selection of a subset using PMPNBP that explained the current state of the objects. We discuss and analyze our augmentation-selection method with its counterparts in belief propagation literature. Furthermore, we develop an inference pipeline for pose estimation and tracking of articulated objects in clutter. In this pipeline, the message passing module with the augmentation-selection method is informed by segmentation heatmaps from a trained neural network. In our experiments, we show that our proposed pipeline can effectively maintain belief and track articulated objects over a sequence of observations under occlusion.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163159/1/kdesingh_1.pd

    Algorithms for trajectory integration in multiple views

    Get PDF
    PhDThis thesis addresses the problem of deriving a coherent and accurate localization of moving objects from partial visual information when data are generated by cameras placed in di erent view angles with respect to the scene. The framework is built around applications of scene monitoring with multiple cameras. Firstly, we demonstrate how a geometric-based solution exploits the relationships between corresponding feature points across views and improves accuracy in object location. Then, we improve the estimation of objects location with geometric transformations that account for lens distortions. Additionally, we study the integration of the partial visual information generated by each individual sensor and their combination into one single frame of observation that considers object association and data fusion. Our approach is fully image-based, only relies on 2D constructs and does not require any complex computation in 3D space. We exploit the continuity and coherence in objects' motion when crossing cameras' elds of view. Additionally, we work under the assumption of planar ground plane and wide baseline (i.e. cameras' viewpoints are far apart). The main contributions are: i) the development of a framework for distributed visual sensing that accounts for inaccuracies in the geometry of multiple views; ii) the reduction of trajectory mapping errors using a statistical-based homography estimation; iii) the integration of a polynomial method for correcting inaccuracies caused by the cameras' lens distortion; iv) a global trajectory reconstruction algorithm that associates and integrates fragments of trajectories generated by each camera
    • …
    corecore