2,573 research outputs found

    Online Metric-Weighted Linear Representations for Robust Visual Tracking

    Full text link
    In this paper, we propose a visual tracker based on a metric-weighted linear representation of appearance. In order to capture the interdependence of different feature dimensions, we develop two online distance metric learning methods using proximity comparison information and structured output learning. The learned metric is then incorporated into a linear representation of appearance. We show that online distance metric learning significantly improves the robustness of the tracker, especially on those sequences exhibiting drastic appearance changes. In order to bound growth in the number of training samples, we design a time-weighted reservoir sampling method. Moreover, we enable our tracker to automatically perform object identification during the process of object tracking, by introducing a collection of static template samples belonging to several object classes of interest. Object identification results for an entire video sequence are achieved by systematically combining the tracking information and visual recognition at each frame. Experimental results on challenging video sequences demonstrate the effectiveness of the method for both inter-frame tracking and object identification.Comment: 51 pages. Appearing in IEEE Transactions on Pattern Analysis and Machine Intelligenc

    Computational intelligence approaches to robotics, automation, and control [Volume guest editors]

    Get PDF
    No abstract available

    Optimisation of Mobile Communication Networks - OMCO NET

    Get PDF
    The mini conference ā€œOptimisation of Mobile Communication Networksā€ focuses on advanced methods for search and optimisation applied to wireless communication networks. It is sponsored by Research & Enterprise Fund Southampton Solent University. The conference strives to widen knowledge on advanced search methods capable of optimisation of wireless communications networks. The aim is to provide a forum for exchange of recent knowledge, new ideas and trends in this progressive and challenging area. The conference will popularise new successful approaches on resolving hard tasks such as minimisation of transmit power, cooperative and optimal routing

    BFO vs. BSO for video object tracking using particle filter (PF)

    Get PDF
    In this paper, we introduced a new algorithm for Video tracking, which is the process of locating a moving object (or multiple objects) over time using a camera. A new particle filter based on bacteria foraging optimization (PF-BFO) is introduced in field of video object tracking. This paper reviews particle filter and using it for tracking. Particle Swarm Optimization (PSO) is also described. Moreover, using the combination of PSO with PF (PF-PSO) in video object tracking is reviewed. Bacterial Foraging Optimization (BFO) is a novel heuristic algorithm inspired from forging behavior of E. coli. After analysis of optimization mechanism, a series of measures are taken to improve the classic BFO by using Particle filter. The PSO is a meta-heuristic which is also inspired from insects' life as ACO. Even both methods use a population of entities. The comparison between PF-BFO and PF-PSO for video object tracking is presented in this work. The results show that PF is strong tool in tracking field. On the other hand, PF-BFO method presents outstanding performance versus PF-PSO

    Audioā€Visual Speaker Tracking

    Get PDF
    Target motion tracking found its application in interdisciplinary fields, including but not limited to surveillance and security, forensic science, intelligent transportation system, driving assistance, monitoring prohibited area, medical science, robotics, action and expression recognition, individual speaker discrimination in multiā€speaker environments and video conferencing in the fields of computer vision and signal processing. Among these applications, speaker tracking in enclosed spaces has been gaining relevance due to the widespread advances of devices and technologies and the necessity for seamless solutions in realā€time tracking and localization of speakers. However, speaker tracking is a challenging task in realā€life scenarios as several distinctive issues influence the tracking process, such as occlusions and an unknown number of speakers. One approach to overcome these issues is to use multiā€modal information, as it conveys complementary information about the state of the speakers compared to singleā€modal tracking. To use multiā€modal information, several approaches have been proposed which can be classified into two categories, namely deterministic and stochastic. This chapter aims at providing multimedia researchers with a stateā€ofā€theā€art overview of tracking methods, which are used for combining multiple modalities to accomplish various multimedia analysis tasks, classifying them into different categories and listing new and future trends in this field

    Iterative Estimation of Rigid-Body Transformations: Application to Robust Object Tracking and Iterative Closest Point

    Get PDF
    Closed-form solutions are traditionally used in computer vision for estimating rigid body transformations. Here we suggest an iterative solution for estimating rigid body transformations and prove its global convergence. We show that for a number of applications involving repeated estimations of rigid body transformations, an iterative scheme is preferable to a closed-form solution. We illustrate this experimentally on two applications, 3D object tracking and image registration with Iterative Closest Point. Our results show that for those problems using an iterative and continuous estimation process is more robust than using many independent closed-form estimation

    Efficient Belief Propagation for Perception and Manipulation in Clutter

    Full text link
    Autonomous service robots are required to perform tasks in common human indoor environments. To achieve goals associated with these tasks, the robot should continually perceive, reason its environment, and plan to manipulate objects, which we term as goal-directed manipulation. Perception remains the most challenging aspect of all stages, as common indoor environments typically pose problems in recognizing objects under inherent occlusions with physical interactions among themselves. Despite recent progress in the field of robot perception, accommodating perceptual uncertainty due to partial observations remains challenging and needs to be addressed to achieve the desired autonomy. In this dissertation, we address the problem of perception under uncertainty for robot manipulation in cluttered environments using generative inference methods. Specifically, we aim to enable robots to perceive partially observable environments by maintaining an approximate probability distribution as a belief over possible scene hypotheses. This belief representation captures uncertainty resulting from inter-object occlusions and physical interactions, which are inherently present in clutterred indoor environments. The research efforts presented in this thesis are towards developing appropriate state representations and inference techniques to generate and maintain such belief over contextually plausible scene states. We focus on providing the following features to generative inference while addressing the challenges due to occlusions: 1) generating and maintaining plausible scene hypotheses, 2) reducing the inference search space that typically grows exponentially with respect to the number of objects in a scene, 3) preserving scene hypotheses over continual observations. To generate and maintain plausible scene hypotheses, we propose physics informed scene estimation methods that combine a Newtonian physics engine within a particle based generative inference framework. The proposed variants of our method with and without a Monte Carlo step showed promising results on generating and maintaining plausible hypotheses under complete occlusions. We show that estimating such scenarios would not be possible by the commonly adopted 3D registration methods without the notion of a physical context that our method provides. To scale up the context informed inference to accommodate a larger number of objects, we describe a factorization of scene state into object and object-parts to perform collaborative particle-based inference. This resulted in the Pull Message Passing for Nonparametric Belief Propagation (PMPNBP) algorithm that caters to the demands of the high-dimensional multimodal nature of cluttered scenes while being computationally tractable. We demonstrate that PMPNBP is orders of magnitude faster than the state-of-the-art Nonparametric Belief Propagation method. Additionally, we show that PMPNBP successfully estimates poses of articulated objects under various simulated occlusion scenarios. To extend our PMPNBP algorithm for tracking object states over continuous observations, we explore ways to propose and preserve hypotheses effectively over time. This resulted in an augmentation-selection method, where hypotheses are drawn from various proposals followed by the selection of a subset using PMPNBP that explained the current state of the objects. We discuss and analyze our augmentation-selection method with its counterparts in belief propagation literature. Furthermore, we develop an inference pipeline for pose estimation and tracking of articulated objects in clutter. In this pipeline, the message passing module with the augmentation-selection method is informed by segmentation heatmaps from a trained neural network. In our experiments, we show that our proposed pipeline can effectively maintain belief and track articulated objects over a sequence of observations under occlusion.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163159/1/kdesingh_1.pd
    • ā€¦
    corecore