65 research outputs found

    PixelTrack: A Fast Adaptive Algorithm for Tracking Non-rigid Objects

    Full text link
    International audienceIn this paper, we present a novel algorithm for fast tracking of generic objects in videos. The algorithm uses two components: a detector that makes use of the generalised Hough transform with pixel-based descriptors, and a probabilistic segmentation method based on global models for foreground and background. These components are used for tracking in a combined way, and they adapt each other in a co-training manner. Through effective model adaptation and segmentation, the algorithm is able to track objects that undergo rigid and non-rigid deformations and considerable shape and appearance variations. The proposed tracking method has been thoroughly evaluated on challenging standard videos, and outperforms state-of-theart tracking methods designed for the same task. Finally, the proposed models allow for an extremely efficient implementation, and thus tracking is very fast

    Visual tracking of non-rigid objects with partial occlusion through elastic structure of local patches and hierarchical diffusion

    Get PDF
    In this paper, a tracking method based on sequential Bayesian inference is proposed. The proposed method focuses on solving both the problem of tracking under partial occlusions and the problem of non-rigid object tracking in real-time on a desktop personal computer (PC). The proposed method is mainly composed of two parts: (1) modeling the target object using elastic structure of local patches for robust performance; and (2) efficient hierarchical diffusion method to perform the tracking procedure in real-time. The elastic structure of local patches allows the proposed method to handle partial occlusions and non-rigid deformations through the relationship among neighboring patches. The proposed hierarchical diffusion method generates samples from the region where the posterior is concentrated to reduce computation time. The method is extensively tested on a number of challenging image sequences with occlusion and non-rigid deformation. The experimental results show the real-time capability and the robustness of the proposed method under various situations

    Visual object tracking performance measures revisited

    Get PDF
    The problem of visual tracking evaluation is sporting a large variety of performance measures, and largely suffers from lack of consensus about which measures should be used in experiments. This makes the cross-paper tracker comparison difficult. Furthermore, as some measures may be less effective than others, the tracking results may be skewed or biased towards particular tracking aspects. In this paper we revisit the popular performance measures and tracker performance visualizations and analyze them theoretically and experimentally. We show that several measures are equivalent from the point of information they provide for tracker comparison and, crucially, that some are more brittle than the others. Based on our analysis we narrow down the set of potential measures to only two complementary ones, describing accuracy and robustness, thus pushing towards homogenization of the tracker evaluation methodology. These two measures can be intuitively interpreted and visualized and have been employed by the recent Visual Object Tracking (VOT) challenges as the foundation for the evaluation methodology

    Seguimiento automático para RGB y detección de objetos en color

    Get PDF
    Este artículo muestra un algoritmo de seguimiento (tracking) para RGB (red, green, blue) con un enfoque aplicativo para sistemas de video-vigilancia. Se ha desarrollado los procedimientos de segmentación, detección y finalmente el seguimiento así como la integración de información virtual en un ambiente real. Diversas pruebas experimentales se han realizado basadas en la propuesta de estimaciones de detección a través de comparación de longitud de onda del color y la distancia, así como considerando criterios utilizados en la literatura de precisión, rendimiento y valoración global de la correcta detección. Finalmente se ha podido comprobar que existe un rendimiento de detección a través de longitudes de onda del 83% así como una estimación de la métrica de precisión de la detección de 0.882, lo que permite demostrar una eficiente detección en el seguimiento a través algoritmo propuesto. ABSTRACT.– This paper presents a RGB (red, green, and blue) algorithm in surveillance systems. Segmentation, detection, tracking and integration of virtual information has been implemented in the real environment. Experiments of comparing between color wave length and distance have been made. In addition, the criteria of recall, accuracy detection, F-measure are used to evaluate the accuracy of detections. Finally, we have obtained a good performance through length wave detection of 83% and F-measure of 0.882, these results confirm an accurate detection in the tracking through our algorithm

    움직이는 물체 검출 및 추적을 위한 생체 모방 모델

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2014. 2. 최진영.In this thesis, we propose bio-mimetic models for motion detection and visual tracking to overcome the limitations of existing methods in actual environments. The models are inspired from the theory that there are four different forms of visual memory for human visual perception when representing a scenevisible persistence, informational persistence, visual short-term memory (VSTM), and visual long-term memory (VLTM). We view our problem as a problem of modeling and representing an observed scene with temporary short-term models (TSTM) and conservative long-term models (CLTM). We study on building efficient and effective models for TSTM and CLTM, and utilizing them together to obtain robust detection and tracking results under occlusions, clumsy initializations, background clutters, drifting, and non-rigid deformations encountered in actual environments. First, we propose an efficient representation of TSTM to be used for moving object detection on non-stationary cameras, which runs within 5.8 milliseconds (ms) on a PC, and real-time on mobile devices. To achieve real-time capability with robust performance, our method models the background through the proposed dual-mode kernel model (DMKM) and compensates the motion of the camera by mixing neighboring models. Modeling through DMKM prevents the background model from being contaminated by foreground pixels, while still allowing the model to be able to adapt to changes of the background. Mixing neighboring models reduces the errors arising from motion compensation and their influences are further reduced by keeping the age of the model. Also, to decrease computation load, the proposed method applies one DMKM to multiple pixels without performance degradation. Experimental results show the computational lightness and the real-time capability of our method on a smart phone with robust detection performances. Second, by using the concept from both TSTM and CLTM, a new visual tracking method using the novel tri-model is proposed. The proposed method aims to solve the problems of occlusions, background clutters, and drifting simultaneously with the new tri-model. The proposed tri-model is composed of three models, where each model learns the target object, the background, and other non-target moving objects online. The proposed scheme performs tracking by finding the best explanation of the scene with the three learned models. By utilizing the information in the background and the foreground models as well as the target object model, our method obtains robust results under occlusions and background clutters. Also, the target object model is updated in a conservative way to prevent drifting. Furthermore, our method is not restricted to bounding-boxes when representing the target object, and is able to give pixel-wise tracking results. Third, we go beyond pixel-wise modeling and propose a local feature based tracking model using both TSTM and CLTM to track objects in case of uncertain initializations and severe occlusions. To track objects accurately in such situations, the proposed scheme uses ``motion saliency'' and ``descriptor saliency'' of local features and performs tracking based on generalized Hough transform (GHT). The proposed motion saliency of a local feature utilizes instantaneous velocity of features to form TSTM and emphasizes features having distinctive motions, compared to the motions coming from local features which are not from the object. The descriptor saliency models local features as CLTM and emphasizes features which are likely to be of the object in terms of its feature descriptors. Through these saliencies, the proposed method tries to ``learn and find'' the target object rather than looking for what was given at initialization, becoming robust to initialization problems. Also, our tracking result is obtained by combining the results of each local features of the target and the surroundings, thus being robust against severe occlusions as well. The proposed method is compared against eight other methods, with nine image sequences, and hundred random initializations. The experimental results show that our method outperforms all other compared methods. Fourth and last, we focus on building robust CLTM with local patches and their neighboring structures. The proposed method is based on sequential Bayesian inference and focuses on solving both the problem of tracking under partial occlusions and the problem of non-rigid object tracking in real-time on desktop personal computers (PC). The proposed scheme is mainly composed of two parts: (1) modeling the target object using elastic structure of local patches for robust performanceand (2) efficient hierarchical diffusion method to perform the tracking process in real-time. The elastic structure of local patches allows the proposed scheme to handle partial occlusions and non-rigid deformations through the relationship among neighboring patches. The proposed hierarchical diffusion generates samples from the region where the posterior is concentrated to reduce computation time. The method is extensively tested on a number of challenging image sequences with occlusion and non-rigid deformation. The experimental results show the real-time capability and the robustness of the proposed scheme under various situations.1 Introduction 1.1 Background and Research Issues 1.1.1 Issues in Motion Detection 1.1.2 Issues in Object Tracking 1.2 The Human Visual Memory 1.2.1 Sensory Memory 1.2.2 Visual Short-Term Memory 1.2.3 Visual Long-Term Memory 1.3 Bio-mimetic Framework for Detection and Tracking 1.4 Contents of the Research 2 Detection by Pixel-wise Dual-Mode Kernel Model 2.1 Proposed Method 2.1.1 Approximated Gaussian Kernel Model 2.1.2 Dual-Mode Kernel Model (DMKM) 2.1.3 Motion Compensation by Mixing Models 2.1.4 Detection of Foreground Pixels 2.2 Experimental Results 2.2.1 Runtime Comparison 2.2.2 Qualitative Comparison 2.2.3 Quantitative Comparison 2.2.4 Effects of Dual-Mode Kernel Model 2.2.5 Effects of Motion Compensation 2.2.6 Mobile Results 2.3 Remarks and Discussion 3 Tracking by Pixel-wise Tri-Model Representation 3.1 Tri-Model Framework 3.1.1 Overall Scheme 3.1.2 Advantages 3.1.3 Practical Approximation 3.2 Tracking with the Tri-Model 3.2.1 Likelihood of the Tri-Model 3.2.2 Likelihood Maximization 3.2.3 Estimating Pixel-Wise Labels 3.3 Learning the Tri-Model 3.3.1 Target Model 3.3.2 Background Model 3.3.3 Foreground Model 3.4 Experimental Results 3.4.1 Experimental Settings 3.4.2 Tracking Accuracy: Bounding Box 3.4.3 Tracking Accuracy: Pixel-Wise 3.5 Remarks and Discussion 4 Tracking by Feature-point-wise Saliency Model 4.1 Proposed Method 4.1.1 Tracking based on GHT 4.1.2 Descriptor Saliency and Feature DB Update 4.1.3 Motion Saliency 4.2 Experimental Results 4.2.1 Tracking with Inaccurate Initializations 4.2.2 Tracking Under Occlusions 4.3 Remarks and Discussion 5 Tracking by Patch-wise Elastic Structure Model 5.1 Tracking with Elastic Structure of Local Patches 5.1.1 Sequential Bayesian Inference Framework 5.1.2 Elastic Structure of Local Patches 5.1.3 Modeling a Single Patch 5.1.4 Modeling the Relationship between Patches 5.1.5 Model Update 5.1.6 Hierarchical Diffusion 5.1.7 Summary of the Proposed Method 5.2 Experiments 5.2.1 Parameter Effects 5.2.2 Performance Evaluation 5.2.3 Discussion on Translation, Rotation, Illumination Changes 5.2.4 Discussion on Partial Occlusions 5.2.5 Discussion on Non-Rigid Deformations 5.2.6 Discussion on Additional Cases 5.2.7 Summary of Tracking Results 5.2.8 Effectiveness of Hierarchical Diffusion 5.2.9 Limitations 5.3 Remarks and Discussion 6 Concluding Remarks and Future Works Bibliography Abstract in KoreanDocto

    Visual tracking over multiple temporal scales

    Get PDF
    Visual tracking is the task of repeatedly inferring the state (position, motion, etc.) of the desired target in an image sequence. It is an important scientific problem as humans can visually track targets in a broad range of settings. However, visual tracking algorithms struggle to robustly follow a target in unconstrained scenarios. Among the many challenges faced by visual trackers, two important ones are occlusions and abrupt motion variations. Occlusions take place when (an)other object(s) obscures the camera's view of the tracked target. A target may exhibit abrupt variations in apparent motion due to its own unexpected movement, camera movement, and low frame rate image acquisition. Each of these issues can cause a tracker to lose its target. This thesis introduces the idea of learning and propagation of tracking information over multiple temporal scales to overcome occlusions and abrupt motion variations. A temporal scale is a specific sequence of moments in time Models (describing appearance and/or motion of the target) can be learned from the target tracking history over multiple temporal scales and applied over multiple temporal scales in the future. With the rise of multiple motion model tracking frameworks, there is a need for a broad range of search methods and ways of selecting between the available motion models. The potential benefits of learning over multiple temporal scales are first assessed by studying both motion and appearance variations in the ground-truth data associated with several image sequences. A visual tracker operating over multiple temporal scales is then proposed that is capable of handling occlusions and abrupt motion variations. Experiments are performed to compare the performance of the tracker with competing methods, and to analyze the impact on performance of various elements of the proposed approach. Results reveal a simple, yet general framework for dealing with occlusions and abrupt motion variations. In refining the proposed framework, a search method is generalized for multiple competing hypotheses in visual tracking, and a new motion model selection criterion is proposed

    Symmetry-Driven Accumulation of Local Features for Human Characterization and Re-identification

    Get PDF
    This work proposes a method to characterize the appearance of individuals exploiting body visual cues.The method is based on a symmetry-driven appearance-based descriptor and a matching policy that allows to recognize an individual.The descriptor encodes three complementary visual characteristics of the human appearance: theoverall chromatic content, the spatial arrangement of colors intostable regions, and the presence of recurrent local motifs with highentropy. The characteristics are extracted by following symmetry and asymmetryperceptual principles, that allow to segregate meaningful body parts and to focus on the human body only, pruning out the background clutter.The descriptor exploits the case where we have a single image of the individual, as so as the eventuality that multiple pictures of the same identity are available, as in a tracking scenario.The descriptor is dubbed Symmetry-Driven Accumulation of LocalFeatures (SDALF).Our approach is applied to two different scenarios: re-identification and multi-target tracking.In the former, we show the capabilities of SDALF in encoding peculiar aspects of an individual, focusing on its robustness properties across dramatic low resolution images, in presence of occlusions and pose changes, and variations of viewpoints and scene illumination.SDALF has been tested on various benchmark datasets, obtaining in general convincing performances, and setting the state of the art in some cases.The latter scenario shows the benefits of using SDALF as observation model for different trackers, boosting their performances under different respects on the CAVIAR dataset
    corecore