772 research outputs found

    Self-correcting Bayesian target tracking

    Get PDF
    The copyright of this thesis rests with the author and no quotation from it or information derived from it may be published without the prior written consent of the authorAbstract Visual tracking, a building block for many applications, has challenges such as occlusions,illumination changes, background clutter and variable motion dynamics that may degrade the tracking performance and are likely to cause failures. In this thesis, we propose Track-Evaluate-Correct framework (self-correlation) for existing trackers in order to achieve a robust tracking. For a tracker in the framework, we embed an evaluation block to check the status of tracking quality and a correction block to avoid upcoming failures or to recover from failures. We present a generic representation and formulation of the self-correcting tracking for Bayesian trackers using a Dynamic Bayesian Network (DBN). The self-correcting tracking is done similarly to a selfaware system where parameters are tuned in the model or different models are fused or selected in a piece-wise way in order to deal with tracking challenges and failures. In the DBN model representation, the parameter tuning, fusion and model selection are done based on evaluation and correction variables that correspond to the evaluation and correction, respectively. The inferences of variables in the DBN model are used to explain the operation of self-correcting tracking. The specific contributions under the generic self-correcting framework are correlation-based selfcorrecting tracking for an extended object with model points and tracker-level fusion as described below. For improving the probabilistic tracking of extended object with a set of model points, we use Track-Evaluate-Correct framework in order to achieve self-correcting tracking. The framework combines the tracker with an on-line performance measure and a correction technique. We correlate model point trajectories to improve on-line the accuracy of a failed or an uncertain tracker. A model point tracker gets assistance from neighbouring trackers whenever degradation in its performance is detected using the on-line performance measure. The correction of the model point state is based on the correlation information from the states of other trackers. Partial Least Square regression is used to model the correlation of point tracker states from short windowed trajectories adaptively. Experimental results on data obtained from optical motion capture systems show the improvement in tracking performance of the proposed framework compared to the baseline tracker and other state-of-the-art trackers. The proposed framework allows appropriate re-initialisation of local trackers to recover from failures that are caused by clutter and missed detections in the motion capture data. Finally, we propose a tracker-level fusion framework to obtain self-correcting tracking. The fusion framework combines trackers addressing different tracking challenges to improve the overall performance. As a novelty of the proposed framework, we include an online performance measure to identify the track quality level of each tracker to guide the fusion. The trackers in the framework assist each other based on appropriate mixing of the prior states. Moreover, the track quality level is used to update the target appearance model. We demonstrate the framework with two Bayesian trackers on video sequences with various challenges and show its robustness compared to the independent use of the trackers used in the framework, and also compared to other state-of-the-art trackers. The appropriate online performance measure based appearance model update and prior mixing on trackers allows the proposed framework to deal with tracking challenges

    Ego-Downward and Ambient Video based Person Location Association

    Full text link
    Using an ego-centric camera to do localization and tracking is highly needed for urban navigation and indoor assistive system when GPS is not available or not accurate enough. The traditional hand-designed feature tracking and estimation approach would fail without visible features. Recently, there are several works exploring to use context features to do localization. However, all of these suffer severe accuracy loss if given no visual context information. To provide a possible solution to this problem, this paper proposes a camera system with both ego-downward and third-static view to perform localization and tracking in a learning approach. Besides, we also proposed a novel action and motion verification model for cross-view verification and localization. We performed comparative experiments based on our collected dataset which considers the same dressing, gender, and background diversity. Results indicate that the proposed model can achieve 18.32%18.32 \% improvement in accuracy performance. Eventually, we tested the model on multi-people scenarios and obtained an average 67.767%67.767 \% accuracy

    Visual tracking using structural local DCT sparse appearance model with occlusion detection

    Get PDF
    In this paper, a structural local DCT sparse appearance model with occlusion detection is proposed for visual tracking in a particle filter framework. The energy compaction property of the 2D-DCT is exploited to reduce the size of the dictionary as well as that of the candidate samples so that the computational cost of l1-minimization can be lowered. Further, a holistic image reconstruction procedure is proposed for robust occlusion detection and used for appearance model update, thus avoiding the degradation of the appearance model in the presence of occlusion/outliers. Also, a patch occlusion ratio is introduced in the confidence score computation to enhance the tracking performance. Quantitative and qualitative performance evaluations on two popular benchmark datasets demonstrate that the proposed tracking algorithm generally outperforms several state-of-the-art methods

    Visual Tracking Based on Correlation Filter and Robust Coding in Bilateral 2DPCA Subspace

    Get PDF
    The success of correlation filters in visual tracking has attracted much attention in computer vision due to their high efficiency and performance. However, they are not equipped with a mechanism to cope with challenging situations like scale variations, out-of-view, and camera motion. With the aim of dealing with such situations, a collaborative scheme of tracking based on the discriminative and generative models is proposed. Instead of finding all the affine motion parameters of the target by the combined likelihood of these models, the correlation filters, based on discriminative model, are used to find the position of the target, whereas 2D robust coding in a bilateral 2DPCA subspace, based on generative model, is used to find the other affine motion parameters of the target. Further, a 2D robust coding distance is proposed to differentiate the candidate samples from the subspace and used to compute the observation likelihood in the generative model. In addition, it is proposed to generate a robust occlusion map from the weights obtained during the residual minimization and a novel update mechanism of the appearance model for both the correlation filters and bilateral 2DPCA subspace is proposed. The proposed method is evaluated on the challenging image sequences available in the OTB-50, VOT2016, and UAV20L benchmark datasets, and its performance is compared with that of the state-of-the-art tracking algorithms. In contrast to OTB-50 and VOT2016, the dataset UAV20L contains long duration sequences with additional challenges introduced by both the camera motion and the view points in three dimensions. Quantitative and qualitative performance evaluations on three benchmark datasets demonstrate that the proposed tracking algorithm outperforms the state-of-the-art methods

    3D Face Tracking Using Stereo Cameras with Whole Body View

    Get PDF
    All visual tracking tasks associated with people tracking are in a great demand for modern applications dedicated to make human life easier and safer. In this thesis, a special case of people tracking - 3D face tracking in whole body view video is explored. Whole body view video means that the tracked face typically occupies not more than 5-10% of the frame area. Currently there is no reliable tracker that can track a face in long-term whole body view videos with luminance cameras in the 3D space. I followed a non-classical approach to designing a 3D tracker: first a 2D face tracking algorithm was developed in one view and then extended into stereo tracking. I recorded and annotated my own extensive dataset specifically for 2D face tracking in whole body view video and evaluated 17 state of the art 2D tracking algorithms. Based on the TLD tracker, I developed a face adapted median flow tracker that shows superior results compared to state of the art generic trackers. I explored different ways of extending 2D tracking into 3D and developed a method of using the epipolar constraint to check consistency of 3D tracking results. This method allows to detect tracking failures early and improves overall 3D tracking accuracy. I demonstrated how a Kinect based method can be compared to visual tracking methods and compared four different visual tracking methods running on low resolution fisheye stereo video and the Kinect face tracking application. My main contributions are: - I developed a face adaptation of generic trackers that improves tracking performance in long-term whole body view videos. - I designed a method of using the epipolar constraint to check consistency of 3D tracking results

    Suivi Multi-Locuteurs avec des Informations Audio-Visuelles pour la Perception des Robots

    Get PDF
    Robot perception plays a crucial role in human-robot interaction (HRI). Perception system provides the robot information of the surroundings and enables the robot to give feedbacks. In a conversational scenario, a group of people may chat in front of the robot and move freely. In such situations, robots are expected to understand where are the people, who are speaking, or what are they talking about. This thesis concentrates on answering the first two questions, namely speaker tracking and diarization. We use different modalities of the robot’s perception system to achieve the goal. Like seeing and hearing for a human-being, audio and visual information are the critical cues for a robot in a conversational scenario. The advancement of computer vision and audio processing of the last decade has revolutionized the robot perception abilities. In this thesis, we have the following contributions: we first develop a variational Bayesian framework for tracking multiple objects. The variational Bayesian framework gives closed-form tractable problem solutions, which makes the tracking process efficient. The framework is first applied to visual multiple-person tracking. Birth and death process are built jointly with the framework to deal with the varying number of the people in the scene. Furthermore, we exploit the complementarity of vision and robot motorinformation. On the one hand, the robot’s active motion can be integrated into the visual tracking system to stabilize the tracking. On the other hand, visual information can be used to perform motor servoing. Moreover, audio and visual information are then combined in the variational framework, to estimate the smooth trajectories of speaking people, and to infer the acoustic status of a person- speaking or silent. In addition, we employ the model to acoustic-only speaker localization and tracking. Online dereverberation techniques are first applied then followed by the tracking system. Finally, a variant of the acoustic speaker tracking model based on von-Mises distribution is proposed, which is specifically adapted to directional data. All the proposed methods are validated on datasets according to applications.La perception des robots joue un rôle crucial dans l’interaction homme-robot (HRI). Le système de perception fournit les informations au robot sur l’environnement, ce qui permet au robot de réagir en consequence. Dans un scénario de conversation, un groupe de personnes peut discuter devant le robot et se déplacer librement. Dans de telles situations, les robots sont censés comprendre où sont les gens, ceux qui parlent et de quoi ils parlent. Cette thèse se concentre sur les deux premières questions, à savoir le suivi et la diarisation des locuteurs. Nous utilisons différentes modalités du système de perception du robot pour remplir cet objectif. Comme pour l’humain, l’ouie et la vue sont essentielles pour un robot dans un scénario de conversation. Les progrès de la vision par ordinateur et du traitement audio de la dernière décennie ont révolutionné les capacités de perception des robots. Dans cette thèse, nous développons les contributions suivantes : nous développons d’abord un cadre variationnel bayésien pour suivre plusieurs objets. Le cadre bayésien variationnel fournit des solutions explicites, rendant le processus de suivi très efficace. Cette approche est d’abord appliqué au suivi visuel de plusieurs personnes. Les processus de créations et de destructions sont en adéquation avecle modèle probabiliste proposé pour traiter un nombre variable de personnes. De plus, nous exploitons la complémentarité de la vision et des informations du moteur du robot : d’une part, le mouvement actif du robot peut être intégré au système de suivi visuel pour le stabiliser ; d’autre part, les informations visuelles peuvent être utilisées pour effectuer l’asservissement du moteur. Par la suite, les informations audio et visuelles sont combinées dans le modèle variationnel, pour lisser les trajectoires et déduire le statut acoustique d’une personne : parlant ou silencieux. Pour experimenter un scenario où l’informationvisuelle est absente, nous essayons le modèle pour la localisation et le suivi des locuteurs basé sur l’information acoustique uniquement. Les techniques de déréverbération sont d’abord appliquées, dont le résultat est fourni au système de suivi. Enfin, une variante du modèle de suivi des locuteurs basée sur la distribution de von-Mises est proposée, celle-ci étant plus adaptée aux données directionnelles. Toutes les méthodes proposées sont validées sur des bases de données specifiques à chaque application

    Predictive Duty Cycling of Radios and Cameras using Augmented Sensing in Wireless Camera Networks

    Get PDF
    Energy efficiency dominates practically every aspect of the design of wireless camera networks (WCNs), and duty cycling of radios and cameras is an important tool for achieving high energy efficiencies. However, duty cycling in WCNs is made complex by the camera nodes having to anticipate the arrival of the objects in their field-of-view. What adds to this complexity is the fact that radio duty cycling and camera duty cycling are tightly coupled notions in WCNs. Abstract In this dissertation, we present a predictive framework to provide camera nodes with an ability to anticipate the arrival of an object in the field-of-view of their cameras. This allows a predictive adaption of network parameters simultaneously in multiple layers. Such anticipatory approach is made possible by enabling each camera node in the network to track an object beyond its direct sensing range and to adapt network parameters in multiple layers before the arrival of the object in its sensing range. The proposed framework exploits a single spare bit in the MAC header of the 802.15.4 protocol for creating this beyond-the-sensing-rage capability for the camera nodes. In this manner, our proposed approach for notifying the nodes about the current state of the object location entails no additional communication overhead. Our experimental evaluations based on large-scale simulations as well as an Imote2-based wireless camera network demonstrate that the proposed predictive adaptation approach, while providing comparable application-level performance, significantly reduces energy consumption compared to the approaches addressing only a single layer adaptation or those with reactive adaptation

    Horizontal flow fields observed in Hinode G-band images. I. Methods

    Full text link
    Context: The interaction of plasma motions and magnetic fields is an important mechanism, which drives solar activity in all its facets. For example, photospheric flows are responsible for the advection of magnetic flux, the redistribution of flux during the decay of sunspots, and the built-up of magnetic shear in flaring active regions. Aims: Systematic studies based on G-band data from the Japanese Hinode mission provide the means to gather statistical properties of horizontal flow fields. This facilitates comparative studies of solar features, e.g., G-band bright points, magnetic knots, pores, and sunspots at various stages of evolution and in distinct magnetic environments, thus, enhancing our understanding of the dynamic Sun. Methods: We adapted Local Correlation Tracking (LCT) to measure horizontal flow fields based on G-band images obtained with the Solar Optical Telescope on board Hinode. In total about 200 time-series with a duration between 1-16 h and a cadence between 15-90 s were analyzed. Selecting both a high-cadence (dt = 15 s) and a long-duration (dT = 16 h) time-series enabled us to optimize and validate the LCT input parameters, hence, ensuring a robust, reliable, uniform, and accurate processing of a huge data volume. Results: The LCT algorithm produces best results for G-band images having a cadence of 60-90 s. If the cadence is lower, the velocity of slowly moving features will not be reliably detected. If the cadence is higher, the scene on the Sun will have evolved too much to bear any resemblance with the earlier situation. Consequently, in both instances horizontal proper motions are underestimated. The most reliable and yet detailed flow maps are produced using a Gaussian kernel with a size of 2560 km x 2560 km and a full-width-at-half-maximum (FWHM) of 1200 km (corresponding to the size of a typical granule) as sampling window.Comment: 12 pages, 8 figures, 4 tables, accepted for publication in Astronomy and Astrophysic

    A facility to Search for Hidden Particles (SHiP) at the CERN SPS

    Get PDF
    A new general purpose fixed target facility is proposed at the CERN SPS accelerator which is aimed at exploring the domain of hidden particles and make measurements with tau neutrinos. Hidden particles are predicted by a large number of models beyond the Standard Model. The high intensity of the SPS 400~GeV beam allows probing a wide variety of models containing light long-lived exotic particles with masses below O{\cal O}(10)~GeV/c2^2, including very weakly interacting low-energy SUSY states. The experimental programme of the proposed facility is capable of being extended in the future, e.g. to include direct searches for Dark Matter and Lepton Flavour Violation.Comment: Technical Proposa
    • …
    corecore