396 research outputs found

    Local Geometric Consensus: A General Purpose Point Pattern-Based Tracking Algorithm

    Get PDF
    Proceedings of ACM ISMAR 2015, Fukuoka, JapanInternational audienceWe present a method which can quickly and robustly match 2D and 3D point patterns based on their sole spatial distribution , but it can also handle other cues if available. This method can be easily adapted to many transformations such as similarity transformations in 2D/3D, and affine and perspective transformations in 2D. It is based on local geometric consensus among several local matchings and a refinement scheme. We provide two implementations of this general scheme, one for the 2D homography case (which can be used for marker or image tracking) and one for the 3D similarity case. We demonstrate the robustness and speed performance of our proposal on both synthetic and real images and show that our method can be used to augment any (textured/textureless) planar objects but also 3D objects

    Visual pose estimation system for autonomous rendezvous of spacecraft

    Get PDF
    In this work, a tracker spacecraft equipped with a short-range vision system is tasked with visually identifying a target spacecraft and determining its relative angular velocity and relative linear velocity using only visual information from onboard cameras. Focusing on methods that are feasible for implementation on relatively simple spacecraft hardware, we locate and track objects in three-dimensional space using conventional high-resolution cameras, saving cost and power compared to laser or infrared ranging systems. Identification of the target is done by means of visual feature detection and tracking across rapid, successive frames, taking the perspective matrix of the camera system into account, and building feature maps in three dimensions over time. Features detected in two-dimensional images are matched and triangulated to provide three-dimensional feature maps using structure-from-motion techniques. This methodology allows one, two, or more cameras with known baselines to be used for triangulation, with more images resulting in higher accuracy. Triangulated points are organized by means of orientation histogram descriptors and used to identify and track parts of the target spacecraft over time. This allows some estimation of the target spacecraft's motion even if parts of the spacecraft are obscured or in shadow. The state variables with respect to the camera system are extracted as a relative rotation quaternion and relative translation vector for the target. Robust tracking of the state variables for the target spacecraft is accomplished by an embedded adaptive unscented Kalman filter. In addition to estimation of the target quaternion from visual Information, the adaptive filter can also identify when tracking errors have occurred by measurement of the residual. Significant variations in lighting can be tolerated as long as the movement of the satellite is consistent with the system model, and illumination changes slowly enough for state variables to be estimated periodically. Inertial measurements over short periods of time can then be used to determine the movement of both the tracker and target spacecraft. In addition, with a sufficient number of features tracked, the center of mass of the target can be located. This method is tested using laboratory images of spacecraft movement with a simulated spacecraft movement model. Varying conditions are applied to demonstrate the effectiveness and limitations of the system for online estimation of the movement of a target spacecraft at close range

    Proposal Flow: Semantic Correspondences from Object Proposals

    Get PDF
    Finding image correspondences remains a challenging problem in the presence of intra-class variations and large changes in scene layout. Semantic flow methods are designed to handle images depicting different instances of the same object or scene category. We introduce a novel approach to semantic flow, dubbed proposal flow, that establishes reliable correspondences using object proposals. Unlike prevailing semantic flow approaches that operate on pixels or regularly sampled local regions, proposal flow benefits from the characteristics of modern object proposals, that exhibit high repeatability at multiple scales, and can take advantage of both local and geometric consistency constraints among proposals. We also show that the corresponding sparse proposal flow can effectively be transformed into a conventional dense flow field. We introduce two new challenging datasets that can be used to evaluate both general semantic flow techniques and region-based approaches such as proposal flow. We use these benchmarks to compare different matching algorithms, object proposals, and region features within proposal flow, to the state of the art in semantic flow. This comparison, along with experiments on standard datasets, demonstrates that proposal flow significantly outperforms existing semantic flow methods in various settings.Comment: arXiv admin note: text overlap with arXiv:1511.0506

    A Comprehensive Performance Evaluation of Deformable Face Tracking "In-the-Wild"

    Full text link
    Recently, technologies such as face detection, facial landmark localisation and face recognition and verification have matured enough to provide effective and efficient solutions for imagery captured under arbitrary conditions (referred to as "in-the-wild"). This is partially attributed to the fact that comprehensive "in-the-wild" benchmarks have been developed for face detection, landmark localisation and recognition/verification. A very important technology that has not been thoroughly evaluated yet is deformable face tracking "in-the-wild". Until now, the performance has mainly been assessed qualitatively by visually assessing the result of a deformable face tracking technology on short videos. In this paper, we perform the first, to the best of our knowledge, thorough evaluation of state-of-the-art deformable face tracking pipelines using the recently introduced 300VW benchmark. We evaluate many different architectures focusing mainly on the task of on-line deformable face tracking. In particular, we compare the following general strategies: (a) generic face detection plus generic facial landmark localisation, (b) generic model free tracking plus generic facial landmark localisation, as well as (c) hybrid approaches using state-of-the-art face detection, model free tracking and facial landmark localisation technologies. Our evaluation reveals future avenues for further research on the topic.Comment: E. Antonakos and P. Snape contributed equally and have joint second authorshi

    Vision-based retargeting for endoscopic navigation

    Get PDF
    Endoscopy is a standard procedure for visualising the human gastrointestinal tract. With the advances in biophotonics, imaging techniques such as narrow band imaging, confocal laser endomicroscopy, and optical coherence tomography can be combined with normal endoscopy for assisting the early diagnosis of diseases, such as cancer. In the past decade, optical biopsy has emerged to be an effective tool for tissue analysis, allowing in vivo and in situ assessment of pathological sites with real-time feature-enhanced microscopic images. However, the non-invasive nature of optical biopsy leads to an intra-examination retargeting problem, which is associated with the difficulty of re-localising a biopsied site consistently throughout the whole examination. In addition to intra-examination retargeting, retargeting of a pathological site is even more challenging across examinations, due to tissue deformation and changing tissue morphologies and appearances. The purpose of this thesis is to address both the intra- and inter-examination retargeting problems associated with optical biopsy. We propose a novel vision-based framework for intra-examination retargeting. The proposed framework is based on combining visual tracking and detection with online learning of the appearance of the biopsied site. Furthermore, a novel cascaded detection approach based on random forests and structured support vector machines is developed to achieve efficient retargeting. To cater for reliable inter-examination retargeting, the solution provided in this thesis is achieved by solving an image retrieval problem, for which an online scene association approach is proposed to summarise an endoscopic video collected in the first examination into distinctive scenes. A hashing-based approach is then used to learn the intrinsic representations of these scenes, such that retargeting can be achieved in subsequent examinations by retrieving the relevant images using the learnt representations. For performance evaluation of the proposed frameworks, extensive phantom, ex vivo and in vivo experiments have been conducted, with results demonstrating the robustness and potential clinical values of the methods proposed.Open Acces
    • 

    corecore