16 research outputs found

    On Pairwise Costs for Network Flow Multi-Object Tracking

    Full text link
    Multi-object tracking has been recently approached with the min-cost network flow optimization techniques. Such methods simultaneously resolve multiple object tracks in a video and enable modeling of dependencies among tracks. Min-cost network flow methods also fit well within the "tracking-by-detection" paradigm where object trajectories are obtained by connecting per-frame outputs of an object detector. Object detectors, however, often fail due to occlusions and clutter in the video. To cope with such situations, we propose to add pairwise costs to the min-cost network flow framework. While integer solutions to such a problem become NP-hard, we design a convex relaxation solution with an efficient rounding heuristic which empirically gives certificates of small suboptimality. We evaluate two particular types of pairwise costs and demonstrate improvements over recent tracking methods in real-world video sequences

    Dynamic Body VSLAM with Semantic Constraints

    Full text link
    Image based reconstruction of urban environments is a challenging problem that deals with optimization of large number of variables, and has several sources of errors like the presence of dynamic objects. Since most large scale approaches make the assumption of observing static scenes, dynamic objects are relegated to the noise modeling section of such systems. This is an approach of convenience since the RANSAC based framework used to compute most multiview geometric quantities for static scenes naturally confine dynamic objects to the class of outlier measurements. However, reconstructing dynamic objects along with the static environment helps us get a complete picture of an urban environment. Such understanding can then be used for important robotic tasks like path planning for autonomous navigation, obstacle tracking and avoidance, and other areas. In this paper, we propose a system for robust SLAM that works in both static and dynamic environments. To overcome the challenge of dynamic objects in the scene, we propose a new model to incorporate semantic constraints into the reconstruction algorithm. While some of these constraints are based on multi-layered dense CRFs trained over appearance as well as motion cues, other proposed constraints can be expressed as additional terms in the bundle adjustment optimization process that does iterative refinement of 3D structure and camera / object motion trajectories. We show results on the challenging KITTI urban dataset for accuracy of motion segmentation and reconstruction of the trajectory and shape of moving objects relative to ground truth. We are able to show average relative error reduction by a significant amount for moving object trajectory reconstruction relative to state-of-the-art methods like VISO 2, as well as standard bundle adjustment algorithms

    A Theory of Refractive Photo-Light-Path Triangulation

    Get PDF
    International audience3D reconstruction of transparent refractive objects like a plastic bottle is challenging: they lack appearance related visual cues and merely reflect and refract light from the surrounding environment. Amongst several approaches to reconstruct such objects, the seminal work of Light-Path triangulation is highly popular because of its general applicability and analysis of minimal scenarios. A light-path is defined as the piece-wise linear path taken by a ray of light as it passes from source, through the object and into the camera. Transparent refractive objects not only affect the geometric configuration of light-paths but also their radiometric properties. In this paper, we describe a method that combines both geometric and radiometric information to do reconstruction. We show two major consequences of the addition of radiometric cues to the light-path setup. Firstly, we extend the case of scenarios in which reconstruction is plausible while reducing the minimal requirements for a unique reconstruction. This happens as a consequence of the fact that radiometric cues add an additional known variable to the already existing system of equations. Secondly, we present a simple algorithm for reconstruction, owing to the nature of the radiometric cue. We present several synthetic experiments to validate our theories, and show high quality reconstructions in challenging scenarios

    Estimation de la forme d'objets spéculaires à partir d'un système multi-vues

    No full text
    The task of understanding, 3D reconstruction and analysis of the multiple view geometry related to transparent objects is one of the long standing challenging problems in computer vision. In this thesis, we look at novel approaches to analyze images of transparent surfaces to deduce their geometric and photometric properties. At first, we analyze the multiview geometry of the simple case of planar refraction. We show how the image of a 3D line is a quartic curve in an image, and thus derive the first imaging model that accounts for planar refraction. We use this approach to then derive other properties that involve multiple cameras, like fundamental and homography matrices. Finally, we propose approaches to estimate the refractive surface parameters and camera poses, given images. We then extend our approach to derive algorithms for recovering the geometry of multiple planar refractive surfaces from a single image. We propose a simple technique to compute the normal of such surfaces given in various scenarios, by equating our setup to an axial camera. We then show that the same model could be used to reconstruct reflective surfaces using a piecewise planar assumption. We show encouraging 3D reconstruction results, and analyse the accuracy of results obtained using this approach. We then focus our attention on using both geometric and photometric cues for reconstructing transparent 3D surfaces. We show that in the presence of known illumination, we can recover the shape of such objects from single or multiple views. The cornerstone of our approach are the Fresnel equations, and we both derive and analyze their use for 3D reconstruction. Finally, we show our approach could be used to produce high quality reconstructions, and discuss other potential future applications.Un des modèles les plus simples de surface de réfraction est une surface plane. Bien que sa présence soit omniprésente dans notre monde sous la forme de vitres transparentes, de fenêtres, ou la surface d'eau stagnante, très peu de choses sont connues sur la géométrie multi-vues causée par la réfraction d'une telle surface. Dans la première partie de cette thèse, nous analysons la géométrie à vues multiple d'une surface réfractive. Nous considérons le cas où une ou plusieurs caméras dans un milieu (p. ex. l'air) regardent une scène dans un autre milieu (p. ex. l'eau), avec une interface plane entre ces deux milieux. Le cas d'une photo sous-marine, par exemple, correspond à cette description. Comme le modèle de projection perspectif ne correspond pas à ce scenario, nous dérivons le modèle de caméra et sa matrice de projection associée. Nous montrons que les lignes 3D de la scène correspondent à des courbes quartiques dans les images. Un point intéressant à noter à propos de cette configuration est que si l'on considère un indice de réfraction homogène, alors il existe une courbe unique dans l'image pour chaque ligne 3D du monde. Nous décrivons et développons ensuite des éléments de géométrie multi-vues telles que les matrices fondamentales ou d'homographies liées à la scène, et donnons des éléments pour l'estimation de pose des caméras à partir de plusieurs points de vue. Nous montrons également que lorsque le milieu est plus dense, la ligne d'horizon correspond à une conique qui peut être décomposer afin d'en déduire les paramètres de l'interface. Ensuite, nous étendons notre approche en proposant des algorithmes pour estimer la géométrie de plusieurs surfaces planes refractives à partir d'une seule image. Un exemple typique d'un tel scenario est par exemple lorsque l'on regarde à travers un aquarium. Nous proposons une méthode simple pour calculer les normales de telles surfaces étant donné divers scenari, en limitant le système à une caméra axiale. Cela permet dans notre cas d'utiliser des approches basées sur ransac comme l'algorithme “8 points” pour le calcul de matrice fondamentale, d'une manière similaire à l'estimation de distortions axiales de la littérature en vision par ordinateur. Nous montrons également que le même modèle peut être directement adapté pour reconstruire des surfaces réflectives sous l'hypothèse que les surfaces soient planes par morceaux. Nous présentons des résultats de reconstruction 3D encourageants, et analysons leur précision. Alors que les deux approches précédentes se focalisent seulement sur la reconstruction d'une ou plusieurs surfaces planes réfractives en utilisant uniquement l'information géométrique, les surfaces spéculaires modifient également la manière dont l'énergie lumineuse à la surface est redistribuée. Le modèle sous-jacent correspondant peut être expliqué par les équations de Fresnel. En exploitant à la fois cette information géométrique et photométrique, nous proposons une méthode pour reconstruire la forme de surfaces spéculaires arbitraires. Nous montrons que notre approche implique un scenario d'acquisition simple. Tout d'abord, nous analysons plusieurs cas minimals pour la reconstruction de formes, et en déduisons une nouvelle contrainte qui combine la géométrie et la théorie de Fresnel à propos des surfaces transparentes. Ensuite, nous illustrons la nature complémentaire de ces attributs qui nous aident à obtenir une information supplémentaire sur l'objet, qu'il est difficile d'avoir autrement. Finalement, nous proposons une discussion sur les aspects pratiques de notre algorithme de reconstruction, et présentons des résultats sur des données difficiles et non triviales

    Multi-View Geometry of the Refractive Plane

    Get PDF
    Transparent refractive objects are one of the main problems in geometric vision that have been largely unexplored. The imaging and multi-view geometry of scenes with transparent or translucent objects with refractive properties is relatively less well understood than for opaque objects. The main objective of our work is to analyze the underlying multi-view relationships between cameras, when the scene being viewed contains a single refractive planar surface separating two different media. Such a situation might occur in scenarios like underwater photography. Our main result is to show the existence of geometric entities like the fundamental matrix, and the homography matrix in such instances. In addition, under special circumstances we also show how to compute the relative pose between two cameras immersed in one of the two media.

    Reliable 2D Tracking using good texture and edge features for Robotic Vision

    No full text
    We present an algorithm for highly reliable tracking of planar objects using visual cues like texture and contour in presence of feature correspondence errors. These two cues are integrated using a probabilistic formulation. The integration is based on quality goodness factors. The goodness criterion is a generalization of the well known "good features to track" concept to the both point and edge cases. The motion model of the object is computed as a homography between reference and current frames. A probabilistic formulation of the problem is proposed and implemented using particle filters. Tracking for geometric computation is useful in applications like object grasping, 3D reconstruction, augmented reality, etc. The algorithm combines contour and texture information in a novel manner to achieve robustness that outperforms the state of the art methods, which is justified by the results of experiments

    Combining texture and edge planar trackers based on a local quality metric

    No full text
    Abstract-A new probabilistic tracking framework for integrating information available from various visual cues is presented in this paper. The framework allows selection of "good" features for each cue, along with factors of their "goodness" to select the best combination form. Two particle filter based trackers, which use edge and texture features, run independently. The output of the master tracker is computed using democratic integration using the "goodness" weights. The final output is used as apriori for both tracker in the next iteration. Finally, particle filters are used to deal with non-Gaussian errors in feature extraction / prior computation. Results are shown for planar object tracking
    corecore