5,131 research outputs found

    An Appearance-Based Method for Parametric Video Registration

    Get PDF
    In this paper we address the problem of multi frame video registration using the combination of an appearance-based technique and a parametric model of the transformations. This technique uses an image that is selected as reference frame, and therefore, estimates the transformation that occurred to each frame in the sequence respect to this absolute referenced one. Both global and local information are employed to the estimation of these registered images. Global information is applied in terms of linear appearance subspace constraints, under the subspace constancy assumption [4], where variabilities of each frame respect to the reference frame are encoded. Local information is used by means of a polynomial parametric model that estimates the velocities field evoluton in each frame. The objective function to be minimized considers both issues at the same time, i.e., the appearance representation and the time evolution across the sequence. This function is the connection between the global coordinates in the subspace representation and the time evolution and the parametric optical flow estimates. Thus, the appearance constraints result to take into account al the images in a sequence in order to estimate the transformation parameters

    Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes

    Full text link
    Unsupervised deep learning for optical flow computation has achieved promising results. Most existing deep-net based methods rely on image brightness consistency and local smoothness constraint to train the networks. Their performance degrades at regions where repetitive textures or occlusions occur. In this paper, we propose Deep Epipolar Flow, an unsupervised optical flow method which incorporates global geometric constraints into network learning. In particular, we investigate multiple ways of enforcing the epipolar constraint in flow estimation. To alleviate a "chicken-and-egg" type of problem encountered in dynamic scenes where multiple motions may be present, we propose a low-rank constraint as well as a union-of-subspaces constraint for training. Experimental results on various benchmarking datasets show that our method achieves competitive performance compared with supervised methods and outperforms state-of-the-art unsupervised deep-learning methods.Comment: CVPR 201

    Better Feature Tracking Through Subspace Constraints

    Full text link
    Feature tracking in video is a crucial task in computer vision. Usually, the tracking problem is handled one feature at a time, using a single-feature tracker like the Kanade-Lucas-Tomasi algorithm, or one of its derivatives. While this approach works quite well when dealing with high-quality video and "strong" features, it often falters when faced with dark and noisy video containing low-quality features. We present a framework for jointly tracking a set of features, which enables sharing information between the different features in the scene. We show that our method can be employed to track features for both rigid and nonrigid motions (possibly of few moving bodies) even when some features are occluded. Furthermore, it can be used to significantly improve tracking results in poorly-lit scenes (where there is a mix of good and bad features). Our approach does not require direct modeling of the structure or the motion of the scene, and runs in real time on a single CPU core.Comment: 8 pages, 2 figures. CVPR 201

    Reducing "Structure From Motion": a General Framework for Dynamic Vision - Part 2: Experimental Evaluation

    Get PDF
    A number of methods have been proposed in the literature for estimating scene-structure and ego-motion from a sequence of images using dynamical models. Although all methods may be derived from a "natural" dynamical model within a unified framework, from an engineering perspective there are a number of trade-offs that lead to different strategies depending upon the specific applications and the goals one is targeting. Which one is the winning strategy? In this paper we analyze the properties of the dynamical models that originate from each strategy under a variety of experimental conditions. For each model we assess the accuracy of the estimates, their robustness to measurement noise, sensitivity to initial conditions and visual angle, effects of the bas-relief ambiguity and occlusions, dependence upon the number of image measurements and their sampling rate

    Reducing "Structure From Motion": a General Framework for Dynamic Vision - Part 1: Modeling

    Get PDF
    The literature on recursive estimation of structure and motion from monocular image sequences comprises a large number of different models and estimation techniques. We propose a framework that allows us to derive and compare all models by following the idea of dynamical system reduction. The "natural" dynamic model, derived by the rigidity constraint and the perspective projection, is first reduced by explicitly decoupling structure (depth) from motion. Then implicit decoupling techniques are explored, which consist of imposing that some function of the unknown parameters is held constant. By appropriately choosing such a function, not only can we account for all models seen so far in the literature, but we can also derive novel ones

    Reducing “Structure from Motion”: a general framework for dynamic vision. 2. Implementation and experimental assessment

    Get PDF
    For pt.1 see ibid., p.933-42 (1998). A number of methods have been proposed in the literature for estimating scene-structure and ego-motion from a sequence of images using dynamical models. Despite the fact that all methods may be derived from a “natural” dynamical model within a unified framework, from an engineering perspective there are a number of trade-offs that lead to different strategies depending upon the applications and the goals one is targeting. We want to characterize and compare the properties of each model such that the engineer may choose the one best suited to the specific application. We analyze the properties of filters derived from each dynamical model under a variety of experimental conditions, assess the accuracy of the estimates, their robustness to measurement noise, sensitivity to initial conditions and visual angle, effects of the bas-relief ambiguity and occlusions, dependence upon the number of image measurements and their sampling rate

    Dynamic Body VSLAM with Semantic Constraints

    Full text link
    Image based reconstruction of urban environments is a challenging problem that deals with optimization of large number of variables, and has several sources of errors like the presence of dynamic objects. Since most large scale approaches make the assumption of observing static scenes, dynamic objects are relegated to the noise modeling section of such systems. This is an approach of convenience since the RANSAC based framework used to compute most multiview geometric quantities for static scenes naturally confine dynamic objects to the class of outlier measurements. However, reconstructing dynamic objects along with the static environment helps us get a complete picture of an urban environment. Such understanding can then be used for important robotic tasks like path planning for autonomous navigation, obstacle tracking and avoidance, and other areas. In this paper, we propose a system for robust SLAM that works in both static and dynamic environments. To overcome the challenge of dynamic objects in the scene, we propose a new model to incorporate semantic constraints into the reconstruction algorithm. While some of these constraints are based on multi-layered dense CRFs trained over appearance as well as motion cues, other proposed constraints can be expressed as additional terms in the bundle adjustment optimization process that does iterative refinement of 3D structure and camera / object motion trajectories. We show results on the challenging KITTI urban dataset for accuracy of motion segmentation and reconstruction of the trajectory and shape of moving objects relative to ground truth. We are able to show average relative error reduction by a significant amount for moving object trajectory reconstruction relative to state-of-the-art methods like VISO 2, as well as standard bundle adjustment algorithms

    Motion from "X" by Compensating "Y"

    Get PDF
    This paper analyzes the geometry of the visual motion estimation problem in relation to transformations of the input (images) that stabilize particular output functions such as the motion of a point, a line and a plane in the image. By casting the problem within the popular "epipolar geometry", we provide a common framework for including constraints such as point, line of plane fixation by just considering "slices" of the parameter manifold. The models we provide can be used for estimating motion from a batch using the preferred optimization techniques, or for defining dynamic filters that estimate motion from a causal sequence. We discuss methods for performing the necessary compensation by either controlling the support of the camera or by pre-processing the images. The compensation algorithms may be used also for recursively fitting a plane in 3-D both from point-features or directly from brightness. Conversely, they may be used for estimating motion relative to the plane independent of its parameters
    • …
    corecore