4,536 research outputs found

    Deep Forward and Inverse Perceptual Models for Tracking and Prediction

    Full text link
    We consider the problems of learning forward models that map state to high-dimensional images and inverse models that map high-dimensional images to state in robotics. Specifically, we present a perceptual model for generating video frames from state with deep networks, and provide a framework for its use in tracking and prediction tasks. We show that our proposed model greatly outperforms standard deconvolutional methods and GANs for image generation, producing clear, photo-realistic images. We also develop a convolutional neural network model for state estimation and compare the result to an Extended Kalman Filter to estimate robot trajectories. We validate all models on a real robotic system.Comment: 8 pages, International Conference on Robotics and Automation (ICRA) 201

    Search Tracker: Human-derived object tracking in-the-wild through large-scale search and retrieval

    Full text link
    Humans use context and scene knowledge to easily localize moving objects in conditions of complex illumination changes, scene clutter and occlusions. In this paper, we present a method to leverage human knowledge in the form of annotated video libraries in a novel search and retrieval based setting to track objects in unseen video sequences. For every video sequence, a document that represents motion information is generated. Documents of the unseen video are queried against the library at multiple scales to find videos with similar motion characteristics. This provides us with coarse localization of objects in the unseen video. We further adapt these retrieved object locations to the new video using an efficient warping scheme. The proposed method is validated on in-the-wild video surveillance datasets where we outperform state-of-the-art appearance-based trackers. We also introduce a new challenging dataset with complex object appearance changes.Comment: Under review with the IEEE Transactions on Circuits and Systems for Video Technolog

    Statistical M-Estimation and Consistency in Large Deformable Models for Image Warping

    Get PDF
    The problem of defining appropriate distances between shapes or images and modeling the variability of natural images by group transformations is at the heart of modern image analysis. A current trend is the study of probabilistic and statistical aspects of deformation models, and the development of consistent statistical procedure for the estimation of template images. In this paper, we consider a set of images randomly warped from a mean template which has to be recovered. For this, we define an appropriate statistical parametric model to generate random diffeomorphic deformations in two-dimensions. Then, we focus on the problem of estimating the mean pattern when the images are observed with noise. This problem is challenging both from a theoretical and a practical point of view. M-estimation theory enables us to build an estimator defined as a minimizer of a well-tailored empirical criterion. We prove the convergence of this estimator and propose a gradient descent algorithm to compute this M-estimator in practice. Simulations of template extraction and an application to image clustering and classification are also provided

    Dense Motion Estimation for Smoke

    Full text link
    Motion estimation for highly dynamic phenomena such as smoke is an open challenge for Computer Vision. Traditional dense motion estimation algorithms have difficulties with non-rigid and large motions, both of which are frequently observed in smoke motion. We propose an algorithm for dense motion estimation of smoke. Our algorithm is robust, fast, and has better performance over different types of smoke compared to other dense motion estimation algorithms, including state of the art and neural network approaches. The key to our contribution is to use skeletal flow, without explicit point matching, to provide a sparse flow. This sparse flow is upgraded to a dense flow. In this paper we describe our algorithm in greater detail, and provide experimental evidence to support our claims.Comment: ACCV201

    Opt: A Domain Specific Language for Non-linear Least Squares Optimization in Graphics and Imaging

    Full text link
    Many graphics and vision problems can be expressed as non-linear least squares optimizations of objective functions over visual data, such as images and meshes. The mathematical descriptions of these functions are extremely concise, but their implementation in real code is tedious, especially when optimized for real-time performance on modern GPUs in interactive applications. In this work, we propose a new language, Opt (available under http://optlang.org), for writing these objective functions over image- or graph-structured unknowns concisely and at a high level. Our compiler automatically transforms these specifications into state-of-the-art GPU solvers based on Gauss-Newton or Levenberg-Marquardt methods. Opt can generate different variations of the solver, so users can easily explore tradeoffs in numerical precision, matrix-free methods, and solver approaches. In our results, we implement a variety of real-world graphics and vision applications. Their energy functions are expressible in tens of lines of code, and produce highly-optimized GPU solver implementations. These solver have performance competitive with the best published hand-tuned, application-specific GPU solvers, and orders of magnitude beyond a general-purpose auto-generated solver
    corecore