1,186 research outputs found

    Learning how to be robust: Deep polynomial regression

    Get PDF
    Polynomial regression is a recurrent problem with a large number of applications. In computer vision it often appears in motion analysis. Whatever the application, standard methods for regression of polynomial models tend to deliver biased results when the input data is heavily contaminated by outliers. Moreover, the problem is even harder when outliers have strong structure. Departing from problem-tailored heuristics for robust estimation of parametric models, we explore deep convolutional neural networks. Our work aims to find a generic approach for training deep regression models without the explicit need of supervised annotation. We bypass the need for a tailored loss function on the regression parameters by attaching to our model a differentiable hard-wired decoder corresponding to the polynomial operation at hand. We demonstrate the value of our findings by comparing with standard robust regression methods. Furthermore, we demonstrate how to use such models for a real computer vision problem, i.e., video stabilization. The qualitative and quantitative experiments show that neural networks are able to learn robustness for general polynomial regression, with results that well overpass scores of traditional robust estimation methods.Comment: 18 pages, conferenc

    An Appearance-Based Method for Parametric Video Registration

    Get PDF
    In this paper we address the problem of multi frame video registration using the combination of an appearance-based technique and a parametric model of the transformations. This technique uses an image that is selected as reference frame, and therefore, estimates the transformation that occurred to each frame in the sequence respect to this absolute referenced one. Both global and local information are employed to the estimation of these registered images. Global information is applied in terms of linear appearance subspace constraints, under the subspace constancy assumption [4], where variabilities of each frame respect to the reference frame are encoded. Local information is used by means of a polynomial parametric model that estimates the velocities field evoluton in each frame. The objective function to be minimized considers both issues at the same time, i.e., the appearance representation and the time evolution across the sequence. This function is the connection between the global coordinates in the subspace representation and the time evolution and the parametric optical flow estimates. Thus, the appearance constraints result to take into account al the images in a sequence in order to estimate the transformation parameters

    Vision and Learning for Deliberative Monocular Cluttered Flight

    Full text link
    Cameras provide a rich source of information while being passive, cheap and lightweight for small and medium Unmanned Aerial Vehicles (UAVs). In this work we present the first implementation of receding horizon control, which is widely used in ground vehicles, with monocular vision as the only sensing mode for autonomous UAV flight in dense clutter. We make it feasible on UAVs via a number of contributions: novel coupling of perception and control via relevant and diverse, multiple interpretations of the scene around the robot, leveraging recent advances in machine learning to showcase anytime budgeted cost-sensitive feature selection, and fast non-linear regression for monocular depth prediction. We empirically demonstrate the efficacy of our novel pipeline via real world experiments of more than 2 kms through dense trees with a quadrotor built from off-the-shelf parts. Moreover our pipeline is designed to combine information from other modalities like stereo and lidar as well if available

    Event-Based Motion Segmentation by Motion Compensation

    Full text link
    In contrast to traditional cameras, whose pixels have a common exposure time, event-based cameras are novel bio-inspired sensors whose pixels work independently and asynchronously output intensity changes (called "events"), with microsecond resolution. Since events are caused by the apparent motion of objects, event-based cameras sample visual information based on the scene dynamics and are, therefore, a more natural fit than traditional cameras to acquire motion, especially at high speeds, where traditional cameras suffer from motion blur. However, distinguishing between events caused by different moving objects and by the camera's ego-motion is a challenging task. We present the first per-event segmentation method for splitting a scene into independently moving objects. Our method jointly estimates the event-object associations (i.e., segmentation) and the motion parameters of the objects (or the background) by maximization of an objective function, which builds upon recent results on event-based motion-compensation. We provide a thorough evaluation of our method on a public dataset, outperforming the state-of-the-art by as much as 10%. We also show the first quantitative evaluation of a segmentation algorithm for event cameras, yielding around 90% accuracy at 4 pixels relative displacement.Comment: When viewed in Acrobat Reader, several of the figures animate. Video: https://youtu.be/0q6ap_OSBA

    Model-Based Environmental Visual Perception for Humanoid Robots

    Get PDF
    The visual perception of a robot should answer two fundamental questions: What? and Where? In order to properly and efficiently reply to these questions, it is essential to establish a bidirectional coupling between the external stimuli and the internal representations. This coupling links the physical world with the inner abstraction models by sensor transformation, recognition, matching and optimization algorithms. The objective of this PhD is to establish this sensor-model coupling
    • …
    corecore