1,186 research outputs found
Learning how to be robust: Deep polynomial regression
Polynomial regression is a recurrent problem with a large number of
applications. In computer vision it often appears in motion analysis. Whatever
the application, standard methods for regression of polynomial models tend to
deliver biased results when the input data is heavily contaminated by outliers.
Moreover, the problem is even harder when outliers have strong structure.
Departing from problem-tailored heuristics for robust estimation of parametric
models, we explore deep convolutional neural networks. Our work aims to find a
generic approach for training deep regression models without the explicit need
of supervised annotation. We bypass the need for a tailored loss function on
the regression parameters by attaching to our model a differentiable hard-wired
decoder corresponding to the polynomial operation at hand. We demonstrate the
value of our findings by comparing with standard robust regression methods.
Furthermore, we demonstrate how to use such models for a real computer vision
problem, i.e., video stabilization. The qualitative and quantitative
experiments show that neural networks are able to learn robustness for general
polynomial regression, with results that well overpass scores of traditional
robust estimation methods.Comment: 18 pages, conferenc
An Appearance-Based Method for Parametric Video Registration
In this paper we address the problem of multi frame video registration using the combination of an appearance-based technique and a parametric model of the transformations. This technique uses an image that is selected as reference frame, and therefore, estimates the transformation that occurred to each frame in the sequence respect to this absolute referenced one. Both global and local information are employed to the estimation of these registered images. Global information is applied in terms of linear appearance subspace constraints, under the subspace constancy assumption [4], where variabilities of each frame respect to the reference frame are encoded. Local information is used by means of a polynomial parametric model that estimates the velocities field evoluton in each frame. The objective function to be minimized considers both issues at the same time, i.e., the appearance representation and the time evolution across the sequence. This function is the connection between the global coordinates in the subspace representation and the time evolution and the parametric optical flow estimates. Thus, the appearance constraints result to take into account al the images in a sequence in order to estimate the transformation parameters
Vision and Learning for Deliberative Monocular Cluttered Flight
Cameras provide a rich source of information while being passive, cheap and
lightweight for small and medium Unmanned Aerial Vehicles (UAVs). In this work
we present the first implementation of receding horizon control, which is
widely used in ground vehicles, with monocular vision as the only sensing mode
for autonomous UAV flight in dense clutter. We make it feasible on UAVs via a
number of contributions: novel coupling of perception and control via relevant
and diverse, multiple interpretations of the scene around the robot, leveraging
recent advances in machine learning to showcase anytime budgeted cost-sensitive
feature selection, and fast non-linear regression for monocular depth
prediction. We empirically demonstrate the efficacy of our novel pipeline via
real world experiments of more than 2 kms through dense trees with a quadrotor
built from off-the-shelf parts. Moreover our pipeline is designed to combine
information from other modalities like stereo and lidar as well if available
Event-Based Motion Segmentation by Motion Compensation
In contrast to traditional cameras, whose pixels have a common exposure time,
event-based cameras are novel bio-inspired sensors whose pixels work
independently and asynchronously output intensity changes (called "events"),
with microsecond resolution. Since events are caused by the apparent motion of
objects, event-based cameras sample visual information based on the scene
dynamics and are, therefore, a more natural fit than traditional cameras to
acquire motion, especially at high speeds, where traditional cameras suffer
from motion blur. However, distinguishing between events caused by different
moving objects and by the camera's ego-motion is a challenging task. We present
the first per-event segmentation method for splitting a scene into
independently moving objects. Our method jointly estimates the event-object
associations (i.e., segmentation) and the motion parameters of the objects (or
the background) by maximization of an objective function, which builds upon
recent results on event-based motion-compensation. We provide a thorough
evaluation of our method on a public dataset, outperforming the
state-of-the-art by as much as 10%. We also show the first quantitative
evaluation of a segmentation algorithm for event cameras, yielding around 90%
accuracy at 4 pixels relative displacement.Comment: When viewed in Acrobat Reader, several of the figures animate. Video:
https://youtu.be/0q6ap_OSBA
Model-Based Environmental Visual Perception for Humanoid Robots
The visual perception of a robot should answer two fundamental questions: What? and Where? In order to properly and efficiently reply to these questions, it is essential to establish a bidirectional coupling between the external stimuli and the internal representations. This coupling links the physical world with the inner abstraction models by sensor transformation, recognition, matching and optimization algorithms. The objective of this PhD is to establish this sensor-model coupling
- …