6,496 research outputs found
Learning to Navigate the Energy Landscape
In this paper, we present a novel and efficient architecture for addressing
computer vision problems that use `Analysis by Synthesis'. Analysis by
synthesis involves the minimization of the reconstruction error which is
typically a non-convex function of the latent target variables.
State-of-the-art methods adopt a hybrid scheme where discriminatively trained
predictors like Random Forests or Convolutional Neural Networks are used to
initialize local search algorithms. While these methods have been shown to
produce promising results, they often get stuck in local optima. Our method
goes beyond the conventional hybrid architecture by not only proposing multiple
accurate initial solutions but by also defining a navigational structure over
the solution space that can be used for extremely efficient gradient-free local
search. We demonstrate the efficacy of our approach on the challenging problem
of RGB Camera Relocalization. To make the RGB camera relocalization problem
particularly challenging, we introduce a new dataset of 3D environments which
are significantly larger than those found in other publicly-available datasets.
Our experiments reveal that the proposed method is able to achieve
state-of-the-art camera relocalization results. We also demonstrate the
generalizability of our approach on Hand Pose Estimation and Image Retrieval
tasks
The World of Fast Moving Objects
The notion of a Fast Moving Object (FMO), i.e. an object that moves over a
distance exceeding its size within the exposure time, is introduced. FMOs may,
and typically do, rotate with high angular speed. FMOs are very common in
sports videos, but are not rare elsewhere. In a single frame, such objects are
often barely visible and appear as semi-transparent streaks.
A method for the detection and tracking of FMOs is proposed. The method
consists of three distinct algorithms, which form an efficient localization
pipeline that operates successfully in a broad range of conditions. We show
that it is possible to recover the appearance of the object and its axis of
rotation, despite its blurred appearance. The proposed method is evaluated on a
new annotated dataset. The results show that existing trackers are inadequate
for the problem of FMO localization and a new approach is required. Two
applications of localization, temporal super-resolution and highlighting, are
presented
A group-theoretic approach to formalizing bootstrapping problems
The bootstrapping problem consists in designing agents that learn a model of themselves and the world, and utilize it to achieve useful tasks. It is different from other learning problems as the agent starts with uninterpreted observations and commands, and with minimal prior information about the world. In this paper, we give a mathematical formalization of this aspect of the problem. We argue that the vague constraint of having "no prior information" can be recast as a precise algebraic condition on the agent: that its behavior is invariant to particular classes of nuisances on the world, which we show can be well represented by actions of groups (diffeomorphisms, permutations, linear transformations) on observations and commands. We then introduce the class of bilinear gradient dynamics sensors (BGDS) as a candidate for learning generic robotic sensorimotor cascades. We show how framing the problem as rejection of group nuisances allows a compact and modular analysis of typical preprocessing stages, such as learning the topology of the sensors. We demonstrate learning and using such models on real-world range-finder and camera data from publicly available datasets
A one-parameter family of interpolating kernels for Smoothed Particle Hydrodynamics studies
A set of interpolating functions of the type f(v)={(sin[v pi/2])/(v pi/2)}^n
is analyzed in the context of the smoothed-particle hydrodynamics (SPH)
technique. The behaviour of these kernels for several values of the parameter n
has been studied either analytically as well as numerically in connection with
several tests carried out in two dimensions. The main advantage of this kernel
relies in its flexibility because for n=3 it is similar to the standard widely
used cubic-spline, whereas for n>3 the interpolating function becomes more
centrally condensed, being well suited to track discontinuities such as shock
fronts and thermal waves.Comment: 36 pages, 12 figures (low-resolution), published in J.C.
- …