2,173 research outputs found
Bottom-Up and Top-Down Reasoning with Hierarchical Rectified Gaussians
Convolutional neural nets (CNNs) have demonstrated remarkable performance in
recent history. Such approaches tend to work in a unidirectional bottom-up
feed-forward fashion. However, practical experience and biological evidence
tells us that feedback plays a crucial role, particularly for detailed spatial
understanding tasks. This work explores bidirectional architectures that also
reason with top-down feedback: neural units are influenced by both lower and
higher-level units.
We do so by treating units as rectified latent variables in a quadratic
energy function, which can be seen as a hierarchical Rectified Gaussian model
(RGs). We show that RGs can be optimized with a quadratic program (QP), that
can in turn be optimized with a recurrent neural network (with rectified linear
units). This allows RGs to be trained with GPU-optimized gradient descent. From
a theoretical perspective, RGs help establish a connection between CNNs and
hierarchical probabilistic models. From a practical perspective, RGs are well
suited for detailed spatial tasks that can benefit from top-down reasoning. We
illustrate them on the challenging task of keypoint localization under
occlusions, where local bottom-up evidence may be misleading. We demonstrate
state-of-the-art results on challenging benchmarks.Comment: To appear in CVPR 201
Understanding the Limitations of CNN-based Absolute Camera Pose Regression
Visual localization is the task of accurate camera pose estimation in a known
scene. It is a key problem in computer vision and robotics, with applications
including self-driving cars, Structure-from-Motion, SLAM, and Mixed Reality.
Traditionally, the localization problem has been tackled using 3D geometry.
Recently, end-to-end approaches based on convolutional neural networks have
become popular. These methods learn to directly regress the camera pose from an
input image. However, they do not achieve the same level of pose accuracy as 3D
structure-based methods. To understand this behavior, we develop a theoretical
model for camera pose regression. We use our model to predict failure cases for
pose regression techniques and verify our predictions through experiments. We
furthermore use our model to show that pose regression is more closely related
to pose approximation via image retrieval than to accurate pose estimation via
3D structure. A key result is that current approaches do not consistently
outperform a handcrafted image retrieval baseline. This clearly shows that
additional research is needed before pose regression algorithms are ready to
compete with structure-based methods.Comment: Initial version of a paper accepted to CVPR 201
Automatic Craniomaxillofacial Landmark Digitization via Segmentation-Guided Partially-Joint Regression Forest Model and Multiscale Statistical Features
The goal of this paper is to automatically digitize craniomaxillofacial (CMF) landmarks efficiently and accurately from cone-beam computed tomography (CBCT) images, by addressing the challenge caused by large morphological variations across patients and image artifacts of CBCT images
A Collaborative Visual Localization Scheme for a Low-Cost Heterogeneous Robotic Team with Non-Overlapping Perspectives
This paper presents and evaluates a relative localization scheme for a heterogeneous team of low-cost mobile robots. An error-state, complementary Kalman Filter was developed to fuse analytically-derived uncertainty of stereoscopic pose measurements of an aerial robot, made by a ground robot, with the inertial/visual proprioceptive measurements of both robots. Results show that the sources of error, image quantization, asynchronous sensors, and a non-stationary bias, were sufficiently modeled to estimate the pose of the aerial robot. In both simulation and experiments, we demonstrate the proposed methodology with a heterogeneous robot team, consisting of a UAV and a UGV tasked with collaboratively localizing themselves while avoiding obstacles in an unknown environment. The team is able to identify a goal location and obstacles in the environment and plan a path for the UGV to the goal location. The results demonstrate localization accuracies of 2cm to 4cm, on average, while the robots operate at a distance from each-other between 1m and 4m
Co-Localization of Audio Sources in Images Using Binaural Features and Locally-Linear Regression
This paper addresses the problem of localizing audio sources using binaural
measurements. We propose a supervised formulation that simultaneously localizes
multiple sources at different locations. The approach is intrinsically
efficient because, contrary to prior work, it relies neither on source
separation, nor on monaural segregation. The method starts with a training
stage that establishes a locally-linear Gaussian regression model between the
directional coordinates of all the sources and the auditory features extracted
from binaural measurements. While fixed-length wide-spectrum sounds (white
noise) are used for training to reliably estimate the model parameters, we show
that the testing (localization) can be extended to variable-length
sparse-spectrum sounds (such as speech), thus enabling a wide range of
realistic applications. Indeed, we demonstrate that the method can be used for
audio-visual fusion, namely to map speech signals onto images and hence to
spatially align the audio and visual modalities, thus enabling to discriminate
between speaking and non-speaking faces. We release a novel corpus of real-room
recordings that allow quantitative evaluation of the co-localization method in
the presence of one or two sound sources. Experiments demonstrate increased
accuracy and speed relative to several state-of-the-art methods.Comment: 15 pages, 8 figure
A selective overview of nonparametric methods in financial econometrics
This paper gives a brief overview on the nonparametric techniques that are
useful for financial econometric problems. The problems include estimation and
inferences of instantaneous returns and volatility functions of
time-homogeneous and time-dependent diffusion processes, and estimation of
transition densities and state price densities. We first briefly describe the
problems and then outline main techniques and main results. Some useful
probabilistic aspects of diffusion processes are also briefly summarized to
facilitate our presentation and applications.Comment: 32 pages include 7 figure
- …